The tech company has launched initial versions of its latest large language model and a real-time image generator in an effort to keep pace with OpenAI
On Thursday, Meta Platforms unveiled preliminary versions of its newest large language model, Llama 3, and a real-time image generator. This move is part of Meta’s efforts to compete with the leading generative AI provider, OpenAI.
These models will be incorporated into the Meta AI virtual assistant, which Meta touts as the most advanced among its free-to-use counterparts. The assistant will receive greater prominence within Meta’s Facebook, Instagram, WhatsApp, and Messenger apps, as well as on a new standalone website. This positioning aims to enable Meta to directly challenge ChatGPT, the successful offering from Microsoft-backed OpenAI.
The announcement coincides with Meta’s urgent efforts to introduce generative AI products to its extensive user base, aiming to rival OpenAI’s dominant position in the field. This endeavor involves revamping computing infrastructure and merging previously separate research and product teams.
Chris Cox, Meta’s chief product officer, mentioned in an interview that Llama 3 has been enhanced with new computer coding capabilities and trained on images as well as text. However, the current version of the model will only produce text outputs.
Cox further stated that future versions will incorporate more advanced reasoning abilities, such as the capacity to devise longer, multi-step plans. Meta also indicated in blog posts that upcoming versions planned for release in the coming months will feature “multimodality,” allowing them to generate both text and images.
“The ultimate aim is to assist in lightening your load, making your life easier, whether it involves interacting with businesses, writing, or planning a trip,” Cox explained.
He noted that incorporating images into Llama 3’s training would enhance an upcoming update to the Ray-Ban Meta smart glasses, a collaboration with glasses manufacturer EssilorLuxottica. This update will enable Meta AI to recognize objects seen by the wearer and provide information about them.
Additionally, Meta revealed a new collaboration with Google, a subsidiary of Alphabet, to incorporate real-time search results into the assistant’s responses. This complements an existing partnership with Microsoft’s Bing.
The Meta AI assistant is expanding its reach to more than a dozen markets beyond the US, including Australia, Canada, Singapore, Nigeria, and Pakistan. Cox mentioned that Meta is still refining its approach for Europe, where privacy regulations are stricter and the upcoming AI Act is expected to introduce requirements such as disclosing models’ training data.
The insatiable data appetite of generative AI models has become a significant point of contention in the technology’s advancement.
As part of its efforts to catch up, Meta has been making models like Llama 3 available for free commercial use by developers. This move is strategic, as the success of a robust free option could disrupt competitors’ plans to monetize their proprietary technology. However, critics have expressed concerns about the safety implications of allowing unscrupulous developers access to such models.
In a video accompanying the announcement, Meta CEO Mark Zuckerberg acknowledged this competitive landscape, referring to Meta AI as “the most intelligent AI assistant that you can freely use.”
Zuckerberg stated that the largest iteration of Llama 3 is presently undergoing training with 400 billion parameters and has already achieved a score of 85 MMLU, referring to metrics used to measure the robustness and performance of AI models. The two smaller versions currently being released have 8 billion and 70 billion parameters, respectively. He added that the latter scored approximately 82 MMLU in tests, or Massive Multitask Language Understanding.
Zuckerberg stated that the largest version of Llama 3 is presently undergoing training with 400 billion parameters and has already achieved a score of 85 MMLU, referring to metrics used to measure the strength and performance of AI models. He mentioned that the two smaller versions currently being released have 8 billion and 70 billion parameters, respectively. The latter scored approximately 82 MMLU in tests, which stands for Massive Multitask Language Understanding.
Developers have expressed frustration with the previous Llama 2 version, citing its failure to grasp basic context. For example, it would misunderstand queries about “killing” a computer program as requests for instructions on committing murder. Similarly, rival Google faced similar issues and recently halted the use of its Gemini AI image generation tool after facing criticism for producing inaccurate depictions of historical figures.
Meta stated that it addressed these issues in Llama 3 by utilizing “high-quality data” to help the model understand nuances better. However, it did not provide specifics about the datasets used. Meta mentioned that it fed seven times more data into Llama 3 compared to Llama 2 and utilized “synthetic” or AI-generated data to enhance areas such as coding and reasoning.
Cox stated that there was “not a major change in posture” regarding how the company sourced its training data.