Google is launching what it considers its largest and most capable artificial intelligence model Wednesday as pressure mounts on the company to answer how it’ll monetize AI.
The large language model Gemini will include a suite of three different sizes: Gemini Ultra, its largest, most capable category; Gemini Pro, which scales across a wide range of tasks; and Gemini Nano, which it will use for specific tasks and mobile devices.
For now, the company is planning to license Gemini to customers through Google Cloud for them to use in their own applications. Starting Dec. 13, developers and enterprise customers can access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI. Android developers will also be able to build with Gemini Nano. Gemini will also be used to power Google products like its Bard chatbot and Search Generative Experience, which tries to answer search queries with conversational-style text (SGE is not widely available yet).
Google CEO Sundar Pichai speaks in conversation with Emily Chang during the APEC CEO Summit at Moscone West on November 16, 2023 in San Francisco, California. The APEC summit is being held in San Francisco and runs through November 17.
Justin Sullivan | Getty Images News | Getty Images
Companies and enterprises could use it for more advanced customer service engagement via chatbots and product recommendations, as well as identifying trends for companies looking to advertise products. Gemini could also be used for content creation if a company wants to create marketing campaigns or blog content, as well as productivity apps that may want to summarize meetings or generate code for developers.
The company gave examples including showing Gemini being able to take a screenshot of a chart and analyzing hundreds of pages from research and then updating the chart. Another example was analyzing a photo of a person’s math homework and identifying correct answers and pointing out incorrect ones.
Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities, the company said in a blog post Wednesday. It can supposedly understand nuance and reasoning in complex subjects.
Sundar Pichai, chief executive officer of Alphabet Inc., during the Google I/O Developers Conference in Mountain View, California, US, on Wednesday, May 10, 2023.
David Paul Morris | Bloomberg | Getty Images
“Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research,” wrote CEO Sundar Pichai in a blog post Wednesday. “It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.”
Starting today, Google’s chatbot Bard will use Gemini Pro to help with advanced reasoning, planning, understanding and other capabilities. Early next year, it will launch “Bard Advanced,” which will use Gemini Ultra, executives said on a call with reporters Tuesday. It represents the biggest update to Bard, its ChatGPT-like chatbot.
The update comes eight months after the search giant first launched Bard and one year after OpenAI launched ChatGPT on GPT-3.5. In March of this year, the Sam Altman-led startup launched GPT-4. Executives said Tuesday that Gemini Pro outperformed GPT-3.5 but dodged questions about how it stacked up against GPT-4.
However, Gemini’s Ultra model outperformed GPT-4 in a few benchmarks, according to a white paper Google released Wednesday.
When asked if Google has plans to charge for access to “Bard Advanced,” Google’s general manager for Bard, Sissie Hsiao, said it is focused on creating a good experience and doesn’t have any monetization details yet.
When asked on a press briefing if Gemini has any novel capabilities compared with current generation LLMs, Eli Collins, vice president of product at Google DeepMind, answered, “I suspect it does” but that it’s still working to understand Gemini Ultra’s novel capabilities.
Google reportedly postponed the launch of Gemini because it wasn’t ready, bringing back memories of the company’s rocky rollout of its AI tools at the beginning of the year.
Multiple reporters asked about the delay, to which Collins answered that testing the more advanced models take longer. Collins said Gemini is the most highly tested AI model that the company’s built and that it has “the most comprehensive safety evaluations” of any Google model.
Collins said that despite being its largest model, Gemini Ultra is significantly cheaper to serve. “It’s not just more capable, it’s more efficient,” he said. “We still require significant compute to train Gemini but we’re getting much more efficient in terms of our ability to train these models.”
Collins said the company will release a technical white paper with more details of the model on Wednesday but said it won’t be releasing the perimeter count. Earlier this year, CNBC found Google’s PaLM 2 large language model, its latest AI model at the time, used nearly five times the amount of text data for training as its predecessor LLM.
Also on Wednesday, Google introduced its next-generation tensor processing unit for training AI models. The TPU v5p chip, which Salesforce and startup Lightricks have begun using, offers better performance for the price than the TPU v4 announced in 2021, Google said. But the company didn’t provide information on performance compared with market leader Nvidia.
The chip announcement comes weeks after cloud rivals Amazon and Microsoft showed off custom silicon targeting AI.
During Google’s third-quarter earnings conference call in October, investors asked executives more questions about how it’s going to turn AI into actual profit.
In August, Google launched an “early experiment” called Search Generative Experience, or SGE, which lets users see what a generative AI experience would look like when using the search engine — search is still a major profit center for the company. The result is more conversational, reflecting the age of chatbots. However, it is still considered an experiment and has yet to launch to the general public.
Investors have been asking for a timeline for SGE since May, when the company first announced the experiment at its annual developer conference Google I/O. The Gemini announcement Wednesday hardly mentioned SGE and executives were vague about its plans to launch to the general public, saying that Gemini would be incorporated into it “in the next year.”
“This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company,” Pichai said in Wednesday’s blog post. “I’m genuinely excited for what’s ahead, and for the opportunities Gemini will unlock for people everywhere.”
— CNBC’s Jordan Novet contributed to this report.
Don’t miss these stories from CNBC PRO: