Google I/O Keynote – Gemini 1.5 Pro

Gemini, Google’s family of generative AI models, can now analyze longer documents, codebases, videos and audio recordings than before. During a keynote at the Google I/O 2024 developer conference Tuesday, Google announced the private preview of a new version of Gemini 1.5 Pro, the company’s current flagship model, that can take in up to 2 million tokens. That’s double the previous maximum amount. At 2 million tokens, the new version of Gemini 1.5 Pro supports the largest input of any commercially available model. The next-largest, Anthropic’s Claude 3, tops out at 1 million tokens. In the AI field, “tokens” refer to subdivided bits of raw data, like the syllables “fan,” “tas” and “tic” in the word “fantastic.” Two million tokens is equivalent to around 1.4 million words, two hours of video or 22 hours of audio. Beyond being able to analyze large files, models that can take in more tokens can sometimes achieve improved performance. Unlike models with small maximum token inputs (otherwise known as context), models such as the 2-million-token-input Gemini 1.5 Pro won’t easily “forget” the content of very recent conversations and veer off topic. Large-context models can also better grasp the flow of data they take in — hypothetically, at least — and generate contextually richer responses.