Google’s latest AI models compared with features and performance


Gemini 3 vs Gemini 3 Pro vs Gemini 3 DeepThink: The rollout of Gemini 3, Google’s latest large language model (LLM), appears to be the tech giant’s strongest AI debut in recent months as it has drawn positive feedback from users and developers alike.

Early reviews suggest that Gemini 3 is a highly capable foundational AI model, especially when it comes to handling reasoning-heavy tasks. The model was shipped on November 18, with Google promoting its arrival as a ‘new era of intelligence’.

Gemini 3 is designed to provide better answers to more complex questions compared with prior models. It is also said to be the best model that Google has built for ‘vibe-coding’, the controversial practice where users mostly rely on AI tools to generate code and build software.

According to Google, its advances with Gemini 3 are reflected in the model’s performance across several benchmark tests. The company claimed that Gemini 3 outperforms its predecessor on every AI benchmark, topping the LM Arena leaderboard as well as earning top marks on Humanity’s Last Exam and GPQA Diamond.

However, public benchmarks have been criticised as unreliable indicators of real-world AI performance because they can be easy to game. For instance, famed AI researcher Andrej Karpathy pointed out that Gemini 3 refused to believe that it was 2025 since its pre-training data only included information up till 2024. But he also acknowledged that his early impression of Gemini 3 was positive.

As feedback continues to roll in over the next few weeks, let’s take a closer look at the Gemini 3 family of models and what each of them has to offer.

Gemini 3

Gemini 3 is said to possess multimodal reasoning capabilities, meaning that it combines reasoning abilities with vision and spatial understanding as well as multilingual skills and a one million-token context window, allowing users to ask complex and nuanced questions, including lengthy ones.

For developers, Gemini 3 is capable of handling complex prompts and instructions to render richer, more interactive web UI. According to Google, Gemini 3 is ‘exceptional’ at zero-shot generation which means that it can generate software elements without being explicitly trained on such elements.

Story continues below this ad

In terms of use cases, Google said that users could, for instance, ask Gemini 3 to decipher and translate handwritten recipes in different languages into a shareable family cookbook. “It can even analyse videos of your pickleball match, identify areas where you can improve and generate a training plan for overall form improvements,” the company said.

Gemini 3 has been subjected to several safety tests in order to reduce sycophancy and improve resistance to malicious prompt injection attacks, as per Google.

On the benchmark front, Gemini 3 topped the WebDev Arena leaderboard by scoring an impressive 1487 Elo. It also scores 54.2 per cent on Terminal-Bench 2.0, which tests a model’s tool use ability to operate a computer via terminal. It outperformed Gemini 2.5 Pro on SWE-bench Verified (76.2 per cent), a benchmark that measures coding agents.
The model further topped the Vending-Bench 2 leaderboard, which tests longer horizon planning by managing a simulated vending machine business.

Gemini 3 Pro

“Gemini 3 Pro demonstrates better long-horizon planning to generate significantly higher returns compared to other frontier models,” Google said.

Story continues below this ad

Its responses are smart, concise, and direct. “It acts as a true thought partner that gives you new ways to understand information and express yourself, from translating dense scientific concepts by generating code for high-fidelity visualizations to creative brainstorming,” the company added.

Geminin 3 Pro outperforms 2.5 Pro on every major AI benchmark. It topped the LMArena Leaderboard with a breakthrough score of 1501 Elo. It received top scores on Humanity’s Last Exam (37.5 per cent without the usage of any tools) and GPQA Diamond (91.9 per cent).

Gemini 3 Pro is said to be highly capable at solving complex problems across a vast array of topics like science and mathematics. It set a new high score (23.4 per cent) on MathArena Apex, a benchmark for evaluating frontier models on mathematics. Its multimodal reasoning extends beyond text as the model scored 81 per cent on MMMU-Pro and 87.6 per cent on Video-MMMU.

Gemini 3 Pro’s responses are also more likely to be factually accurate as it scored a 72.1 per cent on SimpleQA Verified.

Story continues below this ad

Gemini 3 Deep Think

Gemini 3 Deep Think is an enhanced reasoning mode that pushes Gemini 3’s multimodal reasoning capabilities even further to help users solve more complex problems.

In testing, Gemini 3 Deep Think outperformed Gemini 3 Pro’s performance on Humanity’s Last Exam (41.0 per cent without the use of tools) and GPQA Diamond (93.8%). It also achieved 45.1 per cent on ARC-AGI-2 (with code execution, ARC Prize Verified), demonstrating its ability to solve novel challenges.

However, Google said that Gemini 3 Deep Think Mode is still undergoing safety evaluations and will be made available to Google AI Ultra subscribers after gathering inputs from safety testers in the coming weeks. The company has also said it plans to release additional models to the Gemini 3 series soon.




Related Posts

Motorola unveils ‘ultra-thin’ Edge 70 in India: Price, specs, and more | Technology News

Motorola, on Monday, December 15, expanded its Edge portfolio in India with the launch of its Edge 70 smartphone. The device is powered by the Snapdragon 7 Gen 4 chip,…

Sam Altman ‘courageous’, Musk is ‘bulldozer’: Microsoft AI chief on key tech leaders | Technology News

Mustafa Suleyman, Microsoft’s AI chief, has said that he is in constant touch with some of his peers in the industry, including OpenAI’s Sam Altman, Anthropic’s Dario Amodei, and Google…

Leave a Reply

Your email address will not be published. Required fields are marked *

You Missed

Frontier Airlines replaces CEO Barry Biffle with carrier’s president

  • By admin
  • December 16, 2025
  • 2 views
Frontier Airlines replaces CEO Barry Biffle with carrier’s president

Expected homecomings in IPL Auction 2026: Players who might be bought back by their 2025 franchises

  • By admin
  • December 16, 2025
  • 1 views
Expected homecomings in IPL Auction 2026: Players who might be bought back by their 2025 franchises

Did Vicky Kaushal show his baby boy’s pictures to Alia Bhatt? Her adorable reaction has fans convinced. Watch

  • By admin
  • December 16, 2025
  • 1 views
Did Vicky Kaushal show his baby boy’s pictures to Alia Bhatt? Her adorable reaction has fans convinced. Watch

Saif Ali Khan says he’s still ‘frightened’ by the thought of being ‘bedridden forever’ after knife attack: ‘Had lost feeling in my leg’ | Bollywood News

  • By admin
  • December 16, 2025
  • 3 views
Saif Ali Khan says he’s still ‘frightened’ by the thought of being ‘bedridden forever’ after knife attack: ‘Had lost feeling in my leg’ | Bollywood News

Patrick Mahomes’ LCL injury complicates return; when could Chiefs QB be back? Insider reveals details

  • By admin
  • December 16, 2025
  • 2 views
Patrick Mahomes’ LCL injury complicates return; when could Chiefs QB be back? Insider reveals details

Motorola unveils ‘ultra-thin’ Edge 70 in India: Price, specs, and more | Technology News

  • By admin
  • December 16, 2025
  • 2 views
Motorola unveils ‘ultra-thin’ Edge 70 in India: Price, specs, and more | Technology News