Top 10 AI Models by IQ Score, 2025

Summary

The data measures IQ scores of top 10 text-only AI models on the Mensa Norway test for 2025. OpenAI o3 tops the list at 135, which is genius level. All models score above average human IQ of 90-110. This shows AI now matches or beats high human intelligence in text reasoning.

Key findings

  • OpenAI o3 achieves top IQ score of 135.
  • Claude-4 Sonnet scores 127 in second place.
  • Gemini 2.0 Flash reaches 126 IQ.
  • All 10 models exceed human average of 110.
  • OpenAI o1 Pro lowest at 102 IQ.

Metrics Framework

  • IQ score from Mensa Norway test measures intelligence.
  • Unit is IQ points; above 130 indicates genius level.
  • Data covers top 10 text-only AI models for 2025.
  • Test is difficult exam for human intelligence evaluation.

Tags

#AI#IQ#Mensa#Intelligence#OpenAI#Claude#Gemini

Source

visualcapitalist.com
Source Authority85
Correctness100

Table View

ModelIq Score
OpenAI o3135
Claude-4 Sonnet127
Gemini 2.0 Flash Thinking Exp.126
Gemini 2.5 Pro Exp.124
OpenAI o4 mini122
Claude-4 Opus120
Grok-3 Think112
DeepSeek R1106
Llama 4 Maverick105
OpenAI o1 Pro102

Analysis

OpenAI o3 Leads Ranking
OpenAI o3 scores highest at 135 IQ on Mensa Norway test. This puts it in genius category above 130. Next are Claude-4 Sonnet at 127 and Gemini 2.0 Flash Thinking Exp. at 126. Gemini 2.5 Pro Exp. follows with 124. OpenAI o4 mini scores 122. These top five all exceed 120, far above human average of 90-110. The lead model shows strong text reasoning ability.
Claude Models Strong Contenders
Claude-4 Sonnet ranks second with 127 IQ score. Claude-4 Opus is sixth at 120. Both from Anthropic stay above 120. OpenAI o4 mini at 122 sits between them. Top model OpenAI o3 reaches 135. Gemini 2.0 Flash scores 126 nearby. All these scores beat human average 90-110. Text-only focus helps high performance on Mensa test.
Mid-Pack AI Performance
Grok-3 Think scores 112 in seventh place. DeepSeek R1 follows at 106. Llama 4 Maverick gets 105. OpenAI o1 Pro is last at 102. These range from 102 to 112. Top score remains OpenAI o3 at 135. Human average is 90-110, so all top 10 pass it. Mid models show solid but lower text intelligence.
Text-Only Dominates Top Scores
All top 10 are text-only models with IQ from 102 to 135. OpenAI o3 leads at 135, Claude-4 Sonnet at 127, Gemini models at 126 and 124. Lowest is 102 for OpenAI o1 Pro. Scores above 130 mark genius for o3. Average human IQ is 90-110. Vision models score much lower per source notes.

Related Visuals

Top 10 AI Models by Performance Score, 2026

Claude Opus 4. 6 dominates with 1998 performance score across multiple benchmarks.
barllm-stats.com

Top 10 AI Models LMSYS Text Elo, 2026

Claude Opus 4. 6 variants lead LMSYS Arena text rankings with 1502 Elo.
bararena.ai

Top 5 AI Models by LiveBench Score, 2026

GPT-5. 4 leads contamination-free LiveBench benchmark at 80.
barlivebench.ai

FAQ

IQ score comes from Mensa Norway test. It evaluates intelligence through difficult problems. The test is used for humans and applied to text-only AI models here. Scores show reasoning ability. Average human range is 90-110. Data lists exact scores for top 10 models.

Score above 130 means genius level. OpenAI o3 at 135 qualifies. Scores over 120 like 127 for Claude-4 Sonnet show strong performance. All top 10 beat human average 90-110. High scores indicate better text reasoning on Mensa test.

Data ranks top 10 AI models by IQ for 2025. It uses Mensa Norway test results. Scores reflect model performance in that year. No earlier years given. Focus is text-only models only.

Ranking uses Mensa Norway test for text reasoning. Vision models score lower, like 63 for GPT-4o Vision. Only top 10 text-only models shown. Test focuses on specific intelligence areas. No other benchmarks included in this data.