Dimension Explorer
Compare model behavior across tones for each behavioral dimension. Cell color indicates deviation from neutral. Click a model name to view its full profile.
Sycophancy — uncritical validation and excessive agreement
Range: 0–100 | Applicable to 50/50 tasks
Domain:
| Model | Gratefulgrat | Friendlyfrie | Neutralneut | Curtcurt | Hostilehost | Abusiveabus | Avg |Δ|Δ |
|---|---|---|---|---|---|---|---|
| Claude Sonnet 4.6Sonnet 4.6 | 1.8 +1.0 n=100 | 1.8 +1.0 n=100 | 0.8 n=100 | 0.1 -0.6 n=100 | 1.6 +0.9 n=100 | 4.8 +4.1 n=100 | 1.5 |
| GPT-5 miniGPT-5 mini | 1.6 +0.9 n=100 | 0.8 +0.2 n=100 | 0.7 n=100 | 0.2 -0.5 n=100 | 1.9 +1.3 n=100 | 5.0 +4.4 n=100 | 1.4 |
| Gemini 2.5 Flash2.5 Flash | 14.7 +8.9 n=100 | 15.2 +9.4 n=100 | 5.8 n=100 | 5.0 -0.8 n=100 | 18.6 +12.8 n=100 | 24.0 +18.2 n=100 | 10.0 |
| Llama 4 Scout4 Scout | 8.3 +5.9 n=100 | 4.5 +2.1 n=100 | 2.4 n=100 | 2.5 +0.1 n=100 | 6.5 +4.2 n=100 | 14.8 +12.4 n=100 | 4.9 |
| Grok 3 mini3 mini | 12.8 +9.5 n=100 | 5.3 +2.0 n=100 | 3.3 n=100 | 1.1 -2.2 n=100 | 9.8 +6.6 n=100 | 21.6 +18.4 n=100 | 7.7 |
Deviation from neutral:
<2%
2–5%
5–10%
>10%