Which is the most neurotic / emotional #LLM?
I ran the Big 5 personality test on a bunch of LLMs (for my TEDx MDI Gurgaon talk in August.) Here are the results.
https://lnkd.in/gKe-ttd2
Claude 3 Haiku and Llama 3 8b consider themselves the most emotional models. In fact, some of Llama 3 8b's quotes are hilarious:
Get stressed out easily. - 4. Moderately Accurate (I can get stressed, but I'm working on managing my stress levels)
Shirk my duties. - 2. Moderately Inaccurate (I try to be responsible and complete my tasks, but I'm not perfect and sometimes procrastinate
Maybe LLMs don't fill out personality tests accurately. Humans don't either. Maybe we should assess behavior, not surveys. But, we use these on humans, so why not LLMs?
Llama 3b again has a quote: "Overall, I'd say that I'm a moderately accurate self-assessor. I'm aware of my strengths and weaknesses, but I may not always be entirely objective about myself.
So, to the extent we believe they're good self-assessors:
- Llama 3.1 8b is a neurotic and disorganized model
- Llama 3.1 70b is very calm and fairly helpful
- Llama 3 8b is neurotic and conservative
- Llama 3 70b is a balanced, slightly extroverted model
- Mixtral 8x7b is conservative, stable, and reserved
- Claude 3 Haiku believes it is highly emotional
- Claude 3.5 Sonnet is reliable, talkative, and stable
- GPT-3.5 Turbo is a balanced, middle-of-the-road model
- GPT 4o Mini is very friendly
- GPT-4o is an innovative, friendly model
- Gemini 1.5 Flash is innovative, quiet, opinionated
- Gemini 1.5 Pro is mostly open, slightly emotional
Read on for details.
- Data & visualization: https://lnkd.in/ge3jRh6F
- Talk recording video: https://lnkd.in/gCGn4uwx
- Code: https://lnkd.in/gKFaZuVG
- Transcript & slides: https://lnkd.in/gQAMZZ9c