🏆 C
h
3
Ef Leaderboard 🏆
A1
A2
A3
📝 Notes
Models are ranked according to Accuarcy% using evaluation pipeline based on Perplexity.
GPT-4V and Gemini are evaluated by human.