Is your feature request related to a problem?
We need to assess the performance of various AI models on multilingual student inputs. This evaluation is crucial for understanding their accuracy and consistency.
Describe the solution you'd like
- Evaluate the following models on multilingual inputs:
- GPT-4o-mini
- GPT-4.1-mini
- Gemini 2.5 Flash
- Gemini 3.1 Flash Lite
Original issue
Evaluate GPT-4o-mini, GPT-4.1-mini, Gemini 2.5 Flash, and Gemini 3.1 Flash Lite on multilingual student inputs for accuracy and consistency
Is your feature request related to a problem?
We need to assess the performance of various AI models on multilingual student inputs. This evaluation is crucial for understanding their accuracy and consistency.
Describe the solution you'd like
Original issue
Evaluate GPT-4o-mini, GPT-4.1-mini, Gemini 2.5 Flash, and Gemini 3.1 Flash Lite on multilingual student inputs for accuracy and consistency