Is your feature request related to a problem? Please describe.
The current llm/call endpoint only supports text and audio input types. However, the AI Assessment module from InquiLab requires support for images and PDFs (including multiple files) as well as mixed-content understanding (text + image/PDF). This limitation prevents the existing LLM integration from handling the required multimodal assessment workflows.
Describe the solution you'd like
- Extend the existing providers
openai and google to support both images and pdf.
- The input field from
QueryParamsshould accept both a single dict or a list of dicts, allowing mixed content in a single request.
- input: dict | list[dict]
- single image input
{
"input": {
"type": "image",
"content": {
"format": "url",
"value": "public_url"
}
}
}
- mixed content image and text
{
"input": [
{
"type": "image",
"content": {
"format": "url",
"value": "public_url"
}
},
{
"type": "text",
"content": {
"format": "text",
"value": "What is in the image?"
}
}
]
}
Additional context
The image and pdf content should support both base64 and public url.
Is your feature request related to a problem? Please describe.
The current llm/call endpoint only supports text and audio input types. However, the AI Assessment module from InquiLab requires support for images and PDFs (including multiple files) as well as mixed-content understanding (text + image/PDF). This limitation prevents the existing LLM integration from handling the required multimodal assessment workflows.
Describe the solution you'd like
openaiandgoogleto support both images and pdf.QueryParamsshould accept both a single dict or a list of dicts, allowing mixed content in a single request.Additional context
The image and pdf content should support both base64 and public url.