Feature/add file support for telegram#184
Conversation
09b0b41 to
54def41
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds multimodal support (images and files) to the Telegram integration, enabling users to send photos and documents alongside text messages for AI analysis. The changes also include session memory improvements to preserve conversation context with multimodal inputs.
- Added file and image download functionality with base64 encoding for Telegram messages
- Implemented session memory persistence for multimodal conversations in OpenAI framework
- Added new Google ADK example (
server_adk.py) demonstrating alternative agent framework - Updated documentation with comprehensive multimodal features guide
Reviewed changes
Copilot reviewed 7 out of 23 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
ak-py/src/agentkernel/integration/telegram/telegram_chat.py |
Core implementation adding file/image download, processing, and multi-request handling |
ak-py/src/agentkernel/framework/openai/openai.py |
Session memory fix to manually save multimodal conversations for future reference |
examples/api/telegram/server_adk.py |
New example showing Google ADK framework integration with Telegram |
examples/api/telegram/build.sh |
Added adk to package extras for Google ADK support |
examples/api/telegram/README.md |
Comprehensive documentation update with multimodal usage examples |
docs/docs/integrations/telegram.md |
Integration guide updated with multimodal features, supported formats, and limitations |
ak-py/src/agentkernel/integration/telegram/README.md |
Technical documentation for multimodal support and file handling |
Multiple uv.lock files |
Dependency updates including openai-agents downgrade from 0.6.4 to 0.6.3 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7b6ec6a to
a906b91
Compare
…upport_for_telegram
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 29 changed files in this pull request and generated 17 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 30 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 32 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 32 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
ak-py/src/agentkernel/framework/langgraph/langgraph.py:426
StructuredTool.from_functionis called withfunc=Nonefor coroutine tools. In LangChain,from_functiontypically expects a real callable forfunc(even whencoroutineis provided); passingNonerisks a runtime error when binding async tools. Consider passingfunc=funcas well, or using the dedicated async tool constructor/pattern supported by your pinnedlangchain_coreversion.
if asyncio.iscoroutinefunction(func):
tools.append(
StructuredTool.from_function(
func=None,
coroutine=func,
name=func.__name__,
description=func.__doc__ or func.__name__,
)
)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| AttachmentStorageDriver, | ||
| ) | ||
|
|
||
| _log = logging.getLogger("ak.core.multimodal.storage.in_memory") |
| AttachmentStorageDriver, | ||
| ) | ||
|
|
||
| _log = logging.getLogger("ak.core.multimodal.storage.dynamodb") |
| _log = logging.getLogger("ak.core.multimodal.storage.redis") | ||
|
|
||
|
|
||
| class RedisStorageDriver(AttachmentStorageDriver): |
There was a problem hiding this comment.
All ak related redis keys should have a ak prefix (Refer to config.session.redis for an example)
| if current and current.id == session_id: | ||
| session = current | ||
| else: | ||
| from ...runtime import Runtime |
There was a problem hiding this comment.
Add lazy loading only when required
| """ | ||
| High-level API for attachment storage. | ||
|
|
||
| Resolves the storage driver from ``AKConfig.multimodal.storage_type`` |
There was a problem hiding this comment.
Remove AI generated unrelated comments
| """ | ||
| Attachment storage manager for multimodal memory. | ||
|
|
||
| This module provides the ``AttachmentStorageManager`` class — a high-level API |
| if asyncio.iscoroutinefunction(func): | ||
| tools.append( | ||
| StructuredTool.from_function( | ||
| func=None, |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 32 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ```bash | ||
| export AK_MULTIMODAL__ENABLED=true # Enable multimodal support (default: true) | ||
| export AK_MULTIMODAL__MAX_ATTACHMENTS=5 # Keep last N files in session (default: 5) | ||
| export AK_MULTIMODAL__ATTACHMENT_TTL=604800 # File lifetime in seconds (default: 604800 = 1 week) | ||
| ``` |
There was a problem hiding this comment.
This README documents AK_MULTIMODAL__ENABLED as defaulting to true and introduces AK_MULTIMODAL__ATTACHMENT_TTL, but the config defaults to enabled=false and there is no attachment_ttl field/env var (TTL is storage-backend specific, e.g. AK_MULTIMODAL__REDIS__TTL). Please update the documented env vars and defaults to match AKConfig.multimodal.
| ``{prefix}{session_id}:{attachment_id}``. An additional sorted-set | ||
| ``{prefix}{session_id}:_index`` tracks attachment order for pruning. |
There was a problem hiding this comment.
Docstring says the Redis index is a “sorted-set”, but the implementation uses a Redis list (RPUSH/LPOP/LLEN). Update the docstring to match the actual data structure (or switch implementation to a sorted set) to avoid operational confusion when inspecting Redis keys.
| ``{prefix}{session_id}:{attachment_id}``. An additional sorted-set | |
| ``{prefix}{session_id}:_index`` tracks attachment order for pruning. | |
| ``{prefix}{session_id}:{attachment_id}``. An additional Redis list | |
| ``{prefix}{session_id}:_index`` tracks attachment order (insertion order) | |
| for pruning. |
| def analyze_attachments(attachment_ids: list[str], prompt: str) -> str: | ||
| """ | ||
| Analyze attachments (images/files) using LLM and return ONLY the analysis response. | ||
|
|
||
| :param attachment_ids: List of attachment IDs to analyze | ||
| :param prompt: The question/prompt for analyzing the attachments | ||
| :return: Only the LLM analysis response text | ||
| """ | ||
| if not attachment_ids: | ||
| return "No attachments provided" | ||
|
|
||
| try: | ||
| from ..tool import ToolContext | ||
|
|
||
| ctx = ToolContext.get() | ||
| session = ctx.session | ||
|
|
||
| attachments = AttachmentStorageManager(session_id=session.id).get_attachment_data(attachment_ids=attachment_ids) | ||
|
|
||
| if not attachments: | ||
| return "No attachments found for the given IDs in this session" | ||
|
|
There was a problem hiding this comment.
This new multimodal tool/path (attachment storage + analyze_attachments + prehook injection) isn’t covered by unit tests in this PR. Add tests that (1) store attachments in a session via the prehook, (2) ensure raw attachments are removed/injected as expected, and (3) verify analyze_attachments can retrieve session-scoped attachments (for at least in-memory + one persistent driver stub).
| @router.post("/telegram/webhook") | ||
| async def handle_webhook(request: Request): | ||
| async def handle_webhook(request: Request, background_tasks: BackgroundTasks): | ||
| """ | ||
| Handle incoming Telegram webhook updates. | ||
| """ | ||
| return await self._handle_webhook(request) | ||
| # Read body first to avoid stream consumption issues in background | ||
| body = await request.json() | ||
| background_tasks.add_task(self._process_webhook_body, body, request.headers) | ||
| return {"ok": True} |
There was a problem hiding this comment.
Webhook secret validation is now deferred to the background task, but the HTTP handler always returns 200/{"ok": true} even when the X-Telegram-Bot-Api-Secret-Token is invalid. This weakens auth and makes it hard to detect misconfiguration/attacks (Telegram will think delivery succeeded). Validate the secret token in handle_webhook before scheduling the background task and return a 403 on mismatch; only enqueue _process_webhook_body when authorized.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 68 out of 81 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -233,6 +237,31 @@ def get_a2a_card(self): | |||
| # TODO extract description from graph | |||
| return A2ACardBuilder.build(name=self.name, description="", skills=skills) | |||
|
|
|||
| def attach_tool(self, tool: Any) -> None: | |||
| """ | |||
| Accepts a raw Callable and wraps it with LangGraphToolBuilder before storing. | |||
| Follows the same pattern as ADK, OpenAI, and CrewAI. | |||
| Note: LangGraph tools must be passed to the graph BEFORE compile(); these wrapped | |||
| tools are stored on the agent wrapper for inspection and future use. | |||
| :param tool: Raw Python callable or already-wrapped LangChain StructuredTool. | |||
| """ | |||
| # Delegate to the tool builder to handle binding | |||
| wrapped = LangGraphToolBuilder.bind([tool]) | |||
| for w in wrapped: | |||
| if w not in self._tools: | |||
| self._tools.append(w) | |||
|
|
|||
| def override_system_prompt(self, prompt: str) -> None: | |||
| """ | |||
| Stores the system prompt suffix on the agent wrapper. | |||
| Follows the same pattern as ADK, OpenAI, and CrewAI. | |||
| Note: LangGraph compiled graphs do not expose a mutable system prompt field; | |||
| the value is stored here for inspection. Set the prompt in your graph nodes | |||
| before compile() for it to take effect at inference time. | |||
| """ | |||
| if prompt not in self._system_prompt: | |||
| self._system_prompt += ("\n" if self._system_prompt else "") + prompt | |||
|
|
|||
There was a problem hiding this comment.
LangGraphAgent.attach_tool() and override_system_prompt() only store values on the wrapper (self._tools, self._system_prompt), but LangGraphRunner.run() never uses them when invoking the compiled graph. This means system tools like analyze_attachments are effectively not attached for LangGraph (despite being auto-attached by the base Agent), and the injected system prompt suffix is never applied. Either wire these into the graph before compile / invocation (e.g., ensure create_react_agent(..., tools=[...]) includes the system tools) or explicitly disable auto-attachment for LangGraph and document that users must add the tool/prompt manually.
| #!/bin/sh | ||
| set -e | ||
| # Docker Engine for Linux installation script. | ||
| # | ||
| # This script is intended as a convenient way to configure docker's package | ||
| # repositories and to install Docker Engine, This script is not recommended | ||
| # for production environments. Before running this script, make yourself familiar | ||
| # with potential risks and limitations, and refer to the installation manual | ||
| # at https://docs.docker.com/engine/install/ for alternative installation methods. | ||
| # |
There was a problem hiding this comment.
This repo now vendors the full upstream get.docker.com installer script (~760 lines). That makes it hard to keep in sync with upstream security/compatibility updates and encourages running a privileged installer from the repo. Consider removing this file and linking to the official Docker installation docs (or documenting a minimal docker run redis:alpine ... alternative) instead of vendoring the whole installer.
| multimodal: | ||
| enabled: true | ||
| storage_type: inmemory |
There was a problem hiding this comment.
storage_type: inmemory doesn’t match the allowed values in the config schema (in_memory, redis, dynamodb, session_cache). With the current schema, this YAML will fail validation / won’t load as intended. Use storage_type: in_memory (or omit it, since in-memory is the default).
| multimodal: | ||
| enabled: true | ||
| storage_type: inmemory |
There was a problem hiding this comment.
storage_type: inmemory doesn’t match the allowed values in the config schema (in_memory, redis, dynamodb, session_cache). With the current schema, this YAML will fail validation / won’t load as intended. Use storage_type: in_memory (or omit it, since in-memory is the default).
|
|
||
| import httpx | ||
| from fastapi import APIRouter, HTTPException, Request | ||
| from fastapi import APIRouter, BackgroundTasks, HTTPException, Request |
There was a problem hiding this comment.
HTTPException is imported but no longer used (webhook processing now returns early instead of raising). Consider removing the unused import to avoid lint warnings.
| from fastapi import APIRouter, BackgroundTasks, HTTPException, Request | |
| from fastapi import APIRouter, BackgroundTasks, Request |
Description
This change will add file & image support for telegram integrations
Type of Change
Testing
Checklist
Screenshots (if applicable)
Additional Notes