[Python] Add agent-framework-azure-ai-contentunderstanding package by yungshinlintw · Pull Request #4829 · microsoft/agent-framework

yungshinlintw · 2026-03-22T04:56:02Z

Reviewer's Guide

This is PR for #4942

This package adds a BaseContextProvider implementation that bridges Azure Content Understanding (CU) with the Agent Framework. When a user sends file attachments (PDF, images, audio, video), the provider intercepts them in before_run(), sends them to CU for analysis, and injects the structured results (markdown + extracted fields) back into the LLM context — so the agent can answer questions about the files without the developer writing any extraction code.

Quick usage:

cu = ContentUnderstandingContextProvider(
    endpoint="https://my-resource.services.ai.azure.com/",
    credential=AzureCliCredential(),
)
agent = Agent(
    client=client,
    name="DocQA",
    instructions="You are a document analyst.",
    context_providers=[cu],
)
# Files in Message.contents are auto-analyzed; results injected into LLM context
response = await agent.run(
    Message(role="user", contents=[
        Content.from_text("What's on this invoice?"),
        Content.from_uri("https://example.com/invoice.pdf", media_type="application/pdf",
                         additional_properties={"filename": "invoice.pdf"}),
    ]),
    session=session,
)

Suggested review order

1. Start with samples — they show the feature set and usage patterns end-to-end:

Sample	What it demonstrates
`01_document_qa.py`	Simplest flow — upload a PDF via URL, ask a question about it. Shows `Content.from_uri()`, `context_providers=[cu]`, and how CU results appear in the agent's response.
`02_multi_turn_session.py`	`AgentSession` persistence — upload a file on turn 1, ask follow-up questions on turns 2–3 without re-uploading. Shows how `state["documents"]` carries across turns.
`03_multimodal_chat.py`	PDF + audio + video in a single session (5 turns). Shows auto-detection of media types, parallel analysis, and multi-segment video output with per-segment fields.
`04_invoice_processing.py`	Per-file analyzer override — uses `additional_properties={"analyzer_id": "prebuilt-invoice"}` to extract structured invoice fields (vendor, total, line items) instead of generic markdown.
`05_background_analysis.py`	Non-blocking analysis with `max_wait=0.5` — file starts analyzing in the background while the agent responds immediately. Next turn resolves the pending result. Shows the `analyzing` → `ready` status flow.
`06_large_doc_file_search.py`	CU extraction + OpenAI vector store for RAG — large documents are analyzed by CU, uploaded to a vector store, and retrieved via `file_search` tool instead of injecting full content into context.

2. Then review the core implementation:

Priority	File	Why
🔴 High	`_context_provider.py` (1087 lines)	Core logic — `before_run()` hook, file detection/stripping, CU analysis with timeout + background fallback, output formatting, tool registration. Most important file to review.
🔴 High	`_models.py`	Public API surface — `DocumentEntry`, `DocumentStatus`, `AnalysisSection`, `FileSearchConfig` TypedDicts and enums exposed to users
🟡 Medium	`_file_search.py`	`FileSearchBackend` protocol + OpenAI/Foundry factory methods for vector store integration
🟡 Medium	`__init__.py`	Public exports — verify the right symbols are exposed
🟡 Medium	`pyproject.toml`	Package metadata, dependencies, version constraints
🟢 Low	`tests/`	78 unit tests + 5 live integration tests

MAF API usage (needs team alignment)

This package uses the following internal/private MAF APIs — if any of these are changing or not intended for external use, this package may need updates:

BaseContextProvider and its before_run() hook
SessionContext.extend_instructions(), extend_messages(), extend_tools()
Content.from_data(), Content.from_uri(), Content.type, Content.media_type, Content.additional_properties
FunctionTool for registering list_documents()
agent_framework._sessions.AgentSession
agent_framework._settings.load_settings()

This PR adds agent-framework-azure-ai-contentunderstanding, an optional connector package that integrates Azure Content Understanding (CU) into the Agent Framework as a context provider.

What's Included

Core (_context_provider.py, _models.py, _file_search.py)

ContentUnderstandingContextProvider -- auto-analyzes file attachments (PDF, images, audio, video) via Azure CU and injects structured results (markdown, fields) into LLM context
Auto-detects media type and selects the right CU analyzer (prebuilt-documentSearch, prebuilt-audioSearch, prebuilt-videoSearch)
Multi-document session state with status tracking (analyzing/uploading/ready/failed)
Configurable timeout (max_wait) with async background fallback
Output filtering (>90% token reduction) via AnalysisSection enum
Auto-registered list_documents() tool for status queries
Document content injected into conversation history for follow-up turns
Multi-segment video/audio: per-segment fields with time ranges
MIME sniffing for misidentified files (application/octet-stream)
Per-file analyzer ID override via Content.additional_properties["analyzer_id"] -- mix different analyzers in the same turn (e.g., prebuilt-invoice for invoices alongside prebuilt-documentSearch for general docs)
Duplicate filename rejection (filenames must be unique within a session)
Optional FileSearchConfig for vector store integration (OpenAI/Foundry backends)

Samples (6 scripts)

01_document_qa.py -- Single PDF upload + Q&A
02_multi_turn_session.py -- AgentSession persistence across turns
03_multimodal_chat.py -- PDF + audio + video parallel analysis (5 turns)
04_invoice_processing.py -- Structured field extraction with prebuilt-invoice
05_background_analysis.py -- Non-blocking analysis with max_wait + status tracking
06_large_doc_file_search.py -- CU extraction + vector store RAG

Tests

66 unit tests covering all major flows
5 live integration tests (CU endpoint required)
Test fixtures for PDF, audio, video, image, invoice modalities

Add Azure Content Understanding integration as a context provider for the Agent Framework. The package automatically analyzes file attachments (documents, images, audio, video) using Azure CU and injects structured results (markdown, fields) into the LLM context. Key features: - Multi-document session state with status tracking (pending/ready/failed) - Configurable timeout with async background fallback for large files - Output filtering via AnalysisSection enum - Auto-registered list_documents() and get_analyzed_document() tools - Supports all CU modalities: documents, images, audio, video - Content limits enforcement (pages, file size, duration) - Binary stripping of supported files from input messages Public API: - ContentUnderstandingContextProvider (main class) - AnalysisSection (output section selector enum) - ContentLimits (configurable limits dataclass) Tests: 46 unit tests, 91% coverage, all linting and type checks pass.

- Replace synthetic fixtures with real CU API responses (sanitized) - Update test assertions to match real data (Contoso vs CONTOSO, TotalAmount vs InvoiceTotal, field values from real analysis) - Add --pre install note in README (preview package) - Document unenforced ContentLimits fields (max_pages, duration)

Align naming with Azure SDK convention and AF pattern: - Directory: azure-contentunderstanding -> azure-ai-contentunderstanding - PyPI: agent-framework-azure-contentunderstanding -> agent-framework-azure-ai-contentunderstanding - Module: agent_framework_azure_contentunderstanding -> agent_framework_azure_ai_contentunderstanding CI fixes: - Inline conftest helpers to avoid cross-package import collision in xdist - Remove PyPI badge and dead API reference link from README (package not published yet)

markwallace-microsoft · 2026-03-23T20:46:02Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding
_context_provider.py	449	54	87%	334–335, 337, 341–342, 345, 349, 405, 407, 414, 527, 533, 538–543, 604–606, 676, 680, 723, 744, 771, 776, 846, 956, 988, 1009–1013, 1024, 1112, 1116, 1122, 1142–1147, 1149–1154, 1162, 1171–1172
_file_search.py	23	4	82%	58, 65, 69, 72
_models.py	48	1	97%	125
TOTAL	28532	3471	87%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
5534	20 💤	0 ❌	0 🔥	1m 26s ⏱️

- document_qa.py: Single PDF upload, CU context provider, follow-up Q&A - invoice_processing.py: Structured field extraction with prebuilt-invoice - multimodal_chat.py: Multi-file session with status tracking - Add ruff per-file-ignores for samples/ directory - Update README with samples section, env vars, and run instructions

…earch) - S3: devui_multimodal_agent/ — DevUI web UI with CU-powered file analysis - S4: large_doc_file_search.py — CU extraction + OpenAI vector store RAG - Update README and samples/README.md with all 5 samples

Add FileSearchConfig — when provided, CU-extracted markdown is automatically uploaded to an OpenAI vector store and a file_search tool is registered on the context. This enables token-efficient RAG retrieval for large documents without users needing to manage vector stores manually. - FileSearchConfig dataclass (openai_client, vector_store_name) - Auto-create vector store, upload markdown, register file_search tool - Auto-cleanup on close() - When file_search is enabled, skip full content injection (use RAG instead) - Update large_doc_file_search sample to use the integration - 4 new tests (50 total, 90% coverage)

Follow established AF pattern: check for API key env var first, fall back to AzureCliCredential. Supports AZURE_OPENAI_API_KEY and AZURE_CONTENTUNDERSTANDING_API_KEY environment variables.

…zy init _context_provider.py: - Make analyzer_id optional (default None) with auto-detection by media type prefix: audio->audioSearch, video->videoSearch, else documentSearch - Add _ensure_initialized() for lazy client creation in before_run() - Add FileSearchConfig-based vector store upload - Fix: background-completed docs in file_search mode now upload to vector store instead of injecting full markdown into context messages - Add _pending_uploads queue for deferred vector store uploads devui_file_search_agent/ (new sample): - DevUI agent combining CU extraction + OpenAI file_search RAG azure_responses_agent (existing sample fix): - Add AzureCliCredential support and AZURE_AI_PROJECT_ENDPOINT fallback Tests (19 new), Docs updated (AGENTS.md, README.md)

…tor store expiration - Add three-layer MIME detection (fast path → filetype binary sniff → filename fallback) to handle unreliable upstream MIME types (e.g. mp4 sent as application/octet-stream). Adds filetype>=1.2,<2 dependency. - Media-aware output formatting: video shows duration/resolution + all fields as JSON; audio promotes Summary as prose; document unchanged. - Unified timeout for all media types (removed file_search special-case that waited indefinitely for video/audio). All files use max_wait with background polling fallback. - Vector store created with expires_after=1 day as crash safety net. - Add 8 MIME sniffing tests (TestMimeSniffing class).

CU's prebuilt-videoSearch and prebuilt-audioSearch analyzers split long media files into multiple `contents[]` segments. Previously, `_extract_sections()` only read `contents[0]`, causing truncated duration, missing transcript, and incomplete fields for any video/audio longer than a single scene. Now iterates all segments and merges: - duration: global min(startTimeMs) → max(endTimeMs) - markdown: concatenated with `---` separators - fields: same-named fields collected into per-segment list - metadata (kind, resolution): taken from first segment Single-segment results (documents, short audio) are unaffected. Update test fixture to realistic 3-segment video structure and expand assertions to verify multi-segment merging. Add documentation for multi-segment processing and speaker diarization limitation.

- Improve class docstring: clarify endpoint (Azure AI Foundry URL with example), credential (AzureKeyCredential vs Entra ID), and analyzer_id (prebuilt/custom with auto-selection behavior and reference links) - Add SUPPORTED_MEDIA_TYPES comments explaining MIME-based matching behavior and add missing file types per CU service docs - Use namespaced logger to align with other packages - Remove ContentLimits and related code/tests - Rename DEFAULT_MAX_WAIT to DEFAULT_MAX_WAIT_SECONDS for clarity

- Add vector_store_id field to FileSearchConfig (None = auto-create) - Track _owns_vector_store to only delete auto-created stores on close() - Remove vector_store_name; use internal _DEFAULT_VECTOR_STORE_NAME - Add inline comments for private state fields - Document output_sections default in docstring - Update AGENTS.md, samples, and tests

Resolve conflict in azure_responses_agent/agent.py by taking upstream (AzureOpenAIResponsesClient -> FoundryChatClient rename)

Follow Azure AI Search provider pattern: create the client eagerly in __init__, make __aenter__ a no-op. This ensures __aexit__/close() is always safe to call and eliminates the _ensure_initialized() workaround.

Replace direct OpenAI client usage with FileSearchBackend ABC: - OpenAIFileSearchBackend: for OpenAIChatClient (Responses API) - FoundryFileSearchBackend: for FoundryChatClient (Azure Foundry) - Shared base _OpenAICompatBackend for common vector store CRUD FileSearchConfig now takes a backend instead of openai_client. Factory methods from_openai() and from_foundry() for convenience. BREAKING: FileSearchConfig(openai_client=...) -> FileSearchConfig.from_openai(...)

- Poll vector store indexing (create_and_poll) to ensure file_search returns results immediately after upload - Set status to failed when vector store upload fails - Skip get_analyzed_document tool in file_search mode to prevent LLM from bypassing RAG - Simplify sample auth: single credential, direct parameters - Use from_foundry backend for Foundry project endpoints

- Add module-level docstrings to __init__.py and _context_provider.py - Use Self return type for __aenter__ (with typing_extensions fallback) - Use explicit typed params for __aexit__ signature - Add sync TokenCredential to AzureCredentialTypes union - Pass AGENT_FRAMEWORK_USER_AGENT to ContentUnderstandingClient - Remove unused ContentLimits from public API and tests - Fix FileSearchConfig tests to match refactored backend API - Fix lifecycle tests to match eager client initialization

- Refactor _analyze_file to return DocumentEntry instead of mutating dict - Remove TokenCredential from AzureCredentialTypes (fixes mypy/pyright CI) - Remove OpenAIFileSearchBackend/FoundryFileSearchBackend from public API (internal to FileSearchConfig factory methods) - Remove DocumentStatus from public exports (implementation detail) - Update file_search comments to reflect backend-agnostic design - Add DocumentStatus enum, analysis/upload duration tracking - Add combined timeout for CU analysis + vector store upload

- Trim analyze_pdf_result.json from 4427 to 23 lines by removing pages, words, lines, paragraphs, sections, spans, and source fields that are not used by any unit test. - Add docstring note that filename must be unique within a session; duplicate filenames are rejected and the file will not be analyzed.

Copilot

Pull request overview

Adds a new optional Python connector package, agent-framework-azure-ai-contentunderstanding, integrating Azure Content Understanding (CU) into the Agent Framework as a BaseContextProvider for automatic attachment analysis and optional vector-store (file_search) indexing.

Changes:

Introduces ContentUnderstandingContextProvider plus supporting models and vector-store upload abstraction (FileSearchBackend / FileSearchConfig).
Adds extensive unit + integration tests and CU result fixtures, along with script + DevUI samples.
Wires the new workspace package into python/pyproject.toml and python/uv.lock.

Reviewed changes

Copilot reviewed 36 out of 38 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
python/uv.lock	Adds the new workspace member and locks new deps (`azure-ai-contentunderstanding`, `filetype`).
python/pyproject.toml	Registers the package in workspace deps and adds pyright test env config.
python/packages/azure-ai-contentunderstanding/pyproject.toml	New package metadata, deps, and tooling config (pytest/ruff/mypy/pyright).
python/packages/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/init.py	Public exports for provider/models/backends.
python/packages/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_models.py	Defines `DocumentStatus`, `AnalysisSection`, `DocumentEntry`, `FileSearchConfig`.
python/packages/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_file_search.py	Adds backend abstraction for vector store upload/delete across OpenAI/Foundry clients.
python/packages/azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_context_provider.py	Implements CU analysis, session tracking, background analysis, MIME sniffing, and optional vector-store upload.
python/packages/azure-ai-contentunderstanding/tests/cu/conftest.py	Adds fixtures and mock CU client factory.
python/packages/azure-ai-contentunderstanding/tests/cu/test_models.py	Unit tests for enums/typed models and `FileSearchConfig` factories.
python/packages/azure-ai-contentunderstanding/tests/cu/test_context_provider.py	Comprehensive unit tests for provider flows (analysis, background, sniffing, file_search).
python/packages/azure-ai-contentunderstanding/tests/cu/test_integration.py	Live CU integration tests (skipped unless env var is set).
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_pdf_result.json	CU PDF fixture for unit tests.
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_invoice_result.json	CU invoice fixture for unit tests.
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_image_result.json	CU image fixture for unit tests.
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_audio_result.json	CU audio fixture for unit tests.
python/packages/azure-ai-contentunderstanding/tests/cu/fixtures/analyze_video_result.json	CU video fixture for unit tests.
python/packages/azure-ai-contentunderstanding/README.md	Package README with setup guidance and usage examples.
python/packages/azure-ai-contentunderstanding/LICENSE	Adds MIT license file for the new package.
python/packages/azure-ai-contentunderstanding/AGENTS.md	Package-specific agent/dev notes and architecture description.
python/packages/azure-ai-contentunderstanding/.gitignore	Ignores local-only artifacts under the package.
python/packages/azure-ai-contentunderstanding/samples/README.md	Top-level samples index for scripts and DevUI examples.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/01_document_qa.py	Script sample: single PDF upload + Q&A.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/02_multi_turn_session.py	Script sample: session persistence across turns.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/03_multimodal_chat.py	Script sample: PDF+audio+video parallel CU analysis.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/04_invoice_processing.py	Script sample: per-file analyzer override for invoice extraction.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/05_background_analysis.py	Script sample: short `max_wait` triggers background analysis + status.
python/packages/azure-ai-contentunderstanding/samples/01-get-started/06_large_doc_file_search.py	Script sample: CU extraction + vector-store indexing for `file_search`.
python/packages/azure-ai-contentunderstanding/samples/02-devui/01-multimodal_agent/agent.py	DevUI agent: CU-powered upload + chat.
python/packages/azure-ai-contentunderstanding/samples/02-devui/01-multimodal_agent/init.py	DevUI agent module export.
python/packages/azure-ai-contentunderstanding/samples/02-devui/01-multimodal_agent/README.md	DevUI setup/usage doc for multimodal agent.
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/azure_openai_backend/agent.py	DevUI agent: CU + `file_search` (Azure OpenAI backend).
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/azure_openai_backend/init.py	DevUI agent module export.
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/azure_openai_backend/README.md	DevUI setup/usage doc for Azure OpenAI file_search agent.
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/foundry_backend/agent.py	DevUI agent: CU + `file_search` (Foundry backend).
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/foundry_backend/init.py	Foundry backend sample package init.
python/packages/azure-ai-contentunderstanding/samples/02-devui/02-file_search_agent/foundry_backend/README.md	DevUI setup/usage doc for Foundry file_search agent.
python/AGENTS.md	Adds the new package to the Python “Azure Integrations” index.

...e-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_context_provider.py

python/packages/azure-ai-contentunderstanding/tests/cu/test_context_provider.py

python/packages/azure-ai-contentunderstanding/samples/02-devui/01-multimodal_agent/agent.py

python/packages/azure-ai-contentunderstanding/AGENTS.md

...-ai-contentunderstanding/samples/02-devui/02-file_search_agent/azure_openai_backend/agent.py

...on/packages/azure-ai-contentunderstanding/samples/01-get-started/06_large_doc_file_search.py

…azure_ai_contentunderstanding/_context_provider.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…/02-file_search_agent/azure_openai_backend/agent.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…/01-multimodal_agent/agent.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…tarted/06_large_doc_file_search.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

… helper AGENTS.md: - Remove _ensure_initialized() reference (client is created in __init__) - Fix multi-segment docs: segments kept as list, not merged into fields - Remove get_analyzed_document() reference (only list_documents registered) - Update sample names to match current directory structure test_context_provider.py: - Simplify _make_data_uri() — remove unused 'encoded' variable

- Change _resolve_pending_tasks() instruction from 'Use file_search' to 'being indexed' since the upload hasn't completed yet at that point. - Add LLM instruction on upload failure in step 1b so the agent can inform the user the document isn't searchable.

Copilot

Pull request overview

Copilot reviewed 36 out of 38 changed files in this pull request and generated 3 comments.

.../azure-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_file_search.py

python/packages/azure-ai-contentunderstanding/samples/02-devui/01-multimodal_agent/README.md

...e-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_context_provider.py

…led tasks - _file_search.py: Remove unused logger and logging import - 01-multimodal_agent/README.md: Remove accidentally pasted Python script - _context_provider.py close(): Await cancelled tasks before closing client to prevent 'Task destroyed but pending' warnings

Copilot

Pull request overview

Copilot reviewed 36 out of 38 changed files in this pull request and generated 3 comments.

...e-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_context_provider.py

- Add _sanitize_doc_key() to strip control characters, collapse whitespace, and cap length at 255 chars — prevents prompt injection via crafted filenames in extend_instructions() calls. - Track accepted doc_keys in step 3 so step 5 only injects content for files actually analyzed this turn, not pre-existing duplicates. - Soften duplicate upload instruction wording (remove IMPORTANT/caps).

Previously _pending_tasks, _pending_uploads, and _uploaded_file_ids were stored on self, shared across all sessions. This caused cross-session leakage: Session A's background task results could be injected into Session B's context. Now these are stored in the per-session state dict. Global copies (_all_pending_tasks, _all_uploaded_file_ids) are kept on self only for best-effort cleanup in close(). Add 2 new TestSessionIsolation tests verifying that background tasks and resolved content stay within their originating session.

Only MARKDOWN and FIELDS are handled by _extract_sections(). Remove FIELD_GROUNDING, TABLES, PARAGRAPHS, SECTIONS to avoid exposing dead options to users.

Copilot

Pull request overview

Copilot reviewed 36 out of 38 changed files in this pull request and generated 1 comment.

Copilot · 2026-03-27T07:16:46Z

...e-ai-contentunderstanding/agent_framework_azure_ai_contentunderstanding/_context_provider.py

+            kind = filetype.guess(binary_data[:262])  # type: ignore[reportUnknownMemberType]
+            if kind:
+                mime: str = kind.mime  # type: ignore[reportUnknownMemberType]


The # type: ignore[reportUnknownMemberType] annotations use a Pyright-only error code. In this repo, MyPy is run in strict mode for packages, and MyPy will treat unknown ignore codes as an error. Replace these ignores with a MyPy-compatible approach (e.g., use Any/cast, or getattr to access mime without a type-ignore code that MyPy doesn't understand).

Suggested change

kind = filetype.guess(binary_data[:262]) # type: ignore[reportUnknownMemberType]

if kind:

mime: str = kind.mime # type: ignore[reportUnknownMemberType]

kind_any: Any = filetype.guess(binary_data[:262])

if kind_any:

mime = cast(str, kind_any.mime)

yungshinlintw added 3 commits March 21, 2026 15:41

chore: add connector .gitignore, update uv.lock

8e6e73b

markwallace-microsoft added documentation Improvements or additions to documentation python labels Mar 22, 2026

github-actions bot changed the title ~~[WIP] [Python] Add agent-framework-azure-contentunderstanding package (DO NOT REVIEW)~~ Python: [WIP] [Python] Add agent-framework-azure-contentunderstanding package (DO NOT REVIEW) Mar 22, 2026

yungshinlintw and others added 19 commits March 23, 2026 14:49

fix: add key-based auth support to all samples

c4fe308

Follow established AF pattern: check for API key env var first, fall back to AzureCliCredential. Supports AZURE_OPENAI_API_KEY and AZURE_CONTENTUNDERSTANDING_API_KEY environment variables.

Merge upstream/main into yslin/contentunderstanding-context-provider

14234d2

Resolve conflict in azure_responses_agent/agent.py by taking upstream (AzureOpenAIResponsesClient -> FoundryChatClient rename)

fix: remove ContentLimits from README code block

04e8dce

refactor: create CU client in __init__ instead of __aenter__

637a3a4

Follow Azure AI Search provider pattern: create the client eagerly in __init__, make __aenter__ a no-op. This ensures __aexit__/close() is always safe to call and eliminates the _ensure_initialized() workaround.

docs: add file_search param to class docstring

1f451b6

refactor: FileSearchBackend abstraction + caller-owned vector store

cb9b5b6

perf: set max_num_results=10 for file_search to reduce token usage

90284e6

fix: move import to top of file (E402 lint)

67975c6

chore: remove unused imports

4345cbc

yungshinlintw changed the title ~~Python: [WIP] [Python] Add agent-framework-azure-contentunderstanding package (DO NOT REVIEW)~~ Python: [WIP] [Python] Add agent-framework-azure-ai-contentunderstanding package (DO NOT REVIEW) Mar 26, 2026

yungshinlin added 2 commits March 26, 2026 10:36

yungshinlintw changed the title ~~Python: [WIP] [Python] Add agent-framework-azure-ai-contentunderstanding package (DO NOT REVIEW)~~ [Python] Add agent-framework-azure-ai-contentunderstanding package Mar 27, 2026

yungshinlintw marked this pull request as ready for review March 27, 2026 05:14

Copilot AI review requested due to automatic review settings March 27, 2026 05:14

Copilot started reviewing on behalf of yungshinlintw March 27, 2026 05:16 View session

Merge branch 'main' into yslin/contentunderstanding-context-provider

6ee5d98

Copilot AI reviewed Mar 27, 2026

View reviewed changes

yungshinlintw and others added 8 commits March 26, 2026 22:28

Update python/packages/azure-ai-contentunderstanding/agent_framework_…

5ee0514

…azure_ai_contentunderstanding/_context_provider.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update python/packages/azure-ai-contentunderstanding/agent_framework_…

d0e98b3

…azure_ai_contentunderstanding/_context_provider.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update python/packages/azure-ai-contentunderstanding/samples/02-devui…

dd1fffb

…/02-file_search_agent/azure_openai_backend/agent.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update python/packages/azure-ai-contentunderstanding/samples/02-devui…

0714d17

…/01-multimodal_agent/agent.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update python/packages/azure-ai-contentunderstanding/samples/01-get-s…

c456327

…tarted/06_large_doc_file_search.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fix: wrap long line in devui agent instructions (E501)

d288fc6

yungshinlintw requested a review from Copilot March 27, 2026 05:58

yungshinlintw mentioned this pull request Mar 27, 2026

.NET: [Feature]: Azure Content Understanding context provider for multimodal document analysis #4942

Open

Copilot started reviewing on behalf of yungshinlintw March 27, 2026 06:00 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

yungshinlintw requested a review from Copilot March 27, 2026 06:21

Copilot started reviewing on behalf of yungshinlintw March 27, 2026 06:22 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

yungshinlin added 3 commits March 26, 2026 23:43

fix: add type annotation to tasks_to_cancel for pyright

0afc812

yungshinlintw requested a review from Copilot March 27, 2026 07:12

Copilot started reviewing on behalf of yungshinlintw March 27, 2026 07:13 View session

Remove unused AnalysisSection enum values

898478f

Only MARKDOWN and FIELDS are handled by _extract_sections(). Remove FIELD_GROUNDING, TABLES, PARAGRAPHS, SECTIONS to avoid exposing dead options to users.

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Conversation

yungshinlintw commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Suggested review order

MAF API usage (needs team alignment)

What's Included

Uh oh!

markwallace-microsoft commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yungshinlintw commented Mar 22, 2026 •

edited

Loading

markwallace-microsoft commented Mar 23, 2026 •

edited

Loading