Skip to content

feat(api): add MCP user-identity forwarding#36839

Open
CourTeous33 wants to merge 33 commits into
mainfrom
feat/mcp-token-transfer-api
Open

feat(api): add MCP user-identity forwarding#36839
CourTeous33 wants to merge 33 commits into
mainfrom
feat/mcp-token-transfer-api

Conversation

@CourTeous33
Copy link
Copy Markdown
Contributor

Summary

  • Adds end-user SSO identity forwarding to outbound MCP requests. When an MCP provider has forward_user_identity enabled, the API calls dify-enterprise's /inner/api/mcp/issue-token to mint a fresh per-user OIDC id_token (or OAuth2 access_token) for the calling workflow user and stamps it as Authorization: Bearer <token> on every outbound MCP request — so the MCP server can act on behalf of the verified end user rather than seeing only "Dify is calling."
  • Two new columns on tool_mcp_providers:
    • forward_user_identity (bool, default false) — master gate exposed to the UI as a toggle.
    • identity_mode (varchar(32), default "off") — "off" or "idp_token". Forward-compatible so a future RFC 8693 token-exchange mode lands without UI/DB churn.
  • Errors map cleanly to workflow halts: 428MCPNoRefreshTokenError (user has no stored SSO refresh token), 401MCPIdentityRefreshError (IdP rejected the refresh). No silent downgrade to anonymous.

Test plan

  • flask db upgrade applies the new migration on a fresh DB.
  • Existing MCP providers continue to work unchanged (forward_user_identity defaults to false).
  • With a workspace using SSO + an MCP provider with forward_user_identity=true, invoking a tool from a workflow forwards an audience-bound JWT (verifiable via a JWKS-checking MCP server).
  • With forward_user_identity=true but the calling user has no stored refresh token, the workflow fails with MCPNoRefreshTokenError instead of silently going anonymous.
  • GET /console/api/workspaces/current/tools/mcp returns the two new fields in each provider entry.

This is the server-side half. The UI toggle ships in a separate PR (feat/mcp-token-transfer-web).

When an MCP provider has forward_user_identity enabled, MCPTool now asks
dify-enterprise to mint a short-lived per-user SSO id_token (via the M1
/inner/api/mcp/issue-token endpoint) and stamps it as the Authorization
Bearer on every outbound MCP request — so an MCP server can act on behalf
of the verified end user instead of seeing only "Dify is calling."

- Adds forward_user_identity (bool) + identity_mode ("off" | "idp_token")
  to tool_mcp_providers, plumbed through MCPProviderEntity, the controller,
  service-layer CRUD, and the tool/provider runtime.
- EnterpriseService.issue_mcp_token wraps the enterprise endpoint and maps
  428 → MCPNoRefreshTokenError, 401 → MCPIdentityRefreshError so workflows
  halt with a clear "please re-authenticate" instead of silently going
  anonymous.
- Identity_mode is intentionally an enum-shaped string column so future
  modes (e.g. RFC 8693 token exchange) land without UI/DB churn.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@CourTeous33 CourTeous33 requested a review from a team May 30, 2026 02:53
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label May 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 30, 2026

Pyrefly Type Coverage

Metric Base PR Delta
Type coverage 46.88% 46.88% +0.00%
Strict coverage 46.38% 46.38% +0.00%
Typed symbols 25,884 25,900 +16
Untyped symbols 29,644 29,662 +18
Modules 2819 2819 0

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 30, 2026

Pyrefly Diff

base → PR
--- /tmp/pyrefly_base.txt	2026-06-04 18:33:05.140660082 +0000
+++ /tmp/pyrefly_pr.txt	2026-06-04 18:32:50.065570824 +0000
@@ -63,7 +63,7 @@
 ERROR Returned type `list[DatasetRetrieverTool]` is not assignable to declared return type `list[DatasetRetrieverBaseTool] | None` [bad-return]
     --> core/rag/retrieval/dataset_retrieval.py:1243:16
 ERROR Class member `MCPToolProviderController.entity` overrides parent class `ToolProviderController` in an inconsistent manner [bad-override-mutable-attribute]
-  --> core/tools/mcp_tool/provider.py:33:14
+  --> core/tools/mcp_tool/provider.py:34:14
 ERROR Class member `PluginToolProviderController.entity` overrides parent class `BuiltinToolProviderController` in an inconsistent manner [bad-override-mutable-attribute]
   --> core/tools/plugin_tool/provider.py:12:5
 ERROR Cannot set item in `dict[str, dict[str, Any]]` [unsupported-operation]
@@ -4256,7 +4256,7 @@
 ERROR Argument `Iterator[DatasourceMessage]` is not assignable to parameter `messages` with type `Generator[DatasourceMessage]` in function `core.datasource.utils.message_transformer.DatasourceFileMessageTransformer.transform_datasource_invoke_messages` [bad-argument-type]
    --> tests/unit_tests/core/datasource/utils/test_message_transformer.py:329:26
 ERROR Argument `SimpleNamespace` is not assignable to parameter `db_provider` with type `MCPToolProvider` in function `core.entities.mcp_provider.MCPProviderEntity.from_db_model` [bad-argument-type]
-  --> tests/unit_tests/core/entities/test_entities_mcp_provider.py:59:46
+  --> tests/unit_tests/core/entities/test_entities_mcp_provider.py:60:46
 ERROR `None` is not subscriptable [unsupported-operation]
    --> tests/unit_tests/core/entities/test_entities_provider_configuration.py:666:12
 ERROR Argument `SimpleNamespace` is not assignable to parameter `credential_record` with type `ProviderCredential | ProviderModelCredential` in function `core.entities.provider_configuration.ProviderConfiguration._update_load_balancing_configs_with_credential` [bad-argument-type]
@@ -5394,6 +5394,8 @@
 ERROR Object of class `BlobChunkMessage` has no attribute `text`
 ERROR Object of class `BlobChunkMessage` has no attribute `text`
 ERROR Object of class `BlobChunkMessage` has no attribute `text`
+ERROR Argument `str` is not assignable to parameter `identity_mode` with type `IdentityMode` in function `core.tools.mcp_tool.tool.MCPTool.__init__` [bad-argument-type]
+   --> tests/unit_tests/core/tools/test_mcp_tool.py:175:23
 ERROR Object of class `BlobChunkMessage` has no attribute `text`
 ERROR Argument `SimpleNamespace` is not assignable to parameter `agent_message` with type `Message` in function `core.tools.tool_engine.ToolEngine._create_message_files` [bad-argument-type]
    --> tests/unit_tests/core/tools/test_tool_engine.py:138:31

@autofix-ci autofix-ci Bot requested a review from crazywoola as a code owner May 30, 2026 02:56
@github-actions github-actions Bot added the web This relates to changes on the web. label May 30, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

Codecov Report

❌ Patch coverage is 87.37010% with 316 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.95%. Comparing base (d3058d6) to head (85f1ee3).
⚠️ Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
api/core/workflow/generator/runner.py 82.59% 50 Missing and 32 partials ⚠️
api/core/workflow/nodes/agent_v2/validators.py 57.62% 31 Missing and 19 partials ⚠️
api/controllers/web/human_input_file_upload.py 71.18% 25 Missing and 9 partials ⚠️
api/services/human_input_file_upload_service.py 71.90% 17 Missing and 17 partials ⚠️
api/services/agent/composer_validator.py 78.94% 7 Missing and 9 partials ⚠️
api/core/workflow/node_runtime.py 60.52% 9 Missing and 6 partials ⚠️
...workflow/nodes/agent_v2/runtime_request_builder.py 83.52% 11 Missing and 3 partials ⚠️
api/core/workflow/human_input_policy.py 64.86% 8 Missing and 5 partials ⚠️
api/services/human_input_service.py 88.11% 6 Missing and 6 partials ⚠️
api/controllers/console/app/agent_app_workspace.py 94.28% 10 Missing ⚠️
... and 15 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #36839      +/-   ##
==========================================
+ Coverage   85.88%   85.95%   +0.07%     
==========================================
  Files        4573     4602      +29     
  Lines      224404   227335    +2931     
  Branches    41206    41807     +601     
==========================================
+ Hits       192730   195412    +2682     
- Misses      27955    28101     +146     
- Partials     3719     3822     +103     
Flag Coverage Δ
api 85.45% <87.37%> (+0.06%) ⬆️
dify-ui 95.57% <ø> (ø)
web 86.63% <ø> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

CourTeous33 and others added 13 commits May 31, 2026 23:00
…complete error mapping, immutable PATCH defaults

Three correctness fixes flagged in self-review of the M2 PR:

1. Fail-closed when forwarded identity can't be obtained. MCPClientWithAuthRetry
   gained a `forward_identity_active` flag; when set, an MCPAuthError from the
   MCP server is re-raised immediately instead of falling through to the
   static-OAuth retry path (which would silently replace the rejected user
   token with the provider's static client identity). MCPTool also now refuses
   to invoke when forward_user_identity is enabled but no user_id was threaded
   through — better to halt the workflow than fingerprint as Dify-the-app.

2. EnterpriseService.issue_mcp_token now catches EnterpriseServiceError (the
   real base class) instead of EnterpriseAPIError. The 400/401/403/404
   subclasses inherit directly from EnterpriseServiceError, so the old
   catch-list let 403/404 leak through as generic ToolInvokeError text. 403
   now surfaces as MCPIdentityRefreshError (license/policy refusal).

3. MCPProviderBasePayload.forward_user_identity / identity_mode are now
   bool | None / Literal | None defaulting to None — matching the
   service-layer "None means leave unchanged" contract on update_provider.
   Previously a PATCH that omitted the fields would reset the toggle to
   False/"off". Create path now materializes None → safe defaults.

Also tightens the expires_at parser: accepts int or float, rejects bool
(which is an int subclass and would have passed isinstance previously).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers the M2 surfaces and the review-fix behaviors:

- core/mcp/auth_client.py — MCPClientWithAuthRetry short-circuits on
  MCPAuthError when forward_identity_active=True (no silent downgrade to
  static OAuth identity); default constructor still defaults to False.

- services/enterprise/enterprise_service.py — issue_mcp_token maps
  401→MCPIdentityRefreshError, 428→MCPNoRefreshTokenError,
  403→MCPIdentityRefreshError (license/policy), other status→MCPTokenError.
  Accepts int/float expires_at; rejects bool (which is a Python int subclass
  and would have slipped past a naive isinstance check).

- core/tools/mcp_tool/tool.py — _inject_forwarded_identity stamps the
  Bearer header (preserving non-Authorization headers) and translates EE
  errors to ToolInvokeError; invoke_remote_mcp_tool now fails closed when
  forwarding is on but no user_id was supplied (reordered the check ahead
  of the DB session so it's testable without a Flask app context).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The M2 PR's stated promise that identity_mode is "forward-compatible for a
future RFC 8693 token_exchange mode" was undercut by Literal["off","idp_token"]
inlined in 6 layers (entity, model, controller payload, service signatures,
runtime comparisons). Adding a third value would have been a multi-file edit.

Introduces a single IdentityMode(StrEnum) in core/entities/mcp_provider.py
(next to the existing MCPSupportGrantType) and threads it through:

  - MCPProviderEntity                  (entity field + DB hydration)
  - MCPProviderCreatePayload           (Pydantic validation at the boundary)
  - MCPToolProviderController          (constructor + attribute)
  - MCPTool                            (constructor + comparison at invoke)
  - MCPToolManageService.create/update (service signatures)

Adding a new mechanism is now one line — append it to the enum and the entire
stack (including Pydantic boundary validation) picks it up automatically.

DB column stays varchar(32); MCPProviderEntity.from_db_model now coerces
through IdentityMode(...) so an unknown stored value raises at the boundary
rather than silently leaking a bare string into the runtime checks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Simplicity pass — no behavior changes:

- Drop multi-line "M2/M3 — …" narration above forward_user_identity fields
  in MCPProviderEntity and MCPProviderBasePayload; the field names and
  default values speak for themselves.
- Shrink IdentityMode docstring (drop the forward-compat marketing copy).
- Drop the redundant `assert user_id` inside the inject block — the
  fail-closed check 30 lines above already establishes it; the assert plus
  its comment is duplicate guarding.
- Drop the duplicate security comment block in MCPClientWithAuthRetry —
  the kwarg docstring already explains the contract.
- Shrink MCPNoRefreshTokenError / MCPIdentityRefreshError docstrings to a
  single sentence each; drop the leaf-mount workaround comment that won't
  matter post-merge.
- Drop the per-test docstrings and a no-op `patch.object(client,
  "_has_retried", False)` from test_auth_client_inheritance.py — the test
  names are explicit and the attribute is already False after __init__.
- Drop test_forwarding_inactive_tool_keeps_legacy_construction — it
  asserted attribute round-tripping (constants in, constants out) without
  exercising any code path.
- Drop the `noqa: F401` dead import in test_happy_path_returns_token_and_expiry.

22 tests still pass (was 23 — minus the tautological one).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two CI fixes — no runtime behavior changes:

1. tests/unit_tests/core/entities/test_entities_mcp_provider.py — the
   SimpleNamespace stub for `db_provider` was missing the two new columns
   that MCPProviderEntity.from_db_model now reads. Add forward_user_identity
   and identity_mode to the mock.

2. migrations: rebase down_revision from 7885bd53f9a9 onto 121e7346074d.
   main grew two new migrations off 7885bd53f9a9 since this branch was
   cut (a1b2c3d4e5f6 add_credential_visibility, 121e7346074d
   unify_agent_runtime_sessions_table), so our 3df4dbcc1e21 was a
   divergent second head — flask-migrate refused to apply.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-requested compatibility behavior for non-enterprise installs:
even with a stale `forward_user_identity=True` row in the DB, the
runtime treats the toggle as a no-op when `dify_config.ENTERPRISE_ENABLED`
is False. No enterprise inner-API call is made, no Bearer header is
injected, and the legacy provider-identity path runs unchanged.

This protects non-enterprise OSS users from a 5xx if a row was created
when the deployment was Enterprise and later switched off (or via a
migration import). The DB columns persist, but only become active when
the enterprise side that can mint the tokens is actually present.

- Adds MCPTool._forwarding_requested property combining
  forward_user_identity + identity_mode + ENTERPRISE_ENABLED.
- Both the fail-closed check and the inject path consult it.
- New test test_invoke_skips_forwarding_when_enterprise_disabled
  asserts no _inject_forwarded_identity call when ENTERPRISE_ENABLED
  is False.
- Existing fail-closed test now explicitly patches ENTERPRISE_ENABLED=True
  to lock in that branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread api/models/tools.py Outdated
from services.enterprise.enterprise_service import EnterpriseService

try:
token, _expires_at = EnterpriseService.issue_mcp_token(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all enterprise related must be gated using dify_config.ENTERPRISE_ENABLED, including UI and backend invocations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the route are gated already.

One possible non-gated senario is that someone curl to the controller with identity_mode=idp_token on a non-EE install. should we consider to gate such attack?

Copy link
Copy Markdown
Contributor

@wylswz wylswz Jun 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Both API and backend implementation should be gated.

authentication: MCPAuthentication | None = None,
validation_result: ServerUrlValidationResult | None = None,
forward_user_identity: bool | None = None,
identity_mode: IdentityMode | None = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just keep one parameter,
identity_mode: IdentityMode = IdentityMode.OFF, then None check is unnecessary

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes forward_user_identity can be removed

but PUT /workspaces/current/tool-provider/mcp can have the identity_mode = None if unchanged, so I would rather keep the none check.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API related behavior could be handled by controller layer

CourTeous33 added a commit that referenced this pull request Jun 4, 2026
…ty_mode (#36840)

API PR #36839 collapses the redundant `forward_user_identity` boolean
into the `identity_mode` enum — there's only one knob now. Mirror
that on the web side:

- types/ + service: stop declaring and sending forward_user_identity
- modal: omit it from the create/update payload; identity_mode alone
  drives backend state
- hook: hydrate the local toggle from `data.identity_mode !== 'off'`
  instead of the dropped bool
- tests: assert the new payload shape and the new hydration source

UI surface (the toggle label, switch state name) is unchanged; the
boolean lives only in component state and gets translated to
identity_mode at submit time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
autofix-ci Bot and others added 13 commits June 4, 2026 17:25
Reverts 1b78307. On reflection the validator was paranoia, not
defense in depth:

- the UI already hides the toggle on non-EE installs
- the runtime `_forwarding_requested` check no-ops a non-OFF row
  on non-EE before any enterprise call is made

The only callers it actually rejected were hand-rolled API requests
on non-EE — whose worst case was a harmless persisted column the
runtime ignores. Worse, it turned that case into a 4xx, which would
break an EE-export → non-EE-import migration for no benefit.

The remaining two layers (UI hide + runtime gate) already satisfy
wylswz's "UI and backend invocations" review note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e inspector (#36984)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
…uman Input node (#36322)

Co-authored-by: JzoNg <jzongcode@gmail.com>
Co-authored-by: GPT 5.4 <codex@openai.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
…lidation (ENG-367/368) (#37033)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@dosubot dosubot Bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jun 4, 2026
CourTeous33 and others added 2 commits June 4, 2026 19:49
Two migrations diverged from 121e7346074d after the most recent rebase
onto main — 3df4dbcc1e21 (this branch's identity_mode column) and
8d4c2a1b9f03 (human-input upload tables, on main). DB Migration Test
fails with "Multiple head revisions are present" until they're joined.

Routing-only merge; no DDL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Jun 4, 2026
@wylswz wylswz added this to the 1.15.0 milestone Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files. web This relates to changes on the web.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants