Skip to content

[FLINK-39604][table] Show state and PTF capabilities in DESCRIBE FUNCTION EXTENDED#28114

Draft
nateab wants to merge 1 commit intoapache:masterfrom
nateab:FLINK-39604-describe-function-ptf
Draft

[FLINK-39604][table] Show state and PTF capabilities in DESCRIBE FUNCTION EXTENDED#28114
nateab wants to merge 1 commit intoapache:masterfrom
nateab:FLINK-39604-describe-function-ptf

Conversation

@nateab
Copy link
Copy Markdown
Contributor

@nateab nateab commented May 4, 2026

What is the purpose of the change

Extend DESCRIBE FUNCTION EXTENDED to surface function metadata that previously had no introspection path from SQL: state entries (with type and TTL) and three PTF capability flags.

JIRA: FLINK-39604

DESCRIBE FUNCTION shipped under FLINK-35822 before the PTF infrastructure (FLIP-440 / FLINK-36705) and has not been touched since. The EXTENDED form already calls FunctionDefinition#getTypeInference(...) to render the signature row, but it ignores getStateTypeStrategies(), disableSystemArguments(), the ChangelogFunction interface, and the presence of an onTimer method — all class-level facts the user can't otherwise see.

Per-argument trait info was considered but rejected because it is already encoded in the signature row (e.g. f(input => {TABLE, SET SEMANTIC TABLE, OPTIONAL PARTITION BY})).

Brief change log

  • DescribeFunctionOperation: new private buildPtfMetadataRows(FunctionDefinition, TypeInference) helper appends:
    • state: <name> rows (type + TTL, best-effort via inferType(null) / getTimeToLive(null)) — applies to PTFs and aggregate functions (the accumulator surfaces as state: acc).
    • When kind == PROCESS_TABLE:
      • accepts system arguments!disableSystemArguments() (whether uid / on_time are auto-injected).
      • is changelog functiondefinition instanceof ChangelogFunction (whether the PTF implements the interface that lets it emit +U/-U/-D).
      • uses timers — reuses ExtractionUtils.collectMethods(cls, UserDefinedFunctionHelper.PROCESS_TABLE_ON_TIMER), the same lookup the planner uses.
  • DescribeFunctionTestPtf, DescribeFunctionTestMinimalPtf, DescribeFunctionTestAgg (new): three small test fixtures in flink-sql-client/src/test/java/.../cli/utils/, reachable on the test classpath via FQN. Together they cover both true and false for each of the three capability flags, plus state entries on both a PTF and an aggregate.
  • function.q golden: three new blocks register my_ptf, my_minimal_ptf, and my_agg and assert their DESCRIBE FUNCTION EXTENDED output. Existing temp_upperudf / SUM blocks unchanged.

Output schema

Unchanged: still (info name, info value). Additional rows appended only when applicable. No new SQL syntax.

Example (PTF, all capabilities present)

+---------------------------+---------------------------------------------------------------------+
|                 info name |                                                          info value |
+---------------------------+---------------------------------------------------------------------+
| ...                       | ...                                                                 |
|                 signature | my_ptf(input => {TABLE, SET SEMANTIC TABLE, OPTIONAL PARTITION BY}) |
|              state: state |                                 type=ROW<\`count\` BIGINT>, ttl=PT24H |
|  accepts system arguments |                                                                true |
|     is changelog function |                                                                true |
|               uses timers |                                                                true |
+---------------------------+---------------------------------------------------------------------+

Example (Minimal PTF, no capabilities)

| ...                       | ...                                                  |
|                 signature | my_minimal_ptf(input => {TABLE, SET SEMANTIC TABLE}) |
|  accepts system arguments |                                                 true |
|     is changelog function |                                                false |
|               uses timers |                                                false |

Example (Aggregate)

| ...        | ...                                                                                         |
| signature  | my_agg(value => BIGINT)                                                                     |
| state: acc | type=STRUCTURED<'...DescribeFunctionTestAgg\$Acc', \`count\` BIGINT, \`sum\` BIGINT>, ttl=PT48H |

For non-PTF / non-stateful functions (most scalar UDFs, SUM, etc.) the output is unchanged from today.

Verifying this change

  • flink-table-api-java: ./mvnw test -pl flink-table/flink-table-api-java — all tests pass.
  • flink-sql-client: ./mvnw test -pl flink-table/flink-sql-client -Dtest=CliClientITCase — all 14 tests pass with the three new function blocks in function.q.

Does this pull request potentially affect one of the following parts?

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no (the helper is private and the class is @Internal)
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no (extension of existing DESCRIBE FUNCTION EXTENDED)
  • If yes, how is the feature documented? N/A

Out of scope (deliberately)

  • Per-argument rows — redundant with the signature row, which already encodes name, type, and traits via f(arg => TYPE {TRAITS}).
  • New SQL syntax (e.g. DESCRIBE FUNCTION ... SHOW STATE) — would require a FLIP.
  • Adding columns to the result schema — output remains (info name, info value).
  • Resolved changelog mode — ChangelogFunction#getChangelogMode(ChangelogContext) and ChangelogModeStrategy#inferChangelogMode(...) both require call-time context, so only the static instanceof boolean is exposed here. The label is changelog function (rather than emits updates) reflects that this is a class-level check, not a runtime fact.
  • Time / late-record / ordering behavior — all per-call.

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented May 4, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@nateab nateab force-pushed the FLINK-39604-describe-function-ptf branch 4 times, most recently from f69104b to d4cdae5 Compare May 5, 2026 22:16
@nateab nateab changed the title [FLINK-39604][table] Surface PTF metadata in DESCRIBE FUNCTION EXTENDED [FLINK-39604][table] Show state and PTF capabilities in DESCRIBE FUNCTION EXTENDED May 5, 2026
@nateab nateab force-pushed the FLINK-39604-describe-function-ptf branch from d4cdae5 to 305d449 Compare May 5, 2026 22:26
…TION EXTENDED

Append rows to DESCRIBE FUNCTION EXTENDED that surface metadata
already accessible at definition time but not previously rendered:

- state entries (with type and TTL) — for PTFs and user-defined
  aggregate functions whose accumulator declares a StateTypeStrategy.
- PTF capability flags — when kind == PROCESS_TABLE, three additional
  rows: accepts system arguments (whether the framework auto-injects
  uid / on_time), emits updates (ChangelogFunction implementation),
  uses timers (presence of an onTimer method).

Per-argument trait info was considered but rejected because it is
already encoded in the signature row.

Output schema is unchanged: still (info name, info value); rows are
appended only when applicable.
@nateab nateab force-pushed the FLINK-39604-describe-function-ptf branch from 305d449 to 929b477 Compare May 6, 2026 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants