Summary
The pre-publish validator is producing a false positive for normal Python code that assigns a function return value to a variable named token.
Steps To Reproduce
Current behavior
Publishing fails with an error like:
Pre-publish validation failed:
scripts/f2e_mock.py line 1842 contains a value that looks like a secret or token. Replace real credentials with placeholders before publishing.
A representative line that triggers the error is:
token = extract_group_token_value(response, group_choice.group_id)
This line does not contain a hardcoded credential. It only assigns the return value of a function call.
Minimal example
def maybe_generate_group_token(...):
requested_token = secrets.token_hex(20)
...
token = extract_group_token_value(response, group_choice.group_id)
if token:
return token
...
Why this looks like a false positive
The pre-publish validator currently uses a regex-based heuristic in:
- server/skillhub-domain/src/main/java/com/iflytek/skillhub/domain/skill/validation/BasicPrePublishValidator.java
Relevant rule:
(?i)(api[-]?key|access[-]?key|secret|password|token)\s*[:=]\s*['"]?([A-Za-z0-9_-]{12,})
This regex scans text line-by-line and does not parse Python syntax. As a result:
- the left-hand side matches token
- the right-hand side matches the identifier prefix extract_group_token_value
- the validator treats that identifier as a “secret-like value”
So function calls / identifiers can be mistaken for leaked secrets.
Related code path
The validator is invoked during publish here:
- server/skillhub-domain/src/main/java/com/iflytek/skillhub/domain/skill/service/SkillPublishService.java
Expected behavior
The validator should block obvious hardcoded credentials, but should not reject:
- variable assignments
- function-call return values
- ordinary identifiers that happen to contain words like token, secret, etc.
Example that should be allowed:
token = extract_group_token_value(response, group_choice.group_id)
Expected Behavior
Suggested regression test
A test case similar to this should pass:
token = extract_group_token_value(response, group_choice.group_id)
while real hardcoded secrets such as:
token = "ghp_xxxxxxxxxxxxxxxxxxxx"
api_key = "sk-xxxxxxxxxxxxxxxxxxxx"
should still fail.
Environment
No response
API Contract Impact
No response
Logs Or Screenshots
No response
Summary
The pre-publish validator is producing a false positive for normal Python code that assigns a function return value to a variable named
token.Steps To Reproduce
Current behavior
Publishing fails with an error like:
A representative line that triggers the error is:
token = extract_group_token_value(response, group_choice.group_id)
This line does not contain a hardcoded credential. It only assigns the return value of a function call.
Minimal example
def maybe_generate_group_token(...):
requested_token = secrets.token_hex(20)
...
token = extract_group_token_value(response, group_choice.group_id)
if token:
return token
...
Why this looks like a false positive
The pre-publish validator currently uses a regex-based heuristic in:
Relevant rule:
(?i)(api[-]?key|access[-]?key|secret|password|token)\s*[:=]\s*['"]?([A-Za-z0-9_-]{12,})
This regex scans text line-by-line and does not parse Python syntax. As a result:
So function calls / identifiers can be mistaken for leaked secrets.
Related code path
The validator is invoked during publish here:
Expected behavior
The validator should block obvious hardcoded credentials, but should not reject:
Example that should be allowed:
token = extract_group_token_value(response, group_choice.group_id)
Expected Behavior
Suggested regression test
A test case similar to this should pass:
token = extract_group_token_value(response, group_choice.group_id)
while real hardcoded secrets such as:
token = "ghp_xxxxxxxxxxxxxxxxxxxx"
api_key = "sk-xxxxxxxxxxxxxxxxxxxx"
should still fail.
Environment
No response
API Contract Impact
No response
Logs Or Screenshots
No response