Skip to content

Conversation

Copy link

Copilot AI commented Jan 3, 2026

When a ShEx schema contains value sets with typed literals (e.g., "true"^^xsd:boolean), the JSON-LD context and TypeScript types were not inferring the datatype. This PR adds datatype inference for such cases.

Changes

  • New helper function (inferDataTypeFromValueSet.ts): Extracts common datatype from value sets where all ObjectLiteral values share the same type
  • Context generation: ShexJContextVisitor now adds @type to predicates with typed literal value sets
  • TypeScript generation: ShexJTypingTransformer generates correct primitive types (boolean, number, string) for ObjectLiteral values; refactored datatype mapping into reusable datatypeToDomType helper
  • Tests: Unit tests for the helper function and integration test case for boolean value sets

Example

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ex:FeatureShape {
  ex:isEnabled [ "true"^^xsd:boolean "false"^^xsd:boolean ] ;
}

Before:

  • Context: isEnabled: {}
  • TypeScript: isEnabled: undefined

After:

  • Context: isEnabled: { "@type": "http://www.w3.org/2001/XMLSchema#boolean" }
  • TypeScript: isEnabled: boolean

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • www.w3.org
    • Triggering command: /usr/local/bin/node /usr/local/bin/node /home/REDACTED/work/ldo/ldo/node_modules/jest-worker/build/workers/processChild.js k/node_modules/.tsc --project tsconfig.esm.json sh odul�� cjs.js n (dns block)
    • Triggering command: /usr/local/bin/node /usr/local/bin/node /home/REDACTED/work/ldo/ldo/node_modules/jest-worker/build/workers/processChild.js ode sh odul�� cjs.js n /node config.cjs.json node build:cjs bin/node node tgra�� &amp;&amp; node ../../scripts/rename-to-cjs.js build:cjs node_modules/@npmcli/run-script/lib/node-gyp-bin/sh d:esm build es/.bin/node node (dns block)
    • Triggering command: /usr/local/bin/node node --experimental-vm-modules test-boolean-valueset.mjs node tgra�� ist &amp;&amp; npm run build:ts &amp;&amp; npm run copy-files &amp;&amp; npm run update-permission build:cjs odules/.bin/sh cjs.js node es/.bin/sh sh -c &amp;&amp; npm run build:esm sort es/connected-nextgraph/node_modules/.bin/node -t: -k2,2 es/.bin/sh sh (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Infer dataType from value set</issue_title>
<issue_description>For instance I would expect the following to have a boolean datatype

PREFIX sh: <http://www.w3.org/ns/shacl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX shsh: <http://www.w3.org/ns/shacl-shacl#>

shsh:ShapeShape {
      sh:deactivated ["true"^^xsd:boolean "false"^^xsd:boolean]?;
}

rather than erroring with

SyntaxError: Type expected. (27:19)
  25 |         "@id": string;
  26 |     };
> 27 |     deactivated?: ;
     |                   ^
  28 |     class?: {
  29 |         "@id": string;
  30 |     }[];
    at Kq (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:23:498)
    at xz (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:25:794)
    at Object.Cz [as parse] (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:25:1194)
    at parse4 (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/index.mjs:22117:24)
    at async coreFormat (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/index.mjs:22607:7)
    at async formatWithCursor (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/index.mjs:22809:14)
    at async Module.format2 (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/index.mjs:24194:25)
    at async /home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/@ldo/cli/dist/build.js:40:128
    at async Promise.all (index 3)
    at async /home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/@ldo/cli/dist/build.js:33:9 {
  loc: { start: { line: 27, column: 19 }, end: { line: 27, column: 19 } },
  cause: Vm: Type expected.
      at RB (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:21:119402)
      at YB (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:21:122980)
      at YU (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:21:191880)
      at $q (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:21:203680)
      at Xq (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:21:203346)
      at file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:25:1160
      at Zq (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:23:652)
      at Object.Cz [as parse] (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/plugins/typescript.mjs:25:1132)
      at parse4 (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/index.mjs:22117:24)
      at async coreFormat (file:///home/jeswr/Documents/GitHub/jeswr/reasoning-test/shacl2shex/node_modules/prettier/index.mjs:22607:7) {
    fileName: 'estree.ts',
    location: {
      start: { line: 27, column: 18, offset: 592 },
      end: { line: 27, column: 18, offset: 592 }
    }
  },
  codeFrame: '\x1B[0m \x1B[90m 25 |\x1B[39m         \x1B[32m"@id"\x1B[39m\x1B[33m:\x1B[39m string\x1B[33m;\x1B[39m\x1B[0m\n' +
    '\x1B[0m \x1B[90m 26 |\x1B[39m     }\x1B[33m;\x1B[39m\x1B[0m\n' +
    '\x1B[0m\x1B[31m\x1B[1m>\x1B[22m\x1B[39m\x1B[90m 27 |\x1B[39m     deactivated\x1B[33m?\x1B[39m\x1B[33m:\x1B[39m \x1B[33m;\x1B[39m\x1B[0m\n' +
    '\x1B[0m \x1B[90m    |\x1B[39m                   \x1B[31m\x1B[1m^\x1B[22m\x1B[39m\x1B[0m\n' +
    '\x1B[0m \x1B[90m 28 |\x1B[39m     \x1B[36mclass\x1B[39m\x1B[33m?\x1B[39m\x1B[33m:\x1B[39m {\x1B[0m\n' +
    '\x1B[0m \x1B[90m 29 |\x1B[39m         \x1B[32m"@id"\x1B[39m\x1B[33m:\x1B[39m string\x1B[33m;\x1B[39m\x1B[0m\n' +
    '\x1B[0m \x1B[90m 30 |\x1B[39m     }[]\x1B[33m;\x1B[39m\x1B[0m'
}
```</issue_description>

## Comments on the Issue (you are @copilot in this section)

<comments>
</comments>


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits January 3, 2026 13:43
- Added inferDataTypeFromValueSet helper function to extract datatype from ObjectLiteral values
- Updated ShexJContextVisitor to infer @type for predicates with typed literal value sets
- Updated ShexJTypingTransformer to generate correct TypeScript types for typed literal values
- Added unit tests for inferDataTypeFromValueSet
- Added booleanValueSet test case demonstrating the feature

Co-authored-by: jeswr <63333554+jeswr@users.noreply.github.com>
- Use Set for tracking added primitive types in TypeScript generation
- Add robust isObjectLiteral type guard function
- Improve null checks for value detection

Co-authored-by: jeswr <63333554+jeswr@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix inference of dataType from value set Infer dataType from value set with typed literals Jan 3, 2026
Copilot AI requested a review from jeswr January 3, 2026 13:50
@jeswr jeswr marked this pull request as ready for review January 3, 2026 14:04
Copilot AI review requested due to automatic review settings January 3, 2026 14:04
@jeswr jeswr marked this pull request as draft January 3, 2026 14:04
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds datatype inference for ShEx value sets containing typed literals (e.g., "true"^^xsd:boolean), fixing an issue where the JSON-LD context and TypeScript types did not properly infer datatypes from such value sets. Previously, this resulted in empty TypeScript types and missing @type annotations in the JSON-LD context.

Key Changes

  • Introduced a helper function (inferDataTypeFromValueSet) to extract common datatypes from value sets where all ObjectLiteral values share the same type
  • Modified context generation to add @type annotations when typed literal value sets are detected
  • Enhanced TypeScript generation to produce correct primitive types (boolean, number, string) for ObjectLiteral values

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/schema-converter-shex/src/context/util/inferDataTypeFromValueSet.ts New helper function that infers a common datatype from value sets with typed ObjectLiterals
packages/schema-converter-shex/test/inferDataTypeFromValueSet.test.ts Unit tests for the new helper function covering various edge cases
packages/schema-converter-shex/test/testData/booleanValueSet.ts Integration test case demonstrating boolean value set inference
packages/schema-converter-shex/test/testData/testData.ts Imports and registers the new boolean value set test case
packages/schema-converter-shex/src/context/ShexJContextVisitor.ts Uses the new helper to add @type to predicates with inferred datatypes
packages/schema-converter-shex/src/typing/ShexJTypingTransformer.ts Refactored datatype-to-TypeScript mapping into reusable datatypeToDomType function; handles ObjectLiteral values in value sets
packages/schema-converter-shex/src/context/JsonLdContextBuilder.ts Minor formatting fixes (semicolon and quotes)
packages/react/src/methods/useDataset.ts Minor formatting fixes (imports and trailing semicolons)
packages/subscribable-dataset/src/TransactionDatasetFactory.ts Minor formatting fix (multi-line import)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +365 to +367
if (domType && !addedPrimitiveTypes.has(domType)) {
addedPrimitiveTypes.add(domType);
valuesUnion.members.push(domType);
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic in the NodeConstraint transformer for handling value sets has a potential issue. When an ObjectLiteral has a type that is recognized by datatypeToDomType, it correctly adds the primitive type to the union. However, if datatypeToDomType returns undefined (for an unrecognized datatype), the code silently ignores that value and doesn't add anything to the union.

This means that if you have a value set with an unrecognized datatype like "value"^^custom:type, nothing will be added to the TypeScript type for that value, which could lead to incomplete type definitions. Consider adding a fallback to treat unrecognized datatypes as strings, similar to the behavior when an ObjectLiteral has no type specified (lines 370-375).

Suggested change
if (domType && !addedPrimitiveTypes.has(domType)) {
addedPrimitiveTypes.add(domType);
valuesUnion.members.push(domType);
// If datatype is unrecognized, fall back to string (same as no datatype)
const effectiveType = domType ?? dom.type.string;
if (!addedPrimitiveTypes.has(effectiveType)) {
addedPrimitiveTypes.add(effectiveType);
valuesUnion.members.push(effectiveType);

Copilot uses AI. Check for mistakes.
Comment on lines +378 to +379
// If there's only one type in the union, return that type directly
if (valuesUnion.members.length === 1) {
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code doesn't handle the case where valuesUnion.members is empty. If a value set contains only unrecognized value types (e.g., only IriStemRange, LiteralStemRange, etc. that aren't currently handled), the union will have zero members. This would result in returning an empty union type, which may cause TypeScript errors. Consider adding a check: if valuesUnion.members.length === 0, return a default type like dom.type.undefined or dom.type.string to ensure valid TypeScript is always generated.

Suggested change
// If there's only one type in the union, return that type directly
if (valuesUnion.members.length === 1) {
// If no recognized values were added, fall back to undefined
if (valuesUnion.members.length === 0) {
return dom.type.undefined;
} else if (valuesUnion.members.length === 1) {
// If there's only one type in the union, return that type directly

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Infer dataType from value set

2 participants