Skip to content

Move jsonargparse to main dependencies, lazy-load sklearn/scipy in PubChem#153

Draft
Copilot wants to merge 2 commits intodevfrom
copilot/add-jsonargparse-dependency
Draft

Move jsonargparse to main dependencies, lazy-load sklearn/scipy in PubChem#153
Copilot wants to merge 2 commits intodevfrom
copilot/add-jsonargparse-dependency

Conversation

Copy link

Copilot AI commented Feb 17, 2026

Inference-only installations failed due to missing jsonargparse (required by Lightning CLI) and scikit-learn imports in PubChem dataset module being eagerly evaluated at import time.

Changes

Dependencies (pyproject.toml)

  • Move jsonargparse[signatures]>=4.17 from [dev] to main dependencies

Lazy imports (chebai/preprocessing/datasets/pubchem.py)

  • Move sklearn.cluster.KMeans import into PubChemKMeans._build_clusters() and cluster_centers_superclustered property
  • Move sklearn.model_selection.train_test_split into PubChem.setup_processed() and PubChemTokens.setup_processed()
  • Move scipy.spatial import into PubChemKMeans._exclude_clusters()

This allows importing PubChem classes without requiring sklearn/scipy to be installed, deferring the dependency until dataset methods that actually use these libraries are called.

# Now works without sklearn installed
from chebai.preprocessing.datasets.pubchem import PubChem

# sklearn only required when calling setup_processed() or other split/clustering methods
Original prompt

This section details on the original issue you should resolve

<issue_title>Add jsonargparse to main dependencies, only use sklearn if needed</issue_title>
<issue_description>We had two issues when installing chebai for inference only:

  • missing jsonargparse caused some error in lightning
  • missing scikit-learn causes issues with the pubchem dataset

Both dependencies are included in the [dev] extra.

Todo

  • move jsonargparse to the regular dependencies
  • only load sklearn when a PubChem dataset is actually instantiated</issue_description>

Comments on the Issue (you are @copilot in this section)

@aditya0by0 Lets try and assign this to co-pilot to check whether it can take upon minor issues.

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

… sklearn/scipy

Co-authored-by: aditya0by0 <65857172+aditya0by0@users.noreply.github.com>
Copilot AI changed the title [WIP] Add jsonargparse to main dependencies and optimize sklearn usage Move jsonargparse to main dependencies, lazy-load sklearn/scipy in PubChem Feb 17, 2026
Copilot AI requested a review from aditya0by0 February 17, 2026 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add jsonargparse to main dependencies, only use sklearn if needed

2 participants