Skip to content

feat: python client #45#55

Closed
jdpearce4 wants to merge 8 commits intomainfrom
jpearce-python-client
Closed

feat: python client #45#55
jdpearce4 wants to merge 8 commits intomainfrom
jpearce-python-client

Conversation

@jdpearce4
Copy link
Collaborator

Summary

Introduce a first-class Python client (TranscriptFormerClient) for inference and artifact/data downloads. The client mirrors the CLI configuration behavior while providing a simple, programmatic API that returns an in-memory AnnData object.

Key Features

  • In-memory inference: inference(...) returns an anndata.AnnData without writing to disk.
  • Config parity with CLI: Builds a Hydra-compatible config by:
  • Loading the same CLI YAML defaults.
  • Applying dataclass overrides from kwargs (InferenceConfig, DataConfig).
  • Merging with the checkpoint config.json via the same utility used by the CLI.

Convenience downloads:

  • download_model(...) for checkpoints and embeddings.
  • download_data(...) for CellxGene datasets by species.
  • download_dataset(...) for curated sources (e.g., Tabula Sapiens, Bgee).
  • Logging control: Optional log_level argument to run quietly or verbosely without affecting global logging.

API Surface

  • TranscriptFormerClient.inference(data_file, checkpoint_path, **kwargs) -> anndata.AnnData
  • Accepts most InferenceConfig and DataConfig fields as kwargs (e.g., batch_size, output_keys, gene_col_name, use_raw, use_oom_dataloader, n_data_workers, etc.).
  • Returns a single AnnData with obsm/uns populated per output_keys.
  • TranscriptFormerClient.download_model(model, checkpoint_dir=...) -> None
  • TranscriptFormerClient.download_data(species=[...], output_dir=..., ...) -> int
  • TranscriptFormerClient.download_dataset(dataset, ...) -> anndata.AnnData | None

jdpearce4 and others added 8 commits July 22, 2025 17:21
…ce and artifact downloading

- Deleted `download_artifacts.py` and `inference.py` scripts as they are now replaced by CLI commands.
- Updated CLI commands to improve user experience and added progress tracking for downloads and extractions.
- Enhanced inference configuration to support backward compatibility for checkpoint paths.
- Updated documentation in the inference configuration YAML file to clarify model types and embedding options.
@jdpearce4 jdpearce4 closed this Aug 21, 2025
@jdpearce4 jdpearce4 deleted the jpearce-python-client branch August 25, 2025 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant