PolyCore is a Python-based tool for core genome analysis in polyploid organisms.
It loads reference and sample FASTA files, collapses identical sequences, filters by genome/core fractions, and produces:
- Core / Full alignment FASTA files
- Pairwise distance matrices (wide and long formats)
- Per-sample summary table
- Progressive core fraction plot (HTML)
- Handles haploid and polyploid genomes (auto-detects ploidy if not specified)
- Soft-core / progressive core fraction calculation
- Collapsing and re-expansion of identical sequences
- Distance matrices with efficient chunking (auto memory-aware)
- Output in CSV, FASTA, and VCF formats
- Interactive visualization with Plotly
docker pull public.ecr.aws/o8h2f0o1/polycore:1.0.0
git clone https://github.com/WA-DOH/polycore.git
cd polycore
pip install -e .PolyCore requires Python 3.10+. Dependencies (numpy, screed, psutil, plotly) are installed automatically.
Run PolyCore from the command line:
polycore \
--progressive \
--ref reference.fasta \
sample1.fasta sample2.fasta
--min-gf: Minimum genome fraction per sample (default: 0.9)--min-cf: Minimum fraction of population required per site (default: 0.95)--ploidy: Force ploidy (otherwise auto-detected)--progressive: Enable soft-core (progressive) calculation--split: Treat each contig in each assembly as a separate sample
For full options:
polycore --help
Outputs
core.aln: Core alignment (variants only, FASTA)core.full.aln: Full core alignment (FASTA)full.csv: Per-site summary for all passing samplesdist_wide.csv: Pairwise distance matrix (wide)dist_long.csv: Pairwise distance matrix (long/tidy)summary.csv: Per-sample statisticscore_fraction_plot.html: Interactive visualization of soft-core genome fraction