Releases · Ensembl/plant-scripts

12 Sep 14:39

59b392e

20250904 Latest

Latest

Main changes in GET_PANGENES:

06032025: get_pangenes.pl: sort & concat alignment results using tempfile with filenames to sort to avoid "Argument list too long"
24032025: BED matrix produced by _cluster_analysis.pl is 0-based 
25032025: match_cluster.pl was added -i to control sequence identity of matches
25032025: match_cluster.pl was added -F to produce a FASTA file with sequence index that can be exported as gene-based pangenome for mapping, 
25032025: with <global pangenome positions> estimated from reference genome
25032025: updated Makefiles and documentation
08042025: match_cluster.pl TSV output updated, tested with barley
08042025: add pangenome coords example to documentation
14052025: added POCS to troubleshooting to explain small cores
19052025: check_quality.pl does not assume gff files are available
27052025: _cluster_analysis.pl -t now affects pangene set growth simulation

Plus changes to phylogenomics scripts described in #16

Finally, tag format was changed to 1.3 for conda compatibility

Assets 2

23 Jan 10:34

brunocontrerasmoreira

20250123

732b4fe

20250123

This release

adapts 04102024 for bioconda
adopts ISO date formats for version numbers.

Assets 2

25 Oct 07:44

brunocontrerasmoreira

04102024

bbc0669

04102024

This release ships with get_pangenes.pl version 04102024.

Main changes are:
25092024: added section 'Example 6: estimation of haplotype diversity'
03102024: get_pangenes.pl expects min 95% sequence identity for WGA-based gene alignments, as in GET_HOMOLOGUES-EST, to help avoid diverged tandem copies
04102024: get_pangenes.pl now set MAXDISTNEIGHBORS=2, neighbor genes in a cluster cannot be more than 2 genes away

Assets 2

08 Feb 16:22

brunocontrerasmoreira

v1.2

df9cfde

11012024

This release ships with updates to GET_PANGENES: code changes since the publication of the manuscript, involving:

fixed bug in handling - strand coords in sub query2ref_coords
sub _parseCIGARfeature handles correctly 1bp CS-type SNPs when computing overlap with optional query coord
tested rename_pangenes.pl with MAGIC16 rice dataset, check AgBioData nomenclature rules at https://github.com/Ensembl/plant-scripts/blob/df9cfdef5e49e6f463a08e7ed8ec8a04556735ff/pangenes/rename_pangenes.pl#L5C48-L5C57 ; code to update a previous cluster set not yet in place

Assets 2

16 Nov 07:31

brunocontrerasmoreira

v1.1b

58a3636

15112023

This release ships with updates to:

GET_PANGENES: code and documentation changes since the publication of the manuscript, involving improved handling of input GFF files and calculation of overlap coordinates from WGA segments in different strands.
REST-based recipes.

Assets 2

03 Apr 14:40

brunocontrerasmoreira

Apr2023

d6a187e

pangenes_benchmark

Pangene sets of Arabidopsis (ACK), rice, wheat and barley datasets produced while benchmarking get_pangenes as described at https://doi.org/10.1186/s13059-023-03071-z and https://www.biorxiv.org/content/10.1101/2023.01.03.520531v2

The HOWTO* files contain the actual commands required to produce these results with the input FASTA & GFF files (32GB), which should be first be downloaded from

The source code was archived as but has been updated since.

Assets 7

04 Jan 20:12

brunocontrerasmoreira

v0.4

8e136b1

test_rice

Toy dataset to test the scripts for pan-gene analysis.

Assets 3

16 Feb 12:23

brunocontrerasmoreira

v0.3

1133c16

nrTEplants

Release 0.3 (Jun2020) the nrTEplants library of plant transposable elements which minimizes overlap with sequence containing protein domains known to be part of NLR genes. This sequence set was computed after combining TREP, SINEbase, REdat, RepetDB, EDTArice, EDTAmaize, SoyBaseTE, TAIR10TE, SunflowerTE, MelonTE, RosaTE and SUNREP and obtaining a non-redundant collection with GET_HOMOLOGUES-EST.

Check the code and documentation at https://github.com/Ensembl/plant_tools/tree/master/bench/repeat_libs

Citation: Contreras-Moreira,B., Filippi,C.V., Naamati,G., Girón,C.G., Allen,J.E. and Flicek,P. (2021) Efficient masking of plant genomes by combining kmer counting and curated repeats Genomics. Plant Genome https://doi.org/10.1002/tpg2.20143

Assets 3

23 Oct 13:11

manuelcarbajo

v0.1.2

9de5841

23102020

This release was created to obtain a DOI from Zenodo

Assets 2

Releases: Ensembl/plant-scripts

20250904

Uh oh!

20250123

Uh oh!

04102024

Uh oh!

11012024

Uh oh!

15112023

Uh oh!

pangenes_benchmark

Uh oh!

test_rice

Uh oh!

nrTEplants

Uh oh!

23102020

Uh oh!