Skip to content

taffish/agat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

agat

TAFFISH wrapper for AGAT, Another GTF/GFF Analysis Toolkit. AGAT is a suite of Perl tools for standardizing, fixing, converting, filtering, editing, extracting from, and summarizing genome annotation files in GFF/GTF-related formats.

Package identity:

  • name: agat
  • command: taf-agat
  • kind: tool
  • TAFFISH version: 1.7.0-r1
  • container image: ghcr.io/taffish/agat:1.7.0-r1
  • upstream: NBISweden/AGAT tag v1.7.0
  • runtime version check: agat --version contains 1.7.0
  • TAFFISH package license: Apache-2.0
  • upstream license: GPL-3.0

What Is Packaged

This app packages the Bioconda agat=1.7.0 distribution in a TAFFISH container. Bioconda is used because AGAT is a large Perl toolkit with many runtime modules and helper commands; the Bioconda package pins and installs those Perl/R/samtools dependencies as a coherent runtime.

The default command is:

taf-agat -- --help

Because this is a normal TAFFISH tool app with command mode enabled, all installed AGAT scripts can also be called directly inside the same container:

taf-agat agat --tools
taf-agat agat_convert_sp_gxf2gxf.pl --help
taf-agat agat_sp_statistics.pl --help
taf-agat agat_sp_extract_sequences.pl --help

Common Workflows

List AGAT tools:

taf-agat agat --tools

Show the default AGAT command help:

taf-agat -- --help

Standardize or repair a GFF/GTF file into sorted GFF3:

taf-agat agat_convert_sp_gxf2gxf.pl \
  --gff annotation.gff3 \
  --output annotation.fixed.gff3

Convert GFF3 to GTF:

taf-agat agat_convert_sp_gff2gtf.pl \
  --gff annotation.fixed.gff3 \
  --output annotation.gtf

Convert GFF/GTF to BED:

taf-agat agat_convert_sp_gff2bed.pl \
  --gff annotation.fixed.gff3 \
  --output annotation.bed

Generate annotation statistics:

taf-agat agat_sp_statistics.pl \
  --gff annotation.fixed.gff3 \
  --output annotation.statistics.txt

Extract sequences from an annotation and genome FASTA:

taf-agat agat_sp_extract_sequences.pl \
  --gff annotation.fixed.gff3 \
  --fasta genome.fa \
  --type cds \
  --output cds.fa

AGAT Script Families

AGAT contains many agat_* commands. Upstream groups the most important tools roughly as:

  • agat_convert_*: convert between annotation-related formats.
  • agat_convert_sp_*: convert using AGAT's full parser/standardizer.
  • agat_sp_*: full-parser tools that load the annotation model in memory for robust editing, filtering, statistics, extraction, and comparison.
  • agat_sq_*: sequential tools that stream line by line and use less memory.
  • agat config: expose or adjust AGAT configuration such as parser attributes.

Use taf-agat agat --tools for the complete list installed in this release.

Inputs And Outputs

AGAT mainly works with GFF, GFF3, GTF, BED, FASTA, and related annotation files. Most scripts accept --gff / -g / -f style annotation inputs and write to --output / --out / --outfile; exact option names vary by script, so check the individual script help:

taf-agat agat_convert_sp_gxf2gxf.pl --help

AGAT's _sp_ tools parse the complete annotation into memory. This makes them robust for difficult GFF/GTF files, but memory use scales with annotation size. For very large annotations, check whether an _sq_ sequential tool can perform the same task.

Runtime Dependencies

The Bioconda package brings AGAT's Perl runtime stack, including BioPerl, YAML, Moose, Parallel::ForkManager, File::ShareDir, R / Statistics::R support, and samtools. The smoke test verifies the main Perl module, representative commands, Rscript, and samtools.

The resulting image is comparatively large, about 1.8 GB in local amd64/arm64 test builds. That is expected for this r1 packaging strategy: keeping the Bioconda runtime preserves AGAT's broad upstream command surface, including scripts that rely on R / Statistics::R and bundled BioPerl dependencies.

This app does not bundle external genome databases or reference files. Scripts that require a genome FASTA, ontology file, BUSCO result, or another annotation must be given those inputs explicitly by the user.

Platform Notes

The image is declared for native linux/amd64 and linux/arm64. AGAT itself is distributed by Bioconda as a noarch package; compiled dependencies are resolved by Bioconda for the target Linux architecture.

Smoke Coverage

The smoke tests check:

  • wrapper package and upstream runtime version binding
  • agat --help and agat --tools
  • representative helper command help
  • Perl module imports for AGAT and core dependencies
  • a minimal GFF3 standardization path with agat_convert_sp_gxf2gxf.pl
  • GFF3 to GTF and BED conversion
  • annotation statistics
  • presence of Rscript and samtools runtime dependencies

Smoke tests are packaging and integration checks. They do not validate every AGAT script or every possible annotation edge case.

Citation

AGAT upstream asks users to cite:

Dainat J. 2022. Another Gtf/Gff Analysis Toolkit (AGAT): Resolve interoperability issues and accomplish more with your annotations. Plant and Animal Genome XXIX Conference. https://github.com/NBISweden/AGAT.

Zenodo DOI: https://doi.org/10.5281/zenodo.3552717

Sources

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors