-
Notifications
You must be signed in to change notification settings - Fork 9
03. Offline Setup
This setup is crucial to successfully execute the metagenomics pipeline in an air-gapped system. It is assumed that you have completed the Install directions, including the creation and activation of your metag environment.
When running the Offline Setup, you may specify which workflow files and dependencies to download, or you may choose to download all of the workflow files and dependencies at once (see the Workflow Setup Options table below).
NOTE: The offline setup command must be executed from the metagenomics/workflows/ directory.
[user@localhost ~]$ source activate metag
(metag)[user@localhost ~]$ cd metagenomics/workflowspython download_offline_files.py --workflow {workflow_setup_options}| Setup Option | Description |
|---|---|
test_files |
Downloads the Shakya subset 10 datasets |
read_filtering |
Copies the adapter file to the data directory, downloads the biocontainers needed for the read filtering workflow, and creates singularity images |
assembly |
Downloads biocontainers needed for the assembly workflow and creates singularity images |
comparison |
Downloads biocontainer needed for the metagenome comparison workflow and creates singularity image |
taxonomic_classification |
Downloads all databases and biocontainers needed for tools within the taxonomic classification workflow and creates singularity images |
sourmash |
Downloads only the sourmash databases and biocontainer; creates the sourmash singularity image |
kaiju |
Downloads only the kaiju database and biocontainer; creates the kaiju singularity image |
functional_inference |
Downloads the databases and biocontainers needed for the functional inference workflow and creates singularity images |
all |
Downloads all files and biocontainers needed for all workflows and creates all singularity images |
Once you have successfully finished the setup, you will have the files and images needed to proceed with executing each workflow offline.
In order to proceed to the Read Filtering workflow in an offline environment, you should have run the workflow setup with either 1) the test_files and the read_filtering flags or 2) the all flag.
IMPORTANT: If you did not download all of the workflow files and dependencies, please keep in mind that the workflows are executed in a specific order (i.e., read filtering -> assembly -> comparison -> taxonomic classification -> functional inference). It is recommended that users run the example dataset through the workflows in that order to learn how everything operates, and the workflows are described in that order throughout subsequent pages of this wiki. The subsequent wiki pages will walk through each workflow in a step by step process using the example dataset and default config files.
If you are ready to skip ahead to run your own samples, it is recommended that you review the Workflow Architecture page to decide if you would like to use the default config or a custom config to process your samples. The Workflow Architecture page also contains an example of a custom config file that you can copy, edit, and save with the name of your specific samples and parameters. One advantage of custom config files is that they can be uniquely named, which may be helpful for organizing and keeping a record of the analyses that you run.
You can skip ahead to processing your own samples by following these steps:
- Complete the installation and offline setup for v1.1
- Check to make sure you have all the container images you need in the
metagenomics/container_imagesdirectory - Check to make sure you have the Trimmomatic adapter file and all of the taxonomic and functional databases you need in the
metagenomics/workflows/datadirectory - Move your input files to the
metagenomics/workflows/datadirectory - Setup your default or custom config file as needed (i.e., change file names and parameters throughout the config to process your sample)
- Save your updated config file in the
metagenomics/workflows/configdirectory - Activate your metag environment
- Run screen or something similar, since these workflows can run for a while
- Navigate to the
metagenomics/workflowsdirectory - Set the singularity bindpath
- Run a command to execute the snakemake rules
These commands will activate your metag environment, run screen, navigate to the metagenomics/workflows directory, and set the singularity bindpath:
source activate metag
screen
cd metagenomics/workflows
export SINGULARITY_BINDPATH="data:/tmp"The following command will run all available rules within the metagenomics/workflows/config/default_workflowconfig.settings default config file (i.e., if you directly edited the default config with the name of your sample and parameters, then run this command):
snakemake --use-singularity read_filtering_pretrim_workflow read_filtering_posttrim_workflow read_filtering_multiqc_workflow read_filtering_khmer_interleave_reads_workflow read_filtering_khmer_count_unique_kmers_workflow read_filtering_khmer_subsample_interleaved_reads_workflow read_filtering_khmer_split_interleaved_reads_workflow read_filtering_fastq_to_fasta_workflow assembly_all_workflow assembly_quast_workflow assembly_multiqc_workflow comparison_reads_assembly_workflow taxclass_signatures_workflow taxclass_gather_workflow taxclass_kaijureport_workflow taxclass_kaijureport_filtered_workflow taxclass_kaijureport_filteredclass_workflow taxclass_add_taxonnames_workflow taxclass_kaiju_species_summary_workflow taxclass_visualize_krona_kaijureport_workflow taxclass_visualize_krona_kaijureport_filtered_workflow taxclass_visualize_krona_kaijureport_filteredclass_workflow taxclass_visualize_krona_species_summary_workflow functional_with_srst2_workflow functional_prokka_with_megahit_workflow functional_prokka_with_metaspades_workflow functional_abricate_with_megahit_workflow functional_abricate_with_metaspades_workflowThe following command will run all available rules within a custom config, provided that all of these rules are included in the custom config (i.e., if you created a custom config file and saved it in the metagenomics/workflows/config/ directory with a unique name, then run this command):
snakemake --use-singularity --configfile=config/custom_config_name.json read_filtering_pretrim_workflow read_filtering_posttrim_workflow read_filtering_multiqc_workflow read_filtering_khmer_interleave_reads_workflow read_filtering_khmer_count_unique_kmers_workflow read_filtering_khmer_subsample_interleaved_reads_workflow read_filtering_khmer_split_interleaved_reads_workflow read_filtering_fastq_to_fasta_workflow assembly_all_workflow assembly_quast_workflow assembly_multiqc_workflow comparison_reads_assembly_workflow taxclass_signatures_workflow taxclass_gather_workflow taxclass_kaijureport_workflow taxclass_kaijureport_filtered_workflow taxclass_kaijureport_filteredclass_workflow taxclass_add_taxonnames_workflow taxclass_kaiju_species_summary_workflow taxclass_visualize_krona_kaijureport_workflow taxclass_visualize_krona_kaijureport_filtered_workflow taxclass_visualize_krona_kaijureport_filteredclass_workflow taxclass_visualize_krona_species_summary_workflow functional_with_srst2_workflow functional_prokka_with_megahit_workflow functional_prokka_with_metaspades_workflow functional_abricate_with_megahit_workflow functional_abricate_with_metaspades_workflowNote that --configfile=config/custom_config_name.json is used to specify the name of the custom config in the above command.