This repository already has a few files in input_data/. You need to add data in input_data/pathways/ and input_data/genes/.
The files in input_data/pathways/ contain the associations between each phenotype and pathway. Their format is:
- filenames:
PheCode_*_{GO,Curated}.wConditional.txt.gz- eg,
PheCode_986_Curated.wConditional.txt.gzcontains data about the Curated pathways for phecode 986.
- eg,
- format: tab-delimited with columns
pathway_name,pathway_url,association_pvalue,pathway_genes_that_the_method_selected_as_meaningful.- eg,
KEGG_GLYCOLYSIS_GLUCONEOGENESIS http://www.broadinstitute.org/gsea/msigdb/cards/KEGG_GLYCOLYSIS_GLUCONEOGENESIS 0.993 ALDH3A2,ADH1B,HK3,ALDH1B1,FBP1,TPI1,PFKP - optionally, you may include unused columns before and after
pathway_genes_that_the_method_selected_as_meaningful
- eg,
The files in input_data/genes/ contain the associations between each phenotype and gene. Their format is:
- filenames:
OUTF_PheCode_*.txt.gz- eg,
OUTF_PheCode_803.2.txt.gzcontains data about phenotype 803.2
- eg,
- format: tab-delimited or space-delimited with columns
gene_name,pvalue_or_NA- eg,
7SK NAorA1BG 8.35e-01 - optionally, you may add 3 columns before and after
pvalue_or_NA.
- eg,
These files can be collected from Flux by running:
mkdir -p ~/pathweb-data/pathways/ && cd /scratch/leeshawn_fluxod/diptavo/1000G/RESULTS/ && find . -name 'PheCode_*[dO].wConditional.txt'|sort -n|while read f; do echo $f; cat $f|cut -d' ' -f1-3,5|tr " " "\t"|gzip - > ~/pathweb-data/pathways/$(basename $f).gz; done
mkdir -p ~/pathweb-data/genes/ && cd /scratch/leeshawn_fluxod/diptavo/1000G/RESULTS/ && find . -name 'OUTF_PheCode_*.txt'|sort -n|while read f; do echo $f; cat $f|perl -nae 'print "$F[0]\t";if ($F[4] eq "NA") {print("NA\n")} else {printf("%.2e\n",$F[4])}' | gzip - > ~/pathweb-data/genes/$(basename $f).gz; done
If you are on an Ubuntu server you can simply run ./setup-server.sh which should install the required tools, process the data, and configure and start the Flask app and Nginx using Systemd.
If you are on a laptop or otherwise don't want to use that script, then:
-
run
pip3 install -r requirments.txt(which may require you to set up and activate avirtualenvorminicondaor usesudo) -
run
python3 pathweb/make_sqlite3_db.pyto producepathweb/pheno_pathway_assoc.db. -
run
python3 pathweb/make_gene_sqlite3_db.pyto producepathweb/gene.db. -
run
python3 pathweb/make_tables.pyto producepathweb/static/phenotypes.jsonandpathweb/static/pathways.json. -
run the server with one of these:
python3 pathweb/serve.py(insecure and slow for development/debugging)cd pathweb && gunicorn serve:app -k gevent -w4 --bind 0.0.0.0:8000(fast for production)sudo systemctl restart gunicorn-pathweb(if the server is run by /etc/systemd/system/gunicorn-pathweb.service)