From fc91c1e7e0cab935f98d1225164d887fa595b751 Mon Sep 17 00:00:00 2001 From: Alexandre Pelletier <45462633+AlexandrePelletier@users.noreply.github.com> Date: Tue, 27 Jan 2026 12:12:43 -0500 Subject: [PATCH] Update README.md --- data/genes/README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/data/genes/README.md b/data/genes/README.md index 8bf865d..ed00eaa 100644 --- a/data/genes/README.md +++ b/data/genes/README.md @@ -35,10 +35,14 @@ For more details please check out [this page](https://statfungen.github.io/xqtl- ## Others contents - `res_allanalysis_ADloci_overlap.csv.gz` : a long/flatten version of the excel table, where each row is a variant-ADlocus-Method-context-gene_name information, facilitating querying informations. It contains also all variants associated to the loci (instead of top variants per loci) -- `AD_loci_unified_cs95orColocs_Pval1e5_variant_level.csv.gz` : a table for the unified AD loci where each row is a variant of these loci +- `AD_loci_unified_cs95orColocs_Pval1e5_variant_level.csv.gz` : a table for the unified AD loci where each row is a variant of these loci. - `context_meta.tsv` generated by Ru Feng +### generation of the unified AD loci +The AD credible sets (CS with coverage 0.95) and the AD colocalized set (CoS, derived from colocboost or the 'standard' coloc made with fsusie) made with each AD GWAS studies have been joined in a single locus if they share any variants, or if any of their variants have r>0.8 AND all their variants have r >0.5. The loci have then been filtered to keep only those with minimum pvalue < 1e-5 accross the GWAS dataset. To rank the variants in each locus, we determine for each variant its `max_variant_inclusion_probability`, which correspond to the maximum value between the PIP, the VCP (of colocboost) and the PPH4 (for the standard coloc) for the variant accross GWAS datasets. + + ### generation of `context_meta.tsv` ```bash cat /mnt/vast/hpc/homes/rf2872/codes/xqtl-analysis/analysis/Wang_Columbia/susie_twas/*/commands_to_submit.txt |grep -oP '(?<=--phenotype-names )[^\\]*'| sed 's/\s--.*$//' | awk -v OFS=',' '{for (i=1; i<=NF; i++) printf "%s%s", $i, (i==NF?ORS:OFS)}' > context.txt