Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
c7c0cd1
Update metaboscape_formatter.py
lfnothias Nov 28, 2020
757129f
Updating the description of the requirements for MetaboScape format
lfnothias Nov 28, 2020
f9f7c70
Remove .d extension for the filename
lfnothias Nov 28, 2020
8d4debe
Add note metadata filename requirement
lfnothias Nov 28, 2020
db6f2bf
Add test files metaboscape 5.0 + test
lfnothias Nov 28, 2020
aac0cb7
Adding a metadata file for testing 5.0
lfnothias Nov 28, 2020
e04099d
Adding new test task for MetaboScape 5.0
lfnothias Nov 28, 2020
4e2f6de
Clean up code
lfnothias Nov 28, 2020
5bf01f6
Merge pull request #36 from CCMS-UCSD/master
lfnothias Nov 28, 2020
3ad80d6
Update README.md
lfnothias Nov 28, 2020
49fceb8
Merge branch 'CCMS-UCSD-master' of https://github.com/lfnothias/GNPS_…
lfnothias Nov 28, 2020
0e7da88
Fix test_tasks
lfnothias Dec 18, 2020
271e496
fix
lfnothias Dec 18, 2020
64918e3
sync
lfnothias Dec 18, 2020
78ccb28
fix
lfnothias Dec 18, 2020
7458883
Merge pull request #37 from CCMS-UCSD/master
lfnothias Dec 18, 2020
1405aa9
Adding MetaboScape additional test tasks
lfnothias Dec 18, 2020
4617385
Merge branch 'CCMS-UCSD-master' of https://github.com/lfnothias/GNPS_…
lfnothias Dec 18, 2020
d52ec27
Update Optimus script for FBMN
lfnothias Dec 18, 2020
b39ccbc
Add Legacy to Optimus support option
lfnothias Dec 18, 2020
0395592
Revert "Update Optimus script for FBMN"
lfnothias Dec 19, 2020
fe5e01e
Revert "Revert "Update Optimus script for FBMN""
lfnothias Dec 19, 2020
0e636fd
Update input.xml
lfnothias Dec 19, 2020
c431396
Merge pull request #38 from CCMS-UCSD/master
lfnothias Jan 14, 2021
90277e2
[FBMN] Adding support for SIRIUS processing
lfnothias Jan 14, 2021
b7d9559
Revert "[FBMN] Adding support for SIRIUS processing"
lfnothias Jan 14, 2021
3b70622
Merge pull request #39 from CCMS-UCSD/master
lfnothias Mar 2, 2021
63b72c7
Sync the test_tasks
lfnothias Apr 13, 2021
e0d1f9e
Merge pull request #45 from CCMS-UCSD/master
lfnothias Apr 13, 2021
88dedef
Merge branch 'CCMS-UCSD:master' into CCMS-UCSD-master
lfnothias Oct 5, 2021
f232a2a
Update requirements description for Progenesis QI converter
lfnothias Oct 5, 2021
2ea94fe
Update Progenesis QI converter with extra support
lfnothias Oct 5, 2021
2da8549
Adding a Progenesis converter test for European CSV
lfnothias Oct 5, 2021
f039f1a
Input files for Progenesis EU CSV test
lfnothias Oct 5, 2021
e44bf5d
Reference output files for test Progenesis QI converter of EU CSV
lfnothias Oct 5, 2021
c37bb99
Update test_converters.py
lfnothias Oct 5, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 42 additions & 17 deletions feature-based-molecular-networking/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Feature Based Molecular Networking

For more informations on FBMN, see the workflow documentation on GNPS [https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/](https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/) and our preprint: Nothias, L.F. et al [Feature-based Molecular Networking in the GNPS Analysis Environment](https://www.biorxiv.org/content/10.1101/812404v1) bioRxiv 812404 (2019).
For more informations on FBMN, see the workflow documentation on GNPS [https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/](https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/) and our preprint: Nothias, L.F. et al [Feature-based Molecular Networking in the GNPS Analysis Environment](https://www.nature.com/articles/s41592-020-0933-6) Nature Methods volume 17, pages 905–908 (2020).

Representative input files for each supported tools are available at:
[https://github.com/CCMS-UCSD/GNPS_Workflows/tree/master/feature-based-molecular-networking/test/reference_input_file_for_formatter](https://github.com/CCMS-UCSD/GNPS_Workflows/tree/master/feature-based-molecular-networking/test/reference_input_file_for_formatter)
Expand Down Expand Up @@ -39,35 +39,60 @@ Additionally, it is assumed there are additional columns where the per sample qu

The MGF output should contain the "SCANS" header, and it must correspond to the identifier of the "row ID". It has to be unique, and can be non sequential.

### Metaboscape
### MetaboScape

#### For MetaboScape 5.0

The feature quantification table (.CSV file, comma separated) should include columns with the following header:

1. FEATURE_ID
2. RT
3. PEPMASS
4. MaxIntensity
1. SHARED_NAME
2. FEATURE_ID
3. RT
4. PEPMASS
5. CCS (optional, only tims/PASEF data)
6. SIGMA_SCORE
7. NAME_METABOSCAPE
8. MOLECULAR_FORMULA
9. ADDUCT
10. KEGG
11. CAS
12. MaxIntensity
13. {GroupName}_MeanIntensity (0-n times, dependent on the groups defined in MetaboScape)
14. Sample Intensities

All sample headers are not including the file format extension ".d" (DDA) or ".tdf" (PASEF). The columns "FEATURE_ID", "RT", "PEPMASS", "MaxIntensity" are mandatory.
Important: In the metadata table, the filename MUST NOT HAVE the extension suffixe indicated.

#### Earlier versions of MetaboScape (<5.0)

For ion mobility data, it must include a "CCS" column.
The feature quantification table (.CSV file, comma separated) should include columns with the following header:

All sample headers are not including the file format extension ".d" (DDA) or ".tdf" (PASEF)
1. SHARED_NAME
2. FEATURE_ID
3. RT
4. PEPMASS
5. NAME
6. MOLECULAR_FORMULA
7. ADDUCT
8. KEGG
9. CAS
10. {GroupName}_MeanIntensity (0-n times, dependent on the groups defined in MetaboScape)
11. Sample Intensities

Sample headers are including the file format extension ".d". The columns "FEATURE_ID", "RT", "PEPMASS", "CAS" are mandatory.
Important: In the metadata table, the filename MUST HAVE the ".d" extensionsuffixe.

### Progenesis QI

The feature quantification table (text file, comma separated). The row 1/2 are used to deduced the number of samples and sample name and metadata are starting in row 3 with the following columns:
The feature quantification table of Progenesis is a text file. We now support either comma separated text file with dot as decimal separator or the European standard with semicolon separated text file with comma as separator. Make sure you follow the export instructions as mentioned at [https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking-with-progenesisQI](https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking-with-progenesisQI/). Representative input files are available at [https://github.com/CCMS-UCSD/GNPS_Workflows/tree/master/feature-based-molecular-networking/test/reference_input_file_for_formatter](https://github.com/CCMS-UCSD/GNPS_Workflows/tree/master/feature-based-molecular-networking/test/reference_input_file_for_formatter).

At the second row:
1. Raw abundance
2. Normalised abundance
The "Normalised abundances" columns are mandatory along with "Compound", "Retention time (min)", "m/z".

At the third row:
1. Compound
2. m/z
3. Retention time (min)
The conversion script is located at [https://github.com/CCMS-UCSD/GNPS_Workflows/blob/master/feature-based-molecular-networking/tools/feature-based-molecular-networking/scripts/progenesis_formatter.py](https://github.com/CCMS-UCSD/GNPS_Workflows/blob/master/feature-based-molecular-networking/tools/feature-based-molecular-networking/scripts/progenesis_formatter.py).

For ion mobility data, it must include a "CCS (angstrom^2)" column for consistency

The samples headers are deduced from parsing the first row of the input feature quantification table by deducing the number of samples from the difference between the Normalized and Raw abundance columns (Raw abundance column are relabelled with a .1 suffix). We output only the Normalized intensities. All sample headers are not including the filename extension (such as ".raw").
If both 'Row abundances' and 'Normalised abundances' are present, we output only 'Normalised abundances' columns. All sample headers are not including the filename extension (such as ".raw").
We output most metadata columns except the 'Accepted Description' column. These columns are on the row 3 of the input feature quantification table.

We use the .MSP file format for the MS/MS spectral summary and convert it to a.MGF file. Only the first MS/MS entry associated with a compound name is kept in the .MGF file (Following "Comment: "). This is an imperfect solution and we are welcoming volunteers to improve this.
Expand Down
Loading