Skip to content

Added peak level annotations support #83

Open
bhavyak7-lab wants to merge 13 commits intowfondrie:mainfrom
bhavyak7-lab:dia_functionality_clean
Open

Added peak level annotations support #83
bhavyak7-lab wants to merge 13 commits intowfondrie:mainfrom
bhavyak7-lab:dia_functionality_clean

Conversation

@bhavyak7-lab
Copy link

Changes made:

  • Added peak level annotations support by creating a new parser that could store the additional annotations
  • Modified MassSpectrum to retain peak level annotations

Users can extract peak level annotations by creating a custom field to extract them, and they are stored in the spectrum dictionary


raise ValueError("Invalid precursor charge.")

class AsfParser(BaseParser):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can ASF parser extend the MGF parser instead? Might reduce a little bit of redundancy.

if "=" in line:
key, value = line.split("=", 1)
if key == "CHARGE":
if len(value) == 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this length = 2 check for?

if len(value) == 2:
value = value[0]

if key == "PEPMASS":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not always expect a precursor intensity to also be provided (in fact unless you do a special feature extraction step DIA data we will never have it). We should handle not having it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case would be handled by the final line: spectrum["params"][key.lower()] = [value]

if key == "SEQ":
spectrum["params"][key.lower()] = value
continue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add support for skipping spectra where key parameters are missing / of the incorrect format, and warn the users that these spectra were skipped?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the charge is outside of the range, the iter_batches method throws an error. Should I also throw an error here?

yield _iter()


def parse_spectrum(self, spectrum: dict) -> MassSpectrum:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is 99% the same as the one for MGF parser. If we extend that class instead, this can just call super() and then set the peak_annotations=spectrum["peak_annotations"] on the result.

@bhavyak7-lab
Copy link
Author

The failing TDF tests are due to a file-reading issue with MACOS as the tests pass on Linux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants