pyRapidFire

A simple program to analysis protein-compound complex rapidfire data at Vicinitas using either UniDec or openMS FlashDeconv.

Details

Currently, the program is in development stage and runs as a pipeline. Running mainly off of the main.py function. The program takes in a folder or either raw MS data or mzML files and runs them through the pipeline. If the data is raw MS data then a conversion docker is called via a REST API, more details below.

Steps

The program takes in a folder of data
Uploads a meta-data file that contains protein masses, compound masses, file identifications and other information. Note that if IC50 values are needed a concentration values are needed in the meta-data file.
Either UniDec or FlashDeconv is called to process the data.

If UniDec is called then the program will run the data through the python API for UniDec.
If FlashDeconv is called then the program will run the data through a CLI call.

Results from the process are then uploaded to a database.
Compound complex modifications are then calculated and matched per each well.
Within each well a percentage intensity is calculated for each protein-compound complex.
These matches are uploaded separately to the database.
Using each protein-compound modification number ie Mod0 or Mod1, the IC50 values are calculated and uploaded to the database.
Plots of each curve are generated and printed to ... png files.
TODO create a web UI to display the results.

Installation

Both installation via pip and poetry are supported. The program is designed to run in a docker container. The layout of the folders allows a package to be built. To build the package run the following command:

which poetry || pip install poetry
poetry build

This will create a whl file that can be installed via pip.

pip install dist/pyRapidFire-VERSION_NUMBER-py3-none-any.whl

Additionally, a full docker_compose file is provided to run the program. The docker_compose file will run the program and a database and the needed converter API functions.

File Structure

main.py - the main file that runs the program. Has a pipeline function that calls most of the other functions.
database.py - contains the database class that is used to upload data to the database.
protein_deconvolution.py - contains the functions that are used to process the data. It has two classes protein_well and protein_decon_unidec class. The protein_decon_unidec class is used to aggragite the wells by a single compound/ VCNT-ID. The protein_well class is used to store the data for each well. Within this class is also the matching function simple_match that is used to match the protein.
- When unidec is used the method needs to know the estimated mass of the protein and the range of masses to search. Additionally, it's helpful for it to know the charge state of the protein.
- FlashDeconv does not need to know the estimated mass of the protein or the range of masses to search and has an improved resolution/mass accuracy.
helper.py - contains the helper functions that are used to process the data. Mainly, functions to find the files, and a function to help fit the IC50 curves.
analysis.py - contains the functions that are used to analyze the data. Mainly, running functions to process the calculation the IC50 values. The IC50 values are processed in the IC50_Curves class.

Development

Things to know

The system is designed with a database in mind. The database is used to store the data and the results of the analysis. Most of the methods and functions are designed with the database in mind. Additionally, there is a custom logger that logs to both a file and the database. If the logger object is not passed to the database class then a default logger is made. The caveat here is that the logger needs a database connection. == This means that enviroment variables are needed ==. These are :

DB_USER - the username for the database
DB_PASS - the password for the database
DB_HOST - the host for the database
DB_NAME - the name of the database
DB_CERT_PATH - the path to the certificate for the database
DB_CERT_NAME - the name of the certificate file
DATA_PATH - the path to the data folder for data to be processed from

Loading modules order

Due to the logger and the need for the database connection, the modules need to be loaded in a specific order. If you are creating a new run script/program you will need to load the dotenv module prior to load ing the pyrapidfire.RapidfireDB and logging_db modules. This is because the database connection is needed for the logger.

An example would be as follows:

import os
from dotenv import load_dotenv
from pyrapidfire import database
from pyrapidfire import logging_db

load_dotenv()
logger = logging_db.get_logger()
logger.name = "pyRapidFire" # Set the name of the logger can also be __name__
obj = database.RapidFireDB(sqlalchemy=True, direct_connect=True , logger=logger)
obj.get_experiments()

Details about the logger

The logger works by creating a custom logger that logs to both a file and the database. The logger is created by the logging_db.get_logger() function. This function returns a logger object that can be used to log messages. The logger object has a custom handler that logs to the database. Additional handlers can be added to the logger object to log to the console or another file by using the logger.addHandler() function. The logger object has a custom attribute expid that can be set to the experiment id. This is used to log the experiment id to the database. The logger object also has a custom attribute name that can be set to the name of the logger. This is used to log the name of the logger to the database.

import os
from dotenv import load_dotenv
from pyrapidfire import database
from pyrapidfire import logging_db

load_dotenv()
logger = logging_db.get_logger()
logger.handlers[0].db.expid = 1 # Set the experiment id for the logger

TODOs

Add a docker container for running either unidec or flashdeconv
~~change to a class based processing system removing main.py from the run~~
~~Add a logger to the program~~
~~Add a web UI for the program~~
~~Move code around into better files and folder structure.~~

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
pyrapidfire		pyrapidfire
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
database_schema.sql		database_schema.sql
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
version.py		version.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pyRapidFire

Details

Steps

Installation

File Structure

Development

Things to know

Loading modules order

Details about the logger

TODOs

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

vicinitas-therapeutics/pyRapidFire

Folders and files

Latest commit

History

Repository files navigation

pyRapidFire

Details

Steps

Installation

File Structure

Development

Things to know

Loading modules order

Details about the logger

TODOs

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages