Add notebook for downloading McFarland 2020 Figure 1 data#2
Open
ethanweinberger wants to merge 1 commit intotheislab:mainfrom
Open
Add notebook for downloading McFarland 2020 Figure 1 data#2ethanweinberger wants to merge 1 commit intotheislab:mainfrom
ethanweinberger wants to merge 1 commit intotheislab:mainfrom
Conversation
This PR adds a Jupyter notebook to download the data from McFarland et al., 2020 used to produce Figure 1 (i.e., response to idasanutlin and control DMSO for different cell lines). This PR also adds a `utils.py` file to the datasets folder containing reusable functions for downloading/preprocessing.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
ethanweinberger
pushed a commit
to ethanweinberger/sc-pert
that referenced
this pull request
Mar 25, 2022
This PR adds a notebook to download + preprocess the Norman 2019 dataset starting directly from downloading the raw counts. The notebook currently downloads the data, and fills in various metadata values. I made this PR as the current Norman 2019 notebook depends on downloading another h5ad file first- I personally like being able to see the full workflow (i.e., going from author provided files to final anndata) as part of the notebooks. As mentioned in theislab#2, I'm not sure what QC steps you prefer so this notebook simply produces an anndata with raw counts.
Member
Hey Ethan, great questions! I'll post the answers here for now but ideally there would be some other documentation somewhere other than an obscure
|
yugeji
pushed a commit
that referenced
this pull request
Mar 31, 2022
* Add Norman 2019 notebook with more details This PR adds a notebook to download + preprocess the Norman 2019 dataset starting directly from downloading the raw counts. The notebook currently downloads the data, and fills in various metadata values. I made this PR as the current Norman 2019 notebook depends on downloading another h5ad file first- I personally like being able to see the full workflow (i.e., going from author provided files to final anndata) as part of the notebooks. As mentioned in #2, I'm not sure what QC steps you prefer so this notebook simply produces an anndata with raw counts. * Add standard metadata fields * standardize naming Authored-by: Ethan Weinberger <ethanweinberger@pop-os.localdomain>
Contributor
Author
|
Got it- the distinction between the curation/preprocessing notebooks makes sense to me. Based on that distinction, it seems like it makes sense to have the |
Contributor
Author
|
Closing since this is taken care of by `mcfarland_2020_curation.ipynb' |
Contributor
Author
|
Reopening per @yugeji's request |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds a Jupyter notebook to download the data from
McFarland et al., 2020 used to produce Figure 1 (i.e.,
response to idasanutlin and control DMSO for different cell lines).
This PR also adds a
utils.pyfile to the datasets foldercontaining reusable functions for downloading/preprocessing.
A couple things that should probably be hashed out before this gets merged:
h5adfile.anndataobject in my notebook just contains raw counts.