Add offsides data from nsides #110
Add offsides data from nsides #110apoorvasrinivasan26 wants to merge 11 commits intoOpenBioML:mainfrom
Conversation
data/nsides/offsides/meta.yaml
Outdated
| description: Standard error of the PRR estimate | ||
| type: continuous | ||
| names: | ||
| - Proportional reporting ratio error |
There was a problem hiding this comment.
do you think this is something the model should be able to predict? (I'm just curious)
|
Overall, looks quite good to me. Your PR does not cause the pre-commit errors. |
|
No, I don't think the error can be predicted but it can be calculated based on other columns in the dataset like columns A, B C, and D. |
|
Great! Lmk if something doesn't work. I have another similar dataset in the works once that I will commit after this is merged |
in this case, I would consider removing this column as it might add more confusion than signal to the model |
|
Has this dataset been used in some benchmarks/papers? Thanks again for your contribution 💯 |
kjappelbaum
left a comment
There was a problem hiding this comment.
Thanks again for your contribution; it would be great if we could address the comments/clarifications. Let me know if you want a discussion or need a hand
|
@kjappelbaum I've removed those columns and yes, the dataset has been used in the following paper: https://pubmed.ncbi.nlm.nih.gov/22422992/ pls lmk if anything is unclear or if you'd like me to make further changes |
|
Thanks a lot! Can you still remove the column from the target list in the |
…emnlp into dataset_drugxdrug
|
Done! Lmk if I missed anything else. |
data/nsides/offsides/meta.yaml
Outdated
| - id: mean_reporting_frequency | ||
| description: Proportion of reports for the drug that report the side effect | ||
| type: continuous | ||
| names: | ||
| - mean reporting frequency |
There was a problem hiding this comment.
is the absolute number something a model should be able to predict?
There was a problem hiding this comment.
yeah, according to the paper, a model should be able to predict it
data/nsides/offsides/meta.yaml
Outdated
| - id: drug_concept_name | ||
| description: RxNorm name string for the drug | ||
| type: categorical | ||
| - id: condition_concept_name | ||
| description: MedDRA identifier for the side effect | ||
| type: categorical |
There was a problem hiding this comment.
Do we need both of them simultaneously for the ratio to be meaningful?
That is, a correct prompt would ask the model something like
"What is the proportional reporting ratio for for "
There was a problem hiding this comment.
yeah, so the prompt would be "what is the PRR of <condition_concept_name> for the <drug_concept_name>?". higher PRR means higher reported side effect for that particular drug.
data/nsides/offsides/meta.yaml
Outdated
| bibtex: "\n @article{Tatonetti2012,\n author = {Tatonetti, Nicholas P. and Ye, Peter P. and Daneshjou, Roxana and Altman, Russ B.},\n \ | ||
| \ title = {Data-driven prediction of drug effects and interactions},\n journal = {Sci Transl Med},\n volume = {4},\n number\ | ||
| \ = {125},\n pages = {125ra31},\n year = {2012},\n doi = {10.1126/scitranslmed.3003377},\n pmid = {22422992},\n pmcid\ | ||
| \ = {PMC3382018}\n }\n " |
There was a problem hiding this comment.
can we remove those newlines somehow and just have a multiline string? I can help with that
There was a problem hiding this comment.
sure! i'll let you take care of it if thats ok
|
Sorry for being unclear with my last reviews. I have more suggestions, but I can also take care of them if you prefer. Thanks for your contribution! |
Co-authored-by: Kevin M Jablonka <32935233+kjappelbaum@users.noreply.github.com>
Co-authored-by: Kevin M Jablonka <32935233+kjappelbaum@users.noreply.github.com>
MicPie
left a comment
There was a problem hiding this comment.
lgtm, I just added some minor changes to the text
| - id: drug_concept_name | ||
| description: RxNorm name string for the drug | ||
| type: categorical | ||
| - id: condition_concept_name | ||
| description: MedDRA identifier for the side effect | ||
| type: categorical |
There was a problem hiding this comment.
this will need to use prompt templates, I guess, because they are not independent.
No description provided.