Conversation
data/USPTO_500k/meta.yaml
Outdated
| - https://bioportal.bioontology.org/ontologies/AFO?p=classes&conceptid=http%3A%2F%2Fpurl.allotrope.org%2Fontologies%2Fquality%23AFQ_0000227 | ||
| - https://en.wikipedia.org/wiki/Yield_(chemistry) |
There was a problem hiding this comment.
I would use the "id" in the ontology table, but I can show you at an example when we discuss
Co-authored-by: Kevin M Jablonka <32935233+kjappelbaum@users.noreply.github.com>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
phalem
left a comment
There was a problem hiding this comment.
Add benchmark field
| link: https://tdcommons.ai/ | ||
| split_column: split | ||
| identifiers: | ||
| - id: reaction_SMILES |
There was a problem hiding this comment.
there is a new entry for that
There was a problem hiding this comment.
Yeah, great I see it. I will edit the file and PR it again.
kjappelbaum
left a comment
There was a problem hiding this comment.
Similar comments as for the other PRs :)
Co-authored-by: Kevin M Jablonka <32935233+kjappelbaum@users.noreply.github.com>
I will add benchmark field on TDC version UPSTO
I will add benchmark field on TDC version UPSTO
I will add benchmark field on TDC version UPSTO
|
I split up the reaction, i.e., |
|
@MicPie, yes, I'd add reaction SMILES as this is the best hope to remove duplicates |
|
As I'm coming from the bio side, wouldn't we need to more info for a reaction smiles or is it always: |
|
I'm not sure what data is TDC yields. They are not very specific in their documentation: https://tdcommons.ai/single_pred_tasks/yields/#uspto On the other hand, I know the data from The Currently, things seem to be a bit mixed up in this pull request. |
You are right. There can be plenty of reactants, reagents, solvents, and catalysts leading to one or more products in a reaction SMILES. |
Add uspto raw from drfp until I finish uspto from tdc