Skip to content

Generate X unique similarity values for X fingerprints with respect to all molecules in the dataset.  #1

@AdamCoxson

Description

@AdamCoxson

Hi,

Say I have a set of 10,000 atoms, each one with a fingerprint 1000 continuous (normalised) scalar values to describe them. Can I use this software to generate 10,000 scalar values, one for each atom, that represents the similarity of the respective fingerprint against all other fingerprints or some arbitrary reference simultaneously?

I've been playing with the code, but from my understanding it only generates a single scalar value to show the similarly of the dataset as a whole? I've gotten a bit lost!

Basically I have a used N-body Iteratively Contracted Equivariants to build up representations of the local atomic environments for all of the atoms in a set of 4000 organic molecules. A representation for a single atom can consist of many continuous scalar values (lets just say 10,000 atoms with 1000 elements in each atomic 'fingerprint' for sake of argument). I can treat these like fingerprints, but I don't want a pairwise comparison. I want to apply some similarity metric that compares these representations and returns an array of 'similarity' scores, one for each fingerprint. Then I can plot a heatmap like the one below, where the phi metric on colourbar scale has been replaced by the 'similarity of atomic environment'.

Obviously, I could just take the sum of all 1000 elements per atom and use that, but surely there is some sort of similarity metric that does a better job.

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions