-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Hi,
Say I have a set of 10,000 atoms, each one with a fingerprint 1000 continuous (normalised) scalar values to describe them. Can I use this software to generate 10,000 scalar values, one for each atom, that represents the similarity of the respective fingerprint against all other fingerprints or some arbitrary reference simultaneously?
I've been playing with the code, but from my understanding it only generates a single scalar value to show the similarly of the dataset as a whole? I've gotten a bit lost!
Basically I have a used N-body Iteratively Contracted Equivariants to build up representations of the local atomic environments for all of the atoms in a set of 4000 organic molecules. A representation for a single atom can consist of many continuous scalar values (lets just say 10,000 atoms with 1000 elements in each atomic 'fingerprint' for sake of argument). I can treat these like fingerprints, but I don't want a pairwise comparison. I want to apply some similarity metric that compares these representations and returns an array of 'similarity' scores, one for each fingerprint. Then I can plot a heatmap like the one below, where the phi metric on colourbar scale has been replaced by the 'similarity of atomic environment'.
Obviously, I could just take the sum of all 1000 elements per atom and use that, but surely there is some sort of similarity metric that does a better job.
