Skip to content

Adding cleavage benchmarks#404

Open
vue1999 wants to merge 30 commits intoddmms:mainfrom
vue1999:cleavage_benchmark
Open

Adding cleavage benchmarks#404
vue1999 wants to merge 30 commits intoddmms:mainfrom
vue1999:cleavage_benchmark

Conversation

@vue1999
Copy link
Copy Markdown

@vue1999 vue1999 commented Mar 4, 2026

Pre-review checklist for PR author

PR author must check the checkboxes below when creating the PR.

Summary

Adding a benchmark to evaluate the accuracy of predicting cleavage energies of crystalline surfaces.

Linked issue

Resolves #403

Progress

  • Calculations
  • Analysis
  • Application
  • Documentation

Testing

MACE omat, ORB v3

New decorators/callbacks

no

@ElliottKasoar ElliottKasoar added the new benchmark Proposals and suggestions for new benchmarks label Mar 4, 2026
@joehart2001
Copy link
Copy Markdown
Collaborator

joehart2001 commented Mar 22, 2026

Hey @vue1999, thanks for the PR and its looking super good. A few things:

  • ive made some code suggestions to the calc script so we save files in the generalised format. I know there are 30,000 files... but what do you think @ElliottKasoar?
  • ive made some knock on suggestions to the analysis based on these calc changes
  • ive also made suggestions which implement the density scatter plot + structure visualisation, as theres a lot of structures
  • We are also going to in the future add the ability to swithc betwen types of errors e.g. mae and rmse, so ive suggested we just keep mae for now unless you're against this?

Let me know if you've got any ideas or are unsure about any of my suggestions, thanks!

vue1999 and others added 19 commits March 25, 2026 11:42
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
vue1999 and others added 9 commits March 25, 2026 11:51
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
…gy.py

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
@vue1999
Copy link
Copy Markdown
Author

vue1999 commented Mar 25, 2026

Thanks for the review @joehart2001! Happy to accept all suggestions and fine to keep just MAE for now.

Co-authored-by: Joseph Hart <92541539+joehart2001@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@ElliottKasoar ElliottKasoar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, @vue1999, and sorry for the delayed review on my side, it's looking great so far!

I've left a few minor comments. Could you run the pre-commit (see: https://ddmms.github.io/ml-peg/developer_guide/get_started.html#automatic-coding-style-check), to tidy up a few things for the automated tests?

I am also a bit concerned by the 30,000 files. The analysis just for a single model took me three minutes. We definitely still want access to the structure files, and it's good to save the data with them, but I wonder if it's worth also saving this to a simpler file format e.g. as you had previously with JSON (sorry if this feels back-and-forth) to be more practical.

It could also be worth exploring saving to a reduced number of files e.g. all structs in each mpid_dir, but I'm not sure it would make a huge difference.

Thoughts @joehart2001?

idx = 0
for mpid_dir in tqdm(sorted(d for d in data_dir.iterdir() if d.is_dir())):
for xyz_file in sorted(mpid_dir.glob("*.xyz")):
structs = read(xyz_file, index=":")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you set the charge and spin (multiplicity) for these structures? Orb-omol requires them to be set

from ml_peg.analysis.utils.utils import (
load_metrics_config,
mae,
write_density_trajectories,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get an import error trying from this import. This branch may need rebasing?

model_ref = []

for xyz_file in sorted(model_dir.glob("*.xyz"), key=lambda p: int(p.stem)):
slab = read(xyz_file)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think read is currently imported?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new benchmark Proposals and suggestions for new benchmarks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cleavage energy benchmark

3 participants