Invalid orbital range bug by calvinp0 · Pull Request #865 · ReactionMechanismGenerator/ARC

calvinp0 · 2026-04-10T13:45:43Z

Bugs:

Guard 1 — Prevention (scheduler.py:1440): Case bug. 'DLPNO' in level.method but Level.init normalizes to lowercase. Dead code since the day it was written — never fired once.

Guard 2 — Troubleshooting (trsh.py:1070): Structurally unreachable. This one actually used lowercase 'dlpno' correctly, but it was an elif after the Memory branch. The error flow made it impossible to reach:

ORCA crashes with INVALID ORBITAL RANGE in err.txt
determine_ess_status reads the log file, finds "ORCA finished by error termination in MDCI", scans for "Please increase MaxCore" or "parallel calculation exceeds number of pairs" — finds neither
Falls through the for-else to: "MDCI error in Orca. Assuming memory allocation error." → keywords = ['MDCI', 'Memory']
trsh_ess_job sees 'Memory' in keywords → enters Memory branch → increases memory → resubmits
Same crash → step 2 → infinite loop

The DLPNO check at step 4 was behind elif, so it could never fire when Memory was in the keywords. Two bugs compounding — the first one prevents the problem, the second one should have caught it but couldn't due to the control flow.

Following on for why ARC did the trsh ad infinitum:

ORCA fails → determine_ess_status sees "ORCA finished by error termination in MDCI", doesn't find "Please increase MaxCore" or "parallel calculation exceeds number of pairs" in the log → falls through to else: keywords = ['MDCI', 'Memory']
trsh_ess_job enters Orca Memory branch → 'memory' not in ess_trsh_methods → appends 'memory' → calculates new memory via estimate_orca_mem_cpu_requirement(num_heavy_atoms=0) → couldnt_trsh stays False
Scheduler resubmits with new memory → same ORCA crash
trsh_ess_job enters Orca Memory branch again → 'memory' already in list (not re-added) → calculates same memory estimate → couldnt_trsh stays False
Repeat step 3-4 forever

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes a DLPNO + monoatomic edge case that could trigger ORCA “INVALID ORBITAL RANGE” failures and an infinite memory-troubleshooting loop.

Changes:

Normalize the DLPNO monoatomic guard in the scheduler to actually trigger (case/normalization fix) and downgrade to the canonical (non-DLPNO) method.
Pass monoatomic context into ESS troubleshooting.
Reorder ORCA troubleshooting to detect DLPNO + monoatomic/H before memory-based retries.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
arc/scheduler.py	Detects DLPNO on monoatomic species and rewrites the level of theory; forwards monoatomic flag into troubleshooting.
arc/job/trsh.py	Adds `is_monoatomic` parameter and prioritizes DLPNO+monoatomic/H handling before memory retries for ORCA.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

arc/scheduler.py

alongd · 2026-04-10T15:42:58Z

We must add a trsh counter (or do we already have one?), so we don't do anything infinitely

alongd

Thanks!! Added some comments

arc/scheduler.py

alongd · 2026-04-10T15:49:40Z

arc/job/trsh.py


    elif 'orca' in software:
-        if 'Memory' in job_status['keywords']:
+        if 'dlpno' in level_of_theory.method and (is_monoatomic or is_h):


this shouldn't happen, if it does, it means Scheduler is buggy. I almost think we could raise an error here (so devs know)

Fair, changed it

calvinp0 · 2026-04-10T15:54:04Z

We must add a trsh counter (or do we already have one?), so we don't do anything infinitely

Depends on what we are troubleshooting - liek for TS guess, we eventually try everything and then declare all methods attempted

calvinp0 · 2026-04-10T16:24:30Z

We must add a trsh counter (or do we already have one?), so we don't do anything infinitely

I added a counter now, and defaulted it to 10 in the settings.py

alongd · 2026-04-10T18:34:59Z

I think the trsh counter could be per trsh method (not to try the same one too many times), Maybe we can add a trsh_counter dict to Species?

codecov · 2026-04-10T20:20:40Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.20%. Comparing base (61a711a) to head (7ff2e77).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #865      +/-   ##
==========================================
+ Coverage   60.10%   60.20%   +0.10%     
==========================================
  Files         102      102              
  Lines       31041    31052      +11     
  Branches     8082     8084       +2     
==========================================
+ Hits        18657    18696      +39     
+ Misses      10071    10033      -38     
- Partials     2313     2323      +10

Flag	Coverage Δ
functionaltests	`60.20% <ø> (+0.10%)`	⬆️
unittests	`60.20% <ø> (+0.10%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

calvinp0 · 2026-04-10T20:56:17Z

I think the trsh counter could be per trsh method (not to try the same one too many times), Maybe we can add a trsh_counter dict to Species?

I guess so. The thing is, most of our trsh methods are safe - they have limits. like rotor scans max 4 in the settings. TS guesses have a flag I implemented a year ago or more where it tries all relevant methods and then does 'all_attempted' to indicate exhaustion. Orca Mem has a guard if 'memory' is not in the ess_trsh_methods. and molpro has an elif chain of guards.

The issue now gaussian memory - no 'memory' not in ess trsh method guard. can keep doubling until hiting 95% node memory. I think that was for ATLAS? and then general ESS trsh where there is no global max attempt counter - so it relies entirely on ess_trsh_methods list eventually mathcing attempted_ess_trsh_methods.

So, I am not so sure having an ess trsh counter per method is really relevant here.

calvinp0 · 2026-04-10T21:06:30Z

Ok

I think the trsh counter could be per trsh method (not to try the same one too many times), Maybe we can add a trsh_counter dict to Species?

Further investigating, I am hesitant about per trsh method cause then they means we need to set a limit for a fair few methods and the also allow the user to change it (could be overwhelming), and I know for ORCA memory - I misspoke. It doesn't have a guard cause I know we sometimes have to troubleshoot the mem multiple times as ORCA can keep saying 'okay allocate more' and then we do, then it complains for even more.

DLPNO methods are incompatible with monoatomic species. This change generalizes the previous hydrogen-specific check to all monoatomic species and automatically falls back to the canonical method by stripping the 'dlpno-' prefix. Added max ess trsh counter attempts

Generalize the H-atom-specific check to all monoatomic species when using DLPNO methods in Orca, as these methods are incompatible with single-atom systems that lack electron pairs to correlate. Added tests for trsh regard monoatomic

Added a counter now to how many times ARC will troubleshoot an ESS job. This is set in the settings.py - default is 25 times.

Correctly import and use the logging module to set the Paramiko log level in the SSHClient class, replacing an undefined logger reference.

calvinp0 requested review from Lilachn91, alongd and Copilot April 10, 2026 13:45

Copilot AI reviewed Apr 10, 2026

View reviewed changes

arc/scheduler.py Outdated Show resolved Hide resolved

arc/scheduler.py Outdated Show resolved Hide resolved

arc/scheduler.py Outdated Show resolved Hide resolved

github-actions bot added Module: Scheduler Module: trsh Troubleshooting labels Apr 10, 2026

Copilot started reviewing on behalf of calvinp0 April 10, 2026 14:14 View session

calvinp0 force-pushed the mono_atom_orca branch from 69457e1 to e3a59d3 Compare April 10, 2026 15:30

alongd reviewed Apr 10, 2026

View reviewed changes

calvinp0 force-pushed the mono_atom_orca branch from e3a59d3 to da8f011 Compare April 10, 2026 16:18

calvinp0 requested a review from alongd April 10, 2026 16:24

calvinp0 force-pushed the mono_atom_orca branch from da8f011 to f799aba Compare April 10, 2026 16:28

calvinp0 force-pushed the mono_atom_orca branch 4 times, most recently from 557b52d to 46b213f Compare April 12, 2026 11:31

github-actions bot added the Module: SSH label Apr 12, 2026

calvinp0 added 4 commits April 12, 2026 15:58

Handle monoatomic species for DLPNO methods in Orca

3ce9bb7

Generalize the H-atom-specific check to all monoatomic species when using DLPNO methods in Orca, as these methods are incompatible with single-atom systems that lack electron pairs to correlate. Added tests for trsh regard monoatomic

Max TRSH ESS counter

c76d65b

Added a counter now to how many times ARC will troubleshoot an ESS job. This is set in the settings.py - default is 25 times.

Fix logging reference in ssh.py

dfd53e3

Correctly import and use the logging module to set the Paramiko log level in the SSHClient class, replacing an undefined logger reference.

calvinp0 force-pushed the mono_atom_orca branch from 46b213f to dfd53e3 Compare April 12, 2026 12:58

Conversation

calvinp0 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bugs:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alongd commented Apr 10, 2026

Uh oh!

alongd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alongd Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

calvinp0 Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

calvinp0 commented Apr 10, 2026

Uh oh!

calvinp0 commented Apr 10, 2026

Uh oh!

alongd commented Apr 10, 2026

Uh oh!

codecov bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

calvinp0 commented Apr 10, 2026

Uh oh!

calvinp0 commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

calvinp0 commented Apr 10, 2026 •

edited

Loading

codecov bot commented Apr 10, 2026 •

edited

Loading