Skip to content

CREST Adapter#807

Open
calvinp0 wants to merge 67 commits intomainfrom
crest_adapter
Open

CREST Adapter#807
calvinp0 wants to merge 67 commits intomainfrom
crest_adapter

Conversation

@calvinp0
Copy link
Copy Markdown
Member

@calvinp0 calvinp0 commented Nov 27, 2025

Addition of CREST Adapter that complements the heuristic adapter.

This pull request adds support for the CREST conformer and transition state (TS) search method to the ARC project, along with several related improvements and code cleanups. The most important changes include integrating CREST as a TS search adapter, updating configuration and constants, and enhancing the heuristics TS search logic for better provenance tracking and code clarity.

CREST Integration:

  • Added CREST as a supported TS search method: updated JobEnum (arc/job/adapter.py), included CREST in the list of adapters and RMG family mapping, and registered it as a default incore adapter (arc/job/adapters/common.py, arc/job/adapters/ts/__init__.py). [1] [2] [3] [4]
  • Implemented a new test suite for CREST input generation (arc/job/adapters/ts/crest_test.py).
  • Added a Makefile target and installation script for CREST (Makefile). [1] [2]

Constants and Configuration:

  • Added the angstrom_to_bohr conversion constant to both Cython and Python constants modules (arc/constants.pxd, arc/constants.py). [1] [2] [3]

Heuristics TS Search Enhancements and Refactoring:

  • Refactored heuristics TS search logic to track and combine method provenance for TS guesses, allowing for more precise attribution when multiple methods contribute to a guess (arc/job/adapters/ts/heuristics.py). [1] [2] [3] [4]
  • Improved code readability and maintainability by reformatting imports and function calls, and clarifying data structures and comments in heuristics TS search (arc/job/adapters/ts/heuristics.py). [1] [2] [3] [4] [5] [6] [7]
    .

This comment was marked as resolved.

This comment was marked as resolved.

@calvinp0 calvinp0 force-pushed the crest_adapter branch 2 times, most recently from 3e88c36 to 7674f5f Compare February 2, 2026 09:49
@calvinp0 calvinp0 requested a review from Copilot February 2, 2026 12:10

This comment was marked as resolved.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

This comment was marked as resolved.

@calvinp0 calvinp0 force-pushed the crest_adapter branch 2 times, most recently from 9be935e to eab7647 Compare February 4, 2026 20:14
@calvinp0 calvinp0 requested a review from alongd February 4, 2026 21:30
calvinp0 added 8 commits April 8, 2026 01:07
MockAdapter writes YAML output (not real ESS logs), so
determine_ess_status falsely reports convergence failure.
Patching _parse_ess_error to return None in TestRunTask and
TestWorkerLoop skips the ESS check for these mockter-based tests.
cluster_tsgs() merges methods into method_sources list, not into the
method string. Updated test_cluster_tsgs to expect the representative's
original method and execution_time. Updated test_as_dict and
test_from_dict to include method_sources in expected output.
Updated Arkane validation logic to provide more granular feedback when atom energy corrections (AEC) or bond additivity corrections (BAC) are missing.

Key changes include:
- Added `check_arkane_aec` to verify atom energy corrections independently when BAC is disabled.
- Enhanced `check_arkane_bacs` to specifically identify and log whether AEC, BAC, or both are missing from the RMG database.
- Improved log reporting to distinguish between PBAC and MBAC types.
Updated the energy correction retrieval logic to perform independent fuzzy matching for AEC and BAC keys. This ensures corrections can be retrieved even if they are stored under slightly different level-of-theory definitions in the RMG database (e.g., one including the software attribute and the other not).

Key changes include:
- Modified `_get_energy_corrections` to find `aec_key` and `bac_key` independently.
- Added specific search ranges for PBAC and MBAC sections within the database files.
- Updated the `get_qm_corrections.py` script interface to handle separate keys for atom and bond corrections.

Remove unused import "get_arkane_model_chemistry"
Updated the logic for determining the Arkane level of theory to provide logging regarding its source and integrated AEC validation when BAC is not used.

Key changes include:
- Added logging to identify if the Arkane level of theory was explicitly set or inferred from the composite method or single point level.
- Integrated a call to `check_arkane_aec` to verify atom energy corrections when `bac_type` is not specified.
Updated the logic for retrieving QM corrections to handle separate keys for atom energy corrections (AEC) and bond additivity corrections (BAC). This ensures that corrections can be resolved independently if they are stored under different level-of-theory definitions in the RMG database.

Key changes include:
- Updated the script to process `aec_key` and `bac_key` independently.
- Maintained backward compatibility by using `matched_key` as a fallback for `aec_key`.
- Modified BAC retrieval to utilize the dedicated `bac_key`.
calvinp0 added 17 commits April 8, 2026 15:55
Updated Arkane validation logic to provide more granular feedback when atom energy corrections (AEC) or bond additivity corrections (BAC) are missing.

Key changes include:
- Added `check_arkane_aec` to verify atom energy corrections independently when BAC is disabled.
- Enhanced `check_arkane_bacs` to specifically identify and log whether AEC, BAC, or both are missing from the RMG database.
- Improved log reporting to distinguish between PBAC and MBAC types.

Enhance Arkane AEC and BAC validation logging and error handling

Updated the validation logic for Arkane energy corrections to handle potential input errors and provide more granular feedback when entries are missing.

Key changes include:
- Added exception handling to catch cases where Arkane quantum corrections data cannot be loaded from the RMG database.
- Improved logging to specifically identify the matched BAC key and correction type (PBAC or MBAC).
- Added a general warning message when no matching Arkane entry is found for a given level of theory.
Updated the energy correction retrieval logic to perform independent fuzzy matching for AEC and BAC keys. This ensures corrections can be retrieved even if they are stored under slightly different level-of-theory definitions in the RMG database (e.g., one including the software attribute and the other not).

Key changes include:
- Modified `_get_energy_corrections` to find `aec_key` and `bac_key` independently.
- Added specific search ranges for PBAC and MBAC sections within the database files.
- Updated the `get_qm_corrections.py` script interface to handle separate keys for atom and bond corrections.

Remove unused import "get_arkane_model_chemistry"
Updated the logic for determining the Arkane level of theory to provide logging regarding its source and integrated AEC validation when BAC is not used.

Key changes include:
- Added logging to identify if the Arkane level of theory was explicitly set or inferred from the composite method or single point level.
- Integrated a call to `check_arkane_aec` to verify atom energy corrections when `bac_type` is not specified.
Updated the logic for retrieving QM corrections to handle separate keys for atom energy corrections (AEC) and bond additivity corrections (BAC). This ensures that corrections can be resolved independently if they are stored under different level-of-theory definitions in the RMG database.

Key changes include:
- Updated the script to process `aec_key` and `bac_key` independently.
- Maintained backward compatibility by using `matched_key` as a fallback for `aec_key`.
- Modified BAC retrieval to utilize the dedicated `bac_key`.

Support backward compatibility for BAC keys in the QM corrections script

Updated the logic for retrieving QM corrections to handle legacy input formats where atom and bond corrections are not defined as independent keys.

Key changes include:
- Maintained backward compatibility by using `matched_key` as a fallback for `bac_key`, mirroring the existing behavior for `aec_key`.
In the Scheduler, only break after conformer troubleshooting if new jobs (conf_opt or conf_sp) are actually running for that species. This ensures the scheduler correctly falls through to the "all conformers done" check if troubleshooting was attempted but failed to launch new tasks.
Don't trigger a resubmission if "fresh" pending tasks (attempt_index == 0) still exist. The presence of fresh tasks indicates that some workers from the initial submission are still queued in the HPC scheduler; these workers will pick up both fresh and retried tasks once they start, making a new submission unnecessary.
DLPNO methods are incompatible with monoatomic species. This update generalizes the previous H-atom-specific check to all monoatomic species, substituting the DLPNO method with its canonical equivalent and logging a warning. It also adds a monoatomic status flag to the scheduler's job metadata.
Generalize the H-atom-specific check to all monoatomic species when using DLPNO methods in Orca, as these methods are incompatible with single-atom systems that lack electron pairs to correlate.
DLPNO methods are incompatible with monoatomic species. This change generalizes the previous hydrogen-specific check to all monoatomic species and automatically falls back to the canonical method by stripping the 'dlpno-' prefix.
  When a TS guess fails validation (e.g., NMD check), switch_ts picks the next guess but previously left stale state behind:

  1. IRC species from the invalidated guess were never cleaned up. delete_all_species_jobs('TS0') only deletes jobs under the TS0 label, but IRC species like IRC_TS0_1 are separate entries in running_jobs/species_dict/etc. These orphaned species
  continued running in parallel with the new guess, potentially interfering with job processing.
  2. job_types flags (freq, sp, opt) were never reset. After guess N's freq completed, job_types['freq'] = True carried over to guess N+1, causing the scheduler to skip re-running freq for the new geometry.
  3. convergence was never reset to None.
  4. The old line self.output[label]['geo'] = ... wrote to the wrong dict level (top-level keys instead of self.output[label]['paths']), making it dead code.
  5. Pending pipe batches from the old guess were never discarded.
Passing server_job_ids allows the pipe coordinator to monitor the status of jobs on the server, facilitating job cancellation handling as indicated by the branch name.
Moving task spec reading and scratch directory creation into the try-except block ensures that initialization errors are properly caught. This allows the worker to mark the task as failed instead of leaving it stuck in a RUNNING state.
If the underlying scheduler job for a pipe run is no longer alive, any CLAIMED or RUNNING tasks are immediately marked as orphaned and PENDING tasks are cancelled. This ensures the pipe can reach a terminal state and doesn't hang indefinitely when its workers are lost.

.
Use server job IDs to check if a pipe's scheduler job is still in the cluster queue. This allows the coordinator to identify when workers have been lost so that orphaned or stuck tasks can be cleaned up immediately. The implementation includes logic to match both standard and array job ID formats (e.g., Slurm/PBS).

Fixes for SLURM
@github-actions github-actions bot added the Module: trsh Troubleshooting label Apr 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants