Skip to content

Migrating ARC to Python 14#867

Draft
alongd wants to merge 5 commits intomainfrom
py14
Draft

Migrating ARC to Python 14#867
alongd wants to merge 5 commits intomainfrom
py14

Conversation

@alongd
Copy link
Copy Markdown
Member

@alongd alongd commented Apr 11, 2026

Bump the target runtime from 3.12 to 3.14 and modernize the codebase accordingly.

  • Update Python, Cython (≥3.1), and RDKit (≥2026.03) pins across environment.yml, requirements.txt, Dockerfile, and pyproject.toml
  • Fix stale check_python() gate that still referenced Python 3.7 / RMG-Py
  • Remove dead from future import print_function and deprecated assertDictEqual
  • Replace deprecated typing generics with built-in equivalents (list, dict, X | None, X | Y, collections.abc) across 70 files (−828 lines net)

RMG's separate Python 3.9 environment is unchanged.



def get_git_branch(path: Optional[str] = None) -> str:
def get_git_branch(path: str | None = None) -> str:
Comment on lines +1307 to +1309
def is_same_pivot(torsion1: list | str,
torsion2: list | str,
) -> bool | None:
@alongd alongd marked this pull request as draft April 11, 2026 10:32
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates ARC’s runtime baseline to Python 3.14 and modernizes the codebase accordingly (dependency pins, version gating, and widespread typing updates).

Changes:

  • Bump Python target/runtime metadata (3.12 → 3.14) and update dependency minimums (e.g., Cython/RDKit) across env/packaging artifacts.
  • Update Python-version gate logic and remove deprecated/legacy Python constructs.
  • Replace deprecated typing generics with built-in generics / X | None annotations across many modules.

Reviewed changes

Copilot reviewed 76 out of 77 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
utilities.py Updates runtime Python version gate messaging/threshold.
requirements.txt Raises minimum Python/RDKit/Cython versions (but see review comments).
pyproject.toml Adds Python version metadata for packaging.
environment.yml Pins conda env to Python 3.14 and updates Cython/RDKit pins.
Dockerfile Builds ARC micromamba env with Python 3.14.
arc/utils/scale.py Modernizes typing annotations and removes legacy typing imports.
arc/statmech/factory.py Modernizes typing annotations for factory registration and adapter creation.
arc/species/vectors.py Modernizes typing annotations for vector utilities.
arc/scripts/rmg_thermo.py Updates typing annotations in the RMG thermo helper script.
arc/scripts/rmg_kinetics.py Updates typing annotations in the RMG kinetics helper script.
arc/scripts/pipe_worker.py Updates typing annotations in the pipe-mode worker script.
arc/scripts/common.py Updates typing annotations for YAML I/O helpers.
arc/processor.py Updates typing annotations around postprocessing helpers.
arc/parser/factory.py Modernizes adapter registration typing.
arc/parser/adapters/yaml.py Modernizes typing annotations for YAML parsing adapter.
arc/parser/adapters/xtb.py Modernizes typing annotations for xTB parsing adapter.
arc/parser/adapters/terachem.py Modernizes typing annotations for TeraChem parsing adapter.
arc/parser/adapters/qchem.py Modernizes typing annotations for Q-Chem parsing adapter.
arc/parser/adapters/psi_4.py Modernizes typing annotations for Psi4 parsing adapter.
arc/parser/adapters/orca.py Modernizes typing annotations for ORCA parsing adapter.
arc/parser/adapters/molpro.py Modernizes typing annotations for Molpro parsing adapter.
arc/parser/adapters/cfour.py Modernizes typing annotations for CFOUR parsing adapter.
arc/parser/adapter.py Modernizes abstract adapter typing for ESS parsers.
arc/output.py Modernizes typing annotations while writing consolidated output.yml.
arc/molecule/resonance.py Updates typing annotations for resonance utilities.
arc/molecule/molecule_test.py Replaces deprecated assertDictEqual usage.
arc/molecule/draw.py Updates typing annotations for drawing utilities.
arc/mapping/driver.py Updates typing annotations for reaction mapping utilities.
arc/main.py Modernizes typing annotations in ARC main class and helpers.
arc/level.py Updates typing annotations and switches to collections.abc.Iterable.
arc/job/ssh.py Modernizes typing annotations for SSH client utilities.
arc/job/pipe/pipe_state.py Modernizes typing annotations for pipe state/records.
arc/job/pipe/pipe_planner.py Modernizes typing annotations for pipe task planning.
arc/job/pipe/pipe_coordinator.py Modernizes typing annotations for coordinating pipe runs.
arc/job/local.py Modernizes typing annotations for local execution helpers.
arc/job/factory.py Modernizes typing annotations for job adapter factory/registration.
arc/job/adapters/xtb_adapter.py Modernizes typing annotations for xTB job adapter.
arc/job/adapters/ts/xtb_gsm.py Modernizes typing annotations for xTB-GSM TS adapter.
arc/job/adapters/ts/orca_neb.py Modernizes typing annotations for ORCA NEB TS adapter.
arc/job/adapters/ts/kinbot_ts.py Modernizes typing annotations for KinBot TS adapter.
arc/job/adapters/ts/gcn_ts.py Modernizes typing annotations for GCN TS adapter.
arc/job/adapters/ts/autotst_ts.py Modernizes typing annotations for AutoTST TS adapter.
arc/job/adapters/torch_ani.py Modernizes typing annotations for TorchANI adapter/scripts.
arc/job/adapters/terachem.py Modernizes typing annotations for TeraChem job adapter.
arc/job/adapters/scripts/xtb_gsm/tm2orca.py Removes dead __future__ import usage.
arc/job/adapters/scripts/tani_script.py Modernizes typing annotations in helper script.
arc/job/adapters/scripts/autotst_script.py Modernizes typing annotations in helper script.
arc/job/adapters/qchem.py Modernizes typing annotations for Q-Chem job adapter.
arc/job/adapters/psi_4.py Modernizes typing annotations for Psi4 job adapter.
arc/job/adapters/orca.py Modernizes typing annotations for ORCA job adapter.
arc/job/adapters/obabel.py Modernizes typing annotations for OpenBabel job adapter.
arc/job/adapters/molpro.py Modernizes typing annotations for Molpro job adapter.
arc/job/adapters/mockter.py Modernizes typing annotations for mock adapter.
arc/job/adapters/gaussian.py Modernizes typing annotations for Gaussian job adapter.
arc/job/adapters/common.py Modernizes typing annotations for shared job-adapter helpers.
arc/job/adapters/cfour.py Modernizes typing annotations for CFOUR job adapter.
arc/job/adapter.py Modernizes typing annotations for the base JobAdapter API.
arc/checks/common.py Modernizes typing annotations for common checks helpers.
Comments suppressed due to low confidence (1)

arc/output.py:37

  • write_output_yml() still annotates parameters as Dict/List, but those names are no longer imported (and there is no postponed annotation evaluation). This will raise NameError when importing arc.output. Replace these with built-in generics (dict[...], list[...]) or re-import the typing aliases, and apply the same fix to other Dict/List annotations in this file.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +82 to 85
def _output_command_error_message(command: list[str],
error: subprocess.CalledProcessError,
logging_func: Union[logger.warning, logger.error],
logging_func: logger.warning | logger.error,
) -> None:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_output_command_error_message() uses logging_func: logger.warning | logger.error as a type annotation. Since this module does not use postponed annotation evaluation, this expression is evaluated at import time and will raise TypeError (bitwise-or between bound methods). Use a Callable[[object], object]/Callable[..., Any] annotation instead (or stringize via from __future__ import annotations).

Copilot uses AI. Check for mistakes.


def combine_parameters(input_dict: dict, terms: list) -> Tuple[dict, List]:
def combine_parameters(input_dict: dict, terms: list) -> tuple[dict, List]:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combine_parameters() annotates its return type as tuple[dict, List], but List is no longer imported in this module. Because annotations are evaluated at function definition time here, importing this module will raise NameError: name 'List' is not defined. Replace List with a built-in list[...] type (or re-import List).

Suggested change
def combine_parameters(input_dict: dict, terms: list) -> tuple[dict, List]:
def combine_parameters(input_dict: dict, terms: list) -> tuple[dict, list]:

Copilot uses AI. Check for mistakes.
Comment on lines 68 to 70
@abstractmethod
def parse_frequencies(self) -> Optional['np.ndarray']:
def parse_frequencies(self) -> 'np.ndarray' | None:
"""
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module now uses return annotations like 'np.ndarray' | None (e.g. parse_frequencies). Without postponed annotation evaluation (from __future__ import annotations), this is evaluated at import time and raises TypeError because it's a str | None expression. Use np.ndarray | None (and import numpy at runtime) or use a single quoted forward reference like 'np.ndarray | None' / enable postponed evaluation.

Copilot uses AI. Check for mistakes.
Comment on lines 48 to 59
def parse_frequencies(self) -> 'np.ndarray' | None:
"""
Parse the frequencies from a freq job output file.

Returns: Optional[np.ndarray]
Returns: np.ndarray | None
The parsed frequencies (in cm^-1).
"""
freqs = self.data.get('freqs')
return np.array(freqs, dtype=np.float64) if freqs else None

def parse_normal_mode_displacement(self) -> Tuple[Optional['np.ndarray'], Optional['np.ndarray']]:
def parse_normal_mode_displacement(self) -> tuple['np.ndarray' | None, 'np.ndarray' | None]:
"""
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file uses return annotations like 'np.ndarray' | None / tuple['np.ndarray' | None, ...] (see parse_frequencies / parse_normal_mode_displacement). Since annotations are evaluated immediately here, this will raise TypeError at import time. Prefer np.ndarray | None (no quotes) or enable postponed evaluation of annotations, and make sure np is available if using it directly.

Copilot uses AI. Check for mistakes.
return None

def parse_nd_scan_energies(self) -> Optional[Dict]:
def parse_nd_scan_energies(self) -> Dict | None:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_nd_scan_energies() is annotated as -> Dict | None, but Dict is no longer imported. With eager annotation evaluation, importing this module will raise NameError: name 'Dict' is not defined. Change this to dict | None (or re-import Dict).

Copilot uses AI. Check for mistakes.
return None

def parse_nd_scan_energies(self) -> Optional[Dict]:
def parse_nd_scan_energies(self) -> Dict | None:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_nd_scan_energies() is annotated as -> Dict | None but Dict is not imported in this module. With eager annotation evaluation, importing the module will raise NameError. Use dict | None (or re-import Dict).

Copilot uses AI. Check for mistakes.
return None

def parse_nd_scan_energies(self) -> Optional[Dict]:
def parse_nd_scan_energies(self) -> Dict | None:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_nd_scan_energies() is annotated as -> Dict | None but Dict is not imported in this module anymore. This causes a NameError during import unless annotations are postponed. Replace Dict with dict (or re-import Dict).

Copilot uses AI. Check for mistakes.
return None

def parse_nd_scan_energies(self) -> Optional[Dict]:
def parse_nd_scan_energies(self) -> Dict | None:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_nd_scan_energies() is annotated as -> Dict | None but Dict is no longer imported in this module. This will raise NameError when importing the module. Use dict | None (or re-import Dict).

Copilot uses AI. Check for mistakes.
return None

def parse_nd_scan_energies(self) -> Optional[Dict]:
def parse_nd_scan_energies(self) -> Dict | None:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_nd_scan_energies() is annotated as -> Dict | None but Dict is not imported in this module. This will raise NameError at import time. Replace with dict | None (or re-import Dict).

Copilot uses AI. Check for mistakes.
)


def write_input_file(self, settings : dict()) -> None:
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write_input_file() annotates settings as dict(), which is a value expression evaluated at function definition time, not a type. This creates an empty dict object in __annotations__ and defeats type checking/introspection. Use dict or a parameterized type like dict[str, Any] instead.

Suggested change
def write_input_file(self, settings : dict()) -> None:
def write_input_file(self, settings: dict) -> None:

Copilot uses AI. Check for mistakes.
@calvinp0
Copy link
Copy Markdown
Member

calvinp0 commented Apr 11, 2026

I can't seem to properly comment on the line, and I know this has gone back to draft and I was not requested to review but I did see that the output.py still has Dict in it without either being imported or changed over

from arc.common import get_logger
from arc.exceptions import DependencyError
from arc.molecule.converter import to_rdkit_mol, from_rdkit_mol, to_ob_mol, from_ob_mol
from arc.molecule.converter import to_rdkit_mol, from_rdkit_mol
from arc.common import get_logger
from arc.exceptions import DependencyError
from arc.molecule.converter import to_rdkit_mol, from_rdkit_mol, to_ob_mol, from_ob_mol
from arc.molecule.converter import to_rdkit_mol, from_rdkit_mol


def write_input_file(self, settings : dict()) -> None:
def write_input_file(self, settings: dict) -> None:
alongd and others added 4 commits April 12, 2026 08:01
Update all version pins across environment.yml, requirements.txt,
Dockerfile, and pyproject.toml for the Python 3.12 → 3.14 transition.
Bump Cython (≥3.1 for 3.14 Limited API) and RDKit (≥2026.03 for 3.14
wheels).  Fix the stale check_python() gate in utilities.py that still
referenced Python 3.7.  Remove dead `from __future__ import
print_function` in tm2orca.py and deprecated assertDictEqual in
molecule_test.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace deprecated typing generics with built-in equivalents across
70+ files:

  - List/Dict/Tuple/Set/Type → list/dict/tuple/set/type
  - Optional[X] → X | None
  - Union[X, Y] → X | Y
  - Sequence/Iterable/Callable → from collections.abc
  - typing.Match → re.Match

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OpenBabel is effectively unmaintained (no Python 3.13+ build on
conda-forge, stalled upstream PR since July 2025).  RDKit already
handles every critical path as the primary backend.

Deleted:
  - obabel.py adapter, ob_script.py, and their tests
  - to_ob_mol/from_ob_mol, xyz_to_pybel_mol, pybel_to_inchi
  - openbabel_force_field, openbabel_force_field_on_rdkit_conformers,
    mix_rdkit_and_openbabel_force_field
  - OB perception fallback in alternative_perception()
  - _openbabel_translator backend (RDKit is now sole backend)
  - ob_environment.yml, install_ob.sh, install-ob Make target
  - All settings, enums, and data references

The only loss is GAFF force field support (RDKit has MMFF94s/UFF).
Accepts 'openbabel-first' backend kwarg for backwards compat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use explicit micromamba run -n arc_env for the docs build step
(micromamba-shell defaulted to rmg_env).  Add custom cache key
to bust stale Python 3.12 environment cache.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The typing modernization script replaced np.array with np.ndarray in
function CALLS as well as annotations.  Restore np.array() calls while
keeping np.ndarray in type annotations.  Also fix OB-specific SMILES
'C(1)CC(1)' → 'C1CC1' in test_scissors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants