Skip to content

Add clean-room math kit; drop FFTW link and LGPL quadrature#290

Closed
krystophny wants to merge 6 commits into
drop-gsl-collision-freqsfrom
math-kit
Closed

Add clean-room math kit; drop FFTW link and LGPL quadrature#290
krystophny wants to merge 6 commits into
drop-gsl-collision-freqsfrom
math-kit

Conversation

@krystophny

@krystophny krystophny commented Jun 9, 2026

Copy link
Copy Markdown
Member

Adds src/math/, a clean-room math kit (modules neo_bessel_i, neo_bessel_k, neo_dawson, neo_gauss_kronrod, neo_gauss_quadrature, neo_fft), and swaps the two license-problematic dependencies onto it: magfie no longer links GPL FFTW, and transport no longer carries the LGPL IQPACK gen_laguerre_rule.f90.

Resolves #287
Resolves #288

Stacked on #289 (license-docs); the diff includes its THIRD_PARTY.md commit until that PR merges.

Modules

Module API Algorithm sources
neo_bessel_i bessel_in(n, x) (elemental), bessel_in_array(nmax, x, values) DLMF 10.25.2 power series for small x; Miller downward recurrence normalized by DLMF 10.29.1 / A&S 9.6.36; DLMF 10.40.1 asymptotic expansion for large x
neo_bessel_k bessel_kn(n, x) (pure) DLMF 10.31.1/10.31.2 power series for x <= 3; Chebyshev fits of the DLMF 10.40.2 asymptotic form for x > 3; upward recurrence DLMF 10.29.1
neo_dawson dawson(x) (elemental) Maclaurin series DLMF 7.6.4; Rybicki sampling-theorem method (Computers in Physics 3, 1989) for moderate x; asymptotic expansion DLMF 7.12.2
neo_gauss_kronrod integrate_gk(f, a, b, epsabs, epsrel, result, abserr, ierr[, key, limit]), keys 15/21/31/61 QUADPACK QAG semantics (Piessens et al. 1983, public domain); GK nodes/weights re-derived with mpmath at 50 digits via Stieltjes-polynomial construction
neo_gauss_quadrature gauss_legendre(n, x, w), gauss_legendre_ab(n, a, b, x, w), gauss_gen_laguerre(n, alpha, x, w) Legendre: Newton iteration on P_n roots (A&S 22.16.6 start, 25.4.29 weights); generalized Laguerre: Golub-Welsch via LAPACK dsterf with Christoffel-number weights (Szego Thm 3.4.2, DLMF 3.5.32)
neo_fft fft_r2c(x, c[, plan]), fft_c2c(z, sign), neo_fft_plan_t + neo_fft_plan_init Stockham self-sorting mixed-radix 2/3/4/5 (Temperton, JCP 52, 1983); Bluestein chirp-z (IEEE TAU 18, 1970) for arbitrary n; r2c via half-length complex packing (Cooley/Lewis/Welch 1970). FFTW forward convention, unnormalized

All coefficient tables (series, Chebyshev, quadrature nodes/weights) were generated or re-derived independently with mpmath at 40 to 50 digits; only DLMF, A&S, and the cited papers were consulted, no GSL/FFTW/Numerical Recipes source.

Thread safety

No module-level mutable state, no save, no allocations in scalar hot paths. All special-function routines are pure or elemental; the Gauss-Kronrod workspace is a local automatic array; the FFT plan is a caller-owned read-only table. This removes the GSL-workspace and FFTW-plan sharing hazards in OpenMP regions (the parallel loop in coil_tools now reuses one read-only plan across threads).

Dependency swaps

  • src/transport/gauss_laguerre_integration: init_gauss_laguerre_integration now calls gauss_gen_laguerre; gen_laguerre_rule.f90 (2175 lines, LGPL) deleted. test_transport passes with its hardcoded expected values unchanged.
  • src/magfie/coil_tools.f90: biot_savart_fourier and vector_potential_biot_savart_fourier use fft_r2c with a plan built once per nphi; src/fftw3.F90 deleted, FFTW removed from link lines, CI packages, and the superbuild. FFTW detection remains only to enable the optional test_fft_oracle.

Benchmarks (gfortran -O2, Apple Silicon)

Case ours GSL / FFTW ratio
bessel_in, scalar, mixed n in {0..100} 101-103 ns/call 112-113 ns/call 1.1x faster
bessel_in_array, nmax=200 587-633 ns/call 712-746 ns/call 1.2x faster
bessel_kn, n in 0..10, x in [0.1, 30] 56-63 ns/call 87-89 ns/call 1.4x faster
dawson, x in (0, 12.5] 21.1 ns/call 56.7 ns/call 2.7x faster
integrate_gk, smooth exp on [0,1], single panel 74.3 ns/call 88.4 ns/call 1.2x faster
integrate_gk, sin(50x) on [0,1], adaptive (315 evals) 2005 ns/call 2067 ns/call 1.0x
gauss_legendre, n=33 (GSL Newton path) 2211 ns/call 5182 ns/call 2.3x faster
gauss_gen_laguerre, n=32, alpha=2.5 17.7 us/call 23.1 us/call 1.3x faster
fft_r2c, n=256 / 1024 / 4096, plan reuse 460 / 2037 / 8585 ns 254 / 1129 / 5345 ns 1.6-1.8x slower than FFTW

The FFT trails FFTW's SIMD codelets (3-4x at n=64/128, under 2x for n >= 256) but is fast enough for the coil_tools Fourier path, where the Biot-Savart evaluation dominates.

Verification

Full suite (FC=gfortran CC=cc make test, 2026-06-10):

83% tests passed, 14 tests failed out of 82
Total Test time (real) =  35.89 sec

All 14 failures are pre-existing environment failures, identical on main on this machine (optional map2disc/chartmap Python module not installed, plus tilted_coil_field_validation missing the libneo Python module): setup_chartmap_volume, setup_vmec_chartmap_python, validate_vmec_rho_lcfs_auto, validate_vmec_map2disc_chartmap_boundary_{python,cli,python_padded}, plot_vmec_map2disc_chartmap_boundary_{python,cli,python_padded}, test_chartmap_coordinates, test_python_chartmap_generator, test_vector_conversion, test_refcoords_file_detection, tilted_coil_field_validation.

New math and swap-affected tests:

20/82 Test #20: test_bessel_i ............................   Passed    0.24 sec
21/82 Test #21: test_bessel_k ............................   Passed    0.24 sec
22/82 Test #22: test_dawson ..............................   Passed    0.24 sec
23/82 Test #23: test_gauss_kronrod .......................   Passed    0.24 sec
24/82 Test #24: test_gauss_quadrature ....................   Passed    0.24 sec
25/82 Test #25: test_fft .................................   Passed    0.02 sec
26/82 Test #26: test_fft_oracle ..........................   Passed    2.01 sec
28/82 Test #28: test_bessel_i_oracle .....................   Passed    0.24 sec
29/82 Test #29: test_bessel_k_oracle .....................   Passed    0.53 sec
30/82 Test #30: test_dawson_oracle .......................   Passed    1.85 sec
31/82 Test #31: test_gauss_kronrod_oracle ................   Passed    0.39 sec
32/82 Test #32: test_gauss_quadrature_oracle .............   Passed    0.26 sec
33/82 Test #33: test_transport ...........................   Passed    0.01 sec
72/82 Test #72: test_coil_tools_biot_savart ..............   Passed    0.23 sec
74/82 Test #74: tilted_coil_fourier_modes ................   Passed    1.68 sec

Oracle agreement (max relative error vs GSL/FFTW, plus independent mpmath references):

Family vs GSL / FFTW tol vs mpmath (40+ digits)
bessel_i 7.1e-14 scalar, 7.2e-16 array 5e-13 4.9e-15 (128-pt table), 5.0e-15 (5100-pt sweep)
bessel_k 1.2e-13 (4649 pts, n to 200) 1e-13 + GSL err est 1.5e-14 (169 cases)
dawson 1.9e-15 (60001 pts) 1e-13 9.4e-17 table, 6.7e-16 dense sweep
gauss_kronrod 7.9e-14 (24-case battery) epsrel 1e-10 within 1e-10 of 40-digit refs; abserr a true bound on every case
gauss_quadrature 4.2e-14 (table n), 9.1e-11 (GSL Newton n, GSL-attributed) 1e-13 / 1e-9 nodes 1e-14 abs, weights 1e-13 rel down to w ~ 2e-101
fft 8.9e-17 scaled, 23 sizes incl. primes 97/1009/4093 1e-13 1.0e-17 on embedded constants

test_transport passes with its pre-existing hardcoded expected values unchanged, proving numerical equivalence of the Laguerre swap. Likewise tilted_coil_fourier_modes and test_coil_tools_biot_savart pass unchanged after the FFT swap.

No-FFTW build proof (-DCMAKE_DISABLE_FIND_PACKAGE_FFTW=TRUE, scratch tree):

-- FFTW not found, skipping FFT oracle test
-- Configuring done (4.9s)
-- Generating done (0.1s)
[177/178] Linking Fortran static library src/math/libneo_math.a
[178/178] Linking Fortran static library src/magfie/libmagfie.a
NOFFTW_MAGFIE_OK

No FFTW symbol remains in src/ outside the convention note in src/math/fft.f90:

$ rg -n -i fftw src/ | rg -v 'src/math/fft.f90'
(no output)

Closes #292

The MIT label in LICENSE covers only original code. Add THIRD_PARTY.md
listing what else ships and under which terms: vendored LGPL
gen_laguerre_rule.f90 (replacement tracked in #288), Apache-2.0
ninja_syntax.py, BSD-3 findFFTW cmake module, the GPL FFTW link
dependency (removal tracked in #287), permissive HDF5/NetCDF/BLAS/
LAPACK, GCC runtimes under the Runtime Library Exception, the optional
GSL test oracle (never conveyed, so no GPL obligation arises), and the
separately installed Python dependencies. Point to it from the README.
krystophny added a commit to itpplasma/KAMEL that referenced this pull request Jun 10, 2026
Replace all GSL usage with in-tree numerics and the libneo math kit.

KIM (Fortran):
- erf: Fortran 2008 intrinsic instead of gsl_sf_erf binding
- modified Bessel I_n: neo_bessel_i from libneo (LIBNEO::math)
- delete gsl_mod.f90, unlink GSL from KIM_lib

KiLCA / QL-Balance (C++), new KiLCA/math/numeric modules:
- adaptive_quad: Gauss-Kronrod QAG with QUADPACK semantics
  (public-domain Netlib constants and error tests)
- quadpack_qagi: QAGI/QAGIU as a faithful translation of the
  public-domain QUADPACK dqagie/dqk15i/dqelg/dqpsrt, including the
  epsilon-algorithm extrapolation; matches gsl_integration_qagi to
  3e-16 over 216 resonant vel_integral parameter sets
- ode_rk8pd: Prince-Dormand RK8(7)13M stepper; step control and
  evolve semantics follow the gsl_odeiv documented behavior
  (truncated final-step suggestion, no-progress acceptance);
  single steps match GSL bitwise in 98 percent of probes
- brent_root: Brent-Dekker bracketing solver
- deriv_central: adaptive 5-point central difference
- sort_index_doubles (std::sort) replaces gsl_heapsort_index
- 2D Newton iteration replaces gsl_multiroot hybridsj in the
  eigenmode determinant zero search
- delete dead GSL-based code paths (Levin acceleration variant,
  circle-integral quadrature, unused analytic Jacobians)

Build system: remove cmake/FetchGSL.cmake and all GSL link and
include references; KIM_lib now links LIBNEO::math.

Requires itpplasma/libneo math kit (itpplasma/libneo#290).
gfortran emits executable-stack trampolines for internal procedures
passed as actual arguments, rejected by -Werror=trampolines. Move
the callbacks and their shared neval counter into a module.

Fixes #292
On macOS with Homebrew LLVM, CMake finds flang before gfortran.
Flang does not expose OpenMP to FindOpenMP, breaking the build.
Pin gfortran since libneo does not support flang.
@krystophny

Copy link
Copy Markdown
Member Author

Superseded by an atomic fortnum-based stack that drops the same dependencies without adding an in-tree math kit. fortnum already provides bessel_i/k, dawson, fft, gauss_kronrod, gauss_legendre, and gauss_gen_laguerre, so the avoidable numerical deps are removed by depending on fortnum (FetchContent, tag a7faa3c) rather than re-homing the math in libneo.

Replacement PRs (a stack off main):

Both tips build green and the FFT (Biot-Savart Fourier) and transport (D11/D12) tests pass. The collision_freqs incomplete-gamma move to fortnum is handled separately by #316. Leaving this open for a human to close.

@krystophny

Copy link
Copy Markdown
Member Author

Superseded by the fortnum-based PRs: #316 (collision_freqs gamma), #317 (coil_tools FFT), #318 (transport gen-Laguerre). The clean-room math kit is replaced by depending on lazy-fortran/fortnum (which now provides bessel I/K, dawson, FFT, Gauss-Kronrod, Gauss-Legendre, and generalized Gauss-Laguerre). All three replacement PRs build green. Closing in favor of them.

@krystophny krystophny closed this Jun 14, 2026
krystophny added a commit that referenced this pull request Jun 15, 2026
## Merge order

Merge sequence: #316 -> #317 -> #318.

Each PR is based on `main` individually, so each diff is cumulative
against `main` and the three share code. Merge them in order. Once a
predecessor merges, the next PR's diff shrinks to its own increment.

## Scope

src/magfie/coil_tools.f90 used FFTW3 for the one-dimensional
real-to-complex transforms in `biot_savart_fourier` and
`vector_potential_biot_savart_fourier`. This re-homes the field of the
math-kit PR #290 / FFTW-drop PR #287 by routing those transforms through
fortnum instead of an in-tree math kit. fortnum `fft_r2c` shares FFTW's
unnormalized forward convention and `n/2+1` complex packing, so the
transform output is identical and both public subroutine interfaces are
unchanged. The FFTW `find_package` / `ExternalProject` paths and the
bundled `cmake/Modules/findFFTW` module are removed; fortnum is fetched
via `FetchContent` at tag `a7faa3c`, guarded by `if(NOT TARGET
fortnum)`.

## Dependency removed

FFTW3 (`src/fftw3.F90` binding wrapper, `FFTW::Double` /
`FFTW::DoubleThreads` link, FFTW detection and from-source
ExternalProject).

## Verification

Configure (no FFTW detection runs; fortnum fetched at a7faa3c):

```
-- === Detecting External Dependencies ===
-- Found system HDF5: 2.1.1
--   -> ABI compatible, using system HDF5
--   -> ABI compatible, using system NetCDF
-- === Dependency Detection Complete ===
-- Configuring done (9.8s)
-- Build files have been written to: /tmp/libneo-l1
```

Build: `[570/570] Linking Fortran shared module
_efit_to_boozer.cpython-314-x86_64-linux-gnu.so`

No FFTW symbols remain in the magfie archive:

```
$ nm /tmp/libneo-l1/src/magfie/libmagfie.a | grep -i fftw
$ echo $?   # (no matches)
```

FFT-path tests (Biot-Savart Fourier modes and field validation) pass:

```
1/6 Test #50: test_biotsavart ..................   Passed    0.45 sec
2/6 Test #52: test_biotsavart_field ............   Passed    0.07 sec
3/6 Test #59: test_coil_tools_biot_savart ......   Passed   29.23 sec
4/6 Test #60: tilted_coil_generate_geometry ....   Passed    0.13 sec
5/6 Test #61: tilted_coil_fourier_modes ........   Passed  199.77 sec
6/6 Test #62: tilted_coil_field_validation .....   Passed   12.36 sec

100% tests passed, 0 tests failed out of 6
```

Note: fortnum pin updated to current main (92de6e9) after a fortnum
history rewrite; old shas no longer resolve.
krystophny added a commit that referenced this pull request Jun 15, 2026
…le (#318)

## Merge order

Merge sequence: #316 -> #317 -> #318.

Each PR is based on `main` individually, so each diff is cumulative
against `main` and the three share code. Merge them in order. Once a
predecessor merges, the next PR's diff shrinks to its own increment.

## Scope

src/transport/gen_laguerre_rule.f90 was an LGPL generalized
Gauss-Laguerre quadrature generator (`cgqf` / `r8_huge` / `rule_write`).
This re-homes the field of the math-kit PR #290 /
LGPL-quadrature-removal PR #288 by routing the rule through fortnum
instead of an in-tree math kit. `init_gauss_laguerre_integration` called
the LGPL code only to build the order-32 rule for the weight `x^alpha
exp(-x)` on `[0, inf)` with `alpha = 5/2` (D11) and `7/2` (D12); the
`a=0, b=1` scaling is the identity. fortnum `gauss_gen_laguerre(n,
alpha, x, w)` returns exactly that rule (Golub-Welsch), so the public
interfaces of `init_gauss_laguerre_integration` and `calc_D_one_over_nu`
are unchanged and the D11/D12 reference values reproduce within
tolerance.

Stacked on #317 (FFTW drop); review and merge that first.

## Dependency removed

The LGPL `src/transport/gen_laguerre_rule.f90` quadrature generator (and
the dead `calculate_gauss_laguerre_rule` / `write_to_file` / `debugging`
state that only fed it).

## Verification

Build (stack tip, both aspects applied):

```
[568/568] ...
ninja: no work to do.
```

No LGPL quadrature symbols remain in the transport archive (only the
fortnum reference):

```
$ nm /tmp/libneo-l2/src/transport/libtransport.a | grep -iE 'cgqf|r8_huge|rule_write|gen_laguerre'
                 U __fortnum_quadrature_MOD_gauss_gen_laguerre
```

Transport test (D11e/D11i/D12e/D12i within 5% of references) passes:

```
    Start 20: test_transport
1/1 Test #20: test_transport ...................   Passed    0.11 sec

100% tests passed, 0 tests failed out of 1
```

Note: fortnum pin updated to current main (92de6e9) after a fortnum
history rewrite; old shas no longer resolve.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant