Skip to content

Conversation

@unalmis
Copy link
Collaborator

@unalmis unalmis commented Sep 2, 2025

A computationally inferior approach for the alternative style free surface optimization in dp/laplace is implemented in this PR to appease the optimizer. The fact that this approach works, while the original in dp/laplace does not, along with other support from debugging that is not detailed here, implies that the discretization error of the quadrature causes the optimizer to fail in dp/laplace.

The only thing that is changed is that |B_out| is computed from equation 4.26 of the shared paper instead of the equation sandwiched between 4.23 and 4.24. That is, on dp/laplace once $\Phi$ is known, |B_out| is computed directly via Fourier differentiation of the $C^{\infty}$ potential. On this branch, |B_out| is instead backed out from another singular integral transform of $\Phi$ because using that integral transform suppresses the discretization error of the original. This approach currently seems necessary because otherwise the discretization error for the high order polar quadrature is larger than the change in the potential that arises from the optimizer's perturbation of the boundary. I don't expect this to have ever been an issue to debug for forward-style free surface optimization because there the discretization error of $B \cdot n$ is a subset of the objective which is minimized by the optimizer.

This is a patch until a quadrature whose error is less strongly dependent on the boundary surface and integrand is added which should help resolve the issue. I also believe #1208 will be useful to resolve.

Vacuum free surface solutions can be obtained with this patch that could not be obtained on dp/laplace.
vacplot

code for above optimization

@unalmis unalmis self-assigned this Sep 3, 2025
@unalmis unalmis changed the title Patch to work around discretization error affecting optimizer Patch to work around discretization error impeding optimizer Sep 3, 2025
@unalmis
Copy link
Collaborator Author

unalmis commented Sep 3, 2025

@dpanici can you share what you have tried for finite beta

@dpanici
Copy link
Collaborator

dpanici commented Sep 3, 2025

run_laplace_finite_beta_maxiter.nbconvert.ipynb
this notebook shows that even starting from the free-boundary solution from master, using this patch and the neumaan method results in the boundary moving far from the nominal solution

I've not done a lot of tweaking of code yet to try to debug why this is happening but I've confirmed this behavior also on current-constrained tokamak equilibria with coils (as opposed to this notebook which is iota constrained and external field from mgrid)

@dpanici
Copy link
Collaborator

dpanici commented Sep 3, 2025

TODO

  • vacuum free bdry only works for SourceFreeField right now (bugs in obj logic with FreeSurfaceOuterField)
  • only works for same eval and source grid (bugs in logic when the two grids are different)

@unalmis
Copy link
Collaborator Author

unalmis commented Sep 3, 2025

TODO

* [ ]  vacuum free bdry only works for SourceFreeField right now (bugs in obj logic with FreeSurfaceOuterField)

* [ ]  only works for same eval and source grid (bugs in logic when the two grids are different)

this is expected because the patch only supports same grid. The hope is that the discretization error affecting the optimizer can be resolved and we can use dp/laplace without this branch.

@dpanici
Copy link
Collaborator

dpanici commented Sep 3, 2025

  • check obj value btwn the two PRs
  • check obj gradient btwn the two PRs (maybe test with only one bdry mode free)
  • same as above also btwn finite beta and vacuum

@dpanici
Copy link
Collaborator

dpanici commented Sep 4, 2025

Objective value (blue is this PR, orange is the dp/laplace PR) for the correct and stalled (as in, the solution we get from solving with #1360 ) free bdry solution, using the same grids and SourceFreeField resolution used to solve each (which was relatively coarse at M_Phi=N_Phi=4 and grid M=N=12, no symmetry). I also note that using the patch objective to optimize for free bdry starting from the stalled freebdry solution from the dp/laplace PR, it converges fine to the correct free bdry solution (with same resolutions as was used in the stalled problem).

Both have similar but not exactly the same values (perhaps showing that the discretization errors do make a difference in the objective). Both have much lower values at the correct solution than at the stalled solution.

image image

Looking at the gradient (obj.grad) for each, we see relatively similar gradients (vert lines dictate when indices correspond to R_lmn to Z_lmn to L_lmn etc.)

image

@dpanici
Copy link
Collaborator

dpanici commented Sep 4, 2025

the gradients if I use double the eval and source grid resolution (also maybe to be clearer, I set maxiter=0 in each case shown here to avoid any conflation with the iterative fixed point method)

The higher grid resolution seems to make things agree closer between the two, especially for the stalled where this actually matters (the correctly solved eq I am not surprised gradient increased, it is now looking at eval points it did not see in the actual optimization)

However running an optimization with higher grid resolutions with the obj from #1360 does not actually fix the stalling issue

image

@unalmis
Copy link
Collaborator Author

unalmis commented Sep 4, 2025

(also maybe to be clearer, I set maxiter=0 in each case shown here to avoid any conflation with the iterative fixed point method)

Both the original and patched version use iterative method to compute Phi, so that should not be necessary.

@dpanici
Copy link
Collaborator

dpanici commented Sep 4, 2025

notebook I am using:
run_laplace_WORKED_VAC.ipynb

branch that has both objectives I've been using: https://github.com/PlasmaControl/DESC/tree/dp/debug-laplace

@unalmis
Copy link
Collaborator Author

unalmis commented Sep 4, 2025

What I gather from these plots is more confirmation of the issue described in the PR header. I still think switching the quadrature to the method in the overleaf will help solve this and will be more performant.

However it may be worth checking again whether I is still identically zero at all the points of these plots? I wonder if the issue for the finite beta case is that somehow by passing I as a float into the compute function and multiplying by the numpy array theta to compute I * data["theta"] that JAX auto diff looses track of an important gradient. If so we should make a PR that casts the theta and zeta compute quantities on master to jax arrays to avoid such issues.

@unalmis
Copy link
Collaborator Author

unalmis commented Sep 4, 2025

I don't expect this to have ever been an issue to debug for forward-style free surface optimization because there the discretization error of $B \cdot n$ is a subset of the objective which is minimized by the optimizer.

could add B dot n as a residual to prove or disprove.

@dpanici
Copy link
Collaborator

dpanici commented Sep 5, 2025

What I gather from these plots is more confirmation of the issue described in the PR header. I still think switching the quadrature to the method in the overleaf will help solve this and will be more performant.

However it may be worth checking again whether I is still identically zero at all the points of these plots? I wonder if the issue for the finite beta case is that somehow by passing I as a float into the compute function and multiplying by the numpy array theta to compute I * data["theta"] that JAX auto diff looses track of an important gradient. If so we should make a PR that casts the theta and zeta compute quantities on master to jax arrays to avoid such issues.

If that was the case I would think that it would also be problematic for optimizations with FourierCurrentPotentialField objects, but I have done that just fine in the past. I is zero for the equilibria plotted here (or at least very close to zero since it is computed as the integral of B_theta)

@dpanici
Copy link
Collaborator

dpanici commented Sep 5, 2025

I don't expect this to have ever been an issue to debug for forward-style free surface optimization because there the discretization error of
B

n
is a subset of the objective which is minimized by the optimizer.

could add B dot n as a residual to prove or disprove.

which B dot n? n dot (B_coil + grad(phi_periodic))?

@unalmis
Copy link
Collaborator Author

unalmis commented Sep 5, 2025

Yes with B_out dot n as another residual alongside pressure balance

@unalmis
Copy link
Collaborator Author

unalmis commented Sep 5, 2025

If that was the case I would think that it would also be problematic for optimizations with FourierCurrentPotentialField objects, but I have done that just fine in the past.

i think it might matter though because the I in fourier current potential is explicitly marked as an optimizable parameter. our I here is not since it's a derived quantity. i can try after move in

@dpanici
Copy link
Collaborator

dpanici commented Sep 5, 2025

Yes with B_out dot n as another residual alongside pressure balance

Isn't this basically like what the patch is doing? Instead of using Phi we use grad(Phi) and a singular integral to evaluate grad(Phi) on the boundary?

@dpanici
Copy link
Collaborator

dpanici commented Sep 5, 2025

Perhaps additional evidence that I is the likely culprit for the finite beta not working well: this is a zero net tor current (so I~0), finite beta ellipse which exhibits strong shafranov shift despite the modest beta of 0.25%. The patch objective works fine for this (once a quadgrid is used for the forcebalance subobj at least, which I think may be a nuance of this eq and IC as even the usual free bdry obj needs a quadgrid for force balance to be used to converge correctly as well)

image

Shown against some of the vacuum flux surfaces to show that this actually was a finite beta free bdry solve that was non-trivial:

image

This was ran on a stellar GPU on dp/debug-laplace but I think I dont use anything that is not on this patch PR

laplace_finite_beta_zero_current.zip

@unalmis
Copy link
Collaborator Author

unalmis commented Sep 5, 2025

Yes with B_out dot n as another residual alongside pressure balance

Isn't this basically like what the patch is doing? Instead of using Phi we use grad(Phi) and a singular integral to evaluate grad(Phi) on the boundary?

No, both methods use n x grad phi to compute B_out. On dp/laplace the potential cannot contribute a normal component to B_out regardless of discretization error. The patched method computes grad phi via an integral expression which will always contribute some nonzero normal component to B_out due to discretization error.

This is another reason i prefer the mathematics of dp/laplace but apparently there's this bad synergy discussed in the PR header with the quadrature error and the optimizer for this problem.

@dpanici
Copy link
Collaborator

dpanici commented Sep 9, 2025

Perhaps additional evidence that I is the likely culprit for the finite beta not working well: this is a zero net tor current (so I~0), finite beta ellipse which exhibits strong shafranov shift despite the modest beta of 0.25%. The patch objective works fine for this (once a quadgrid is used for the forcebalance subobj at least, which I think may be a nuance of this eq and IC as even the usual free bdry obj needs a quadgrid for force balance to be used to converge correctly as well)

image Shown against some of the vacuum flux surfaces to show that this actually was a finite beta free bdry solve that was non-trivial: image This was ran on a stellar GPU on `dp/debug-laplace` but I think I dont use anything that is not on this patch PR

laplace_finite_beta_zero_current.zip

Note that this also seems to work (or at least gets very close) with the dp/laplace implementation (i.e. not using an additional biot-savart integral to compute |B_out|^2)

image

@dpanici
Copy link
Collaborator

dpanici commented Sep 9, 2025

Perhaps additional evidence that I is the likely culprit for the finite beta not working well: this is a zero net tor current (so I~0), finite beta ellipse which exhibits strong shafranov shift despite the modest beta of 0.25%. The patch objective works fine for this (once a quadgrid is used for the forcebalance subobj at least, which I think may be a nuance of this eq and IC as even the usual free bdry obj needs a quadgrid for force balance to be used to converge correctly as well)
image
Shown against some of the vacuum flux surfaces to show that this actually was a finite beta free bdry solve that was non-trivial:
image
This was ran on a stellar GPU on dp/debug-laplace but I think I dont use anything that is not on this patch PR
laplace_finite_beta_zero_current.zip

Note that this also seems to work (or at least gets very close) with the dp/laplace implementation (i.e. not using an additional biot-savart integral to compute |B_out|^2)

image

some weird behavior tho, using M=N=20 for both grids seems to give better results than using M=N=30 or 40, for the same Phi resolution of M=N=10. I would naively expect that the solution should stay similar if the grid resolution increases but the basis resolution stayed the same. I will think about this more

@unalmis
Copy link
Collaborator Author

unalmis commented Sep 13, 2025

I suggest checking the automatic differentiation derivatives of the objective against finite differences with https://docs.jax.dev/en/latest/jax.test_util.html.

@dpanici
Copy link
Collaborator

dpanici commented Sep 15, 2025

I suggest checking the automatic differentiation derivatives of the objective against finite differences with https://docs.jax.dev/en/latest/jax.test_util.html.

Running on #1360 (not this PR, so checking the desired implementation which we have issues with for the optimizer), jax.test_util.check_grads seems to pass for small enough eps step size. I also did a taylor remainder test with the following code and it seems the difference btwn the gradient and the finite difference approx scales as expected (O(h^2)), this is using the finite beta solovev that the objective (both here and on original PR) seems to not be correct at all for, whether optimizing from an initial state or starting at the solution. This makes me think it may not be a JAX deriv issue, at least not an obvious one.

image
from desc.objectives import FreeSurfaceError, ObjectiveFunction
from desc.magnetic_fields import SourceFreeField, SplineMagneticField
from desc.io import load
from desc.profiles import *
from desc.geometry import *
from desc.equilibrium import Equilibrium
from desc.grid import LinearGrid
import numpy as np
from desc.utils import dot

def taylor_remainder_test(objective, h0, dx,eq,nsteps=3):
    f0=objective.compute_scalar(objective.x(eq))
    g0=objective.grad(objective.x(eq))
    x0 = objective.x(eq)
    fs=[]
    gs=[]
    hs=[]
    remainders=[]
    h=h0
    for i in range(nsteps):
        x=x0 + h * dx
        new_f = objective.compute_scalar(x)
        remainders.append(new_f - f0 - h * dot(g0, dx))
        hs.append(h)
        h = h/2 
    return remainders, hs


# need to specify currents in the coil circuits when using mgrid, just like in VMEC
extcur = [
    3.884526409876309e06,
    -2.935577123737952e05,
    -1.734851853677043e04,
    6.002137016973160e04,
    6.002540940490887e04,
    -1.734993103183817e04,
    -2.935531536308510e05,
    -3.560639108717275e05,
    -6.588434719283084e04,
    -1.154387774712987e04,
    -1.153546510755219e04,
    -6.588300858364606e04,
    -3.560589388468855e05,
]
ext_field = SplineMagneticField.from_mgrid(
    r"/home/dpanici/DESC/tests/inputs/mgrid_solovev.nc", extcur=extcur
)
pres = PowerSeriesProfile([1.25e-1, 0, -1.25e-1])
iota = PowerSeriesProfile([-4.9e-1, 0, 3.0e-1])
surf = FourierRZToroidalSurface(
    R_lmn=[4.0, 1.0],
    modes_R=[[0, 0], [1, 0]],
    Z_lmn=[-1.0],
    modes_Z=[[-1, 0]],
    NFP=1,
)

eq_init = Equilibrium(M=10, N=0, Psi=1.0, surface=surf, pressure=pres, iota=iota)
eq_init.solve(verbose=0);

free_out = SourceFreeField(eq_init.surface, M=20, N=0, sym=False, B0=ext_field)
eval_grid = LinearGrid(M=20, N=0, NFP=64)
grid = LinearGrid(M=40, N=0, NFP=64)
objective = ObjectiveFunction(FreeSurfaceError(eq=eq_init, field=free_out,
                                               eval_grid=eval_grid, grid=grid, 
                                               maxiter=30,chunk_size=25),
                              jac_chunk_size=25)
objective.build()

h0 = 1e-3
dx = np.random.rand(objective.x(eq_init).size)
rms, hs = taylor_remainder_test(objective, h0, dx,eq_init, nsteps=8)


import matplotlib.pyplot as plt
plt.figure()
plt.scatter(hs, rms)
plt.xscale("log")
plt.yscale("log")
plt.plot(hs, 1e6*np.array(hs)**2,"--",label="h**2")
plt.legend()

@dpanici
Copy link
Collaborator

dpanici commented Sep 15, 2025

Things to check

@unalmis
Copy link
Collaborator Author

unalmis commented Sep 25, 2025

I'll try some where net toroidal current profile is constrained rather than the rotational transform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants