Skip to content

Conversation

@Lomholy
Copy link
Collaborator

@Lomholy Lomholy commented Aug 14, 2025

Allows Phonon_simple to use more than 1 solution, by removing an if statement, and extending the vf_list to be as long as the new input parameters, that allow for finetuning scan steps below and above the elastic line.

Also change the parms array to be a struct, containing names of the parameters.

Lomholy and others added 4 commits July 10, 2025 11:25
Merge newest commit to phonon branch
Make sure that vf_list has length of 100 and allow the user to choose the amount of scan steps actually taken, if they need to optimise the instrument. This counteracts overflow of vf_list.

Also change from parms array to parms struct in order to improve readability.
@willend
Copy link
Contributor

willend commented Aug 14, 2025

@Lomholy thanks, will have a closer look later - but one non-McStas-3 (and GPU friendly) feature you added is the structure that contains mixed component and neutron data...

They must be separated to avoid thread collisions on GPU.

Functions applied in TRACE should really instead make use of _class_particle *_particle in the footprint. This carries all neutron state parameter and allows seamless calls to e.g. the rand01() and other RNG-oriented macros.

@willend
Copy link
Contributor

willend commented Aug 14, 2025

Just to quickly illustrate the collision issue, I get megabytes of these "FATAL ERROR" messages running on GPU - and the results are completely off ;-)

FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
FATAL ERROR: Did not hit cylinder from inside.
*** TRACE end *** 

Save [Samples_Phonon]
Detector: mon1_I=1.69899e-25 mon1_ERR=7.12726e-29 mon1_N=7.10252e+06 "e.dat"
<E> : 13.6705 meV , E-width : 4.13346 meV 

Finally [Samples_Phonon: synts3]. Time: 4 [s] 
INFO: Placing instr file copy Samples_Phonon.instr in dataset synts3
INFO: Placing generated c-code copy Samples_Phonon.c in dataset synts3
[pkwi@hypatia rerun]$ grep Example Samples_Phonon.instr 
* %Example: E=10 -n 1e5 Detector: mon1_I=2.86265e-25

@willend
Copy link
Contributor

willend commented Aug 18, 2025

Why not simply use the normal particle struct?

@Lomholy
Copy link
Collaborator Author

Lomholy commented Aug 18, 2025

I reckoned that particle struct didn't have the intermediate values that will be changed during the calculations

@Lomholy
Copy link
Collaborator Author

Lomholy commented Aug 18, 2025

Would you make a struct that wraps the particle struct inside it? I don't think I am quite following you there

@willend
Copy link
Contributor

willend commented Aug 18, 2025

  1. No neutron data can go onto something defined in DECLARE. DECLARE stuff / parameters are GLOBAL
  2. The incoming particle is known as _particle, inside TRACE do something along the lines of
    _class_particle particle_incoming = _particle;
    to preserve the incoming state and work on _particle directly.

@Lomholy
Copy link
Collaborator Author

Lomholy commented Aug 18, 2025

When I wrote the intermediate values i meant the unit vector of the incoming neutron, and the candidate final velocities for the algorithm etc.

@willend
Copy link
Contributor

willend commented Aug 20, 2025

Once I patch your version of the code in the direction indicated above, the code compiles on GPU but crashes... (Still very likely a thread-collision issue...)

I will likely start from fresh on a new branch and implement your changes one by one / add safety belts on the way... Stay tuned.

…allow correct GPU execution.

Secondary changes:
1. calloc not available on gpu (in TRACE), use malloc
2. Don't use e_steps_low*e_steps_high but e_steps_low+e_steps_high for array allocation
3. Cosmetics to comp header
@willend
Copy link
Contributor

willend commented Aug 20, 2025

@Lomholy I finally got the code to agree (to within machine precision) between GPU and cpu. I in essence overlooked the changes in findroots that did not make it across to findroots_gpu.

Screenshot 2025-08-20 at 14 25 20

I am in principle ready to merge - but please confirm that my observation wrt. required length of vf_list is correct: I.e. that the the number of solutions is e_steps_low+e_steps_high and not e_steps_low*e_steps_high... :-)

@Lomholy
Copy link
Collaborator Author

Lomholy commented Aug 20, 2025 via email

@willend
Copy link
Contributor

willend commented Aug 20, 2025

OK @Lomholy then I will proceed to merge. I have on my list to look at the Bent mono also in respect to the GPU side.

@willend willend merged commit 9e43d88 into main Aug 20, 2025
44 checks passed
@Lomholy Lomholy deleted the phonon_tolerate_more_than_2_solutions branch August 26, 2025 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants