Conversation
|
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection is Not Necessary for this Pull Request. |
|
Status Flag 'Pull Request AutoTester' - Failure: Timed out waiting for job SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.9_sst-elements to start: Total Wait = 303
|
|
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.9_sst-elements
Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.9_sst-elements_Make-Dist
Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MT-2
Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MR-2
Build InformationTest Name: SST__AutotestGen2_NewFW_OSX-15-XC15-ARM2_OMPI-4.1.6_PY3.10_sst-elements
Using Repos:
Pull Request Author: nab880 |
|
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.9_sst-elements
Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.9_sst-elements_Make-Dist
Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MT-2
Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.9_sst-elements_MR-2
Build InformationTest Name: SST__AutotestGen2_NewFW_OSX-15-XC15-ARM2_OMPI-4.1.6_PY3.10_sst-elements
|
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
18 similar comments
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
This PR adds the Carcosa interface layer and supporting components so Hali can sit between CPUs/sensors and the memory hierarchy, coordinate fault injection via a shared registry, and support both sensor/ring workloads and Vanadis MMIO coordination (e.g. ping-pong). It also adds the MemHierarchy PortModule faultInjectorMemH, the PMDataRegistry for manager–injector communication, and tests that verify Hali, the manager logic, and fault injection behavior.
Hali – Main interface component.
Sits on the memHierarchy (highlink = CPU side, lowlink = cache/memory side). Forwards MemEvents
Uses a FaultInjManager to attach data that PortModules can read to modify behavior.
Example of this working exists in the added faultInjector faultInjectorMemH. Ideally this would be abstracted out into the general "FaultInjector API" but I leave that to future work.
Working Features:
MMIO mode when control_addr_base/control_addr_size are set: intercepts loads/stores to a control region for Vanadis coordination (command/status registers), with optional ring-based “done” sync (see ping-ping example in Carcosa tests).
FaultInjManagerAPI / FaultInjManager – Subcomponent used by Hali. Queues highlink/lowlink PortModule requests. Uses event ID to coordinate behavior through PMDataRegistry.
PMDataRegistry – Shared state between manager and injectors.
CarcosaMemCtrl – MemHierarchy-style memory controller with optional iflLinks_N. Used for backing manipulation.
hyades.h – Single-header API for Vanadis to talk to Hali’s MMIO control region.
Applications that use hyades should have a jump table Upon accessing the MMIO control region the Hali interface intercepts the memory access and returns a jump table index. See pingpong.c in carcosa/tests for example. This will likely need to be expanded for more complex use cases.
testCarcosaPingPong.py – Two Vanadis cores, one VanadisNodeOS, two processes (same binary, roles 0 and 1). Puts Hali in each core’s data path (CPU → Hali → dTLB → L1D), MMIO params