Network Storage (NetworkIO) Implementation for SST Firefly NIC#2643
Network Storage (NetworkIO) Implementation for SST Firefly NIC#2643RishankPratikHPLabs wants to merge 2 commits into
Conversation
SSD module integration with Firefly NIC for realistic storage I/O modeling. End-to-end I/O path simulation: compute nodes -> network -> SSD storage nodes. Distributed address mapping with round-robin striping across storage devices. Direct integration with Ember motif framework for HPC workload simulation. New components: - NetworkIO API layer in Hermes interface - HadesNetworkIO with distributed address mapping - NIC-level packet handling for NetworkIO (read, write, ACK) - SimpleSSD storage device model (configurable bandwidth, latency, multi-queue bus) - Ember NetworkIO motif and test configurations Co-authored-by: Rishank Pratik <rishank.pratik@hpe.com> Co-authored-by: Pawan Kumar <pawan.kumar4@hpe.com> Co-authored-by: Sumant Kalra <sumant.kalra@hpe.com> Co-authored-by: Shridhar Joshi <shridhar@hpe.com>
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
Add SPDX-FileCopyrightText and SPDX-License-Identifier (BSD-3-Clause) headers to all new and modified files per HPE OSRB attribution requirements. Update CONTRIBUTORS.TXT with HPE entry. Fix corrupted copyright line in nicEvents.h.
630bdc2 to
f85e717
Compare
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
feldergast
left a comment
There was a problem hiding this comment.
Code generally looks good. There are a few comments about some code clean-up. There will also need to be at least one test, preferably more to test the new code. The tests should use the new merlin.base derived python modules from the merlin library and should not use emberload.py as this uses the old deprecated merlin python libraries.
| m_networkIOLib = static_cast<EmberNetworkIOLib*>(getLib("networkIO")); | ||
| assert(m_shmemLib); | ||
| assert(m_networkIOLib); | ||
| } No newline at end of file |
There was a problem hiding this comment.
New line at end of line
|
|
||
| [JOB_ID] 2 | ||
| [NID_LIST] generateNidList=generateNidListRange({SSD_START_NODE},{SSD_NODES}) | ||
| [MOTIF] Null No newline at end of file |
There was a problem hiding this comment.
New line at end of line
| params.find<uint32_t>("verboseMask",-1), | ||
| Output::STDOUT ); | ||
|
|
||
| auto parse = [](const std::string& s) |
There was a problem hiding this comment.
Is this used somewhere that I'm missing?
| m_storageNodesList.push_back(m_ssd_start_node + i); | ||
| } | ||
|
|
||
| m_storageNodeCapacity = params.find<UnitAlgebra>("storageNodeCapacity", "1GiB").getRoundedValue(); |
There was a problem hiding this comment.
May want to consider a check on the units to makes sure they are passing in Bytes for this.
|
|
||
| int nodeIndex = (offset/m_storageNodeCapacity)%m_storageNodesList.size(); | ||
| return m_storageNodesList.at(nodeIndex); | ||
| } No newline at end of file |
| m_nic.qSendEntry(entry); | ||
|
|
||
| delete event; | ||
| } No newline at end of file |
| m_toCoreLink->send( delay , event ); | ||
| } | ||
|
|
||
| void sendNetworkIO( SimTime_t delay, SST::Event * event ) { |
There was a problem hiding this comment.
This function could likely just use the sendShmem() function above, but give it a more generic name. Could likely even just be called send() since it has different parameters than the original send() function above.
| { | ||
| m_dbg.debug(CALL_INFO,2,0,"src=%#lx len=%zu\n", src, len); | ||
| sendCmd(0, new NicNetworkIOWriteCmdEvent( targetNid, src, len, callback ) ); | ||
| } No newline at end of file |
|
|
||
| virtual ~Interface() = default; | ||
|
|
||
| // Network IO READ - reads from network storage to local buffer |
There was a problem hiding this comment.
Comments about function would be better formatted as doxygen comments as we are trying to move everything to use doxygen.
Network Storage (NetworkIO) Implementation for SST Firefly NIC
What this does
Adds network-attached SSD storage simulation to the Firefly NIC. Compute nodes can now issue async read/write operations to storage nodes over the simulated network, with a SimpleSSD model handling the storage-side latency and bandwidth.
We needed this to study I/O performance in distributed storage setups — how network latency, SSD throughput, node placement, and access patterns interact — without needing real hardware. This plugs into the existing Ember/Firefly/Hermes stack, so existing MPI simulations are unaffected.
How it works
Here's what each layer does:
Hermes API (
networkIOapi.h) — defines two calls:networkIORead(dest, offset, length, callback)andnetworkIOWrite(offset, src, length, callback). Both are async — you pass a callback that fires when the op completes.Hades (
hadesNetworkIO) — takes the global byte offset and figures out which storage node to hit. Uses a simple modulo scheme:nodeIndex = (offset / storageNodeCapacity) % numStorageNodes. Then hands it down to the NIC.Firefly NIC (
nicNetworkIO,nicNetworkIOSendEntry,nicNetworkIOStream) — this is where most of the work happens:On the compute side: the NIC creates a send entry (read or write), generates a response key (
respKey) with a callback attached, and queues a small request packet:[MsgHdr::NetworkIO][Read|Write][offset][addr][length][respKey]. The respKey gets stored so the NIC can match the ACK later.On the storage side: a
NetworkIOStreamreceives the packet, extracts the op type and respKey, and passes it to SimpleSSD (or falls back to DMA if SimpleSSD isn't loaded). When the SSD finishes, the stream creates a tiny ACK packet containing just the respKey and sends it back.Back on the compute side: another
NetworkIOStreamreceives the ACK, pulls out the respKey, looks up the stored callback viagetRespKeyValue(), and invokes it. This completes the async operation.NetworkIO traffic stays separated from MPI — it uses
MsgHdr::NetworkIO(vsMsgHdr::Msg), its ownNetworkIOStream(vsMsgStream), and vNic 0.Ember (
TestNetworkIOmotif) — the workload layer. Issues configurable read/write calls withmessageSize,iterations,optype, andfileSizeparams. Randomizes offsets within the file range.SimpleSSD — sits on each storage node as a subcomponent. Has a multi-lane queue structure (
nSSDsPerNode × queuesCountPerSSDlanes). Requests go round-robin across lanes. Each request gets a delay ofoverheadLatency + (bytes / bandwidth), scheduled via a self-link timer. When the delay fires, the completion callback runs and the ACK gets sent.SimpleSSD model
firefly.SimpleSSDis a subcomponent with these params:nSSDsPerNodequeuesCountPerSSDreadBandwidthPerSSD_GBpswriteBandwidthPerSSD_GBpsreadOverheadLatency_nswriteOverheadLatency_nsDelay = overhead + (bytes / bandwidth). Requests go round-robin across all SSD×queue lanes.
Node setup
Some nodes are compute, some are storage. Configured in
loadNetworkIO:TestNetworkIO) and issue I/ONull(idle) and serve requests through SimpleSSDSSD_START_NODEandSSD_NODESvars in the load fileFiles
New files (20):
hermes/networkIOapi.h— NetworkIO APIfirefly/hadesNetworkIO.{cc,h}— offset-to-node mappingfirefly/nicNetworkIO.{cc,h}— NIC command handlingfirefly/nicNetworkIOSendEntry.h— packet constructionfirefly/nicNetworkIOStream.{cc,h}— incoming packet processingfirefly/storageModel/simpleSSD.{cc,h}— SSD simulationember/libs/emberNetworkIOLib.h— library wrapperember/libs/networkIOEvents/emberNetworkIO{Event,ReadEvent,WriteEvent}.h— eventsember/networkIO/emberNetworkIOGen.{cc,h}— generator baseember/networkIO/motifs/emberTestNetworkIO.{cc,h}— test motifember/test/loadNetworkIO— test configember/test/networkIOParams.py— platform paramsModified files (16):
firefly/nic.{cc,h}— NetworkIO handler + SimpleSSD setupfirefly/nicEvents.h— new event typesfirefly/nicRecvCtx.cc,nicRecvMachine.h— stream dispatchfirefly/nicSendEntry.h,nicSendMachine.{cc,h}— send supportfirefly/nicVirtNic.h,virtNic.{cc,h}— delegation methodsfirefly/Makefile.am,hermes/Makefile.am,ember/Makefile.am— build updatesember/.gitignore— ignore patternsCONTRIBUTORS.TXT— added HPE entryTesting
Tested with three configs, all pass:
Built with SST Core + Elements from
devel, GCC 14.How to run
Build:
Verify SimpleSSD registered:
Run the default test (2 compute + 2 SSD):
cd src/sst/elements/ember/test sst emberLoad.py -- --topo=torus --shape=2x2 --numNodes=4 --numCores=1 --platform=networkIO --loadFile=loadNetworkIOTo change the node layout, edit
loadNetworkIO:Then run with a matching topology:
Motif params in
loadNetworkIO:messageSize=<bytes>,iterations=<N>,op=read|write,fileSize=<bytes>.Co-authored-by: Rishank Pratik rishank.pratik@hpe.com
Co-authored-by: Pawan Kumar pawan.kumar4@hpe.com
Co-authored-by: Sumant Kalra sumant.kalra@hpe.com
Co-authored-by: Shridhar Joshi shridhar@hpe.com