Keep track of per module DPInitPending state software to work around transceiver firmware issues#696
Conversation
…rmware issues Certain transceiver firmwares clear DPInitPending on other datapaths when setting it for currently transitioning datapaths. This requires keeping a track of DPInitPending state in software so that the config loop does not fail when two datapaths in a module are being configured in an interleaved manner.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@aditya-nexthop I believe this is some issue with module firmware as each datapath should have its own |
Yes, this is an issue with module firmware. That's why this is still a draft and not marked ready for review. |
Certain transceiver firmwares clear DPInitPending on other datapaths when setting it for currently transitioning datapaths. This requires keeping a track of DPInitPending state in software so that the config loop does not fail when two datapaths in a module are being configured in an interleaved manner.
Description
Software state is maintained for the datapaths undergoing initialization and a second data path on a module is not configured when a different datapath is undergoing initialization. This serializes the initialization and does not cause the firmware to overwrite bits in the
DpInitPendingregister.When a datapath enters the
CMIS_INSERTEDorCMIS_FAILEDstate or if the datapath is configured successfully, the software state maintained is cleared to allow other datapaths on the transceiver to be initialized.Motivation and Context
During reboot + link up testing, certain transceivers entered CMIS failed state due to
EthernetX: datapath init not pending. Upon investigation it was found that affected transceivers' firmware clear a priorDpInitPendingwhen it needs to be set on a different datapath leading toxcvrdmissing theDpInitPendingcheck and setting state to failed.How Has This Been Tested?
Successfully ran 500+ reboot + link up tests with transceivers initializing every single time. Previously, links would fail to come up within 10 iterations.
Additional Information (Optional)