Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions aci-preupgrade-validation-script.py
Original file line number Diff line number Diff line change
Expand Up @@ -6012,6 +6012,48 @@ def apic_vmm_inventory_sync_faults_check(**kwargs):
doc_url=doc_url)


@check_wrapper(check_title='Rogue/COOP Exception List missing on switches')
def rogue_ep_coop_exception_mac_check(cversion, tversion, **kwargs):
result = PASS
headers = ["Rogue Exception MACs Count", "presListener Count"]
data = []
recommended_action = 'Remove the affected EP exception configurations and re-add them'
doc_url = 'https://datacenter.github.io/ACI-Pre-Upgrade-Validation-Script/validations/#roguecoop-exception-list-missing-on-switches'

# Target version check
if not tversion:
prints("Target version not provided, skipping check.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
prints("Target version not provided, skipping check.")

return Result(result=MANUAL, msg=TVER_MISSING)

# Affected source version is in range [5.2(3):6.0(3)] . Fixed on 6.0(9e)+ and 6.1(4)+.
# if cversion.newer_than("3.1(2v)") and tversion.older_than("6.1(3g)"):
if (
(cversion.same_as("5.2(3e)") or cversion.newer_than("5.2(3e)")) and
(cversion.same_as("6.0(3g)") or cversion.older_than("6.0(3g)")) and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 6.0(3g)? The enhancement that caused CSCwp64296 was introduced in the first release of 6.0(3) which is 6.0(3d).

(
tversion.older_than("6.0(9e)") or
((tversion.same_as("6.1(1f)") or tversion.newer_than("6.1(1f)")) and tversion.older_than("6.1(4h)"))
)
):
# endpoint to fetch the rogue exception MACs
exception_mac_api = 'fvRogueExceptionMac.json?query-target-filter=and(wcard(fvRogueExceptionMac.dn,"([0-9a-fA-F]{2}:){5}[0-9a-fA-F]{2}"))'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you matching MAC address in regex here? fvRogueExceptionMac is for MAC as the name suggests. Its DN always has a MAC address.

Unless I'm missing something, this is just causing unnecessary query work on the APICs.


# endpoint to fetch the presListener entries
presListener_api = 'presListener.json?query-target-filter=and(wcard(presListener.dn,"exceptcont"))'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of regex(wcard)), you should do exact match with lstDn such as eq(presListener.lstDn,"exceptcont")

And use rsp-subtree-include=count instead of having APICs return all of the contents because what we need to know is only the number of objects.


exception_macs = icurl('class', exception_mac_api)

if exception_macs:
prints("Found {} exception MACs, checking presListener entries...".format(len(exception_macs)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use prints() inside a check function.

In the new framework with the progress bar, it will end up like this, breaking the progress bar view. Please use logging instead, if needed. Otherwise, those info should be stored in the data so that it can be displayed in the result table and JSON file.

--- skip ---
Collecting VPC Node IDs...

Found 1 exception MACs, checking presListener entries...-------------------------------------------------------| 0/1 checks completed     <--- !!!!!
Progress: |████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 checks completed


=== Check Result (failed only) ===

presListener_response = icurl('class', presListener_api)
if len(presListener_response) >= 0 and len(presListener_response) < 32:
prints("Insufficient presListener entries ({} found) for {} exception MACs.".format(len(presListener_response), len(exception_macs)))
result = FAIL_O
data.append([len(exception_macs), len(presListener_response)])

return Result(result=result, headers=headers, data=data, recommended_action=recommended_action, doc_url=doc_url)


@check_wrapper(check_title='APIC downgrade compatibility when crossing 6.2 release')
def apic_downgrade_compat_warning_check(cversion, tversion, **kwargs):
result = NA
Expand Down Expand Up @@ -6216,6 +6258,7 @@ class CheckManager:
isis_database_byte_check,
configpush_shard_check,
auto_firmware_update_on_switch_check,
rogue_ep_coop_exception_mac_check,

]
ssh_checks = [
Expand Down
38 changes: 37 additions & 1 deletion docs/docs/validations.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,7 @@ Items | Defect | This Script
[ISIS DTEPs Byte Size][d27] | CSCwp15375 | :white_check_mark: | :no_entry_sign:
[Policydist configpushShardCont Crash][d28] | CSCwp95515 | :white_check_mark: | :no_entry_sign:
[Auto Firmware Update on Switch Discovery][d29] | CSCwe83941 | :white_check_mark: | :no_entry_sign:
[Rogue/COOP Exception List missing on switches][d30] | CSCwp64296 | :white_check_mark: | :no_entry_sign:

[d1]: #ep-announce-compatibility
[d2]: #eventmgr-db-size-defect-susceptibility
Expand Down Expand Up @@ -224,6 +225,8 @@ Items | Defect | This Script
[d27]: #isis-dteps-byte-size
[d28]: #policydist-configpushshardcont-crash
[d29]: #auto-firmware-update-on-switch-discovery
[d30]: #roguecoop-exception-list-missing-on-switches


## General Check Details

Expand Down Expand Up @@ -2648,6 +2651,7 @@ Due to [CSCwp95515][59], upgrading to an affected version while having any `conf

If any instances of `configpushShardCont` are flagged by this script, Cisco TAC must be contacted to identify and resolve the underlying issue before performing the upgrade.


### Auto Firmware Update on Switch Discovery

[Auto Firmware Update on Switch Discovery][63] automatically upgrades a new switch to the target firmware version before registering it to the ACI fabric. This feature activates in three scenarios:
Expand All @@ -2668,6 +2672,37 @@ To avoid this risk, consider disabling Auto Firmware Update before upgrading to
This issue occurs because older switch firmware versions are not compatible with switch images 6.0(3) or newer. The APIC version is not a factor.


### Rogue/COOP Exception List missing on switches

Rogue Endpoint Control and COOP Dampening are features that mitigate the impact of flapping endpoints by temporarily pausing the learning of such endpoints. However, in some environments, certain MAC or IP addresses are expected to move frequently.

The **Rogue/COOP Exception List** feature, introduced in 5.2(3), allows you to exclude specific MAC addresses from Rogue Endpoint Control and COOP Dampening. Initially, each MAC address had to be configured individually in each bridge domain. In 6.0(3), this feature was enhanced to support fabric-wide exception lists with wildcard options per bridge domain and the ability to exclude MAC addresses in L3Outs.

However, due to [CSCwp64296][64], when upgrading spine switches to version 6.0(3)+ from an older version with **Rogue/COOP Exception List**s configured, some exception lists may not be pushed to the spine switches. As a result, the feature may stop functioning after the upgrade.

!!! info
The root cause is that internal objects called `presListener` for **Rogue/COOP Exception List**, which publish the configuration from APICs to switches, may be missing on the APICs after an upgrade. This is due to [CSCwp64296][64], introduced with the enhancement in 6.0(3).

The total number of `presListener` for **Rogue/COOP Exception List** on APICs should be 32, but APICs may fail to create all of them when upgrading from an older version to 6.0(3)+. If the spine switches are then upgraded while some `presListener`s are missing on APICs, they cannot retrieve the complete lists.

This rule alerts you to [CSCwp64296][64] if:

* Your current version is between 5.2(3) and 6.0(2).
* Your target version is affected by [CSCwp64296][64].
* **Rogue/COOP Exception List** for bridge domains are configured.

OR

* Both your current and target APIC versions are the same and affected by [CSCwp64296][64].
* The oldest current switch version is between 5.2(3) and 6.0(2).
* **Rogue/COOP Exception List** for bridge domains are configured.
* The total number of `presListener` for **Rogue/COOP Exception List** is less than 32.

If the first set of conditions is met, you should change your target version to one with the fix for [CSCwp64296][64].

If the second set of conditions is met, it means that the APICs were already upgraded and affected by [CSCwp64296][64], but some switches have yet to be upgraded. In this case, you need to contact Cisco TAC to resolve the issue of missing `presListener` objects on APICs (see info above) to prevent the switches from failing to retrieve the exception lists.


[0]: https://github.com/datacenter/ACI-Pre-Upgrade-Validation-Script
[1]: https://www.cisco.com/c/dam/en/us/td/docs/Website/datacenter/apicmatrix/index.html
[2]: https://www.cisco.com/c/en/us/support/switches/nexus-9000-series-switches/products-release-notes-list.html
Expand Down Expand Up @@ -2731,4 +2766,5 @@ To avoid this risk, consider disabling Auto Firmware Update before upgrading to
[60]: https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html#Inter
[61]: https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html#EnablePolicyCompression
[62]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwe83941
[63]: https://www.cisco.com/c/en/us/td/docs/dcn/aci/apic/all/apic-installation-aci-upgrade-downgrade/Cisco-APIC-Installation-ACI-Upgrade-Downgrade-Guide/m-auto-firmware-update.html
[63]: https://www.cisco.com/c/en/us/td/docs/dcn/aci/apic/all/apic-installation-aci-upgrade-downgrade/Cisco-APIC-Installation-ACI-Upgrade-Downgrade-Guide/m-auto-firmware-update.html
[64]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwp64296
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
[]
Loading