Skip to content

[BUG] Muted faults appears in the web UI and Foxglove #413

@evTessellate

Description

@evTessellate

Bug report

Please use this template to report bugs.

Steps to reproduce

Here's a minimal example where a reporter reports a MAIN_FAULT and a CASCADED_FAULT, where the CASCADED_FAULT should be muted by the MAIN_FAULT via a correlation_rule.

  1. Change directory to a ro2 workspace
  2. Clone the minimal example: git clone git@github.com:evTessellate/medkit_correlation_example.git
  3. Build the package
  4. source install/setup.bash
  5. Run the fault manager, the gateway and the mock reporter with: ros2 launch medkit_correllation_example demo.launch.yaml

Expected behavior

When I open the ros2 medkit Web UI, I expect to see only the MAIN_FAULT and not the cascaded fault based on the provided correlation_rules:

correlation:
  enabled: true
  default_window_ms: 5000

  patterns:
    cascaded-fault:
      codes: ["CASCADED_FAULT"]

  rules:
    - id: main-masks-cascaded
      name: "Main Fault Masks Cascaded Fault"
      mode: hierarchical
      root_cause:
        codes:
          - MAIN_FAULT
      symptoms:
        - pattern: cascaded-fault
      mute_symptoms: true
      auto_clear_with_root: true

I also expect curl "http://localhost:8080/api/v1/faults?include_muted=false" to return only the MAIN_FAULT since I don't want to include muted faults.

Actual behavior

In the web UI I see both faults:

Image

As for the curl "http://localhost:8080/api/v1/faults?include_muted=false", it returned both faults (including the muted CASCADED_FAULT):

{
  "items": [
    {
      "description": "Fault that causes other faults.",
      "fault_code": "MAIN_FAULT",
      "first_occurred": 1780949039.335378,
      "last_occurred": 1780950700.8036692,
      "occurrence_count": 233,
      "reporting_sources": [
        "/medkit_correlation_example"
      ],
      "severity": 2,
      "severity_label": "ERROR",
      "status": "CONFIRMED"
    },
    {
      "description": "Fault that is caused by MAIN_FAULT.",
      "fault_code": "CASCADED_FAULT",
      "first_occurred": 1780949039.3812168,
      "last_occurred": 1780950700.8099415,
      "occurrence_count": 233,
      "reporting_sources": [
        "/medkit_correlation_example"
      ],
      "severity": 2,
      "severity_label": "ERROR",
      "status": "CONFIRMED"
    }
  ],
  "x-medkit": {
    "cluster_count": 0,
    "count": 2,
    "muted_count": 1
  }

I'm pretty sure both are linked.

What actually happened, including any error messages or stack traces.

Environment

  • ros2_medkit version (commit hash): a0162ad
  • ROS 2 distro: Humble
  • OS: Ubuntu 22.04

Additional information

I plan to join a PR to fix this, but I want you input if its a bug or not. It might be something that should be handled by the UI, but I feel that service call with include_muted=false should return non-muted faults.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions