Skip to content

Log a warning if more than a certain threshold of objects are updated in an import #340

@hancush

Description

@hancush

We encountered an issue over in LA Metro where audio links disappeared from every event in a single scrape / import due to an outage in Legistar: Metro-Records/la-metro-councilmatic#713

Assuming an existing database, we generally only expect a handful of updates per scrape. Mass updates could be an indication of an important and/or breaking change at the scraping source. In this case, it would have been a very useful alert that something had gone wrong and allowed us to be more proactive in reaching a resolution.

It would be awesome if pupa had a configurable expected update threshold with a sane default, such as 75%, and would log a warning if more than that percentage of scraped entities are updated in a given run.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions