fix!: redacting user retirement data in lms #37886

ktyagiapphelix2u · 2026-01-14T11:27:24Z

Description

Instead of deleting user retirement records completely, this PR updates the system to replace personal information (name, email, username) with safe placeholder values while keeping the records in the database.

Private Ticket

https://2u-internal.atlassian.net/browse/BOMS-293

pdpinch · 2026-01-14T13:31:34Z

Maybe this is out of scope for this PR, but I'm concerned about the explicit reference to Jenkins. I understand the desirability of recording the job id, but not everyone uses Jenkins as their runner.

deborahgu · 2026-01-14T13:37:41Z

openedx/core/djangoapps/user_api/accounts/views.py

@@ -1045,7 +1046,15 @@ def cleanup(self, request):
            if len(usernames) != len(retirements):
                raise UserRetirementStatus.DoesNotExist("Not all usernames exist in the COMPLETE state.")

-            retirements.delete()
+            # Redact PII fields instead of deleting records
+            # This ensures Fivetran syncs redacted data to Snowflake instead of creating soft deletes with PII


Fivetran and snowflake are 2U specific. Can you rewrite this to be more general, and not specific to one instance's tooling?

Rewritten the comments more general. Resolved.

deborahgu · 2026-01-14T13:43:58Z

openedx/core/djangoapps/user_api/accounts/views.py

@@ -1032,6 +1032,7 @@ def cleanup(self, request):
        """
        try:
            usernames = request.data["usernames"]
+            jenkins_run_id = request.data.get("jenkins_run_id", "unknown")


looks like this requires a change to the docstring, if you want a second, optional value to be sent.

please generalise the name, so it's not about a jenkins job in particular. or better, is there a reason to associate with jenkins? could this just use a value based on random, or just a string (ie. redacted)? Why do we care about associating this with a specific ID?

Updated the docstring. Resolved.

Changed the suggested things and with run_id it provides traceability to the specific retirement batch/execution.

robrap · 2026-01-14T21:51:47Z

@pdpinch: @ktyagiapphelix2u is out until Friday, but we will be removing references to 2U and Jenkins. Thanks.

ktyagiapphelix2u · 2026-01-16T04:48:05Z

@pdpinch I agreed I am removing jenkins name reference. Thanks.

robrap · 2026-01-16T14:52:15Z

openedx/core/djangoapps/user_api/accounts/views.py

        """
        try:
            usernames = request.data["usernames"]
+            redaction_id = request.data.get("redaction_id", "unknown")


Can this be 3 inputs: redacted_username, redacted_email, and redacted_name. All decisions of what the redacted value will look like will be made by the caller.

I have implemented the suggested method. You can take a look.

robrap · 2026-01-20T16:10:23Z

openedx/core/djangoapps/user_api/accounts/views.py

        """
        try:
            usernames = request.data["usernames"]
+            redacted_value = request.data.get("redacted_value", "redacted")


I was asking for 3 different redacted values, like the following:

redacted_username = request.data.get("redacted_username", "redacted") redacted_email = request.data.get("redacted_email", "redacted") redacted_name = request.data.get("redacted_name", "redacted")

I know that 2U will send the same value for all 3, but others might want different values.

got it but i thought these would more cleanest things to do rather than passing all 3 arguements but will make the change accordingly. thanks updating the new stuff.

robrap · 2026-01-20T17:43:44Z

openedx/core/djangoapps/user_api/accounts/tests/test_retirement_views.py

+            assert retirement.original_username == 'redacted'
+            assert retirement.original_email == 'redacted'
+            assert retirement.original_name == 'redacted'


Can we also get a test for non-defaults, to ensure that the right data makes it to the right places?

got it i have added that test for non-defaults.

robrap · 2026-01-20T17:54:41Z

@ktyagiapphelix2u: Also, I think we should change this to a breaking change PR. That means updating the title to fix!: .... Also, we should open a Fast Track DEPR (i.e automatically accepted) for this as an inform.

We should note that switching from deleting records to redacting records is not a breaking change from the point of view of user retirement, because in either case the sensitive data has been safely taken care of. The breaking change is that these records will no longer be deleted, so any operator scripts that run against the table after retirement, that also relied on the fact that data was being deleted, would need to be updated.

ktyagiapphelix2u · 2026-01-21T06:18:42Z

@robrap I was thinking of updating archive cleanup script for Open edX. Is it expected to update it there as well, or should it be left as is? as they have moved to CI workflows rather than jenkins

bmtcril · 2026-01-21T15:53:07Z

Hi all, can we get a description of the problem this is trying to address since we can't see the internal ticket? I have some (very small) amount of concern about this table growing given how often it's called, but more want to understand why this change is necessary.

Since this is a breaking change in the API I think it should be versioned to a V2 endpoint as part of the depr.

@fghaas I know you've been involved in tutor-contrib-retirement so wanted to give you a heads up as well

robrap · 2026-01-26T17:37:38Z

@bmtcril: In thinking through explaining the issue to you, I may have come up with a backward-compatible version of this change, so that is what I am going to propose we do.

The reason we wanted this change is because in Snowflake, we've used different technologies for sync, but all of them treat the status record deletes as soft-deletes, which then requires an additional custom job to clear out the soft deleted data to fully remove the sensitive data. We decided that redacting here, just like we do everywhere else, would avoid the custom jobs and ongoing maintenance.

I just realized that we could have the best of both worlds, and have this api first redact, and then delete. That way it will still be backward-compatible with the deletes, but we wouldn't have to clean up the soft-deletes, because those records would already be redacted.

@ktyagiapphelix2u: Can you please implement this. You'll need to ensure that the redacted data gets saved to the DB before the delete. I'm not certain exactly how to ensure this, but maybe working with Dave W. to ensure that the soft-deleted records contain the redacted data? Hopefully neither mysql nor Django nor the sync code tries to be too smart.

robrap · 2026-01-26T17:39:19Z

@ktyagiapphelix2u: Once we ensure we can get everything working with redact + delete, we'd be able to close the DEPR:

[DEPR]: User Retirement Records Redaction #37921.

fix: redacting user retirement data in lms

19fc427

ktyagiapphelix2u requested a review from a team as a code owner January 14, 2026 11:27

ktyagiapphelix2u force-pushed the ktyagi/redaction branch from 70f2265 to 7a7ef1b Compare January 14, 2026 11:39

deborahgu reviewed Jan 14, 2026

View reviewed changes

ktyagiapphelix2u force-pushed the ktyagi/redaction branch from 99b47d4 to fa6aa32 Compare January 14, 2026 13:53

robrap reviewed Jan 16, 2026

View reviewed changes

ktyagiapphelix2u force-pushed the ktyagi/redaction branch from 63da9fe to a7cdaa6 Compare January 20, 2026 05:22

ttak-apphelix approved these changes Jan 20, 2026

View reviewed changes

robrap reviewed Jan 20, 2026

View reviewed changes

ktyagiapphelix2u force-pushed the ktyagi/redaction branch from 19bc46e to eb5b4df Compare January 20, 2026 17:13

robrap reviewed Jan 20, 2026

View reviewed changes

ktyagiapphelix2u changed the title ~~fix: redacting user retirement data in lms~~ fix!: redacting user retirement data in lms Jan 21, 2026

ktyagiapphelix2u force-pushed the ktyagi/redaction branch from 0c174cf to 536a3bd Compare January 21, 2026 06:05

ktyagiapphelix2u mentioned this pull request Jan 21, 2026

[DEPR]: User Retirement Records Redaction #37921

Open

fix: redacting user retirement data in lms

5ac51b6

ktyagiapphelix2u force-pushed the ktyagi/redaction branch from 536a3bd to 5ac51b6 Compare January 21, 2026 07:15

fix!: redacting user retirement data in lms #37886

Are you sure you want to change the base?

fix!: redacting user retirement data in lms #37886

Conversation

ktyagiapphelix2u commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Private Ticket

Uh oh!

pdpinch commented Jan 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robrap commented Jan 14, 2026

Uh oh!

ktyagiapphelix2u commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robrap Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ktyagiapphelix2u Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robrap commented Jan 20, 2026

Uh oh!

ktyagiapphelix2u commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bmtcril commented Jan 21, 2026

Uh oh!

robrap commented Jan 26, 2026

Uh oh!

robrap commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ktyagiapphelix2u commented Jan 14, 2026 •

edited

Loading

ktyagiapphelix2u commented Jan 16, 2026 •

edited

Loading

robrap Jan 16, 2026 •

edited

Loading

ktyagiapphelix2u Jan 20, 2026 •

edited

Loading

ktyagiapphelix2u commented Jan 21, 2026 •

edited

Loading