Skip to content

adding model caching and resolving n+1 issues#3484

Closed
praffq wants to merge 23 commits intodevelopfrom
prafful/resolve-n+1-issues
Closed

adding model caching and resolving n+1 issues#3484
praffq wants to merge 23 commits intodevelopfrom
prafful/resolve-n+1-issues

Conversation

@praffq
Copy link
Copy Markdown
Contributor

@praffq praffq commented Jan 21, 2026

Proposed Changes

  • Fixing N+1 issues reported on Sentry
  • added caching for UserSpec in all the used places

Added caching for -

  • UserSpec

Associated Issue

Architecture changes

  • Remove this section if not used

Merge Checklist

  • Tests added/fixed
  • Update docs in /docs
  • Linting Complete
  • Any other necessary step

Only PR's with test cases included and passing lint and test pipelines will be reviewed

@ohcnetwork/care-backend-maintainers @ohcnetwork/care-backend-admins

Summary by CodeRabbit

  • Bug Fixes

    • Fixed incorrect organization reference in user profile serialization for accurate display.
  • Performance Improvements

    • Reduced redundant database lookups by eager-loading related records and using cached user lookups.
    • Standardized audit-user serialization across resources to simplify output and avoid duplicated logic.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 21, 2026

📝 Walkthrough

Walkthrough

Eager-load related objects with select_related in several viewsets; centralize audit-user serialization into a shared helper across many resource specs; replace direct User object serialization with cache-backed lookups via model_from_cache; and consolidate location/organization serialization to use ORM queries.

Changes

Cohort / File(s) Summary
Viewset query optimizations
care/emr/api/viewsets/facility.py, care/emr/api/viewsets/facility.py (AllFacilityViewSet), care/emr/api/viewsets/organization.py, care/emr/api/viewsets/scheduling/availability.py
Added select_related() calls to eager-load geo_organization, role, and availability where applicable.
Viewset variable & cache refactors
care/emr/api/viewsets/location.py, care/emr/api/viewsets/patient.py, care/emr/api/viewsets/scheduling/booking.py
Renamed encounter_organizationslocation_organizations; simplified patient querysets and added select_related("geo_organization") in search paths; switched user serialization to model_from_cache(UserSpec, id=...) in user-related handlers.
Base utilities / cache behavior
care/emr/resources/base.py
Guarded audit-user attribute access with hasattr; model_from_cache now returns data.model_dump(mode="json") instead of dict(data).
Audit-user consolidation
care/emr/resources/*/spec.py (many files: allergy_intolerance, condition, facility, facility_organization, location, medication/administration, meta_artifact, notes/*, questionnaire, questionnaire_response, valueset, patient, etc.)
Replaced explicit per-field created_by/updated_by serialization with centralized cls.serialize_audit_users(mapping, obj) calls across multiple Read/Retrieve specs.
Cache-based user resolution
care/emr/resources/consent/spec.py, care/emr/resources/device/history_spec.py, care/emr/resources/file_upload/spec.py, care/emr/resources/observation/spec.py, care/emr/resources/organization/organization_user_spec.py, care/emr/resources/resource_request/spec.py
Replaced direct ORM User lookups or UserSpec.serialize(...) with model_from_cache(UserSpec, id=...) for fields like verified_by, updated_by, uploaded_by, archived_by, user, assigned_to, and data_entered_by.
Location & organization serialization changes
care/emr/resources/encounter/spec.py, care/emr/resources/healthcare_service/spec.py, care/emr/resources/medication/dispense/dispense_order.py, care/emr/resources/scheduling/resource/spec.py, care/emr/resources/service_request/spec.py
Moved some model imports to local scope; replaced per-item lookups with filtered ORM queries or objects.get(id=...) and serialized results (e.g., FacilityLocation handling and organization select_related).
Patient & user spec tweaks
care/emr/resources/patient/spec.py, care/emr/resources/user/spec.py
Consolidated patient audit serialization to shared helper; fixed organization serialization loop to use the loop variable when serializing organizations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.80% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main objective of the PR: addressing N+1 query problems and implementing model caching, which aligns with the extensive changes made across 30+ files.
Description check ✅ Passed The description covers the main changes (N+1 fixes and UserSpec caching), references associated GitHub issues, and includes the merge checklist, though it could be more detailed about the scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch prafful/resolve-n+1-issues

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 72.58065% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.54%. Comparing base (9607f4d) to head (9e2b2d8).
⚠️ Report is 9 commits behind head on develop.

Files with missing lines Patch % Lines
care/emr/api/viewsets/patient.py 40.00% 3 Missing ⚠️
care/emr/resources/observation/spec.py 25.00% 3 Missing ⚠️
care/emr/resources/device/history_spec.py 50.00% 1 Missing and 1 partial ⚠️
care/emr/resources/file_upload/spec.py 71.42% 1 Missing and 1 partial ⚠️
care/emr/resources/resource_request/spec.py 60.00% 1 Missing and 1 partial ⚠️
care/emr/resources/scheduling/resource/spec.py 33.33% 2 Missing ⚠️
care/emr/api/viewsets/facility.py 50.00% 1 Missing ⚠️
...re/emr/resources/medication/administration/spec.py 0.00% 1 Missing ⚠️
care/emr/resources/valueset/spec.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3484      +/-   ##
===========================================
+ Coverage    74.49%   74.54%   +0.05%     
===========================================
  Files          471      471              
  Lines        21679    21628      -51     
  Branches      2257     2226      -31     
===========================================
- Hits         16149    16122      -27     
+ Misses        5029     5017      -12     
+ Partials       501      489      -12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@praffq praffq marked this pull request as ready for review January 27, 2026 07:34
@praffq praffq requested a review from a team as a code owner January 27, 2026 07:34
@praffq praffq changed the title resolving n+1 issues adding model caching and resolving n+1 issues Jan 27, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@care/emr/api/viewsets/encounter.py`:
- Around line 270-276: The list comprehension that builds data from
EncounterOrganization rows may include None when
model_from_cache(FacilityOrganizationReadSpec, id=org_id) misses the cache;
update the logic that creates data (using encounter_organizations and
model_from_cache) to filter out None results (or skip those org_ids) so the
returned list contains only valid FacilityOrganizationReadSpec objects; locate
the comprehension that assigns data and add a guard (e.g., only include entries
where model_from_cache returned a non-None value) to avoid nulls in the
response.

In `@care/emr/resources/consent/spec.py`:
- Around line 119-124: The code mutates obj.verification_details in-place
causing cached model data to be corrupted; instead, make a shallow or deep copy
of the list/dicts before modifying them (e.g., new_list = [dict(v) for v in
obj.verification_details] or use copy.deepcopy), then replace
mapping["verification_details"] with that modified copy; keep the use of
model_from_cache(UserSpec, external_id=...) but apply it to the copied dicts so
obj.verification_details remains untouched.

In `@care/emr/resources/device/history_spec.py`:
- Around line 43-47: The current logic sets history["updated_by"] to
model_from_cache(UserSpec, id=user) when user is truthy, which can yield None
for deleted/missing users, but sets {} when user is falsy, causing inconsistent
API responses; change the assignment so that the result of
model_from_cache(UserSpec, id=user) is normalized to {} when it is falsy (e.g.,
history["updated_by"] = model_from_cache(UserSpec, id=user) or {}), ensuring
history["updated_by"] is always an object; update the block that populates
history in the function handling edit_history to use this normalized value.

In `@care/emr/resources/healthcare_service/spec.py`:
- Around line 71-81: The current serialization loop reintroduces an N+1 by
querying FacilityLocation objects without preloading related objects used by
FacilityLocationListSpec (notably current_encounter); change the queryset in the
serialization block that iterates over
FacilityLocation.objects.filter(id__in=obj.locations) to preload the related
fields (e.g., use select_related('current_encounter') or appropriate
prefetch_related for any FK/M2M accessed by FacilityLocationListSpec) so
FacilityLocationListSpec.serialize(...) does not trigger extra DB queries.

In `@care/emr/resources/scheduling/resource/spec.py`:
- Around line 15-16: The current code does a fresh DB lookup via
FacilityLocation.objects.get(id=obj.location_id) causing N+1 queries; change it
to use the already-loaded relation accessor obj.location (and fall back safely
if missing) and pass that instance into
FacilityLocationListSpec.serialize(...).to_json() so
select_related("resource__location") will be honored; update the code paths that
use this logic (the serializer function/method containing
FacilityLocation.objects.get and the call to FacilityLocationListSpec.serialize)
to rely on obj.location instead of querying by id.
🧹 Nitpick comments (8)
care/emr/resources/medication/dispense/dispense_order.py (1)

68-80: Duplicated logic between parent and child class.

The location fetching and serialization logic on lines 78-79 is identical to lines 63-64 in MedicationDispenseOrderReadSpec. Since MedicationDispenseOrderRetrieveSpec inherits from MedicationDispenseOrderReadSpec, consider calling super().perform_extra_serialization(cls, mapping, obj) and then overriding only what differs (the patient serialization and audit users).

This would reduce duplication and ensure any fix to the N+1 issue only needs to be applied once.

♻️ Suggested refactor
 class MedicationDispenseOrderRetrieveSpec(MedicationDispenseOrderReadSpec):
     patient: dict = {}
 
     created_by: dict = {}
     updated_by: dict = {}
 
     `@classmethod`
     def perform_extra_serialization(cls, mapping, obj):
+        super().perform_extra_serialization(mapping, obj)
         cls.serialize_audit_users(mapping, obj)
-        mapping["id"] = obj.external_id
-        location = FacilityLocation.objects.get(id=obj.location_id)
-        mapping["location"] = FacilityLocationListSpec.serialize(location).to_json()
         mapping["patient"] = PatientRetrieveSpec.serialize(obj.patient).to_json()
care/emr/resources/tag/config_spec.py (1)

177-194: Consider moving import to module level.

The model_from_cache import on line 179 is inside the method, while EMRResource from the same module is already imported at the top (line 14). Unless there's a circular import concern, moving this to the module-level imports would be more consistent with the rest of the codebase.

That said, the actual logic change using obj.facility_organization_id and obj.organization_id with model_from_cache is correct and resolves the N+1 nicely.

♻️ Optional: Move import to module level
-from care.emr.resources.base import EMRResource, cacheable, model_string
+from care.emr.resources.base import EMRResource, cacheable, model_from_cache, model_string

Then remove line 179.

care/emr/resources/service_request/spec.py (1)

169-172: Add select_related to avoid N+1 queries during locations serialization.

The FacilityLocation.objects.filter(id__in=obj.locations) query doesn't prefetch related fields. Since FacilityLocationListSpec.serialize() accesses both obj.current_encounter and obj.parent (via get_parent_json()), you'll trigger separate queries for each of the N locations.

♻️ Suggested optimization
         mapping["locations"] = [
             FacilityLocationListSpec.serialize(location).to_json()
-            for location in FacilityLocation.objects.filter(id__in=obj.locations)
+            for location in FacilityLocation.objects.filter(id__in=obj.locations).select_related("current_encounter", "parent")
         ]
care/emr/resources/file_upload/spec.py (1)

115-116: Consider using serialize_audit_users for consistency.

Other specs in this PR (e.g., ObservationReadSpec, ResourceRequestRetrieveSpec) delegate audit-field serialization to cls.serialize_audit_users(mapping, obj). This spec handles updated_by manually while the parent handles uploaded_by (which maps to created_by).

If the intent is to only show updated_by in the retrieve spec and uploaded_by in the list spec, the current approach is fine. Just noting the slight deviation from the pattern elsewhere.

care/emr/api/viewsets/location.py (1)

180-186: Consider filtering out None values from the results.

model_from_cache returns None when a model isn't found (with quiet=True by default). If an organization is deleted between fetching IDs and the cache lookup, the response could include None entries in the results array.

♻️ Suggested improvement
         data = [
             model_from_cache(FacilityOrganizationReadSpec, id=org_id)
             for org_id in location_organizations
+            if model_from_cache(FacilityOrganizationReadSpec, id=org_id) is not None
         ]

Or more efficiently:

-        data = [
-            model_from_cache(FacilityOrganizationReadSpec, id=org_id)
-            for org_id in location_organizations
-        ]
+        data = [
+            org_data
+            for org_id in location_organizations
+            if (org_data := model_from_cache(FacilityOrganizationReadSpec, id=org_id)) is not None
+        ]
care/emr/api/viewsets/patient.py (1)

284-288: Same None-filtering consideration as other list comprehensions.

If a user is deleted between the PatientUser query and the cache lookup, None could appear in results. This is probably an edge case, but worth noting for consistency with how you handle it elsewhere.

care/emr/resources/inventory/supply_request/request_order.py (1)

110-121: Use the ForeignKey relations directly to avoid unnecessary queries.
FacilityLocation.objects.get(...) makes extra trips to the database when obj.origin and obj.destination are already available. While you're at it, might as well grab them straight from the model instance.

♻️ Suggested tweak
-        if obj.origin_id:
-            origin = FacilityLocation.objects.get(id=obj.origin_id)
-            mapping["origin"] = FacilityLocationListSpec.serialize(origin).to_json()
-        if obj.destination_id:
-            destination = FacilityLocation.objects.get(id=obj.destination_id)
-            mapping["destination"] = FacilityLocationListSpec.serialize(
-                destination
-            ).to_json()
+        if obj.origin_id:
+            mapping["origin"] = FacilityLocationListSpec.serialize(obj.origin).to_json()
+        if obj.destination_id:
+            mapping["destination"] = FacilityLocationListSpec.serialize(
+                obj.destination
+            ).to_json()
care/emr/resources/organization/spec.py (1)

6-67: Avoid accidentally caching user-scoped retrieve specs.

@cacheable is inherited, so OrganizationRetrieveSpec becomes cacheable too. If anyone ever uses model_from_cache with it, permissions won’t be user-specific. It’s safer to explicitly disable caching on the retrieve spec.

♻️ Suggested safeguard
 class OrganizationRetrieveSpec(OrganizationReadSpec):
+    __cacheable__ = False
     permissions: list[str] = []

Comment thread care/emr/api/viewsets/encounter.py
Comment thread care/emr/resources/consent/spec.py
Comment thread care/emr/resources/device/history_spec.py
Comment thread care/emr/resources/healthcare_service/spec.py Outdated
Comment thread care/emr/resources/scheduling/resource/spec.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
care/emr/resources/user/spec.py (1)

203-205: Bug: Loop variable shadowed by outer parameter obj.

Line 204 still uses obj when it should use the loop variable from iterating over user_facilities. This serializes the same User object repeatedly instead of each facility.

🐛 Proposed fix
         mapping["facilities"] = [
-            FacilityBareMinimumSpec.serialize(obj).to_json() for obj in user_facilities
+            FacilityBareMinimumSpec.serialize(facility).to_json() for facility in user_facilities
         ]
🧹 Nitpick comments (1)
care/emr/resources/facility/spec.py (1)

103-110: Add updated_by field declaration for consistency with other ReadSpec implementations.

The serialize_audit_users method in base.py handles both created_by_id and updated_by_id, and if the Facility model has an updated_by_id, it will be included in the serialized output. However, FacilityReadSpec only declares created_by: dict = {} while nearly all other ReadSpec classes that call serialize_audit_users declare both fields. Adding updated_by: dict = {} aligns the implementation with the established pattern throughout the codebase.

Suggested field addition
     geo_organization: dict = {}
     created_by: dict = {}
+    updated_by: dict = {}
     invoice_number_expression: str | None = None

@praffq
Copy link
Copy Markdown
Contributor Author

praffq commented Jan 27, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 27, 2026

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@care/emr/resources/medication/dispense/dispense_order.py`:
- Around line 78-79: MedicationDispenseOrderRetrieveSpec duplicates per-object
DB lookups for location (FacilityLocation.objects.get(id=obj.location_id)),
causing an N+1; update the code to reuse the parent's serialization or read
obj.location directly and remove the separate query, and ensure the viewset that
provides these objects uses select_related("location") so obj.location is
pre-fetched; adjust MedicationDispenseOrderRetrieveSpec to call
FacilityLocationListSpec.serialize(obj.location).to_json() (or delegate to
MedicationDispenseOrderReadSpec’s serialization) instead of performing a new
FacilityLocation.objects.get.
- Around line 63-64: Add select_related('location', 'patient') to
DispenseOrderViewSet.get_queryset() so related objects are fetched in the same
query, then stop doing per-object lookups: replace the
FacilityLocation.objects.get(id=obj.location_id) usage and any similar manual
queries (lines using mapping["location"]) with direct access to obj.location and
serialize via FacilityLocationListSpec.serialize(obj.location).to_json(); other
code already accesses obj.patient directly so mirror that pattern. Ensure no
additional DB fetches remain for location in dispense_order.py.
🧹 Nitpick comments (1)
care/emr/resources/service_request/spec.py (1)

169-172: Consider adding select_related to avoid N+1 on current_encounter.

The refactor consolidates the locations query nicely, but FacilityLocationListSpec.perform_extra_serialization accesses obj.current_encounter (see care/emr/resources/location/spec.py lines 140-143). Without eager loading, this could trigger an additional query per location.

♻️ Suggested improvement
         mapping["locations"] = [
             FacilityLocationListSpec.serialize(location).to_json()
-            for location in FacilityLocation.objects.filter(id__in=obj.locations)
+            for location in FacilityLocation.objects.filter(id__in=obj.locations).select_related("current_encounter")
         ]

Based on learnings, optimize query performance using Django ORM's select_related for related object fetching.

Comment thread care/emr/resources/medication/dispense/dispense_order.py
Comment thread care/emr/resources/medication/dispense/dispense_order.py
@praffq
Copy link
Copy Markdown
Contributor Author

praffq commented Jan 28, 2026

resolved in #3496

@praffq praffq closed this Jan 28, 2026
@praffq praffq deleted the prafful/resolve-n+1-issues branch January 28, 2026 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

N+1 issues Adding model caching for User and tags for all the relavent specs

1 participant