Skip to content

Conversation

@fogelito
Copy link
Contributor

@fogelito fogelito commented Dec 4, 2025

We need to find a better way to count statusCounters , not per document ID

Summary by CodeRabbit

  • Bug Fixes

    • Fixed counting and caching so row/document items now maintain per-status counters and only increment when not successful; counters stored consistently.
    • Corrected pending/countdown logic so reductions use the actual numeric amount for those resource types.
  • Chores

    • Adjusted internal indexing to keep counters and pending totals synchronized.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 4, 2025

Walkthrough

Cache::add and Cache::update (src/Migration/Cache.php) were changed for resources of TYPE_ROW or TYPE_DOCUMENT: cache entries now maintain per-status counters at $this->cache[$name][$status] (stored as strings) and add() increments that counter; update() only calls add() when status != SUCCESS instead of writing the status at a composite key. In Transfer::getStatusCounters (src/Migration/Transfer.php) the inner loop now uses the iteration key ($k) alongside the resource value, converting string-numeric resources to integers and updating counters and the pending decrement using the numeric index/value.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Review focus:
    • Confirm cache shape change (composite key → per-status map) and stringified counter values in Cache.
    • Verify Cache::update’s changed conditional (skip on SUCCESS) matches intended semantics.
    • Ensure Transfer::getStatusCounters’ use of $k and numeric conversion aligns with new cache layout and correctly adjusts pending counts.
    • Check behavior with non-numeric resource strings and mixed-type resource lists for off-by-one or type-cast issues.
    • Search codebase for consumers that expect the old composite-key entries and adjust tests accordingly.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Fix document cache' is vague and does not clearly convey the actual changes. The PR objectives indicate the intent is to 'Disable document cache' and find a better way to count statusCounters, but the title suggests a simple fix rather than the architectural change being made. Consider a more descriptive title such as 'Refactor cache counting to track by status instead of document ID' or 'Count status counters per status instead of per document ID' to better reflect the actual changes.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch disable-documents-cache

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e3f256 and 36be9d3.

📒 Files selected for processing (1)
  • src/Migration/Cache.php (1 hunks)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/Migration/Cache.php (1)

173-186: remove() is inconsistent with ROW/DOCUMENT storage (status string vs Resource) and will always throw

For TYPE_ROW / TYPE_DOCUMENT:

  • add() stores $status (a string) at $this->cache[$name][$key].
  • remove() first checks isset($this->cache[$name][$key]), then does in_array($resource, $this->cache[$name]).

Once the first check passes, the second will always fail for ROW/DOCUMENT because the array contains strings, not Resource instances, so Line 181–183 will always throw Resource does not exist in cache even when the entry exists.

A simpler and consistent fix is to rely solely on the key‑existence check for all resource types:

     public function remove(Resource $resource): void
     {
         $key = $this->resolveResourceCacheKey($resource);
-        if ($resource->getName() == Resource::TYPE_ROW || $resource->getName() == Resource::TYPE_DOCUMENT) {
-            if (! isset($this->cache[$resource->getName()][$key])) {
-                throw new \Exception('Resource does not exist in cache');
-            }
-        }
-        if (! in_array($resource, $this->cache[$resource->getName()])) {
-            throw new \Exception('Resource does not exist in cache');
-        }
-
-        unset($this->cache[$resource->getName()][$key]);
+        if (! isset($this->cache[$resource->getName()][$key])) {
+            throw new \Exception('Resource does not exist in cache');
+        }
+
+        unset($this->cache[$resource->getName()][$key]);
     }

This also aligns with the fact that ROW/DOCUMENT entries are now capped and may be absent entirely.

🧹 Nitpick comments (2)
src/Migration/Cache.php (2)

91-99: 10k per‑type cap silently drops new ROW/DOCUMENT status entries

When count($this->cache[$resource->getName()] ?? []) >= 10000, new TYPE_ROW / TYPE_DOCUMENT resources are never cached, and update() will also effectively no‑op for them (because add() returns early). That might be acceptable, but it changes behavior from “track every row/doc status” to “track first ~10k per type, ignore the rest” with no visibility.

Consider either:

  • Making the threshold a named constant / configuration and documenting the behavior, and/or
  • Emitting some signal (log/metric) when skipping caching so it’s debuggable.

Also, this still counts status per document ID; if the long‑term goal is “statusCounters not per document ID”, this change alone doesn’t move to an aggregated counter model, so you may want to clarify intent in the PR or follow‑up work.


146-151: in_array($resource->getName(), $this->cache) is the wrong existence check

Line 146 checks if (! in_array($resource->getName(), $this->cache)), but $this->cache is keyed by resource name and its values are arrays of resources/status strings. This condition will never be true for a normal cache state and is semantically incorrect (and may emit notices depending on PHP version).

If you want to add the resource only when its type is not yet present, this should instead check keys:

-        if (! in_array($resource->getName(), $this->cache)) {
-            $this->add($resource);
-        }
+        if (! isset($this->cache[$resource->getName()])) {
+            $this->add($resource);
+        }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 17a7144 and d6126f2.

📒 Files selected for processing (1)
  • src/Migration/Cache.php (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/Migration/Cache.php (2)
src/Migration/Resource.php (1)
  • getName (150-150)
src/Migration/Resources/Database/Row.php (1)
  • getName (76-79)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/Migration/Cache.php (2)

139-146: Critical: Incompatible with new add() logic.

The update method checks for $this->cache[$resource->getName()][$key] (line 140), but the modified add method stores counters at $this->cache[$resource->getName()][$status] instead. This means:

  1. The isset check at line 140 will always fail because add() never creates [$key] entries
  2. Line 144 assigns $status as a value to [$key], conflicting with the counter-based structure where $status is the key

Either the counter-based approach should not support update operations (since counters are aggregates, not individual resources), or this method needs to be rewritten to increment/decrement the appropriate status counters.


179-182: Critical: Incompatible with new add() logic.

The remove method checks isset($this->cache[$resource->getName()][$key]) (line 180), but the modified add method stores counters at $this->cache[$resource->getName()][$status] instead. This check will always fail and throw an exception because add() never creates [$key] entries for TYPE_ROW/TYPE_DOCUMENT resources.

Given the new counter-based approach where resources are aggregated rather than tracked individually, the concept of "removing" a specific resource doesn't align with the design. Consider either:

  1. Disallowing remove operations for TYPE_ROW/TYPE_DOCUMENT resources
  2. Decrementing the counter for the resource's status instead
src/Migration/Transfer.php (1)

175-186: Fix the logic to skip 'pending' status when processing cached resource counters.

The code at lines 175-186 processes cached TYPE_ROW/TYPE_DOCUMENT resources where $k represents the status name and $resource is the counter. However, pending-status resources are cached by Source.cache->addAll() before processing, causing $cache['row']['pending'] to exist as a cache key. When the loop encounters $k = 'pending', line 179 overwrites the correct pending count (set from previousReport at lines 166-172), and lines 181-183 immediately decrement it back to 0.

The 'pending' key should be skipped in this loop since pending resources are pre-counted from previousReport, and only completed/processed resources (success, error, skipped, warning, processing) should decrement the pending counter.

Add a condition to skip when $k === 'pending':

if ($k === 'pending') {
    continue;
}
🧹 Nitpick comments (1)
src/Migration/Transfer.php (1)

177-177: Consider error handling for non-numeric strings.

The code converts string counters to integers with intval($resource). If a counter string is malformed or contains non-numeric characters, intval will return 0 or a partial number without warning, potentially causing incorrect status counts.

Consider adding validation:

 if (($resourceType === Resource::TYPE_ROW || $resourceType === Resource::TYPE_DOCUMENT) && is_string($resource)) {
+    if (!is_numeric($resource)) {
+        throw new \Exception("Invalid counter value in cache for {$resourceType}: {$resource}");
+    }
     $resource = intval($resource);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71e867c and cf8797e.

📒 Files selected for processing (2)
  • src/Migration/Cache.php (1 hunks)
  • src/Migration/Transfer.php (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ItzNotABug
Repo: utopia-php/migration PR: 80
File: src/Migration/Sources/Supabase.php:300-308
Timestamp: 2025-06-28T09:47:58.757Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the user ItzNotABug prefers to keep the existing query logic unchanged even if it becomes semantically incorrect with the new naming. The focus is purely on resource type renaming, not on fixing logical issues that become apparent after the terminology change.
🧬 Code graph analysis (2)
src/Migration/Cache.php (1)
src/Migration/Resources/Database/Row.php (1)
  • getName (76-79)
src/Migration/Transfer.php (1)
src/Migration/Resource.php (1)
  • Resource (5-278)

@fogelito fogelito changed the title Disable document cache Fix document cache Dec 4, 2025
@abnegate abnegate merged commit 409983b into main Dec 5, 2025
4 checks passed
@abnegate abnegate deleted the disable-documents-cache branch December 5, 2025 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants