-
Notifications
You must be signed in to change notification settings - Fork 3
Fix document cache #126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix document cache #126
Conversation
WalkthroughCache::add and Cache::update (src/Migration/Cache.php) were changed for resources of TYPE_ROW or TYPE_DOCUMENT: cache entries now maintain per-status counters at $this->cache[$name][$status] (stored as strings) and add() increments that counter; update() only calls add() when status != SUCCESS instead of writing the status at a composite key. In Transfer::getStatusCounters (src/Migration/Transfer.php) the inner loop now uses the iteration key ($k) alongside the resource value, converting string-numeric resources to integers and updating counters and the pending decrement using the numeric index/value. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…isable-documents-cache # Conflicts: # src/Migration/Cache.php
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/Migration/Cache.php (1)
173-186:remove()is inconsistent with ROW/DOCUMENT storage (status string vs Resource) and will always throwFor
TYPE_ROW/TYPE_DOCUMENT:
add()stores$status(a string) at$this->cache[$name][$key].remove()first checksisset($this->cache[$name][$key]), then doesin_array($resource, $this->cache[$name]).Once the first check passes, the second will always fail for ROW/DOCUMENT because the array contains strings, not
Resourceinstances, so Line 181–183 will always throwResource does not exist in cacheeven when the entry exists.A simpler and consistent fix is to rely solely on the key‑existence check for all resource types:
public function remove(Resource $resource): void { $key = $this->resolveResourceCacheKey($resource); - if ($resource->getName() == Resource::TYPE_ROW || $resource->getName() == Resource::TYPE_DOCUMENT) { - if (! isset($this->cache[$resource->getName()][$key])) { - throw new \Exception('Resource does not exist in cache'); - } - } - if (! in_array($resource, $this->cache[$resource->getName()])) { - throw new \Exception('Resource does not exist in cache'); - } - - unset($this->cache[$resource->getName()][$key]); + if (! isset($this->cache[$resource->getName()][$key])) { + throw new \Exception('Resource does not exist in cache'); + } + + unset($this->cache[$resource->getName()][$key]); }This also aligns with the fact that ROW/DOCUMENT entries are now capped and may be absent entirely.
🧹 Nitpick comments (2)
src/Migration/Cache.php (2)
91-99: 10k per‑type cap silently drops new ROW/DOCUMENT status entriesWhen
count($this->cache[$resource->getName()] ?? []) >= 10000, newTYPE_ROW/TYPE_DOCUMENTresources are never cached, andupdate()will also effectively no‑op for them (becauseadd()returns early). That might be acceptable, but it changes behavior from “track every row/doc status” to “track first ~10k per type, ignore the rest” with no visibility.Consider either:
- Making the threshold a named constant / configuration and documenting the behavior, and/or
- Emitting some signal (log/metric) when skipping caching so it’s debuggable.
Also, this still counts status per document ID; if the long‑term goal is “statusCounters not per document ID”, this change alone doesn’t move to an aggregated counter model, so you may want to clarify intent in the PR or follow‑up work.
146-151:in_array($resource->getName(), $this->cache)is the wrong existence checkLine 146 checks
if (! in_array($resource->getName(), $this->cache)), but$this->cacheis keyed by resource name and its values are arrays of resources/status strings. This condition will never be true for a normal cache state and is semantically incorrect (and may emit notices depending on PHP version).If you want to add the resource only when its type is not yet present, this should instead check keys:
- if (! in_array($resource->getName(), $this->cache)) { - $this->add($resource); - } + if (! isset($this->cache[$resource->getName()])) { + $this->add($resource); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Migration/Cache.php(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/Migration/Cache.php (2)
src/Migration/Resource.php (1)
getName(150-150)src/Migration/Resources/Database/Row.php (1)
getName(76-79)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/Migration/Cache.php (2)
139-146: Critical: Incompatible with new add() logic.The
updatemethod checks for$this->cache[$resource->getName()][$key](line 140), but the modifiedaddmethod stores counters at$this->cache[$resource->getName()][$status]instead. This means:
- The
issetcheck at line 140 will always fail becauseadd()never creates[$key]entries- Line 144 assigns
$statusas a value to[$key], conflicting with the counter-based structure where$statusis the keyEither the counter-based approach should not support
updateoperations (since counters are aggregates, not individual resources), or this method needs to be rewritten to increment/decrement the appropriate status counters.
179-182: Critical: Incompatible with new add() logic.The
removemethod checksisset($this->cache[$resource->getName()][$key])(line 180), but the modifiedaddmethod stores counters at$this->cache[$resource->getName()][$status]instead. This check will always fail and throw an exception becauseadd()never creates[$key]entries for TYPE_ROW/TYPE_DOCUMENT resources.Given the new counter-based approach where resources are aggregated rather than tracked individually, the concept of "removing" a specific resource doesn't align with the design. Consider either:
- Disallowing
removeoperations for TYPE_ROW/TYPE_DOCUMENT resources- Decrementing the counter for the resource's status instead
src/Migration/Transfer.php (1)
175-186: Fix the logic to skip 'pending' status when processing cached resource counters.The code at lines 175-186 processes cached TYPE_ROW/TYPE_DOCUMENT resources where
$krepresents the status name and$resourceis the counter. However, pending-status resources are cached bySource.cache->addAll()before processing, causing$cache['row']['pending']to exist as a cache key. When the loop encounters$k = 'pending', line 179 overwrites the correct pending count (set frompreviousReportat lines 166-172), and lines 181-183 immediately decrement it back to 0.The 'pending' key should be skipped in this loop since pending resources are pre-counted from
previousReport, and only completed/processed resources (success, error, skipped, warning, processing) should decrement the pending counter.Add a condition to skip when
$k === 'pending':if ($k === 'pending') { continue; }
🧹 Nitpick comments (1)
src/Migration/Transfer.php (1)
177-177: Consider error handling for non-numeric strings.The code converts string counters to integers with
intval($resource). If a counter string is malformed or contains non-numeric characters,intvalwill return 0 or a partial number without warning, potentially causing incorrect status counts.Consider adding validation:
if (($resourceType === Resource::TYPE_ROW || $resourceType === Resource::TYPE_DOCUMENT) && is_string($resource)) { + if (!is_numeric($resource)) { + throw new \Exception("Invalid counter value in cache for {$resourceType}: {$resource}"); + } $resource = intval($resource);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/Migration/Cache.php(1 hunks)src/Migration/Transfer.php(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ItzNotABug
Repo: utopia-php/migration PR: 80
File: src/Migration/Sources/Supabase.php:300-308
Timestamp: 2025-06-28T09:47:58.757Z
Learning: In the utopia-php/migration codebase, during the terminology swap from Collection/Attribute/Document to Table/Column/Row, the user ItzNotABug prefers to keep the existing query logic unchanged even if it becomes semantically incorrect with the new naming. The focus is purely on resource type renaming, not on fixing logical issues that become apparent after the terminology change.
🧬 Code graph analysis (2)
src/Migration/Cache.php (1)
src/Migration/Resources/Database/Row.php (1)
getName(76-79)
src/Migration/Transfer.php (1)
src/Migration/Resource.php (1)
Resource(5-278)
We need to find a better way to count statusCounters , not per document ID
Summary by CodeRabbit
Bug Fixes
Chores
✏️ Tip: You can customize this high-level summary in your review settings.