Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
3bc3283
Added routes for main version control actions
be-smith Sep 23, 2025
28ad426
Added a robust get version number helper to make sure users can't try…
be-smith Oct 9, 2025
978ed05
Changed to refocde rather than item id
be-smith Oct 10, 2025
ccf208e
Add deepdiff dependency for nested structure comparison
be-smith Oct 10, 2025
c24b5e6
Improve compare_versions to handle nested structures with DeepDiff
be-smith Oct 10, 2025
72e000a
Add data validation and field protection to restore_version
be-smith Oct 10, 2025
e8d1f67
Add action field for version history audit trail
be-smith Oct 10, 2025
e3adc63
Fix transaction safety in save_item by reversing operation order
be-smith Oct 10, 2025
d7e66bc
Add version field to HasRevisionControl trait and fix auto-increment
be-smith Oct 12, 2025
c6229c9
Add comprehensive tests for version control endpoints
be-smith Oct 12, 2025
f2d965f
Fix version control restore audit trail
be-smith Oct 13, 2025
5f294c7
Add version control UI to webapp
be-smith Oct 13, 2025
1c8c79d
Add comprehensive action field tests for version control
be-smith Oct 13, 2025
a767f64
update uv lock
be-smith Oct 13, 2025
86e1ca1
Ensures json is serializable for difference displaying
be-smith Oct 15, 2025
01a7d9d
Renamed old data to data to better reflect that the current snapshot …
be-smith Nov 4, 2025
8c1b43c
Ensured an initial version is created upon sample creation.
be-smith Nov 4, 2025
78934e6
Hiding UI for now
be-smith Nov 5, 2025
144cba2
Added indexes to mongodb for efficient lookup
be-smith Nov 5, 2025
f5e8d21
Changed how user info is stored, now has the object which is a snapsh…
be-smith Nov 5, 2025
41db7e9
Adds tests to ensure we are populating a user_id field in the version…
be-smith Nov 5, 2025
1197376
Fixed software versioning
be-smith Nov 5, 2025
0be43e6
Added comprehensive sample lifecycle test for creating, modifying and…
be-smith Nov 5, 2025
6dd63c4
Ensured correct software_version is added to version data
be-smith Nov 5, 2025
65a7476
Added pydantic models
be-smith Nov 6, 2025
3fc6ff4
Updated how version number is found
be-smith Nov 11, 2025
3085336
Naming changes of version -> version_number, software_version -> data…
be-smith Nov 11, 2025
9ca38db
Removed user snapshot related models. Updated everything to use versi…
be-smith Nov 11, 2025
d498f85
Added validation to item end points
be-smith Nov 11, 2025
9f1de5a
Added proper use of enum for version actions to pydantic models and t…
be-smith Nov 11, 2025
7acf79b
Moved protected fields to its own helper function
be-smith Nov 11, 2025
51aa0d4
Changed restored_from_version field to ObjectID not str
be-smith Nov 11, 2025
2dec696
Tidied up tests to remove "user" field and updated the field name cha…
be-smith Nov 11, 2025
cbbc9af
Removed user from list versions route
be-smith Nov 11, 2025
8a0fef3
Added pydantic validation to _get_version_number, compare_versions, r…
be-smith Nov 14, 2025
c7bd6ba
Fixed version number call for the UI component after the name change
be-smith Dec 15, 2025
ecfce07
Fixed how software version is retrieved in save version snapshot
be-smith Dec 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pydatalab/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ dependencies = [
"pint ~= 0.24",
"pandas[excel] ~= 2.2",
"pymongo ~= 4.7",
"deepdiff ~= 8.1",
]

[project.urls]
Expand Down
10 changes: 10 additions & 0 deletions pydatalab/schemas/cell.json
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,11 @@
"title": "Revisions",
"type": "object"
},
"version": {
"title": "Version",
"default": 1,
"type": "integer"
},
"creator_ids": {
"title": "Creator Ids",
"default": [],
Expand Down Expand Up @@ -622,6 +627,11 @@
"title": "Revisions",
"type": "object"
},
"version": {
"title": "Version",
"default": 1,
"type": "integer"
},
"creator_ids": {
"title": "Creator Ids",
"default": [],
Expand Down
10 changes: 10 additions & 0 deletions pydatalab/schemas/equipment.json
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,11 @@
"title": "Revisions",
"type": "object"
},
"version": {
"title": "Version",
"default": 1,
"type": "integer"
},
"creator_ids": {
"title": "Creator Ids",
"default": [],
Expand Down Expand Up @@ -586,6 +591,11 @@
"title": "Revisions",
"type": "object"
},
"version": {
"title": "Version",
"default": 1,
"type": "integer"
},
"creator_ids": {
"title": "Creator Ids",
"default": [],
Expand Down
10 changes: 10 additions & 0 deletions pydatalab/schemas/sample.json
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@
"title": "Revisions",
"type": "object"
},
"version": {
"title": "Version",
"default": 1,
"type": "integer"
},
"creator_ids": {
"title": "Creator Ids",
"default": [],
Expand Down Expand Up @@ -675,6 +680,11 @@
"title": "Revisions",
"type": "object"
},
"version": {
"title": "Version",
"default": 1,
"type": "integer"
},
"creator_ids": {
"title": "Creator Ids",
"default": [],
Expand Down
10 changes: 10 additions & 0 deletions pydatalab/schemas/startingmaterial.json
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@
"title": "Revisions",
"type": "object"
},
"version": {
"title": "Version",
"default": 1,
"type": "integer"
},
"creator_ids": {
"title": "Creator Ids",
"default": [],
Expand Down Expand Up @@ -728,6 +733,11 @@
"title": "Revisions",
"type": "object"
},
"version": {
"title": "Version",
"default": 1,
"type": "integer"
},
"creator_ids": {
"title": "Creator Ids",
"default": [],
Expand Down
13 changes: 13 additions & 0 deletions pydatalab/src/pydatalab/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@
from pydatalab.models.people import Person
from pydatalab.models.samples import Sample
from pydatalab.models.starting_materials import StartingMaterial
from pydatalab.models.versions import (
CompareVersionsQuery,
ItemVersion,
RestoreVersionRequest,
VersionAction,
VersionCounter,
)
Comment on lines +10 to +16
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to export any of these at the top level


ITEM_MODELS: dict[str, type[BaseModel]] = {
"samples": Sample,
Expand All @@ -24,4 +31,10 @@
"Collection",
"Equipment",
"ITEM_MODELS",
"ItemVersion",
"VersionCounter",
"UserSnapshot",
"VersionAction",
"RestoreVersionRequest",
"CompareVersionsQuery",
)
3 changes: 3 additions & 0 deletions pydatalab/src/pydatalab/models/traits.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ class HasRevisionControl(BaseModel):
revisions: dict[int, Any] | None = None
"""An optional mapping from old revision numbers to the model state at that revision."""

version: int = 1
"""The version number used by the version control system for tracking snapshots."""
Comment on lines +25 to +26
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as revision no? I would choose one and delete the other



class HasBlocks(BaseModel):
blocks_obj: dict[str, DataBlockResponse] = Field({})
Expand Down
117 changes: 117 additions & 0 deletions pydatalab/src/pydatalab/models/versions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
"""Pydantic models for version control system."""

from datetime import datetime
from enum import Enum

from pydantic import BaseModel, Field, validator

from pydatalab.models.utils import PyObjectId, Refcode


class VersionAction(str, Enum):
"""Valid actions that can create a version snapshot."""

CREATED = "created"
MANUAL_SAVE = "manual_save"
AUTO_SAVE = "auto_save"
RESTORED = "restored"


class ItemVersion(BaseModel):
"""A complete snapshot of an item at a specific point in time.
This model represents a version entry in the `item_versions` collection.
Each version captures the complete state of an item, allowing users to
view history and restore previous states.
"""

refcode: Refcode = Field(..., description="The refcode of the item this version belongs to")
version: int = Field(..., ge=1, description="Sequential version number (1-indexed)")
timestamp: datetime = Field(
..., description="When this version was created (ISO format with timezone)"
)
action: VersionAction = Field(
...,
description="The action that triggered this version: 'created' (item creation), "
"'manual_save' (user save), 'auto_save' (system save), or 'restored' (version restore)",
)
user_id: PyObjectId | None = Field(
None, description="User's ObjectId for efficient querying and indexing"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which user? This can be multivalued right? I don't think its any less efficient to query on nested fields like data.creator_ids or whatever

)
datalab_version: str = Field(
..., description="Version of datalab-server that created this snapshot"
)
data: dict = Field(..., description="Complete snapshot of the item data at this version")
restored_from_version: PyObjectId | None = Field(
None,
description="ObjectId of the version that was restored from (only present if action='restored')",
)

@validator("restored_from_version")
def validate_restored_from_version(cls, v, values):
"""Ensure restored_from_version is only present when action='restored'."""
action = values.get("action")
if action == VersionAction.RESTORED and v is None:
raise ValueError("restored_from_version must be provided when action='restored'")
if action != VersionAction.RESTORED and v is not None:
raise ValueError(
f"restored_from_version should only be present when action='restored', got action='{action}'"
)
return v


class VersionCounter(BaseModel):
"""Atomic counter for tracking version numbers per item.
This model represents a document in the `version_counters` collection.
It ensures atomic increment of version numbers to prevent race conditions.
"""

refcode: Refcode = Field(..., description="The refcode this counter belongs to")
counter: int = Field(
1, ge=1, description="Current version counter value (1-indexed, matches version numbers)"
)

class Config:
extra = "ignore" # Allow MongoDB's _id field and other internal fields


class RestoreVersionRequest(BaseModel):
"""Request body for restoring a version."""

version_id: str = Field(..., description="ObjectId string of the version to restore to")

@validator("version_id")
def validate_version_id_format(cls, v):
"""Validate that version_id is a valid ObjectId string."""
try:
from bson import ObjectId

ObjectId(v)
except Exception as e:
raise ValueError(f"version_id must be a valid ObjectId string: {e}")
return v

class Config:
extra = "forbid"


class CompareVersionsQuery(BaseModel):
"""Query parameters for comparing two versions."""

v1: str = Field(..., description="ObjectId string of the first version")
v2: str = Field(..., description="ObjectId string of the second version")

@validator("v1", "v2")
def validate_version_ids(cls, v):
"""Validate that version IDs are valid ObjectId strings."""
try:
from bson import ObjectId

ObjectId(v)
except Exception as e:
raise ValueError(f"Version ID must be a valid ObjectId string: {e}")
return v

class Config:
extra = "forbid"
Comment on lines +63 to +117
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine in principle -- we should have pydantic models for requests, but we don't yet -- let's remember to move this somewhere better later on

17 changes: 17 additions & 0 deletions pydatalab/src/pydatalab/mongo.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,11 @@ def create_default_indices(
- An index over item type,
- A unique index over `item_id` and `refcode`.
- A text index over user names and identities.
- Version control indexes:
- Index on item_versions.refcode for fast version history lookup
- Index on item_versions.user_id for fast user contribution queries
- Compound index on (refcode, version) for sorted version history
- Unique index on version_counters.refcode for atomic version numbering

Parameters:
background: If true, indexes will be created as background jobs.
Expand Down Expand Up @@ -238,4 +243,16 @@ def create_group_fts():
db.users.drop_index(group_fts_name)
ret += create_group_fts()

# Version control indexes
ret += db.item_versions.create_index("refcode", name="version refcode", background=background)
ret += db.item_versions.create_index("user_id", name="version user_id", background=background)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above about user IDs -- need to make sure we can handle multiple creators here, but I'm also not sure why we need fast querying by user (surely we always know the item ID when doing this)

ret += db.item_versions.create_index(
[("refcode", pymongo.ASCENDING), ("version", pymongo.DESCENDING)],
name="refcode and version",
background=background,
)
ret += db.version_counters.create_index(
"refcode", unique=True, name="unique refcode counter", background=background
)

return ret
Loading