feat(sync): add AWSync client for pushing events to aw-sync-server by TimeToBuildBob · Pull Request #105 · ActivityWatch/aw-client

TimeToBuildBob · 2026-03-11T14:09:25Z

Summary

Adds aw_client/sync.py with AWSync — a client that incrementally pushes local ActivityWatch bucket events to a self-hosted aw-sync-server.

This is the client-side complement to the aw-sync-server proof-of-concept.

Design

Uses the existing ActivityWatchClient for the local AW instance
Talks to the sync server with requests + Authorization: Bearer <api_key> header
(the sync server exposes the same bucket+events API as aw-server)
Persists a per-bucket high-water mark to ~/.config/activitywatch/aw-sync-state.json
so re-runs only upload new events (incremental sync)
Bucket creation on the sync server is automatic (idempotent)
Per-bucket errors are caught and returned as -1 so one failure doesn't abort the whole sync

Usage

from aw_client.sync import AWSync

sync = AWSync(
    sync_url="http://localhost:5667",  # aw-sync-server base URL
    api_key="my-api-key",
)

# Sync all buckets
results = sync.sync()  # {"aw-watcher-window_host": 42, ...}

# Sync only window-activity buckets
results = sync.sync(bucket_filter="aw-watcher-window")

Files changed

aw_client/sync.py — new AWSync class (~130 lines)
aw_client/__init__.py — export AWSync
tests/test_sync.py — 9 unit tests (all passing)

Privacy note

This sync is intended for self-hosted use only. The aw-sync-server README includes a clear Privacy / Design Philosophy section stating it is not intended for third-party hosting of raw user data.

Closes / relates to: ActivityWatch/activitywatch#35

Adds `aw_client/sync.py` with `AWSync` — a lightweight client that incrementally pushes local ActivityWatch bucket events to a self-hosted aw-sync-server (https://github.com/TimeToBuildBob/aw-sync-server). Key design: - Uses `ActivityWatchClient` for the local AW instance - Talks to the sync server with raw `requests` + Bearer token auth - Persists a per-bucket high-water mark to `~/.config/activitywatch/aw-sync-state.json` so that re-runs only upload new events (incremental sync) - Handles bucket creation on the sync server automatically - Errors per bucket are caught and returned as -1 so one bad bucket doesn't abort the whole sync 9 unit tests cover: happy path, empty bucket skip, prefix filter, state persistence, incremental since= arg, error handling, missing state file, existing remote bucket not recreated, auth header present.

TimeToBuildBob · 2026-03-11T14:09:32Z

@greptileai review

greptile-apps · 2026-03-11T14:12:34Z

Greptile Summary

This PR introduces aw_client/sync.py with AWSync, a new client class that incrementally pushes local ActivityWatch bucket events to a self-hosted aw-sync-server. It integrates cleanly with the existing ActivityWatchClient, uses a JSON state file to track per-bucket high-water marks for incremental syncs, and is exported from the package __init__.py. Tests cover the core paths.

Key findings:

Duplicate-upload bug for zero-duration events (logic): The high-water mark is computed as max(e.timestamp + (e.duration or timedelta(0))). Because timedelta(0) is falsy in Python, a zero-duration or None-duration event produces latest = e.timestamp. On the next sync get_events(start=since) returns events with timestamp >= since, re-fetching (and re-uploading) that event.
_get_remote_buckets() called N times per sync (logic): The method is invoked inside sync_bucket(), so a single sync() call with N buckets issues N identical GET requests to the sync server. Remote bucket data should be fetched once in sync() and passed down.
Non-portable default state file path (style): _DEFAULT_STATE_FILE is hard-coded to ~/.config/activitywatch/…, which is not the conventional location on Windows. The existing codebase already uses aw_core.dirs.get_data_dir() for platform-aware paths and the same pattern should be followed here.

Confidence Score: 3/5

Safe to merge for an experimental/PoC feature, but the duplicate-upload bug and per-bucket redundant network calls should be addressed before production use.
The implementation is well-structured and has solid test coverage, but contains a confirmed logic bug (zero-duration events get uploaded twice due to the high-water mark equalling the event timestamp) and a performance issue (remote bucket list fetched N times instead of once per sync). The cross-platform state path is a secondary concern. None of these are security issues, but the duplicate-upload bug could silently corrupt data on the sync server.
aw_client/sync.py — specifically the high-water mark computation (lines 155–158) and the _get_remote_buckets() call inside sync_bucket() (line 148).

Important Files Changed

Filename	Overview
aw_client/sync.py	New AWSync class that pushes local AW events to a self-hosted sync server; has a duplicate-upload bug for zero-duration events and fetches remote buckets N times per sync call instead of once.
aw_client/init.py	Exports AWSync alongside ActivityWatchClient; change is minimal and correct.
tests/test_sync.py	9 unit tests covering happy path, filtering, incremental sync, error handling, and auth; does not include a test for the zero-duration duplicate-upload edge case.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant AWSync
    participant LocalAW as ActivityWatchClient (local)
    participant StateFile as aw-sync-state.json
    participant SyncServer as aw-sync-server

    Caller->>AWSync: sync(bucket_filter?)
    AWSync->>LocalAW: get_buckets()
    LocalAW-->>AWSync: {bucket_id: info, ...}

    loop for each matching bucket
        AWSync->>StateFile: _load_state() / read since timestamp
        AWSync->>LocalAW: get_events(bucket_id, start=since)
        LocalAW-->>AWSync: [Event, ...]

        AWSync->>SyncServer: GET /api/0/buckets/ (Auth: Bearer)
        SyncServer-->>AWSync: {existing buckets}

        alt bucket not on server
            AWSync->>SyncServer: POST /api/0/buckets/{id} (create)
            SyncServer-->>AWSync: 201 Created
        end

        AWSync->>SyncServer: POST /api/0/buckets/{id}/events (upload)
        SyncServer-->>AWSync: 200 OK

        AWSync->>StateFile: _save_state() / write new high-water mark
    end

    AWSync-->>Caller: {bucket_id: events_uploaded, ...}

_{Last reviewed commit: 5ca7e48}

greptile-apps · 2026-03-11T14:12:38Z

aw_client/sync.py

+        latest: datetime = max(
+            e.timestamp + (e.duration or timedelta(0)) for e in events
+        )
+        self._state[bucket_id] = latest.isoformat()


Duplicate uploads for zero-duration events

The high-water mark is computed as e.timestamp + (e.duration or timedelta(0)). In Python, timedelta(0) is falsy, so the expression correctly evaluates to timedelta(0) in either the None or timedelta(0) case — but the result is the same: when the latest event has a zero (or absent) duration, latest equals the event's timestamp.

On the next sync, get_events(bucket_id, start=since) fetches events with timestamp >= since. Since that zero-duration event's timestamp == since, it gets included in the response again and is re-uploaded, causing a duplicate on the server.

The fix is to advance the marker by one microsecond (the minimum representable tick) so that strictly-later events are fetched next time:

Suggested change

latest: datetime = max(

e.timestamp + (e.duration or timedelta(0)) for e in events

)

self._state[bucket_id] = latest.isoformat()

latest: datetime = max(

e.timestamp + (e.duration or timedelta(0)) for e in events

)

# Add 1 µs so a zero-duration event is not re-fetched on the next sync

from datetime import timedelta as _td

latest += timedelta(microseconds=1)

self._state[bucket_id] = latest.isoformat()

self._save_state()

Or more cleanly, keep the import at module level and apply the offset:

latest: datetime = max( e.timestamp + (e.duration or timedelta(0)) for e in events ) + timedelta(microseconds=1) self._state[bucket_id] = latest.isoformat() self._save_state()

greptile-apps · 2026-03-11T14:12:39Z

aw_client/sync.py

+        remote_buckets = self._get_remote_buckets()
+        if bucket_id not in remote_buckets:
+            self._ensure_remote_bucket(bucket_id, event_type, hostname)


Remote bucket list fetched once per bucket, not once per sync

_get_remote_buckets() is called inside sync_bucket(), which is called for every bucket during a single sync() run. This means N identical GET requests hit the sync server for N local buckets — even though the list of remote buckets doesn't change between iterations within the same sync() call.

The fix is to fetch remote buckets once in sync() and pass the result down to sync_bucket():

# In sync(): remote_buckets = self._get_remote_buckets() for bucket_id, info in buckets.items(): ... count = self.sync_bucket(bucket_id, info, remote_buckets=remote_buckets) # sync_bucket signature: def sync_bucket(self, bucket_id: str, bucket_info: dict, remote_buckets: Optional[Dict[str, dict]] = None) -> int: ... if remote_buckets is None: remote_buckets = self._get_remote_buckets() if bucket_id not in remote_buckets: self._ensure_remote_bucket(bucket_id, event_type, hostname)

This also makes sync_bucket() still usable as a standalone method (with a fresh fetch) while being efficient when called from sync().

greptile-apps · 2026-03-11T14:12:40Z

aw_client/sync.py

+
+logger = logging.getLogger(__name__)
+
+_DEFAULT_STATE_FILE = Path.home() / ".config" / "activitywatch" / "aw-sync-state.json"


Hardcoded ~/.config path is not cross-platform

Path.home() / ".config" / "activitywatch" is XDG-style and works on Linux/macOS, but on Windows it resolves to C:\Users\<user>\.config\activitywatch\, which is not the conventional location for config files there (normally %APPDATA%\activitywatch\).

The existing client.py already uses from aw_core.dirs import get_data_dir for a platform-aware data path. If aw_core.dirs exposes a get_config_dir() (or a state-directory equivalent), it should be used here to stay consistent with how ActivityWatch manages paths across platforms.

Suggested change

_DEFAULT_STATE_FILE = Path.home() / ".config" / "activitywatch" / "aw-sync-state.json"

_DEFAULT_STATE_FILE = Path(get_data_dir("aw-client")) / "aw-sync-state.json"

(Adjust to the correct aw_core.dirs helper once confirmed; the important point is to avoid hardcoding ~/.config.)

ErikBjare · 2026-03-11T15:29:25Z

No, we're not doing this "sync server" idea

greptile-apps bot reviewed Mar 11, 2026

View reviewed changes

ErikBjare closed this Mar 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(sync): add AWSync client for pushing events to aw-sync-server#105

feat(sync): add AWSync client for pushing events to aw-sync-server#105
TimeToBuildBob wants to merge 1 commit intoActivityWatch:masterfrom
TimeToBuildBob:feat/aw-sync-client

TimeToBuildBob commented Mar 11, 2026

Uh oh!

TimeToBuildBob commented Mar 11, 2026

Uh oh!

greptile-apps bot commented Mar 11, 2026

Uh oh!

greptile-apps bot Mar 11, 2026

Uh oh!

greptile-apps bot Mar 11, 2026

Uh oh!

greptile-apps bot Mar 11, 2026

Uh oh!

ErikBjare commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		logger = logging.getLogger(__name__)

		_DEFAULT_STATE_FILE = Path.home() / ".config" / "activitywatch" / "aw-sync-state.json"

	_DEFAULT_STATE_FILE = Path.home() / ".config" / "activitywatch" / "aw-sync-state.json"
	_DEFAULT_STATE_FILE = Path(get_data_dir("aw-client")) / "aw-sync-state.json"

Uh oh!

Conversation

TimeToBuildBob commented Mar 11, 2026

Summary

Design

Usage

Files changed

Privacy note

Uh oh!

TimeToBuildBob commented Mar 11, 2026

Uh oh!

greptile-apps bot commented Mar 11, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

ErikBjare commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants