Skip to content

feat: Use GUID to identify articles in RSS feeds#171

Merged
guyfedwards merged 1 commit intoguyfedwards:masterfrom
larsks:feature/guid
Oct 18, 2025
Merged

feat: Use GUID to identify articles in RSS feeds#171
guyfedwards merged 1 commit intoguyfedwards:masterfrom
larsks:feature/guid

Conversation

@larsks
Copy link
Contributor

@larsks larsks commented Oct 14, 2025

Use Link rather than title to correlate feeds items with database items.
This permits us to correctly handle title changes without creating
duplicate entries.

Closes #167

@larsks
Copy link
Contributor Author

larsks commented Oct 14, 2025

How modified titles are treated before this change:

guid-before

And how they work after this change:

guid-after

@larsks
Copy link
Contributor Author

larsks commented Oct 14, 2025

The downside here is that because we weren't previously recording the GUID, the first time this runs with this change it will create a bunch of duplicate entries.

// Index based so all new migrations must go at the end of the array
migrations := []string{
`alter table items add favourite boolean not null default 0;`,
`alter table items add guid text`,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need an update table set guid = link; here to fix existing entries

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will fix the issue of duplicate entries as the first time nom loads with these changes it will migrate the existing entries

Copy link
Contributor Author

@larsks larsks Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will address some cases, but some blogs use e.g. UUIDs for the <guid> value, so we'll still see duplicates. The alternative would be to just use the <link> value instead of <guid>, which maybe is fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using <link> works well; I just didn't go that route originally because, you know, there was a GUID. But using <link> seems better for compatability. I'm going to update the PR.

Use Link rather than title to correlate feeds items with database items.
This permits us to correctly handle title changes without creating
duplicate entries.

Closes guyfedwards#167
@larsks
Copy link
Contributor Author

larsks commented Oct 14, 2025

I've updated the PR to use the Link attribute instead of GUID as the unique identifier, but I've left in the code that gathers and stores the GUID from the feed.

@guyfedwards guyfedwards merged commit 89c1fed into guyfedwards:master Oct 18, 2025
2 checks passed
@guyfedwards
Copy link
Owner

thanks @larsks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use a unique identifier in UpsertItem to identify duplicate items

2 participants