Skip to content

jpom IEEE parser#182

Open
jpom wants to merge 3 commits into
adsabs:mainfrom
jpom:jpomieee
Open

jpom IEEE parser#182
jpom wants to merge 3 commits into
adsabs:mainfrom
jpom:jpomieee

Conversation

@jpom
Copy link
Copy Markdown
Member

@jpom jpom commented Dec 12, 2025

IEEE parser

@seasidesparrow seasidesparrow self-requested a review December 12, 2025 16:21
@jpom jpom changed the title Jpomieee jpom IEEE parser Dec 12, 2025
# Conference location
if self.publicationinfo.find("conflocation") is not None:
confloc = self.publicationinfo.find("conflocation").text
self.base_metadata["conf_location"] = confloc
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line will fail if self.publicationinfo.find("conflocation") is None -- the variable confloc will be undefined at L96.

Copy link
Copy Markdown
Member

@seasidesparrow seasidesparrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good first round. You need to make sure the project passes existing IEEE unit tests, and then add unit tests specifically for conference proceedings.

# self.base_metadata["collection"] = colls_uniq

# TO DO: append confDates & confLocation to %J
if confdate:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need an option to not output confloc if it is NoneType or ""

# Article sequence number
if articleinfo.find("articleseqnum"):
articleid = articleinfo.find("articleseqnum").get_text()
self.base_metadata["electronic_id"] = articleid
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, if there are firstPage, lastPage and articleseqnum, I'd like to field all three. Right now, the existing test cases have a firstPage and lastPage under pagination, so I'd like to add electronicID rather than using it exclusively.

Comment thread adsingestp/parsers/ieee.py Outdated
# Conference volume number
if self.volumeinfo:
self.base_metadata["volume"] = self.volumeinfo.find("volumenum").get_text()
self.base_metadata["issue"] = self.volumeinfo.find("issue").find("issuenum").get_text()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore L50, because you're losing issue numbers when available.

else:
self.base_metadata["volume"] = ""

# Conferences don't have an issue number
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the parser should be able to handle both conferences and journal articles without giving special instructions to the parser.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 2, 2026

Codecov Report

❌ Patch coverage is 0% with 366 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.36%. Comparing base (f1bfe6f) to head (4f51dc9).

Files with missing lines Patch % Lines
adsingestp/parsers/ieee.py 0.00% 207 Missing ⚠️
adsingestp/parsers/ieeeOrig.py 0.00% 159 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #182      +/-   ##
==========================================
- Coverage   91.05%   81.36%   -9.70%     
==========================================
  Files          25       27       +2     
  Lines        3073     3439     +366     
==========================================
  Hits         2798     2798              
- Misses        275      641     +366     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants