Skip to content

Language declaration at the document level #27

@seasidesparrow

Description

@seasidesparrow

adsabs/ADSIngestParser#83 raised the issue of whether to capture the language of the overall document in the fulltext.language value of Document.json. In JATS at least, the attribute @xml:lang isn't declared within the body tag itself, although it is declared in subelements like caption.

If the metadata itself declares a language (for example in article-meta), we should try to capture that within Document.json, but I don't think we should populate the language key-value pair under fulltext with a language declaration that's higher in the document heirarchy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions