Skip to content

Metadata in collections that needs to be standardized #8

@kwalcock

Description

@kwalcock

These are metadata fields that UA has at times included with the metadata. Except for the last one in the list, this is a subset of what was found in the PDFs (pdfinfo) plus what may have been provided in a spreadsheet, and possibly with some renaming to align the PDF metadata with the data in the Doc17k collection.

Doc10
Set(personalAuthor_en, corporateAuthor_en, Famine Early Warning Systems Network (FEWS NET), title, Keywords, creation date)

Doc52
Set(publicationDate, personalAuthor_en, series_en, corporateAuthor_en, title, Keywords, creation date, vroy@fews.net)

Doc350
Set(title, publisherName, creation date)

Doc500
Set(title, publisherName, creation date)

Doc17k - these come not from PDFs but from a database
Set(accessRights, DraftPages, localizedTranslationURL_ms, _dlc_DocIdItemGuid, AG, TitleNumbering, agrovoc_es, Author, defaultTranslationURL_cs, seriesName_ru, localizedTranslationURL_cs, division_es, series_id, defaultTranslationURL_ne, defaultTranslationURL_mg, gsaentity_google_type, thumb100, localizedTranslationURL_mg, region_zh, personalAuthor_es, Minor Version, RESPONSE_SENDER_NAME, localizedTranslationURL_de, session, GTS_PDFXConformance, ShareDoc, MAIL_MSG_ID1, department_zh, localizedCardURL, _EmailSubject, defaultTranslationURL_ko, defaultTranslationURL_zh, localizedTranslationURL_ko, Language, localizedTranslationURL_zh, defaultTranslationURL_fj, Maintained by, defaultTranslationURL_so, region_ru, ParagraphNumberingLegal, jobNumber, localizedTranslationURL_so, Major Version, localizedTranslationURL_to, docRepCollection, author_id, KeyWords, Title, localizedTranslationURL_hy, defaultTranslationURL_ta, defaultTranslationURL_sk, region_en, defaultTranslationURL_ba, seriesName_es, Subject, localizedTranslationURL_sm, localizedTranslationURL_ca, docType_en, defaultTranslationURL_ka, country_es, localizedTranslationURL_ka, agrovoc_id, defaultTranslationURL_ur, meeting_ru, localizedTranslationURL_ur, MTEquationSection, issn, Universal PDF, 213, localizedTranslationURL_mn, defaultTranslationURL_hi, corporateAuthor_id, Description, agrovoc_en, GTS_PDFXVersion, defaultTranslationURL_sr, department_ar, division_en, RapportAuteur, sharepoint_id, defaultTranslationURL_ar, thumb200, _PreviousAdHocReviewCycleID, pages, FilePreviewStatus, MAIL_MSG_ID2, series_ar, robots, division, defaultTranslationURL_ms, mobiUrl, series_es, AssocFileName, Last Modified, allLanguages, distribution, Version, _EmailStoreID, seriesName_id, ICNAppPlatform, defaultTranslationURL_sl, localizedTranslationURL_sl, localizedTranslationURL_rn, localizedTranslationURL_es, collection_ar, seriesDetail, note, Mendeley Citation Style_1, workUuid, country_ru, gsaentity_City, customTitle_es, AGA, seriesName_en, collection_es, ICNAppVersion, first_open, meeting_ar, SourceModified, defaultTranslationURL_dual, Symbol1, defaultTranslationURL_lo, country_en, region_id, localizedTranslationURL_ne, defaultTranslationURL_to, gsaentity_google_language, e-isbn, region_fr, gsaentity_file_type, UseDefaultLanguage, _DocHome, defaultTranslationURL_de, defaultTranslationURL_km, department_id, defaultTranslationURL_id, localizedTranslationURL_mk, PAA activities, PDFVersion, department_fr, Generator, localizedTranslationURL_dual, abstract_es, defaultTranslationURL_fr, defaultTranslationURL_sm, localizedTranslationURL_fr, defaultTranslationURL_et, cardURL, localizedTranslationURL_uk, Operator, corporateAuthor_ar, defaultTranslationURL_da, localizedTranslationURL_sw, localizedTranslationURL_da, customTitle_zh, localizedTranslationURL_et, sdg, title, Keywords, defaultTranslationURL_mt, DirectFormatting, DocumentToConvert, meeting_id, defaultTranslationURL_hy, Direction, division_fr, localizedTranslationURL_sk, localizedTranslationURL_ru, language, TransitPubID, docType_zh, author_en, defaultTranslationURL_hu, localizedTranslationURL_hu, gsaentity_google_lastmod, agrovoc_ru, collection_fr, ContentTypeId, edition, LinksUpToDate, localizedTranslationURL_en, division_ru, confNumber, abstract_zh, localizedTranslationURL_si, defaultTranslationURL_mn, _AdHocReviewCycleID, database_id, defaultLanguage, author, RapportTaalDocument, revision date, project name, _EmailEntryID, department_es, series, defaultTranslationURL_es, personalAuthor_en, defaultTranslationURL_sv, region_ar, country_id, HeaderDone, _AuthorEmailDisplayName, country_fr, _AuthorEmail, Trapped, defaultTranslationURL_th, docType_ar, PTEX.Fullbanner, Comments, DocType, defaultTranslationURL_no, localizedTranslationURL_no, DocSecurity, gsaentity_Country, abstract, agrovoc_ar, defaultTranslationURL_pt, localizedTranslationURL_pt, division_ar, gsaentity_Location, defaultTranslationURL_ky, localizedTranslationURL_lo, localizedTranslationURL_ky, RapportTitel, localizedTranslationURL_hr, series_zh, geoSelfGoverning, localizedTranslationURL_fa, Afdrukken, department_ru, ICNAppName, codeMantra, LLC, localizedTranslationURL_id, IniName, Your guide to the eatwellplate , localizedTranslationURL_vi, subtitle, series number, _ReviewCycleID, customTitle_fr, LastSaved, RapportDatum, defaultTranslationURL_ki, collection_zh, defaultTranslationURL_si, division_id, placeOfPublication, author_zh, description, meeting_zh, HyperlinksChanged, docType_id, series_en, Category, Docear4Word_StyleTitle, AuthoritativeDomain[2], docType_fr, customTitle_ru, Build, country_ar, collection_ru, defaultTranslationURL_el, keywords, localizedTranslationURL_is, localizedTranslationURL_el, defaultTranslationURL_te, localizedTranslationURL_te, Created, defaultTranslationURL_ml, sortpubdate, customTitle_en, agrovoc_zh, localizedTranslationURL_nl, GENERATOR, collection_en, localizedTranslationURL_ta, customTitle, isbn, CreationDate--Text, User, Division, corporateAuthor_zh, personalAuthor_zh, defaultTranslationURL_ja, ElsevierWebPDFSpecifications, gsaentity_country_content, defaultTranslationURL_he, abstract_ru, localizedTranslationURL_he, Translated, SjabloonVersieDatum, EcoNote, localizedTranslationURL_sv, uuid, defaultTranslationURL_it, localizedTranslationURL_it, Company, AppVersion, author_ar, abstract_en, region_es, Status, localizedTranslationURL_th, localizedTranslationURL_sr, doi, localizedTranslationURL_ar, WPS-JOURNALDOI, docType_es, seriesName_zh, geoNonSelfGoverning, defaultTranslationURL_fa, file_length, publisherName, OLV0_XMD_PAGE_COUNT, MTWinEqns, defaultTranslationURL_mk, otherEntitiesInvolved, CreatorVersion, defaultTranslationURL_vi, epubUrl, XPressPrivate, visibility, WPS-ARTICLEDOI, defaultTranslationURL_uk, meeting_es, defaultTranslationURL_sw, docType, personalAuthor_ar, series_fr, Creator, abstract_ar, defaultTranslationURL_ru, WkDocID, faoProject, defaultTranslationURL_pl, TaalDocument, publicationDate, localizedTranslationURL_pl, Prepared, defaultTranslationURL_is, e-issn, series_ru, collection_id, department_en, localizedTranslationURL_ki, MTEquationNumber2, corporateAuthor_es, defaultTranslationURL_sq, gsaentity_google_encoding, localizedTranslationURL_sq, defaultTranslationURL_nl, seriesName_ar, author_fr, docType_ru, meeting_fr, ScaleCrop, AuthoritativeDomain[1], division_zh, RapportVoettekst, localizedTranslationURL_km, _AssemblyLocation, _AssemblyName, defaultTranslationURL_ca, EMAIL_OWNER_ADDRESS, defaultTranslationURL_ro, gsaentity_doc_source, Base Target, localizedTranslationURL_ro, PXCViewerInfo, localizedTranslationURL_fj, WPS-PROCLEVEL, SOURCE, Type, localizedTranslationURL_mt, author_ru, defaultTranslationURL_lv, localizedTranslationURL_ml, localizedTranslationURL_lv, abstract_fr, agrovoc_fr, year, JobNo, cardText, LCID, meetingDocSymbol, localizedTranslationURL_ba, corporateAuthor_fr, meeting_en, personalAuthor_fr, Papiersoort, localizedTranslationURL_ja, homepage, defaultTranslationURL_tr, localizedTranslationURL_tr, creation date, database_en, gsaentity_file_type_content, project code, gsaentity_Date, ContentType, corporateAuthor_ru, localizedTranslationURL_hi, country_zh, publisher, personalAuthor_ru, author_es, defaultTranslationURL_hr, corporateAuthor_en, seriesName_fr, customTitle_ar, defaultTranslationURL_rn, Subjects, defaultTranslationURL_fi, localizedTranslationURL_fi, alternativeVersion, ADBE_ProducerDetails)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions