Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion R/extractWorks.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#' Extract works associated with a concept in openAlex and store data as a compressed R list object
#' Extract works associated with a source or concept in openAlex and store data as a compressed R list object
#'
#' @param data_style options for how much/how little data to return, see @details
#' @param mailto email address of user, needed to get in 'polite pool' of API
Expand Down
1 change: 1 addition & 0 deletions R/queryConcepts.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#' @param variables to return in data.table
#' @description Search spreadsheet of openAlex concept tree (see https://docs.openalex.org/about-the-data/concept). Current google sheet url is: https://docs.google.com/spreadsheets/d/1LBFHjPt4rj_9r0t0TTAlT68NwOtNH8Z21lBMsJDMoZg/edit#gid=1473310811
#' @return datatable of results
#' @example man/examples/concept.R
#' @export
#' @details NOTE: note that https://api.openalex.org/concepts doesn't seem to tolerate regex at this point, so that needs to be done with the output
#' @importFrom stringr str_detect
Expand Down
1 change: 1 addition & 0 deletions R/querySources.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#' @param type which type of source should be included in query, defaults to all
#' @description Primary use of this function is to get source ID for use in API
#' @export
#' @example man/examples/sources.R
#' @import jsonlite
#' @import stringr
#' @import httr
Expand Down
1 change: 1 addition & 0 deletions R/queryTitles.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#' @import jsonlite
#' @import magrittr
#' @import data.table
#' @example man/examples/sources.R
#' @details Note that because extracted records can be pretty large--and are complicated, nested json file--there is an optional "data_style" command that lets the user specify what to return. Currently there are three options: (1) bare_bones returns OpenAlex ID + DOI, basically, results that can be used to look up the work again; (2) citation returns typical citation information, like journal name, author, etc., with a couple bonus items like source.id to link back to openAlex (3) comprehensive returns author institutional affiliations, open access info, funding data, etc.; and (4) [not active] all returns the entire result in original json format.
#' @export
#'
Expand Down
18 changes: 16 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,26 @@ The full openAlex database is ~300GB and so hosting the entire database is not a

# functions

indexBuild currently does three main tasks: (1) search and identify IDs for venues (e.g., journals) and concept tags in openAlex; (2) query works associated with venues or concepts in openAlex and return a json database; (3) turn json file trees for openAlex works into a row-wise data.table object with a simple subset of metadata. Right now, the first two tasks are split across venues and concepts, e.g., there are separate extractVenues() and extractConcepts() functions. At some point, these can be combined.
indexBuild currently does three main tasks:

## (1) search and identify IDs for venues (e.g., journals) and concept tags in openAlex

- `queryConcepts()` lets you search for concepts that works are tagged with in OpenAlex. Example use case: find openAlex ID for "public administration" and use that ID to subsequently query for works tagged with this concept.
- `querySources()` lets you search for journals in openAlex. Use case is to get journal IDs that can then be used to extract journal information or access all works associated with a journal.
- `queryTitles()` lets you input the name/title of a reference and search for matches in openAlex. Use case is to generate a list of candidate matches that can be indexed for more comprehensive, multivariate search.
- `lookupJournal()` is a convenience function for matching ISSN IDs to openAlex identifiers. Example use case is linking journal data from SciMago to OpenAlex.


## (2) extract data for works associated with a given venue or concept in openAlex and return a json database;

- `extractWorks()` lets you input source or concept id and returns query result containing all works associated with that ID in openAlex.

Currently, `extractWorks()` handles processing internally, applying the `processWork()` function to JSON query results to develop a flat file (data.table) representation of the results. `processWork` has multiple return options, including 'bare_bones' which returns just DOI and openAlex ID (useful for further query), 'citation' which returns basic reference information, and 'comprehensive' which returns extra information like authors' institutional affiliations, available funding data, and open access information. Note that where necessary, `processWork` collapses entries using the ';' separator to store multiple entries (e.g., co-author names and IDs) in a single table entry.

# example
To get information about a journal, you can feed in a journal title:
```
queryVenues(venue_string = 'Journal of Public Administration Research and Theory')
querySources(venue_string = 'Journal of Public Administration Research and Theory')
```

and get information about a concept, you can feed in a concept.
Expand Down
5 changes: 5 additions & 0 deletions man/examples/concept.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# find concepts in openAlex
id <- queryConcepts(concept_string = 'public policy')
print(id)


4 changes: 4 additions & 0 deletions man/examples/sources.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# find journal in openAlex
id <- querySources(source = 'Journal of Public Administration Research and Theory')
print(id)

5 changes: 5 additions & 0 deletions man/examples/titles.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# find titles in openAlex
id <- queryTitles('Does collaborative governance work?')
print(id)


2 changes: 1 addition & 1 deletion man/extractWorks.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions man/queryConcepts.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions man/querySources.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions man/queryTitle.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.