Retrieval: Check GRIN transfer to download material (Google books)

Why is this use case interesting for this application?  - Testing alternatives to retrieve datasets?


**Harvard** created a **data pipeline** (https://github.com/institutional/institutional-books-1-pipeline) and an associated tool for obtaining materials from **Google Books** (https://www.institutional.org/posts/grin-transfer).

We do not have access to the Google Books, but we could implement the same approach accessing the Hugging Face dataset instead.

- **GRIN Transfer**: Download books
    * See how well it works
    * What does the output look like?
    * How long does the OCR cleanup process take?
    * Could it make sense to use **FastAPI** here, or does **LangChain** have a data pipeline to access this resource?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retrieval: Check GRIN transfer to download material (Google books) #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Retrieval: Check GRIN transfer to download material (Google books) #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions