A command-line tool to search and extract full conversations from an OpenAI ChatGPT data export (conversations.json inside the ZIP archive).
Conversations matching your filters are written into structured folders, with attachments restored.
Before you can use openai_extract, you need to request your personal data export from OpenAI:
- Go to https://chat.openai.com and log in.
- Click on your profile picture → Settings.
- Navigate to Data controls.
- Click Export data.
- Confirm the request. You’ll receive an email once your export is ready.
- Download the
.zipfile from the email link.
This archive containsconversations.jsonand any uploaded files.
Use this .zip file as the input for openai_extract.
- Works directly on the exported
.zipfile (conversations.json+ attachments). - Search by one or more text patterns (AND semantics: all must match).
- Restrict results by:
- Content type (e.g.
code,code_interpreter) - Programming languages (detected from metadata and code fences).
- Content type (e.g.
- Outputs:
conversation.json(pretty-printed full conversation)files/with any referenced attachments
- Each conversation gets its own folder, named by its start timestamp.
Clone and build:
git clone https://github.com/yourname/openai_extract.git
cd openai_extract
go build -o openai_extract ./cmd/cliThis produces a binary openai_extract.
openai_extract \
-f <archive_file.zip> \
-p <pattern> [-p <pattern> ...] \
-o <output_folder> \
[--content-type code,code_interpreter] \
[--language python,go]-f, --file: Path to your OpenAI export.zip-p, --pattern: Search term or regex. Repeat-pto AND multiple patterns.-o, --output: Output folder where matched conversations are written.
-
--content-type: Require all of these content types. Example:--content-type code,code_interpreter
-
-l, --language: Require all of these languages. Example:-l go -l python
Match conversations containing both feedback and service:
openai_extract \
-f export.zip \
-p feedback -p service \
-o assets/outputMatch conversations that contain both patterns and include Go and JavaScript code:
openai_extract \
-f export.zip \
-p feedback -p service \
-o assets/output \
-l go -l javascript \
--content-type codeassets/output/
090125-1836/ # folder name from conversation start time
conversation.json # full conversation (pretty JSON)
files/ # any linked attachments
image.png
dataset.csv
Run vet/tests:
go vet ./...
go test ./...- Pattern matching is case-insensitive by default unless you pass explicit regex.
- Every filter (pattern, language, content-type) is ANDed. Each extra filter makes the match more restrictive.
- Designed for local use; no API calls.
openai_extract is released under the MIT License.