A tool to extract PDF files from binary files such as memory dumps or firmware images.
This tool searches for the magic number of pdf (%PDF-) in a given file and extracts matching sections as pdf files. This is very similar to binwalk.
It can be used for data recovery, forensics or reverse engineering.
- Install go using the package manager of your operating system (apt, pacman, winget) or follow the official installation guide
- Clone this repository using
git clone - Open a terminal in the repository
- Compile the program with
go build . - The file
./pdfdump666should now exist
You can use gcore, part of gdb to create a memory dump of a running proccess on linux without killing it.
Do a dry run first to see if there might be a pdf in your file.
./pdfdump666 -d <path to file>If no matches were found, the data might be compressed or encrypted which isn't supported by this tool.
Next, extract the pdf files.
./pdfdump666 <path to file>Extracted pdf files will be put into a out directory within your current working directory.
This tool might extract multiple files. Only some of them work. Try opening the files with several pdf readers. Extracted files may be damaged or incomplete.
Consider processing the files using qpdf.
If none of the files can be opened, try another extraction including bytes before and after using the -b and -a options. A good range is between 0 and 1000.
Usage:
pdfdump666 <file> [flags]
Flags:
-a, --after int include N bytes after
-b, --before int include N bytes before
-d, --dry-run dry run
-h, --help help for pdfdump666
-o, --output string output directory (default "out")
Copyright 2026 Daniel Gekeler
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.