Skip to content

mmediasoftwarelab/BIP39RecoveryTool-public

Repository files navigation

BIP39 Recovery Tool - Core Engine

Platform Language Qt Air-Gapped License

Made by M Media


You had a wallet. You lost the seed phrase. Somewhere on that drive - maybe in a deleted file, a browser cache, a backup from three years ago - those words still exist as bytes. This engine finds them.


What This Is

This repository contains the complete scanning and BIP-39 algorithm engine that powers the BIP39 Recovery Tool - a commercial Windows application for recovering BIP-39 cryptocurrency seed phrases from storage devices.

The code published here is the real engine. Not a demo. Not a stripped-down reference implementation. The same LowLevelScanner, BIP39Checksum, and Bip39Sequence routines that run in the commercial product are what you're reading right now.

These components are published for transparency, peer review, and for developers who want to integrate BIP-39 scanning capabilities into their own tools.


Air-Gapped by Design

These components make zero network calls. Zero.

No telemetry. No analytics. No "phone home." No external library calls to anything that touches a socket. The entire scan - from the first disk sector to the final CSV row - happens between your CPU and your storage device. Your seed words, your file paths, and your results never leave your machine.

You can verify this yourself:

grep -rn "http\|socket\|network\|QNetworkAccess\|curl\|telemetry" *.h *.cpp

You will find nothing. That is intentional. When you're recovering a cryptocurrency wallet, the last thing you need is a tool that might be logging what it finds.


Components

BIP39Checksum - Spec-Compliant Validator

#include "bip39_checksum.h"

bool valid = BIP39Checksum::validate(words, wordlist);

Implements the full BIP-39 checksum algorithm as specified in FIPS 180-4:

  1. Maps each word to its 11-bit index in the 2048-word canonical list
  2. Concatenates indices into a single bitstream (MSB-first)
  3. Splits into entropy bytes + checksum bits (CS = N/3 for phrase lengths 12/15/18/21/24)
  4. SHA-256 hashes the entropy; compares the first CS bits against the extracted checksum

False-positive rate for random BIP-39 word sequences: 1 in 256 (2^-CS minimum). When validate() returns true, you have a phrase that passes cryptographic verification - not just a word count match.

No Qt dependency. STL + the bundled sha256 only.


Bip39Sequence - Sliding-Window Extractor

#include "bip39_sequence.h"

QSet<QString> wordSet = Bip39Sequence::buildWordSet(wordlist);
QVector<Bip39Sequence::Match> matches = Bip39Sequence::extract(data, wordSet);

Tokenizes raw bytes on non-alpha boundaries and extracts every consecutive BIP-39 word run of exactly 12, 15, 18, 21, or 24 words. Uses a sliding window across longer runs - a 13-word sequence yields two 12-word candidates (offset 0 and offset 1), so nothing is missed.

Works on both QByteArray (raw disk data) and QString (decoded text). Each Match carries the word list and the byte offset of the first word in the source buffer, so you always know exactly where in a file or disk block the candidate phrase started.


LowLevelScanner - Raw Disk Engine

#include "lowlevelscanner.h"

LowLevelScanner scanner;
scanner.startScan(R"(\\.\PhysicalDrive2)", outputDir, 1024 * 1024);

Reads every byte of a physical disk device using the Windows raw I/O API (CreateFileW / ReadFile). At the default 1 MB block size, a 1 TB drive is processed as roughly 953,000 discrete blocks - each one independently scanned for BIP-39 word sequences and checksum-validated on hit.

Key design decisions:

Feature Implementation
Block size Configurable, defaults to 1 MB
Pause / Resume QMutex + QWaitCondition - zero CPU burn while paused
Stop with partial save writePartialCsv() called automatically on abort
Thread safety std::atomic<bool> stopRequested - no locks in the hot path
Progress reporting Qt signals: progressUpdated(currentBlock, totalBlocks)
Match enrichment wordFoundInBlock signal carries word, block number, byte offset, hex context

System drive (C:\) is blocked by design. The scanner will refuse to open it.


Scanner - File-System Scan

#include "scanner.h"

Scanner scanner;
scanner.setWordlist(wordlist);
scanner.scanDirectory(sourcePath, outputPath, onMatch, onProgress, shouldStop);

Walks a directory tree recursively using QDirIterator, checking .txt and .ini files for BIP-39 content. Uses a two-stage filter: a fast containsBip39Word() pre-check before invoking the full Bip39Sequence::extract() on confirmed candidates. Matched files are copied to the output folder with a yyyyMMdd_HHmmss_ prefix.

All callbacks are std::function - composable and testable without a running Qt event loop.


ScanWorker - Qt Thread Worker

#include "scanworker.h"

ScanWorker *worker = new ScanWorker();
worker->moveToThread(scanThread);
connect(scanThread, &QThread::started, worker, &ScanWorker::startScan);

A QObject subclass designed for Qt's moveToThread() pattern. Wraps Scanner in a background thread, manages std::atomic<bool> stopRequested, emits typed progress/match signals, and writes the final CSV on completion. CSV output includes file path, copy path, matched words, and timestamp.


DriveUtils - Physical Device Resolution

#include "driveutils.h"

QString devicePath = DriveUtils::driveLetterToPhysicalPath("G:/");
// Returns: "\\.\PhysicalDrive2"

Converts a Windows drive letter to the corresponding physical device path using DeviceIoControl with IOCTL_STORAGE_GET_DEVICE_NUMBER. Required because LowLevelScanner operates on physical devices, not drive letters.


Wordlist Variants

Three header-only wordlist formats for different use cases:

Header Type Use case
bip39_wordlist.h std::vector<QString> Qt UI, Scanner, Bip39Sequence
bip39_wordlist_std.h std::vector<std::string> BIP39Checksum, non-Qt code
bip39_wordlist_raw.h std::vector<QByteArray> LowLevelScanner raw-byte matching

All three contain the same canonical 2048-word BIP-39 English list from bitcoin/bips. Multiple formats exist to avoid conversion overhead in the hot scanning paths.


Building

Dependencies: Qt 6.9+ (MinGW 64-bit) on Windows 10/11.

These components do not have a main() - they are designed to be integrated into a Qt application. See the full commercial tool for a complete implementation with UI, licensing, and installer.

To use in your own project, add the files to your .pro:

HEADERS += scanner.h scanworker.h lowlevelscanner.h driveutils.h \
           bip39_checksum.h bip39_sequence.h sha256.h \
           bip39_wordlist.h bip39_wordlist_raw.h bip39_wordlist_std.h

SOURCES += scanner.cpp scanworker.cpp lowlevelscanner.cpp driveutils.cpp \
           bip39_checksum.cpp bip39_sequence.cpp sha256.cpp

lowlevelscanner and driveutils require Windows (windows.h, winioctl.h). All other components are cross-platform Qt/C++17.


Why We Published This

Trust has to be earned, especially for security tools.

If you're going to run software on a machine that might contain - or have once contained - a cryptocurrency seed phrase, you deserve to be able to read every line of the code that touches your data.

The scanning engine is published so that:

  • Security researchers can audit the code
  • Advanced users who won't run an unsigned binary can build from source
  • The community can verify the air-gap claim independently
  • Other developers working on wallet recovery tools have a solid, tested foundation to build on

The commercial product that wraps this engine handles the UI, installer, and customer support. The engine itself is yours to read, compile, and use under the MIT license.


About M Media

M Media builds focused, professional Windows utilities. We don't build suites. We don't build subscriptions. We build tools that solve one problem completely and work reliably for years.

The BIP39 Recovery Tool is one of those tools. It exists because wallet recovery is a real problem that affects real people - and the options that existed before it ranged from inadequate to untrustworthy.

Support: support@mmediasoftwarelab.com


License

The source files in this repository are released under the MIT License.

sha256.cpp / sha256.h are public domain, based on Brad Conte's reference implementation (bradconte.com), algorithm FIPS 180-4.

The BIP-39 wordlist is from the Bitcoin Improvement Proposals repository and is in the public domain.

The commercial BIP39 Recovery Tool application (UI, installer, licensing system) is proprietary software and is not included in this repository.


©2026 M Media. All rights reserved.

About

BIP-39 seed phrase scanning engine; raw disk, checksum-validated, air-gapped by design. Powers the M Media BIP39 Recovery Tool.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors