A collection of various smaller tools to work with Wikimedia projects (Wikipedia, Commons, Wikidata)
Latest update: 28 October 2025
This repo contains the following tools:
-
Wikimedia Commons File Metadata Downloader: Collect metadata from Wikimedia Commons files or categories and write them into an Excel sheet — safely, in chunks, and with per-file JSON snapshots.
-
Wikimedia Commons File Downloader: A robust, Windows-safe downloader for Wikimedia Commons files - Download Wikimedia Commons files by nested category tree or flat list, preview before downloading, slice a subset, use Windows-safe unique filenames, and log to Excel.
-
Wikimedia Commons URL M-ID Excel Extractor: Reads a Wikimedia Commons FileURL column from an Excel sheet, looks up the corresponding MediaInfo entity IDs (M-IDs), and writes the results back into the same Excel workbook.
-
TO ADD: General File Downloader - a simple tool for quickly downloading non-Wikimedia Commons files
-
TO ADD: URL http status checker tool (coming soon): A script to check the HTTP status codes of a list of URLs, with support for retries, timeouts, and detailed logging. Works for all URLs, not necessarily Wikimedia-related URLs.
The code of this repo is releases into the public domain under CC0 1.0 public domain dedication. Feel free to reuse and adapt. Attribution (KB, National Library of the Netherlands) is appreciated but not required.
- Author: Olaf Janssen, Wikimedia coordinator @ KB, National Library of the Netherlands
- Contact via KB expert page or Wikimedia user page.