The Filter, Not the Erosion

How Generative AI Concentrates Rather Than Degrades Technical Knowledge Communities

A natural experiment using Stack Overflow data to test whether ChatGPT's launch degraded the cognitive quality of process automation questions. ICIS 2026 submission.

Quick Start

python -m venv venv
venv/Scripts/pip install -r requirements.txt  # Windows
# or: venv/bin/pip install -r requirements.txt  # Unix

# Copy .env.example to .env and add your API keys
cp .env.example .env

Pipeline

The project runs in 5 stages. Each stage is independently executable.

# Stage 1: Download SO questions via Stack Exchange API
python -m src.data_acquisition.main fetch-data

# Stage 2: Stratified sampling + feature engineering
python -m src.sampling.main run-all

# Stage 3: Rate questions with Claude API (LLM-as-judge)
python -m src.rating_pipeline.main rate --mode batch --yes

# Stage 4: DiD regression + robustness checks + figures
python -m src.analysis.main run-all

# Stage 5: Generate LaTeX paper
python -m src.paper.main run-all

Project Structure

src/
  data_acquisition/   # Stage 1: SO API client, CSV processing, validation
  sampling/           # Stage 2: Stratified sampling, feature engineering
  rating_pipeline/    # Stage 3: Claude API batch rating, JSON parsing
  analysis/           # Stage 4: DiD regression, robustness, figures
  paper/              # Stage 5: LaTeX paper generation

data/
  raw/                # Downloaded CSVs (gitignored)
  processed/          # Parquet files (gitignored)
  ratings/            # LLM rating outputs (gitignored)

figures/              # Publication-quality PNGs
paper/                # Generated LaTeX paper (gitignored)

Key Finding

No significant decline in cognitive quality across any dimension (all p > 0.36), despite a 55% drop in question volume. ChatGPT acts as a filter -- absorbing routine questions -- rather than eroding the quality of those that remain. A significant increase in minimal reproducible examples (p = 0.036) supports this interpretation.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
figures		figures
notebooks		notebooks
paper		paper
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Filter, Not the Erosion

Quick Start

Pipeline

Project Structure

Key Finding

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The Filter, Not the Erosion

Quick Start

Pipeline

Project Structure

Key Finding

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages