Date: October 2025
Hosted by: Digital Research and Innovation Lab (AI in Research series)
Instructor: Jessica Breen, PhD (Program Director, Geospatial Research Support)
Tags: AI • R • Prompt Engineering • Research Tools • Workshop
This hands-on workshop introduces strategies for cleaning messy research data and shows how generative AI tools can support that process. You’ll learn how to identify common issues in tabular datasets including missing values, inconsistent formatting, and duplicates and how to plan a reproducible cleanup workflow. Then, we’ll explore how to use generative AI to help write R code for cleaning tasks, with an emphasis on producing reusable scripts and well-documented steps. Designed for graduate students working with real-world data, this session is useful for anyone preparing a dataset for analysis, visualization, or sharing. Prior experience with R is helpful, but not required.
- Zip File (ZIP):
workshop_materials.zip
- Identifying common issues in tabular datasets
- Planning a reproducible data cleaning workflow
- Describing datasets for AI collaboration
- Prompting AI tools to write and debug R code
- Documenting, testing, and saving reproducible workflows in R Notebooks
This workshop is part of the AI Tools in Research series hosted by the Digital Research and Innovation Lab.
Breen, J. (2025). AI Tools for Research: AI-Assisted Strategies for Cleaning and Preparing Research Data (workshop materials). Geospatial Research Lab, American University Library. https://github.com/GeospatialResearchLab/workshop-data-cleaning-GenAI-R-2025