Planetary AI

The Work of AI Red Teaming: Automation and the Human Infrastructure

As AI systems become more advanced and are taken up in a widening set of applications, the need to identify risks for new AI innovations is growing. AI red teaming—the adversarial testing of AI systems for risks—has been proposed as one approach to address this. It involves practices ranging from manual evaluation to fully automated methods that use AI models to generate and assess test cases. While automated and hybrid human-AI approaches are valued for scalability and cost-effectiveness, they risk deprioritizing human expertise, obscuring the labor and judgment needed to surface meaningful harms, and potentially harming workers’ well-being. Our SIG builds on efforts to theorize AI red teaming as a sociotechnical practice and explores how automation is reshaping the human infrastructure of AI red teaming. Drawing from CSCW research on labor in data annotation and content moderation, we ask: What forms of human expertise remain essential in red teaming? Which tasks can or should be automated, and under what conditions? How do choices about automation redistribute responsibility, agency, and labor? By convening scholars and practitioners across CSCW, HCI, and adjacent fields, we aim to surface key sociotechnical challenges in AI red teaming automation and foster a community to advance thoughtful, responsible practices.

Alice Qian Zhang, Jiayin Zh, Srravya Chandhiramowuli, Hong Shen, Laura Dabbish, Theodora Skeadas, Sarah Amos, and Jina Suh

Publication Year:

2025

IN:

Companion Publication of the 2025 Conference on Computer-Supported Cooperative Work and Social Computing