Chau Minh Pham

Hi there! My name is Chau. I use she/her pronouns and publish under Chau Minh Pham. I am a Ph.D. student in Computer Science at University of Maryland, College Park, where I am advised by Professor Mohit Iyyer in the CLIP Lab. My recent work has focused on building instruction-following datasets to enable long-form text generation and long-context reasoning.

Previously, I completed my master's degree at UMass Amherst. Even before that, I graduated from Colgate University, where I was advised by Professor Joel Sommers. I spent most of my undergraduate years working on statistical test visualization apps at the Data Science Collaboratory. I also worked briefly on AI harm anticipation at Microsoft Research (FATE) and COVID-19 emotion analysis in the CRA-WP DREU program.

News

June 2025: Interning with the Document Intelligence lab at Adobe (mentor: Varun Manjunatha).

May 2025: Released a preprint on Frankentexts, a new type of LLM narratives generated under extreme constraints, with implications for detecting mixed-authorship AI text and simulating co-writing scenarios.

March 2025: Got my MS from UMass Amherst!

Feb 2025: Released a preprint on CLIPPER, a synthetic data generation pipeline for long-context narrative reasoning tasks.

Jan 2025: Transferred to UMD College Park with my advisor!

Nov 2024: Presenting Suri at EMNLP 2024 (+ WNU)! See you in Miami 🌴

Nov 2024: Released a Python package for TopicGPT! See TopicGPT page for more details.

Oct 2024: Gave a guest lecture at Mount Holyoke College (COMSC 341NL - Topics: 'Natural Language Processing') on generating and reasoning over long-form texts.

Research

Frankentext: Stitching random text fragments into long-form narratives
Chau Minh Pham, Jenna Russell, Dzung Pham, Mohit Iyyer
Under review
[Paper] [Code] [BibTeX]

CLIPPER: Compression enables long-context synthetic data generation
Chau Minh Pham, Yapei Chang, Mohit Iyyer
COLM 2025; 7^th WNU
[Paper] [Code] [HuggingFace] [BibTeX]

Suri: Multi-constraint Instruction Following for Long-form Text Generation
Chau Minh Pham, Simeng Sun, Mohit Iyyer
EMNLP 2024 (findings); 6^th WNU
[Paper] [Project Page] [Code] [Poster] [BibTeX]

TopicGPT: A Prompt-based Topic Modeling Framework
Chau Minh Pham, Alexander Hoyle, Simeng Sun, Philip Resnik, Mohit Iyyer
NAACL 2024
[Paper] [Project Page] [Code] [Poster] [BibTeX]

2025

Frankentext: Stitching random text fragments into long-form narratives
Chau Minh Pham, Jenna Russell, Dzung Pham, Mohit Iyyer
arXiv 2025
[Paper] [Code] [BibTeX]

CLIPPER: Compression enables long-context synthetic data generation
Chau Minh Pham, Yapei Chang, Mohit Iyyer
COLM 2025
[Paper] [Code] [HuggingFace] [BibTeX]

Can Large Language Models Really Recognize Your Name?
Dzung Pham, Peter Kairouz, Niloofar Mireshghallah, Eugene Bagdasarian, Chau Minh Pham, Amir Houmansadr
arXiv 2025
[Paper] [BibTeX]

OWL: Probing Cross-Lingual Recall of Memorized Texts via World Literature
Alisha Srivastava*, Emir Korukluoglu*, Minh Nhat Le*, Duyen Tran, Chau Minh Pham, Marzena Karpinska, Mohit Iyyer
arXiv 2025
[Paper] [BibTeX]

BEARCUBS: A benchmark for computer-using web agents
Yixiao Song, Katherine Thai, Chau Minh Pham, Yapei Chang, Mazin Nadaf, Mohit Iyyer
COLM 2025
[Paper] [Website] [BibTeX]

Whose story is it? Personalizing story generation by inferring author styles
Nischal Ashok Kumar, Chau Minh Pham, Mohit Iyyer, Andrew Lan
arXiv 2025
[Paper] [BibTeX]

ProxyGPT: Enabling Anonymous Queries in AI Chatbots with (Un) Trustworthy Browser Proxies
Dzung Pham, Jade Sheffey, Chau Minh Pham, Amir Houmansadr
MADWeb 2025
[Paper] [BibTeX]

2024

Interactive Topic Models with Optimal Transport
Garima Dhanania*, Sheshera Mysore*, Chau Minh Pham, Mohit Iyyer, Hamed Zamani, Andrew McCallum
arXiv 2024
[Paper] [BibTeX]

Suri: Multi-constraint Instruction Following for Long-form Text Generation
Chau Minh Pham, Simeng Sun, Mohit Iyyer
EMNLP 2024 (findings); 6^th Workshop on Narrative Understanding
[Paper] [Project Page] [Code] [Poster] [BibTeX]

The prompt report: A systematic survey of prompting techniques
Sander Schulhoff, Michael Ilie, Nishant Balepur, Konstantine Kahadze, Amanda Liu, Chenglei Si, Yinheng Li, Aayush Gupta, HyoJung Han, Sevien Schulhoff, Pranav Sandeep Dulepet, Saurav Vidyadhara, Dayeon Ki, Sweta Agrawal, Chau Pham, Gerson Kroiz, Feileen Li, Hudson Tao, Ashay Srivastava, Hevander Da Costa, Saloni Gupta, Megan L Rogers, Inna Goncearenco, Giuseppe Sarli, Igor Galynker, Denis Peskoff, Marine Carpuat, Jules White, Shyamal Anadkat, Alexander Hoyle, Philip Resnik
arXiv 2024
[Paper] [BibTeX]

TopicGPT: A Prompt-based Topic Modeling Framework
Chau Minh Pham, Alexander Hoyle, Simeng Sun, Philip Resnik, Mohit Iyyer
NAACL 2024
[Paper] [Project Page] [Code] [Poster] [BibTeX]

2023

AHA!: Facilitating AI Impact Assessment by Generating Examples of Harms
Zana Buçinca, Chau Minh Pham, Maurice Jakesch, Marco Tulio Ribeiro, Alexandra Olteanu, Saleema Amershi
arXiv 2023
[Preprint] [BibTeX]

2022

Emotion analysis and detection during COVID-19
Tiberiu Sosea, Chau Pham, Alexander Tekle, Cornelia Caragea, Junyi Jessy Li
LREC 2022
[Paper] [Code] [BibTeX]

2021

Reassessing the constancy of end-to-end internet latency
Lily Davisson*, Joakim Jakovleski*, Nhiem Ngo*, Chau Pham*, Joel Sommers
TMA 2021
[Paper] [BibTeX]

Teaching/Mentoring

UMD College Park

TA - CMSC 848O - S25: Seminar on Long-context Language Models

UMass Amherst

Mentor - 2025: Industry Mentorship Program
Mentor - 2023-24: Early Research Scholars Program
TA - COMPSCI 685 - S24: Advanced Natural Language Processing
TA - COMPSCI 110 - Su23: Foundations of Programming

Colgate University

TA - CS 480A - F21: Natural Language Processing
TA - CS 101 - F19, S20, S21, S22: Introduction to Algorithms and Data Structures

Miscellaneous

I use my middle name, Minh, in publications to distinguish myself from other Chau Phams who are also PhD students. "Minh" means bright and intelligent, while "Châu" refers to a precious pearl. Together, the name "Minh Châu" carries my parents' wishes for me to grow into a gifted and pure human being.

In my free time, I enjoy lifting weights, running, reading, and cooking experimentally.

I am originally from Hanoi, Vietnam 🇻🇳. If you are planning to visit Hanoi, I would recommend checking out Nguyen Phan Que Mai's read your way through Hanoi and cháo đậu cà (green pea porridge with marinated fried tofu and fermented thai eggplants).

Last updated: June 2025.