Public datasets
I develop and share public datasets to strengthen the empirical foundations of social science and expand access to knowledge for the public good.
- Validated Names for Experimental Studies on Ethnicity and Race
- Summary: A large and fast-growing number of studies across the social sciences use names in experiments to signal race. However, names may also convey other perceived attributes such as socioeconomic status and citizenship. This dataset provides the largest collection of validated name perceptions to date, based on three U.S. surveys. It includes 44,170 evaluations from 4,026 respondents for 600 names, with ratings of perceived race, income, education, and citizenship, along with respondent characteristics. These data are designed to help researchers isolate the causal effects of race in experimental research.
- Use case: Featured in “Validated names for experimental studies on race and ethnicity”, Nature Scientific Data (2023), with Charles Crabtree, S. Michael Gaddis, John B. Holbein, Cameron Guage, and William Marx.
- Collaborator: Charles Crabtree, S. Michael Gaddis, John B. Holbein, Cameron Guage, and William Marx
- Summary: A large and fast-growing number of studies across the social sciences use names in experiments to signal race. However, names may also convey other perceived attributes such as socioeconomic status and citizenship. This dataset provides the largest collection of validated name perceptions to date, based on three U.S. surveys. It includes 44,170 evaluations from 4,026 respondents for 600 names, with ratings of perceived race, income, education, and citizenship, along with respondent characteristics. These data are designed to help researchers isolate the causal effects of race in experimental research.
- Linked Fate Literature Review Dataset (1999–2019)
- Summary: A hand-coded dataset of 160 studies on linked fate published between 1999 and 2019. It captures definitions, measurement strategies, theoretical framings, and the populations studied in political science and related disciplines.
- Use case: Used in “Rewiring Linked Fate: Bringing Back History, Agency, and Power”, Perspectives on Politics (2023), with Reuel Rogers.
- Summary: A hand-coded dataset of 160 studies on linked fate published between 1999 and 2019. It captures definitions, measurement strategies, theoretical framings, and the populations studied in political science and related disciplines.
- Asian American and Latino Advocacy and Community Service Organizations Dataset (1868–2016)
- Summary: A historical dataset documenting over 600 nonprofit organizations serving Asian American and Latino communities in the U.S. from 1868 to 2016. It includes founding years, organizational types (advocacy, service, hybrid), and panethnic status. The dataset captures both early ethnic organizations and those that evolved into panethnic entities, such as the National Council of La Raza (NCLR). Validation relied on websites, news archives, and outreach, with founding dates confirmed for 77% of Asian and 71% of Latino organizations.
- Use case: Used in “How Other Minorities Gained Access: The War on Poverty and Asian American and Latino Community Organizing”, Political Research Quarterly (2021).
- Summary: A historical dataset documenting over 600 nonprofit organizations serving Asian American and Latino communities in the U.S. from 1868 to 2016. It includes founding years, organizational types (advocacy, service, hybrid), and panethnic status. The dataset captures both early ethnic organizations and those that evolved into panethnic entities, such as the National Council of La Raza (NCLR). Validation relied on websites, news archives, and outreach, with founding dates confirmed for 77% of Asian and 71% of Latino organizations.