Validated Names Selection

A database of 600 names with 44,000+ evaluations for experimental studies on race and ethnicity.

About this dataset & Field Descriptions

Background: A large number of studies use names to signal race in experiments. However, names often bundle other signals like class or citizenship. This dataset, described in Crabtree, C., Kim, J.Y., Gaddis, S.M., Holbein, J.B., Guage, C. & Marx, W.W. (2023). Validated names for experimental studies on race and ethnicity. Sci Data 10, 130., provides validated name perceptions to help researchers isolate racial signals.

Methodology: Data was collected from three surveys (N=4,026 respondents) in the US. Respondents evaluated 600 names on race, citizenship, income, and education. The dataset includes over 44,170 individual name evaluations.

R Package: This tool is a web interface for the validatednamesr R package.


Data Dictionary:

  • Identity: The intended racial identity of the name, based on US Census prevalence data (e.g., names ≥90% used by a specific group).
  • Pr(Correct) / Correct Probability: The proportion of respondents who correctly identified the name's intended racial group. Higher values indicate a stronger racial signal.
  • Avg Income: Perceived income level (1=Low <$40k, 2=Middle, 3=High>$120k).
  • Avg Education: Perceived education level (1=High School, 2=Bachelor's, 3=Master's, 4=PhD).
  • Avg Citizenship: The probability (0-1) that respondents perceived the person as a US citizen.
Potential Use Cases
  • Isolating Racial Signals: Select names that vary by race but have similar perceived income, education, and citizenship to conduct rigorous tests of racial discrimination (controlling for confounding variables).
  • Bundled Treatments: Intentionally select names that signal both race and specific socioeconomic classes to study intersectional biases.
  • Audit Studies: Use validated names to signal attributes (like citizenship or class) in contexts where explicitly stating them is difficult (e.g., resume or email audits).
  • Robustness Checks: Select multiple names per racial group to ensure findings are not driven by unique features of a single name.

Loading data...

Results

Name Identity Pr(Correct) Avg Income Avg Education Avg Citizenship