Integrating Human and Machine Coding to Measure Political Issues in Ethnic Newspaper Articles


The voices of racial minority groups have rarely been examined systematically with large-scale text analysis in political science. This study fills such a gap by applying an integrated classification framework to the analysis of the commonalities and differences in political issues that appeared in 78,305 articles from Asian American and African American newspapers from the 1960s to the 1980s. The automated text classification shows that Asian American newspapers focused on promoting collective gains more often than African American newspapers. Conversely, African American newspapers concentrated on preventing collective losses more than Asian American newspapers. The content analysis demonstrates that the issue priorities varied between the corpora, especially with respect to policy contexts. Gaining access to government resources was a more urgent issue for Asian Americans, while reducing or ending state violence, such as police brutality, was a more pressing matter for African Americans. It also helped avoid extreme interpretations of the machine coding, as the misalignment of political agendas between the two corpora widened up to 10 times when the training data were measured using the minimum, rather than the maximum, reliability threshold.

Journal of Computational Social Science