Replication Data for: Multi-label Prediction for Political Text-as-Data

Berliner, D.

, Erlich, A., Dantas, S., Bagozzi, B. & Palmer-Rubin, B. (2021). Replication Data for: Multi-label Prediction for Political Text-as-Data. [Dataset]. Harvard Dataverse. https://doi.org/10.7910/dvn/sovpa4

Copy

Political scientists increasingly use supervised machine learning to code multiple relevant labels from a single set of texts. The current "best practice'' of individually applying supervised machine learning to each label ignores information on inter-label association(s), and is likely to under-perform as a result. We introduce multi-label prediction as a solution to this problem. After reviewing the multi-label prediction framework, we apply it to code multiple features of (i) access to information requests made to the Mexican government and (ii) country-year human rights reports. We find that multi-label prediction outperforms standard supervised learning approaches, even in instances where the correlations among one's multiple labels are low. This repository replicates the figures and tables in the article and appendix. More information can be found in the "README.md" file. (2021-02-12)

Item Type	Dataset
Publisher	Harvard Dataverse
DOI	10.7910/dvn/sovpa4
Date made available	1 April 2021
Keywords	Computer and Information Science, social sciences
Resource language	Other
Departments	LSE

Explore Further

Berliner, Daniel

Erlich, A., Dantas, S. G., Bagozzi, B. E., Berliner, D. & Palmer-Rubin, B. (2022). Multi-label prediction for political text-as-data. Political Analysis, 30(4), 463 - 480. https://doi.org/10.1017/pan.2021.15 (Repository Output)

Available at: 10.7910/dvn/sovpa4

Access level: Open

Licence: CC0 1.0

Downloads

Replication Data for: Multi-label Prediction for Political Text-as-Data

Explore Further

Export as