Multi-label prediction for political text-as-data
Political scientists increasingly use supervised machine learning to code multiple relevant labels from a single set of texts. The current "best practice"of individually applying supervised machine learning to each label ignores information on inter-label association(s), and is likely to under-perform as a result. We introduce multi-label prediction as a solution to this problem. After reviewing the multi-label prediction framework, we apply it to code multiple features of (i) access to information requests made to the Mexican government and (ii) country-year human rights reports. We find that multi-label prediction outperforms standard supervised learning approaches, even in instances where the correlations among one's multiple labels are low.
| Item Type | Article |
|---|---|
| Keywords | classification,machine learning,multi-label,prediction,text-as-data |
| Departments | Government |
| DOI | 10.1017/pan.2021.15 |
| Date Deposited | 02 Jul 2021 10:09 |
| URI | https://researchonline.lse.ac.uk/id/eprint/110971 |
Explore Further
-
picture_as_pdf -
subject - Accepted Version
-
- Available under Creative Commons: Attribution-NonCommercial-No Derivative Works 4.0