quanteda/quanteda: CRAN v4.3.0
Benoit, K.
, Obeng, A., Paskhalis, T.
, Watanabe, K., Wang, H., Nulty, P., Müller, S., Lua, J. W., Matsuo, A., Bivand, R., +17 more...Atria, J. T., Delmarcelle, O., Lowe, W., Barberá, P., Rinker, T., Padgham, M., Gandrud, C., Robitaille, A. L., Chirico, M., Leeper, T. J., Malavin, S., Kearney, M. W., Reuning, K., Hughitt, K., Ucar, I., Baird, J. & Leinweber, K.
(2017).
quanteda/quanteda: CRAN v4.3.0.
[Dataset]. Zenodo.
https://doi.org/10.5281/zenodo.596731
Changes and additions - Added corpus_chunk() for chunking texts into smaller documents. - Significantly reduce the memory usage for the c operation on large tokens and tokens_xptr objects. - Further improvements to the verbose messages for corpus, tokens, dfm and fcm objects. - tokens_ngrams() now includes a new argument apply_if, functioning similar to this argument in tokens_compound() and tokens_lookup() (#2390). - Replaced remove_unigram with match_pattern in object2id() to control the matching of single-word patterns or multi-word patterns. - data_corpus_inaugural now updated for Trump 2025.
| Item Type | Dataset |
|---|---|
| Publisher | Zenodo |
| DOI | 10.5281/zenodo.596731 |
| Date made available | 11 January 2017 |
| Resource language | Other |
| Departments |
LSE > Academic Departments > Methodology LSE > Institutes > Data Science Institute |
Downloads
ORCID: https://orcid.org/0000-0002-0797-564X
ORCID: https://orcid.org/0000-0001-9298-8850