quanteda/quanteda: CRAN v4.3.0

Benoit, K.

, Watanabe, K., Wang, H., Nulty, P., Müller, S., Lua, J. W., Matsuo, A., Bivand, R., +17 more...Atria, J. T., Delmarcelle, O., Lowe, W., Barberá, P., Rinker, T., Padgham, M., Gandrud, C., Robitaille, A. L., Chirico, M., Leeper, T. J., Malavin, S., Kearney, M. W., Reuning, K., Hughitt, K., Ucar, I., Baird, J. & Leinweber, K. (2017). quanteda/quanteda: CRAN v4.3.0. [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.596731

Copy

Changes and additions - Added corpus_chunk() for chunking texts into smaller documents. - Significantly reduce the memory usage for the c operation on large tokens and tokens_xptr objects. - Further improvements to the verbose messages for corpus, tokens, dfm and fcm objects. - tokens_ngrams() now includes a new argument apply_if, functioning similar to this argument in tokens_compound() and tokens_lookup() (#2390). - Replaced remove_unigram with match_pattern in object2id() to control the matching of single-word patterns or multi-word patterns. - data_corpus_inaugural now updated for Trump 2025.

Item Type	Dataset
Publisher	Zenodo
DOI	10.5281/zenodo.596731
Date made available	11 January 2017
Resource language	Other
Departments	LSE > Academic Departments > Methodology LSE > Institutes > Data Science Institute

Explore Further

Available at: 10.5281/zenodo.596731

Access level: Open

Licence: Creative Commons: GNU GPL 3.0

Downloads

quanteda/quanteda: CRAN v4.3.0

Explore Further

Export as