quanteda/quanteda: CRAN v4.3.0

Benoit, K.ORCID logo, Obeng, A., Paskhalis, T.ORCID logo, Watanabe, K., Wang, H., Nulty, P., Müller, S., Lua, J. W., Matsuo, A., Bivand, R., +17 more...Atria, J. T., Delmarcelle, O., Lowe, W., Barberá, P., Rinker, T., Padgham, M., Gandrud, C., Robitaille, A. L., Chirico, M., Leeper, T. J., Malavin, S., Kearney, M. W., Reuning, K., Hughitt, K., Ucar, I., Baird, J. & Leinweber, K. (2017). quanteda/quanteda: CRAN v4.3.0. [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.596731
Copy

Changes and additions - Added corpus_chunk() for chunking texts into smaller documents. - Significantly reduce the memory usage for the c operation on large tokens and tokens_xptr objects. - Further improvements to the verbose messages for corpus, tokens, dfm and fcm objects. - tokens_ngrams() now includes a new argument apply_if, functioning similar to this argument in tokens_compound() and tokens_lookup() (#2390). - Replaced remove_unigram with match_pattern in object2id() to control the matching of single-word patterns or multi-word patterns. - data_corpus_inaugural now updated for Trump 2025.

Available at: 10.5281/zenodo.596731

Access level: Open

Licence: Creative Commons: GNU GPL 3.0


Export as

EndNote BibTeX Reference Manager Refer Atom Dublin Core JSON Multiline CSV
Export

Downloads