Install

Two lines
of R.

TALL is distributed on CRAN and GitHub. The entire installation — including the Shiny interface, analytical modules, and on‑demand language models — fits in a standard R package with MIT‑compatible dependencies.

Stable · CRAN recommended

Install from CRAN

The official release, indexed on CRAN and RDocumentation. Versioned, metadata‑rich, MIT‑licensed.

# Install TALL from CRAN install.packages("tall") # Launch the Shiny interface library(tall) tall()
View on CRAN
Development · GitHub bleeding edge

Install from GitHub

Latest features not yet on CRAN. Requires build tools: Rtools on Windows, Xcode Command Line Tools on macOS.

# Install build tools check if (!require("pak", quietly = TRUE)) install.packages("pak") # Install development version if (!require("remotes", quietly = TRUE)) install.packages("remotes") remotes::install_github("massimoaria/tall") library(tall); tall()
View on GitHub
§ 1 · Requirements

What you need before installing.

R
Runtime

R ≥ 4.2.0

The TALL package targets modern R versions. Compatible with the latest CRAN release.

Download R
IDE

RStudio (recommended)

A friendly IDE for R — Positron and VS Code with the R extension also work well.

Download RStudio
Network

Internet connection

Needed at install time and for on‑demand language‑model downloads. Once cached, models work offline.

§ 2 · Hardware guidance

Choose the right setup for your corpus.

TALL preprocessing scales linearly with corpus size. Rcpp‑accelerated modules — multi‑word extraction, Reinert clustering, collocation measures — keep things fast at scale. Memory is the practical ceiling: reference‑vocabulary tasks like keyness analysis are the primary driver.

Benchmarks reported on Apple M4 Pro · 48 GB RAM · macOS 15 · R 4.5.2.

Corpus size Recommended RAM Typical hardware Runtime
Small
< 10⁵ tokens
1–2 GB Standard laptop < 10 s preprocessing
Medium
10⁵–10⁶ tokens
~ 5 GB Research workstation ~ 1 min preprocessing
Large
> 10⁶ tokens
10+ GB Server / high‑memory desktop ~ 7 min · 5.3M tokens

For corpora exceeding 1M tokens, institutional Shiny Server deployment is recommended — the same codebase runs unchanged.

§ 3 · Adoption

Joining a growing community.

Since its February 2025 CRAN release, TALL has been downloaded more than 4,000 times — sustained monthly growth, with rapid early adoption between April and June 2025. Currently used in postgraduate and doctoral programs at several Italian universities.

4k+
CRAN downloads
Feb 2025
First public release
Cumulative CRAN downloads of TALL, February to November 2025
Figure Cumulative CRAN downloads, February–November 2025 (source: CRANlogs).

Ready to get started?

Try one of the built‑in example datasets — BBC News, US Airline Tweets — before loading your own corpus.