Four worked examples drawn from the paper and supplementary materials — thematic analysis of news, sentiment detection on social media, science mapping of academic literature, and topic modeling of encyclopedic text. All executed end‑to‑end in TALL, without writing code.
A collection of 386 short news articles from the BBC News corpus, retrieved from the BBC website (Greene & Cunningham, 2005). After loading, TALL applies standard pre‑processing — tokenisation and lemmatisation via the EN GUM model — retaining adjectives, nouns, proper nouns, and verbs.
Built‑in BBC News dataset loaded with TALL AI context automatically populated.

EN GUM model selected. Adjectives, nouns, proper nouns, verbs retained.

RAKE extracts salient collocations; manual curation preserves domain phrases.

Descriptive statistics, lexical richness, and TF‑IDF for discriminating terms.

Concordance view surfaces co‑text environments for salient terms.

LDA recovers thematic structure across the full news collection.

"The entertainment news focuses heavily on film awards and the movie industry. Music‑related news forms a significant portion of the content — organised around performance, industry and sales."
14,640 tweets directed at six major US airlines during February 2015 (Figure Eight / Kaggle). TALL detects platform‑specific entities — hashtags, mentions, emojis — and runs polarity detection via the Hu‑Liu opinion lexicon.
Built‑in corpus loaded; TALL AI receives pre‑populated dataset context.

EN EWT model handles social‑media text. Hashtags, mentions, emojis retained as entities.

Automatic tagging of hashtags, mentions, emojis with frequency ranking.

Concordance for @AmericanAirlines reveals customer‑service discourse patterns.

Hu‑Liu lexicon · donut chart with five levels from Very Negative to Very Positive.

Automated key takeaways & managerial implications.

"The dominance of delay and cancel in negative tweets suggests airlines should focus on improving their processes for handling flight disruptions. Both positive and negative tweets highlight the importance of customer service — the presence of gratitude‑related terms in positive sentiment indicates that effective customer service responses can mitigate dissatisfaction and generate positive outcomes."
TALL natively ingests biblioshiny files from the bibliometrix package. Scientific abstracts flow into the same pre‑processing pipeline — multilingual tokenisation, lemmatisation, co‑word analysis — and can be layered with citation‑based metrics for hybrid semantic/bibliometric analysis.
A natural fit for systematic literature reviews, conceptual‑structure analyses, and policy‑oriented research synthesis.




Figure · Bibliometrix Biblioshiny ingestion, corpus filtering, co‑word analysis, and TALL AI interpretation of the emerging conceptual structure.





Figure · Wikipedia An encyclopedic corpus processed through import, overview, multi‑word extraction, co‑word analysis, and topic modeling — ending with TALL AI's topic interpretation.
An encyclopedic corpus pushes TALL through the full analytical narrative: from basic frequency and TF‑IDF to multi‑word extraction, co‑word networks, and finally LDA topic modeling with TALL AI producing concise narrative labels for each topic.
Ideal for digital humanities, educational‑content audits, and cross‑linguistic cultural analysis given TALL's 56‑language coverage.
Large‑scale thematic and sentiment analyses of institutional reports, parliamentary proceedings, and online debates.
Multilingual architecture for cross‑linguistic corpus studies and computational cultural analysis across 56 languages.
Integration with bibliometrix combines semantic and citation‑based analyses for science mapping and literature reviews.
Analyses of patient feedback, clinical records and satisfaction surveys (Valentino et al., 2025).
Currently used in postgraduate and doctoral programs at several Italian universities for teaching data‑driven text methods.
Sentiment tracking and theme discovery across customer feedback, reviews, and social media at scale.