[Corpora-List] English-Corpora.org: new AI/LLM features for corpus analysis

2 Sep 2025


      At English-Corpora.org, we’ve added new AI/LLM-based tools directly into
the corpus interface, while still keeping the corpus data at the center of
analysis. An overview of the features is available at
*https://www.english-corpora.org/ai-llms/
https://www.english-corpora.org/ai-llms/*.
Using nine different LLMs (like GPT, Gemini, and Claude), users can now do
things such as:
-- semantically cluster and categorize collocates and phrases, such as the
collocates of *identity *or the highly polysemous *bow*, or results for the
phrase *soft *NOUN
-- compare words via collocates, such as *quandary*/*predicament*, *provoke*
/*incite*, or *completely*/*entirely*
-- analyze differences in frequency or collocates across corpus sections,
such as genres, historical periods, or dialects
-- analyze KWIC lines, including semantic prosody, collocates, grammatical
patterns, text types, and pragmatic functions
-- generate words and phrases by topic, translation, or rephrasing -- and
then see their frequency in different sections of the corpus
Users can also:
-- switch easily between LLMs to compare analyses across nine different
models
*-- view results in 30 different languages*-- select one of 14 "user
profiles" (e.g. linguist, translator, teacher, or learner), for customized
results
-- save, retrieve, and annotate AI results (categorizations, analyses, and
generated words/phrases)
The goal is not to replace careful corpus analysis, but to complement it.
The LLMs can suggest patterns, categories, and comparisons -- but the
underlying corpus data is always visible, so users can verify, adapt, or
challenge the AI output. We hope these tools will be useful for learners,
teachers, researchers, translators, and anyone interested in richer ways of
exploring corpus data.
============================================
Mark Davies
english-corpora.org
mark-davies.org
============================================

2026

2025

2024

2023

2022

[Corpora-List] English-Corpora.org: new AI/LLM features for corpus analysis