New paper: Schwa density as a phonological register classifier (pre-registered, four-corpus replication) - Corpora

16 Apr 2026


      Dear colleagues,
I'd like to share a new preprint on single-feature register
  classification in English text:
"Schwa Density as a Phonological Stylistic Classifier: Primary
  Stylistic, Secondary Modality -- A Four-Corpus Pre-Registered
  Replication"
Preprint:
https://ling.auf.net/lingbuzz/009926/current.pdf?_s=WPGovroKhmABLC0P
    Materials/code:
https://github.com/kylegtownsend-collab/schwa-density-spgc
    Paper site:     https://papers.letsharkness.com/schwa-density/
The paper tests whether schwa density -- the proportion of vowel
  phones in a text that are unstressed schwa (CMUdict AH0) -- can
  serve as a phonologically motivated single-feature register
  classifier. A pre-registered confirmatory plan was applied to NLTK
  multi-source (N=164) and the Standardized Project Gutenberg Corpus
  (N=2,767), with sensitivity analyses on Brown (N=313) and OANC
  (N=4,375).
Headline findings:
- Schwa density matches or exceeds Flesch-Kincaid on all
     pre-registered corpora.
- A function-word ablation (masking the 198 NLTK English stopwords
     before computing schwa density) preserves or amplifies register
     discrimination on all four corpora (eta^2 retention 0.93-1.27),
     ruling out stopword frequency as a confound.
- The ablation operationalises a two-regime finding: schwa density
     functions as a Primary Stylistic Feature on within-prose
     variation (NLTK, SPGC, Brown) and a Secondary Modality Feature on
     speech-versus-writing variation (OANC).
- Joint partial-eta^2 retains 46-53% of the register signal on the
     pre-registered corpora after controlling jointly for syllables
     per word, mean word length, and Latinate ratio.
The pre-registration, deviation log, analyser, ablation and
  G2P-fallback scripts, per-corpus feature tables, and
  figure-generation code are all openly available in the repository
  (MIT / CC-BY-4.0).
Comments and criticisms welcome.
Thanks,
Kyle Townsend
Independent
ktownsend@spfk12.org