What the Literature Already Knows About Circumlocution

Recall(From the intro post)

WordBridge's H1 claims a continuously listening model can detect circumlocution onset and produce a correct Top-3 lexical candidate without explicit prompting. Before designing anything, it's worth asking: how much of this has already been done, and in what form?

The short answer: there's a real body of work on circumlocution and aphasic speech — but almost all of it operates on finished transcripts, after the fact. Almost none of it operates on a live audio stream while the speaker is still talking. That gap is where WordBridge's actual novelty has to live.

The corpus everyone starts from: AphasiaBank

AphasiaBank is the de facto standard dataset for this kind of work — open-access transcripts and audio from standardized discourse tasks, with a dedicated cohort of anomic aphasia speakers. Critically, it isn't just raw audio: it comes with sentence-level and word-level annotation codes, including:

Empty speech
Circumlocution
Jargon
Agrammatism / paragrammatism
Perseveration
Word-level dysfluency categories

Definition(Semantic paraphasia)

A word-substitution error where the produced word is semantically related to the intended target but incorrect — e.g., saying "fork" when the target was "spoon." Distinct from circumlocution, where no word is produced at all and the speaker instead describes the target.

This annotation layer is what makes AphasiaBank usable for supervised work — and it's also the boundary of what most existing research touches. The annotations exist at the level of complete utterances, produced by human coders reviewing recordings. Nothing about that pipeline runs in real time.

Target-word identification from descriptions

The closest existing work to WordBridge's core retrieval task is a 2024 EMNLP Findings paper on intended target identification for anomia patients via gradient-based selective augmentation. The setup: given a patient's circumlocutory description (the "round thing you put food on" stand-in for "plate"), predict the target word.

The paper frames two compounding failure modes:

Unseen terms — the vocabulary needed to identify the target may never appear in the description at all
Semantic paraphasia noise — the description itself may contain substituted, incorrect words that have to be filtered rather than trusted

Their fix is a synthetic-data augmentation method where gradient values control the quality of generated training examples, and gradient variance is used to decide which "relevant but unseen" terms to inject. They validated the approach first on a general Tip-of-the-Tongue retrieval dataset, then applied it to real AphasiaBank patient data (EMNLP 2024 Findings, pp. 10513–10527).

Remark

This is the right task — text in, target word out — but it's still operating on a complete description after the speaker has finished producing it. WordBridge's H2 (temporal advantage) is specifically about whether a model can produce a usable candidate before the description is complete — at 500ms, 1s, and 2s into the hesitation, not after.

LLM-based "linguistic fingerprinting"

A separate line of work uses LLMs as feature extractors over AphasiaBank transcripts, building an 8-dimension profile per speaker:

Dimension	What it captures
Vocabulary diversity	Lemma similarity across the sample
Syntactic structure	Sentence construction patterns
Connective usage	How clauses are linked
Word abstraction level	Concrete vs. abstract vocabulary
Vocabulary complexity	CEFR-style complexity bands
Maze type	Categorized false-start / self-correction patterns
Semantic field variability	Topic/domain drift within a sample
False start analysis	Frequency and structure of abandoned utterances

The authors report that contemporary LLMs show real promise for aphasia assessment and group differentiation — useful for clinical profiling, and notably close to the kind of "longitudinal patient profile" WordBridge's Background Model is supposed to maintain (PMC12437502).

Intuition(Where this fits WordBridge)

This is essentially the Background Model's job description, already validated as a standalone task: build a longitudinal linguistic profile from accumulated transcript data. WordBridge doesn't need to invent this part — it needs to run it continuously and feed its output back into a live session instead of producing a static clinical report.

Text prediction and QA as communication aids

A third strand — BERT-based text prediction and question-answering models trained on AphasiaBank transcripts — frames itself explicitly as a communication aid: predicting likely next words/phrases and answering questions for patients (Manir et al. 2024, IEEE Access).

This is the closest thing to an existing "assistive tool" in the literature — and it's exactly the query-initiated pattern the intro post argued is the wrong fit. The model waits for typed or spoken input, then predicts. There's no detection step; the user still has to initiate.

The actual gap

Stacking these up:

What exists	What WordBridge needs
Target-word prediction from a complete circumlocutory description	Target-word prediction from a partial, in-progress description, updated as it develops
Longitudinal linguistic profiling from transcripts	The same profiling, running continuously, feeding a live session
Query-initiated text prediction/QA	Detection that there's a query to answer at all — without being asked
Offline annotation of circumlocution onset by human coders	Real-time onset detection from streaming audio (prosody, hesitation, disfluency)

Warning(No existing benchmark at WordBridge's actual timescale)

H2 wants accuracy figures at 500ms / 1s / 2s windows into a hesitation. Nothing in the current literature reports numbers at that resolution — everything is evaluated on complete utterances. This means there's no baseline to compare against directly; one of the first concrete deliverables has to be establishing that baseline, probably by re-segmenting AphasiaBank audio at fixed time offsets from annotated circumlocution onset points.

Summary(Summary)

The pieces WordBridge needs largely exist as separate, validated tasks: target-word identification from circumlocutory descriptions, LLM-based longitudinal linguistic profiling, and transformer-based communication aids. None of them operate on streaming audio at sub-second resolution, and none of them are evaluated on the "partial description, time-boxed" framing WordBridge's H1/H2 require. The dataset construction work (re-segmenting AphasiaBank around annotated onset points) isn't just prep — it's a contribution in its own right, since no comparable benchmark currently exists.

DATE	Jun 15, 2026
BY	gitcoder89431
READ	6 min
TAGS	#aphasia#nlp#research#aphasiabank
STATUS	published