Polysemy, safety and epistemic risks in AI discourse

In last week’s post I traced some connections between an old paper on ambiguity and polysemy that a colleague and I published in 2001, when generative AI and LLMs were not yet on the horizon, and a 2026 paper linking ambiguity and polysemy to modern AI discourse to hype and manipulation, power and ethics.

In our 2001 article, we argued that ‘live’ polysemy is a functional, communicative tool, rather than merely a fixed dictionary phenomenon. We proposed that deliberate and productive ambiguity was a pragmatic and rhetorical strategy used to facilitate social interaction and negotiate meaning in conversation, especially through puns, jokes and, of course, metaphors.

Tracing connections between the old paper and the new made me curious. I wondered whether there were other papers out there exploring AI discourse that directly cite our old paper or are adjacent to it in some way. A search revealed three other papers of this type published between 2022 and 2026. Where the paper discussed in the previous post focused on ethics, these papers focus on the philosophy of language, AI safety and epistemic risks.

The first paper I found was a 2022 philosophical paper that engages with our paper in a computational/AI-adjacent context, namely scientific language. The second one was a 2024 paper that sits squarely in the AI safety/mechanistic interpretability world and quotes our paper on the inherent polysemantic nature of language. The third one was a 2026 AI related paper on ‘ambiguity collapse’ that mirrors our paper but doesn’t quote it. Where we argued that people exploit polysemy productively in conversation, this paper argues that LLMs destroy that productive polysemy and that this poses epistemic risks.

The papers all circle around the issue/problem/opportunity of words having multiple meanings in the lexicon (think about the verb ‘get’ for example, which has 102 meanings according to the Oxford English Dictionary – polysemy); and in language use (think about somebody saying “I missed her”, meaning either that they are sad about her absence, miss-timed a rendezvous or didn’t hit her – ambiguity).

In the following, I’ll summarise these papers and come to some conclusions regarding AI discourse based on my analysis of the 2026 paper discussed in the previous post and the three papers published between 2022 and 2026 – two decades after our 2001 paper on ‘ambiguities we live by’.

Ambiguity and the philosophy of scientific language

In 2022 Beckett Sterner published a paper in the philosophical journal Synthese entitled “Explaining ambiguity in scientific language”. It sits at the intersection of data science and the philosophy of language and builds on the terrain that our 2001 paper opened up.

The article addresses a debate that is relevant to AI and data science. Efforts to make scientific publications and data intelligible to computers generally assume that accommodating multiple meanings for words (polysemy), undermines reasoning and communication, echoing long-standing debates about literalness and metaphor in philosophical language. Sterner argues that this assumption has been contested by historians, philosophers, and social scientists who have applied qualitative methods to demonstrate the generative and strategic value of productive polysemy.

Sterner also points out that results from linguistics have shown how polysemy can actually improve the efficiency of human communication, quoting a 2012 paper by Steven T. Piantadosi on the communicative function of polysemy in language. That paper is also quoted in the next article I’ll discuss below.

Sterner refers to our 2001 paper specifically for the role of polysemy in humour and irony and goes on to build a “contextual pluralist” typology of when ambiguity is productive versus harmful in scientific contexts. Polysemy does both substantial positive and negative work in science, but its utility is context-sensitive in ways that are often overlooked by prescriptive rules against using polysemic terms, especially in highly computationally intensive settings. This maps quite closely onto our argument that context, not a blanket rule of disambiguation, governs polysemy’s value.

Ambiguity and mechanistic interpretability

In 2024 Matthew A. Clarke, Hardik Bhatnagar and Joseph Bloom published a paper in LessWrong entitled “Compositionality and Ambiguity: Latent Co-occurrence and Interpretable Subspaces”.

Researchers studying AI safety want to understand what is happening inside large language models. One promising tool for this is something called a Sparse Autoencoder (SAE). This is essentially a device that tries to decompose an LLM’s internal activity into simple, interpretable ‘features’, preferably one concept per feature (for example, one feature for “Monday”, another for “France”).

Ideally, each feature would fire independently. However, Clarke et al. show that some features consistently activate together, forming clusters. Mapping out these clusters, they find two interesting reasons why this might happen. First, meaning is sometimes built from combinations: the model has features for “some of” and “all of”, and when both fire together they represent intermediate quantities like “many of” or “almost all of.” The more one feature dominates the other, the closer the meaning shifts toward that extreme. Second, when a word could mean different things, multiple features activate simultaneously, each representing a different possible meaning. The word “how” is a good example. It can mean “in what way” or “to what degree”, and the mix of active features reflects the model’s uncertainty about which meaning applies.

A major concern in AI safety research is that models can be ‘overconfident’, picking one interpretation of an ambiguous instruction and running with it, rather than recognising that multiple meanings are in play. The finding that clusters of co-occurring features seem to encode uncertainty is potentially important here. It suggests there might be a way to measure how uncertain a model is about the meaning of a given input, by looking at which features are active and in what proportions. That is more informative than simply asking the model whether it is confident and getting a yes or no answer.

This echoes the argument we made in our 2001 paper, that hearers and speakers hold multiple meanings simultaneously, and that this co-activation of senses is not a failure of disambiguation but a feature of communication. This is, indeed, the very mechanism of wit, irony, and social bonding. It seems that Clarke et al. found, inside the mathematics of a neural network, that the model does something structurally analogous: it doesn’t pick one meaning and discard the others, it holds them in a weighted blend. Ambiguity is not a bug to be engineered away, but a structural feature of how language works, something that needs to be appreciated to make a model safe and aligned.

Ambiguity collapse and LLMs

In 2026, a month after LaCroix et al. published their arXiv paper discussed in my previous post, Shira Gur-Arieh, Angelina Wang and Sina Fazelpour published their arXiv paper entitled “Ambiguity collapse by LLMs: A taxonomy of epistemic risks”. Going beyond Clarke et al., they point out that LLMs are increasingly used to “make sense of ambiguous, open-textured, value-laden terms”. They then explore “a phenomenon that occurs when an LLM encounters a term that genuinely admits multiple legitimate interpretations, yet produces a singular resolution, in ways that bypass the human practices through which meaning is ordinarily negotiated, contested, and justified”.

Unlike the LaCroix et al. paper, this one doesn’t cite our paper, but its central argument is essentially the mirror image of ours, applied to AI systems. Where we argued that people exploit polysemy productively in conversation, Gur-Arieh et al. argue that LLMs destroy that productive polysemy in a dangerous way.

And where LaCroix et al. introduced the new term of ‘glosslighting’, that is, the manipulative exploitation of ambiguity, Gur-Arieh et al. introduce the term of ‘ambiguity collapse’ – a sort of disambiguation on steroids.

Drawing on interdisciplinary accounts of ambiguity as a productive epistemic resource, the authors develop a taxonomy of epistemic risks at three levels: process (foreclosing opportunities to deliberate and shape contested terms), output (distorting concepts agents act upon), and ecosystem (reshaping shared vocabularies and how concepts evolve over time).

This is, in a way, the mirror image of the argument we make in our paper: Where we tried to show that humans need to sustain multiple meanings simultaneously to sustain language and social life, Gur-Arieh et al. try to show that LLMs systematically fail to do this, and that this failure may have serious social consequences. Our argument that hearers/interpreters must ‘keep both meanings in mind’ is what LLMs, by design, cannot do, according to Gur-Arieh et al. This somewhat contradicts the Clarke et al. paper.

Conclusion

In this post and the previous one, I have tried to trace the reception of our 2001 ‘ambiguities we live by’ paper in a new generative AI context. This means looking at how LLMs deal internally with ambiguities and how AI researchers and industry deal externally with ambiguity, that is, how they can use ambiguity in communication but also for deception. This also means examining the ethical and safety implications of ambiguity use in the context of generative AI. What have we learned?

LaCroix et al. use our 2001 paper to argue that some AI producers exploit polysemy strategically upwards (glosslighting) which can lead to hype, deception and manipulation. The issue LaCroix et al. tackle is one of AI ethics.

Gur-Arieh et al. argue that AI systems collapse polysemy downwards, destroying the pragmatic richness that our paper described. Their goal is to design “systems that surface, preserve, and responsibly govern ambiguity”, thus tackling issues of AI safety.

Sterner argues that polysemy’s value for science (and by extension computational data science and AI) is contextually determined. It is neither always good nor always bad. It should therefore not be banned from scientific language, and we should appreciate that multiple meanings, including “metaphors continue to shape and direct the empirical observations and theories of scientific fields”.

Clarke et al. argue that language models do not always store meanings in neat, separate boxes; instead, related internal signals often appear together in clusters. That matters for ambiguity because a word or phrase with more than one meaning may be represented by a mix of signals, with context helping the model settle on the right one. The main message is that ambiguity is built into how the model organises meaning, that it is not just a problem to be eliminated, and that this has implications for AI safety.

From different angles — AI safety, AI ethics, philosophy of science, epistemic risks, mechanistic interpretability — researchers keep rediscovering the same thing: that the standard assumption of ‘disambiguate as fast as possible, aim for one meaning’ is wrong, both in terms of how humans use language and in terms of how AI systems handle it too. Whether scholars are worried about glosslighting (LaCroix et al.), ambiguity collapse (Gur-Arieh et al.), productive scientific ambiguity (Sterner), or the internal geometry of meaning in neural networks (Clarke et al.), it all leads back to our argument that polysemy is live, not dead.

Acknowledgement: I consulted Claude on some of technical issues.

Image: Vassily Kandinsky, 1923, “Silent Tension”


Discover more from Making Science Public

Subscribe to get the latest posts sent to your email.

Comments

Leave a comment

Discover more from Making Science Public

Subscribe now to keep reading and get access to the full archive.

Continue reading