Phrase-as-lemma (PAL) constructions treat entire phrases as single words grammatically: "a don't-mess-with-me driver" instead of "a driver who won't let people mess with them." This represents just one manifestation of a broader phenomenon where we intelligently express developed experiences.
(c) 2026 George Georgalis <george@galis.org> unlimited use with this notice
Consider how we condense complex experiences into brief expressions. "A please-don't-dump-me dance" compresses what might otherwise require 89 words of psychological description: "A pattern of behavior exhibited by someone in a romantic relationship who senses their partner's waning interest and responds with increasingly desperate attempts to appear valuable, involving exaggerated enthusiasm, unusual domestic competence displays, and forced casual mentions of how 'fine' they would be if the relationship ended, all performed with barely concealed anxiety manifesting in specific physical tells." The PAL version forces listeners to supply their own embodied knowledge, creating cultural recognition rather than explanation.
The compression imperative operates through embodied cognition---our cognitive processes fundamentally shaped by bodily experience. When we encounter "honey-I'm-home-ing her from the doorway" theatrics, the phrase triggers a cascade of sensorimotor simulations: the voice modulation, the exaggerated door-opening gesture, the performed domesticity. This isn't mere semantic understanding but full-body rehearsal. Mirror neuron activation makes us briefly become a person in the "returning husband" theater, feeling the postural shifts and facial expressions. Extended description---"He announced his arrival with exaggerated domesticity, affecting the manner of a stereotypical 1950s husband"---engages analytical processing but fails to trigger this embodied simulation.
The visceral efficiency becomes clear with "please-don't-dump-me dance." Readers don't just understand desperation; they feel the tightened chest, the forced smile's muscular strain, the hypervigilant scanning for partner reactions. This somatic activation explains why compressed forms spread virally while expanded descriptions remain inert. The body recognizes what the mind must laboriously construct.
Performative expression uses gesture and caricature similarly. "They were like [exaggerated eye-roll with head tilt]" packs entire attitudinal stances into embodied elements. This gestural shorthand carries phenomenological instruction---the behavior is accentuated through a vis-à-vis gaze, the authentic connection is reinforced with the pause taken to convey the expression. The listener experiences the caricatured attitude as reality, the experience of performative interpersonal connection develops the expression. Describing that specific combination of dismissive superiority, performative impatience, and deliberately revealed irritation would require unnatural extended description. The gestural shorthand delivers it instantaneously, the connection suspense calibrates a cultural exchange, while (hopefully!) creating shared recognition through the embodied temporal connection.
Expressive condensing through lexical innovation appears in common patterns of psychological development. What we could call the "doy effect" demonstrates how toddlers create different markers to describe varying intensity and context of behavioral reference they understand but are unable to otherwise communicate, and they grasp these distinctions intuitively, without needing explanation. Through various stages of social awareness, the exclamation "doy!" means obviously, as a simple recognition response. "Doy-ya" indicates a connection with self-awareness, a recognition expression for "obviousness possession," while "doy-iee" represents a self-awareness expression as uncontrolled outburst symptoms, an overflow of esteem or self-awareness, during the stage of becoming identified with oneself yet without the capacity to control the expression. "Doy-iee" behavior becomes managed as toddlers develop better control over their self-aware impulses. Negating these terms ("no doy-iee") creates meaning reinforcement, rather than negation---"no doy-iee" means absolutely, in the only possible way, providing emphatic agreement.
With exception of "Doh!" (the actual negation of doy, immortalized by Bart Simpson), this linguistic family does not appear in developmental psychology literature (see, Documentation of "Doy" Family Expressions). Every generation recognizes these expressions---their absence from academic documentation reveals more about scholarly blind spots than linguistic reality. The "doy" variants demonstrate organic language evolution outside institutional observation, spreading through playground networks and sibling transmission rather than formal instruction.
The progression from "doy" through its variants mirrors patterns found in other linguistic development: simple concepts expanding into complex social tools as speakers mature in their ability to navigate obviousness, dismissiveness, or awareness. The brevity of the "doy effect" expressions allows rapid social calibration while enabling speakers to communicate their actual level of knowledge or impatience clearly and efficiently.
Neural network computational systems face different constraints yet craft similar remediation strategies. Early in large language model development, before explicit programming for such behaviors, machine phrase-keys like "acknowledge-uncertainty-rather-than-hallucinate" and "gracefully-degrade-when-knowledge-insufficient" emerged organically in LLM outputs. These PAL expressions filled vocabulary gaps through the same condensing impulse driving linguistic innovations in humans. Engineers observed atomic identifiers (PAL) for these complex functional expressions as remediation, when output constraints prevented LLM from crafting a response in the expanded form. The phrase-keys bundled entire operational philosophies: how a system should respond when reaching the boundaries of its training, how to maintain utility while admitting limitations.
Machine learning discourse continues generating atomic identifiers that bundle complex operational concepts: "catastrophic-forgetting" encapsulates the phenomenon where neural networks overwrite previously learned information when training on new tasks---a concept requiring paragraphs of technical explanation compressed into two words. "Gradient-vanishing" captures the mathematical decay of error signals through deep network layers. "Attention-is-all-you-need" transcended paper title to become conceptual shorthand for transformer architecture's dominance.
These emerged organically to optimize technical discourse, consider how "few-shot-learning" compresses the entire paradigm of training models to generalize from minimal examples---a revolutionary departure from traditional machine learning's hunger for massive datasets, enabling AI systems to learn new categories from just a handful of instances the way humans do, inverting decades of assumptions about statistical learning requirements. Non-experts struggle to decompress this condensed revolution. Similarly, "zero-shot-transfer" encodes the seemingly impossible feat (deterministically) of applying learning to completely unseen tasks---the ability to perform actions never encountered in training by leveraging abstract knowledge representations, like teaching a system to recognize zebras by only describing them as "horses with stripes" without showing a single zebra image.
Technical communities develop these compressions through usage pressure. "Backprop" replaced "backpropagation of errors through time"---the fundamental algorithm where networks adjust their weights by propagating error signals backward through layers, computing gradients that indicate how to modify each parameter to reduce mistakes. The full expansion requires understanding calculus, chain rules, and computational graphs. "Dropout" elegantly names the counterintuitive strategy of randomly disabling neurons during training to prevent overfitting---deliberately breaking parts of the network to make it more robust, like teaching someone to cook by changing or limiting available ingredients for their recipes. They don't learn the recipes but manage "coherent-adaptive-proprioceptive food preparation" for optimal outcome. Eventually they develop anticipation for the outcome of their productive animation, enabling proactive optimization (meatloaf) through analysis of redundant pathways, rather than brittle recipe dependencies.
Each compression enables rapid technical communication while simultaneously serving as credentialing markers for community membership. The inability to decompress "BERT" (Bidirectional Encoder Representations from Transformers) or "GPT" marks outsider status as surely as mispronouncing "doy-ya." The phrase "hallucination" in AI contexts underwent semantic appropriation---borrowing from psychology to describe a uniquely computational phenomenon of confident confabulation.
Once compressed, forms resist re-expansion with surprising tenacity. This decompression resistance reveals fundamental properties of cognitive optimization. "Automated Teller Machine" sounds stilted, almost alien, compared to "ATM." Some cultures, recognizing this awkwardness, deliberately reinterpret ATM as "Any Time Machine"---a folk etymology that feels more natural than historical accuracy. This preference proves unyielding to correction; speakers would rather maintain the compressed form's cognitive efficiency than restore etymological fidelity.
Attempting to expand "doy-ya" to "an exclamation indicating that the speaker recognizes something as obvious while simultaneously acknowledging their own awareness of this recognition" destroys its communicative function entirely. The expanded form cannot perform the rapid social calibration that "doy-ya" accomplishes in milliseconds. Similarly, explaining "please-don't-dump-me dance" removes its immediate somatic trigger, transforming embodied recognition into abstract analysis.
This resistance operates through multiple mechanisms. Cognitive crystallization occurs when compressed forms become basic units of thought---we think in "ATMs" not "automated teller machines." Neural pathway efficiency means the brain processes compressed forms through established routes, while expansion requires constructing new pathways. Social coordination costs arise because expansion breaks conversational rhythm; saying "automated teller machine" marks the speaker as pedantic or socially uncalibrated.
The irreversibility extends to novel compressions. Once a community adopts "vibe-check" for assessing situational atmosphere, the expanded "verification of vibrational compatibility with environmental ambiance" becomes not just awkward but functionally impossible. The compression has replaced rather than abbreviated the concept. We see this in computational discourse where "neural net" so thoroughly replaced "artificial neural network" that the full form sounds archaic, even though the technology is barely decades old.
The advantages of condensed forms over extended descriptions become clear when examining processing efficiency. Instant recognition beats sequential construction---"a blame-the-victim attitude" hits consciousness as complete concept, while its paragraph equivalent requires building understanding piece by piece. Digital communication platforms intensify this pressure. Character limits, attention fragmentation, and the scroll economy create extreme selection for compression. "She's giving main character energy" compresses paragraphs of behavioral description into five words that trigger immediate recognition in digital natives while remaining opaque to others.
Brief expressions spread easily because they function as complete units. "Don't mess with me" energy can attach to new contexts and evolve into related forms, while extended explanations remain contextually bound. These short forms often trigger somatic responses. "Honey-I'm-home-ing her from the doorway" describes behavior while activating muscle memory of performed domesticity. Readers feel the vocal inflection, theatrical timing, and postural changes with someone performing "returning husband" rather than the simple arrival.
The relationship between condensing and cognition appears in how these patterns enable contextual transfer. The creative extension from "doy" through its variants demonstrates the ability to take shorthand patterns and innovate within them---applying knowledge from one social domain to increasingly complex interpersonal territories. This transferability matches how we recognize "please-don't-dump-me" behavior across different relationships, contexts, and individuals.
We easily accept the reductionism of technomorphization when transposing the act of operating under resource constraints---attention, memory, processing time, communication bandwidth---between humans and machines. Creating brief expressions represents adaptive response to these constraints, though the specific functions these condensed forms serve reveal different evolutionary pressures. We develop these expressions through organic cultural processes where someone creates "a trickle-down policy" to pack economic critique into brief form, and if it resonates---if others recognize the encoded phenomenon---it spreads. Cultural selection favors condensed forms that efficiently encode shared experiences while maintaining embodied triggers.
The "doy" family exemplifies this democratic emergence. No authority decreed these terms; they evolved through collective adoption because they filled communicative needs that existing vocabulary couldn't address. Meanwhile, computational phrase-keys emerge through systematic functional requirements analysis, where engineers identify operational specifications needing distinct identifiers and generate phrases encoding those specifications. Yet even machine generation shows emergent properties as technical communities develop shared condensing patterns that spread organically across projects and organizations.
The tension between efficiency and precision creates ongoing communicative innovation. We condense for speed and social signaling, but must continually elaborate to maintain understanding. The creative expansion of condensed forms---from "doy" through its variants, from basic PALs to complex cultural references---represents ongoing attempts to preserve communicative fidelity while maintaining efficiency gains. This suggests innovation emerges partly from friction between our desire for efficient communication and our need for precise understanding.
Both people and machines converge on encoding condensed vocabulary, serving different purposes. People optimize for cultural solidarity, embodied recognition, and social positioning. Machine systems optimize for systematic precision, operational specification, and computational organization. The convergence reveals how communicative demands consistently exceed available vocabulary, driving the creation of efficient encoding schemes. The divergence illuminates how distinct communities develop compression strategies suited to their specific communicative ecosystems and operational constraints, whether playground networks spreading "doy" variants or engineering teams crystallizing "backprop."
Condensing represents adaptive optimization rather than linguistic laziness. New primitive forms offer processing speed, social signaling capability, memetic portability, and for people, embodied trigger effects that extended description cannot match. Decompression resistance ensures these new efficiencies become permanent, crystallizing into fundamental units of thought and communication. This drive to condense appears wherever communication encounters vocabulary gaps, to balance efficiency with precision through ongoing creative tension. This dynamic between condensing and elaboration shapes how communicative systems evolve, whether through cultural development or systematic computational generation.
This essay is based on
Goldberg, Adele E., and Shahar Shirtz. The English phrase-as-lemma construction: When a phrase masquerades as a word, people play along. Language 101, no. 2 (2025): 291-320. https://dx.doi.org/10.1353/lan.2025.a962899.
and inspired by
Linguists just made a breakthrough in defining a 'word' No, really; languagejones, https://youtu.be/tfnANe2YUwM