Why text messages ruin relationships
The 25-point overconfidence gap: senders think they're understood 88% of the time, receivers decode the tone right only 63%. The misunderstanding is a feature of the medium.
Highlights
Plain text strips away 37% of our intended emotional meaning. If you are sending a difficult message, text is the worst channel you can choose.There is a 25-point gap in digital communication: senders expect their tone to be understood 88% of the time, but receivers only get it right 63% of the time. The misunderstanding is a feature of the medium.Emotion does not need to travel fast. It needs to travel well. We need to bring back intentional time and multi-sensory layers to how we message.
About this episode
A deep dive on the math of plain-text communication. We unpack the 2005 Kruger study, why emojis are a "crayon drawing of emotional reality", and the case for treating digital messages as multi-sensory emotional scenes that travel well, not fast, including the .emo file format, temporal messaging, and the augmentation-never-substitution rule for emotional AI.
Transcript
JohnI want you to try something for a second. Um, if it is within reach, just pick up your phone, wake up the screen. Now, take a look at the very last text message you received or, you know, even just picture the last notification that popped up.
AmandaRight, just that little banner sliding down from the top of the glass along with that standard, flat, little ping sound.
JohnExactly. Now, consider the sheer range of what that specific notification could contain. Like, it could literally be a delivery update letting you know that a bulk order of paper towels has just been dropped on your porch. Or on the other hand, it could be a profoundly vulnerable, heartfelt message from someone you love.
AmandaTelling you something they have, you know, never told anyone else before.
JohnYeah. But on the surface, before you actually read the words, those two pieces of information arrive looking exactly identical. I mean, they travel at the exact same speed, they are rendered in the exact same flat text housed in those same gray and blue bubbles, and they trigger the exact same notification sound.
AmandaIt really is sort of a jarring realization when you map it out that way, because we have successfully stripped all the physical and emotional context away from human connection, and we're just leaving the raw data behind.
JohnAnd that tension is the core mission of today's deep dive. We are unpacking a massive stack of research, founding theses, and conceptual designs from an emotional technology company based in New York City called 3.2.1 émotion.
AmandaWhich is such fascinating material to dig into.
JohnIt really is. Uh, so our goal today is to thoroughly explore why plain text is actually and actively a terrible medium for emotional communication. We are going to look at the math behind how plain text secretly strips away roughly 37% of our intended meaning, and how treating our digital messages as multi-sensory emotional scenes, rather than just, you know, high-speed data, might be the only way to actually save our relationships.
AmandaAnd looking at the broader context of these sources, I think this exploration is incredibly urgent. The data paints a very clear, very contradictory picture of our current reality.
JohnI mean, just looking at the sheer state of things right now, we have somehow managed to build the single most connected society in the entire history of the human race. Yeah, like you could reach virtually anyone, anywhere on the globe in half a second.
AmandaBut at the exact same time, according to recent global health reports, 1 in 6 people worldwide experiences persistent, chronic loneliness. We are hyper-connected in a technical sense, but we are completely isolated in a human sense.
JohnSo, how do we even reconcile those two facts?
AmandaWell, the initial instinct is to look at those numbers, the loneliness epidemic, the record highs in youth isolation, and view it as a personal failure. Society tends to tell people, "Oh, you just need to put yourself out there more," or "You need to work on your emotional intelligence."
JohnRight, like it's your own fault you feel alone.
AmandaExactly. But what the data from 3.2.1 émotion and these psychological studies suggest is something entirely different. This isn't your personal failure; it is a massive architectural design failure of the internet itself.
JohnOh, wow. A design failure.
AmandaYeah, because for the last two decades, software developers optimized our communication tools for engagement, for speed, and for scale. We optimized for reaching each other's devices, but we built absolutely no infrastructure for actually feeling each other's humanity.
JohnOkay, let's unpack this, because I want to look closely at that default medium, plain text. Yeah. Before we can talk about how to fix our communication, we have to understand the mechanics of how it is currently failing us.
AmandaWe have to look at the math.
JohnYeah, let's look at the actual numbers on what gets lost when you send a text message. There is a really foundational psychological study from 2005 that completely reframed how I look at my inbox.
AmandaOh, the "Egocentrism Over E-mail" study. Yeah, it remains a cornerstone of communication psychology because the methodology was so brutally simple.
JohnRight, so walk me through how they actually proved this, because we all kind of instinctively know that texts get misunderstood, but they actually quantified it.
AmandaThey did. So, the researchers essentially had participants send a series of statements to receivers via email or text. Some of these statements were intended to be heavily sarcastic, while others were entirely serious.
JohnOkay, pretty standard.
AmandaRight. But before hitting send, the senders were asked to predict how often the receiver would correctly interpret their intended tone, and the senders were incredibly confident. They expected their intended emotional tone to be correctly decoded about 88% of the time.
JohnBecause in their own head, the tone is obvious.
AmandaExactly. They felt that their sarcasm or their sincerity was blatantly obvious. But then, the researchers looked at the receivers, who were sitting in another room reading these flat words on a screen, and the receivers only decoded that intended tone correctly about 63% of the time.
JohnWow. So basically, we are walking around blindly assuming everyone is picking up on our subtle sarcasm or our genuine empathy, while mathematically, a full third of the time, the person on the other end thinks we are just being jerks.
AmandaYou are looking at a massive 25-point overconfidence gap. When you run the math on a daily scale, it means roughly 37% of your intended emotional meaning simply vanishes in transit.
JohnIt just vaporizes.
AmandaIt vaporizes into the digital ether the moment you hit send. And the mechanism behind this loss connects back to some classic research on human communication. You have likely heard the rough rule of thumb that communication is up to 90% non-verbal.
JohnRight.
AmandaWhen we speak face-to-face, the emotional tone isn't found in the dictionary definition of the words; it is carried by the prosody.
JohnLet's define prosody for a second, because that's a term that gets thrown around a lot in linguistic circles.
AmandaSure. Prosody is basically the musicality of your speech. It's the pitch of your voice, the intonation, um, the slight hesitation before you drop a difficult word.
JohnLike the physical rhythm of it.
AmandaYes. Combine that with the tiny micro-expressions on your face, the openness of your posture, the pace of your breathing. All of these physical data points tell the receiver exactly how to interpret your words. When you strip all that physical context away to send a text message, you are literally starving the receiver of the necessary data to understand you.
JohnSo, it's like playing a highly produced song for someone, but with the volume completely muted and assuming the other person can still feel the bass drop just because you can hear it in your own head.
AmandaThat is a highly accurate way to look at it. And because the human brain absolutely despises a vacuum, the receiver doesn't just read the text as neutral. Their brain actively reconstructs that missing tone from scratch.
JohnOh, so they fill in the blanks themselves.
AmandaExactly. They project their own current mood, their own insecurities, or their own environmental context onto your flat words. If they are having an anxious day, your simple "okay" reads as passive-aggressive. If they are in a rush, your long, thoughtful message reads as demanding. And as the data shows, they usually guess wrong.
JohnI hear that, but I have to interject here. I mean, I don't just send flat, punctuated text. I think most of us realize text is emotionally dangerous, so we try to patch it, right? I throw a crying laughing face or a skull or a red heart onto the end of a text precisely to ensure the tone lands the way I intended it to.
AmandaMmhmm.
JohnDoesn't that solve the 37% gap?
AmandaThe sources actually refer to this exact behavior as applying an "emoji patch." It is a fascinating societal workaround, but fundamentally, you are just slapping a flat yellow annotation onto a flat green medium.
JohnA flat annotation, right.
AmandaThink about the physiological capability of human expression. The human face is driven by dozens of complex muscles capable of producing roughly 10,000 distinct, nuanced expressions.
John10,000.
AmandaYeah. Your face, without a single word, can communicate the profound difference between, you know, "I am pleasantly surprised," "I am deeply moved," and "I am holding back tears because I am utterly lost for words."
JohnAnd my keyboard has what, a few hundred faces?
AmandaThe entire Unicode emoji set has about 3,600 static symbols total, and only a fraction of those are actually faces. The researchers describe emojis as a "crayon drawing" of emotional reality.
JohnA crayon drawing. Wow.
AmandaThey are a crude, low-resolution approximation. They might move our emotional fidelity from 1% up to maybe 5%, which, sure, prevents some basic misunderstandings, but they absolutely do not get us anywhere near the 95% fidelity required for true, deep connection.
JohnA crayon drawing of emotional reality. That perfectly encapsulates the feeling. You get the rough shape of the emotion, but none of the texture. Here's where it gets really interesting, though. If text with emojis is still at the bottom of the barrel, what are our actual options?
AmandaWell, sociologists have established a clear hierarchy of communication channels based on emotional fidelity. In-person interaction is the absolute gold standard for human bonding and accuracy. It provides shared physical space, pheromones, true synchrony.
JohnWhich makes sense.
AmandaRight. Directly below that is video communication, which provides facial expressions and posture. Below video is voice, which retains the crucial elements of prosody and pace. And sitting at the very bottom, with the lowest fidelity and the highest rate of catastrophic misunderstanding, is text.
JohnWhich makes this next piece of research incredibly frustrating. There is a study in our sources from 2021 that looked at how we actively choose our communication mediums. It turns out, when we have to have a hard conversation or share something vulnerable or apologize, we consistently default to text.
AmandaWe do.
JohnWe actively choose the absolute lowest fidelity channel for our highest stakes emotional moments. Why on earth do we do that to ourselves?
AmandaIt really comes down to a fundamental miscalculation of social friction. The researchers found that people consistently overestimate how awkward or uncomfortable a voice note or a phone call will be.
JohnWe just assume it's going to be terrible.
AmandaExactly. We imagine that hearing someone's voice during a tense moment will be overwhelmingly awkward. But the research proves our anxiety is lying to us. Voice interactions consistently produce a much stronger felt connection and significantly fewer misunderstandings than the participants predict.
JohnSo, we're just hiding.
AmandaWe hide behind the screen of text because it feels safe, and it allows us to edit our thoughts perfectly, but we sacrifice the actual connection just to buy that safety.
JohnOkay, so if higher fidelity is always the goal, and adding a face and a voice increases the connection and reduces the awkwardness, shouldn't we just FaceTime for everything? Like if I need to tell you something important, let's just hop on a live video call.
AmandaThat is the exact logical leap many tech companies made during the pandemic, but it ignores the concept of cognitive load. More channels are not a magic fix if they are poorly designed. We have to look at the phenomenon of "Zoom fatigue," which has been documented extensively.
JohnOh, absolutely.
AmandaIf you just force everyone into live, synchronous video grids, you induce massive non-verbal overload.
JohnRight, because I am not just looking at you. I am looking at you, I am looking at the background behind you trying to see what books you have on your shelf, and most exhaustingly, I am staring at a little mirror of my own face the entire time.
AmandaPrecisely. You are dealing with the unnatural stress of prolonged, intense eye contact at close range, the psychological drain of monitoring your own facial expressions in real time, reduced physical mobility because you literally have to stay in frame, and the cognitive strain of interpreting tiny, compressed faces in little digital boxes.
JohnIt's exhausting.
AmandaIt produces intense exhaustion, not connection. Live video just demands too much bandwidth from our nervous systems for casual, continuous communication.
JohnSo, what does this all mean? If live video exhausts us and plain text emotionally starves us, what is the middle ground?
AmandaThe true solution isn't a live video grid, nor is it a faster text message. It is redesigning asynchronous messages, so, messages you send and receive on your own time, into rich, multi-sensory emotional scenes.
JohnEmotional scenes.
AmandaYeah, and this brings us to the core thesis behind 3.2.1 émotion and their proposed architecture, which is the ".emo" file format. They argue that we need to fundamentally stop treating a message as a string of characters to be quickly parsed and start treating an emotion as an environment that is lived, not read.
JohnI mean, I am looking at the specifications for this .emo file format in the research, and it sounds almost overwhelmingly complex. Walk me through how this actually functions in practice, because this isn't just a text bubble with an audio memo attached to it, right?
AmandaNot at all.
JohnSo how does a phone actually communicate an emotional scene?
AmandaThink of a .emo file as an emotional zip file. It is a dense, multi-layered digital composition that unpacks itself and takes over your entire screen. It utilizes every single output mechanism your device possesses.
JohnOkay.
AmandaYou have the text, certainly, but layered beneath that is what they call a "feelmoji." This isn't a static yellow face. It is a dynamic pixel of emotional language that shifts and breathes to carry intensity and context.
JohnAnd the sources mention haptic rhythm, which I find fascinating. But how does the phone actually know what rhythm to play? Like, how does my device know if I am grieving versus if I am just wildly excited? Do I scroll through a menu and select a grief template?
AmandaIt is actually a combination of active user selection and passive biometric mapping. The software measures the cadence of your typing, the pressure of your fingers on the glass, and pairs that with emotional templates you select.
JohnOh, wow. So, it's reading how you type.
AmandaYes. And the result is a physical pulse, a vibration pattern that beats in the receiver's hand. Grief has a slow, heavy, sinking rhythm. Joy has a light, erratic, elevated flutter. And it doesn't stop at touch.
JohnWhat else is there?
AmandaThe file incorporates light. The screen itself pulses with specific color temperatures, cool blues for melancholy, warm ambers for affection. It seamlessly weaves in your actual voice and image, and even background ambient audio captured at the exact moment of recording.
JohnAnd the mechanism for actually opening this file is completely different from a normal text. These messages do not just slide down into your notifications banner while you are checking your bank balance.
AmandaNo, they demand attention.
JohnRight. To open a .emo message, you have to perform a physical ritual. The receiver might have to physically scratch the screen with their finger, mimicking a lottery ticket, to reveal the scene. Or they might have to press and hold their thumb on the center of the glass for three uninterrupted seconds to unwind it.
AmandaThe ritual is arguably the most important feature. By requiring a continuous physical action, the interface forcefully interrupts the endless, mindless scroll of modern phone usage.
JohnIt stops you in your tracks.
AmandaExactly. It demands your total physical presence. It forces the receiver to pause, center themselves, and implicitly say, "I am here, I have the bandwidth, and I am fully receiving this."
JohnIt totally changes the consumption model. I mean, traditional texting feels like chucking a greasy paperback of fast food through your car window while you speed through a drive-thru. You consume the information quickly, you barely process the taste, and you just keep driving.
AmandaThat's a great way to put it.
JohnBut a .emo file, acting as an emotional zip file, is a sit-down, multi-course meal. You cannot eat it in the car. You have to pull up a chair, look at the presentation, commit the time, and it demands your complete cognitive bandwidth.
AmandaThat analogy perfectly highlights the difference in processing. And a crucial requirement of a multi-course meal is that it actually takes time. This highlights the second massive architectural flaw in how we currently communicate: the complete collapse of the temporal dimension.
JohnThe temporal dimension, yeah.
AmandaThe default modern messenger has obliterated the concept of time. Everything arrives now, instantly. And because everything arrives instantly, human psychology dictates that everything feels equally urgent. But deep, real emotion does not operate on a fiber-optic timeline.
JohnSo, we are essentially saying that emotion doesn't need to travel fast. It needs to travel well.
AmandaExactly. When you take the time to build an incredibly rich, multi-sensory .emo scene, if you fire it off instantly like a rapid-fire text, you actually dilute the impact. Intentional timing is required to give the emotion weight.
JohnOkay, so how do they solve that?
AmandaThis introduces the concept of temporal messaging. Within this new architecture, a message carries its own deliberate delivery time. You can actually send an emotion into the future.
JohnWait, wait. So, practically speaking, I could sit down today, craft a deeply vulnerable message, and lock it so that it physically will not open until my best friend's 40th birthday next year?
AmandaYes, absolutely. Or you could lock a message to a specific GPS location, meaning it will only unpack when your partner physically walks past the specific bench where you got engaged.
JohnThat is wild.
AmandaThe psychological grounding for this is robust. There is research on what's called "future-self continuity," which explores how the brain processes time and identity. When researchers force participants to vividly imagine and communicate with a future version of themselves or a future version of a loved one, the neural regions associated with present-day empathy lit up.
JohnReally?
AmandaYeah, engaging with the future literally rewires and strengthens our present-day emotional bonds and decision-making.
JohnI hear the psychology behind that and, you know, the neuroplasticity makes sense. But realistically, let's look at how we actually live. In a world where we live and die by read receipts, where we expect immediate validation, doesn't making someone wait for a message you have already written feel like artificial friction?
AmandaI can see why you'd ask that.
JohnIt just feels a bit gimmicky to hide a message behind a countdown clock just for the sake of it.
AmandaIt is a completely understandable resistance because we have been conditioned to view friction as a software bug. We think friction is bad. But in human relationships, intentional friction is the very thing that proves value. Think about the historical weight of receiving a handwritten letter in the mail versus receiving an email. The physical letter carries emotional weight specifically because you know the person had to find paper, write it out, buy a stamp, walk to a mailbox, and let it physically travel for days.
JohnOh, I see.
AmandaThe friction proves the care. When you schedule a temporal message, you are communicating an extra, unspoken layer of meaning. "I thought about you on a random Tuesday, and I cared enough to plant this specifically for you to find a month from now." The delay isn't a gimmick. The timing itself is a core component of the message.
JohnI really like that framing: the friction is the message. But, and there's a big point to be made here, if creating these multi-sensory scenes, layering the haptics, choosing the color temperature, pacing the delivery, takes so much time and emotional energy, the obvious temptation is going to be outsourcing that labor. How does 3.2.1 émotion, or honestly anyone building this technology, prevent this from becoming just another automated process fueled by artificial intelligence? Because AI is deeply embedded in everything right now.
AmandaThat is the pivotal ethical and architectural line that the company draws in their founding thesis. Their absolute rule for emotional AI integration is this: augmentation, never substitution.
JohnLet's pause on that specific phrasing: augmentation, never substitution. In practice, on my phone, where is the boundary between augmenting an emotion and substituting it?
AmandaIt is the difference between AI as a mirror and AI as a partner. Augmentation uses AI as an instrument to expand your own capability.
JohnOkay, give me an example.
AmandaFor example, if you are struggling to write a journal entry or a message and you tell the AI you feel bad, the system can analyze your biometrics and typing patterns to help you expand your emotional vocabulary. It might ask, "Do you feel bad, or do you feel dismissed? Do you feel exposed?"
JohnSo, it's coaching you.
AmandaExactly. It highlights your emotional blind spots and prepares you to have a deeper human conversation. Substitution, on the other hand, treats the AI as the author. It is an AI that analyzes your calendar, sees that it is your mother's birthday, and automatically writes and sends a perfectly formatted emotional message to her for you. Or worse, an AI companion that acts as your substitute friend when you are lonely.
JohnI hear the ethical stance on that, but realistically, let's say I mess up. I completely forget my anniversary, I am panicking, and I am notoriously terrible with words. So, I open up a chatbot and I ask it to draft a beautiful, perfectly empathetic paragraph apologizing to my spouse. I copy it, I send it, and my spouse reads it, feels deeply understood, and forgives me. If the outcome is good and the marriage is saved, why is outsourcing that labor practically a bad thing? Didn't the AI successfully facilitate connection?
AmandaThat scenario highlights exactly how we misunderstand the mechanics of connection. The AI did not facilitate a connection; it successfully counterfeited one.
JohnCounterfeited.
AmandaYes. Real, enduring human connection requires two non-negotiable architectural ingredients: vulnerability and reciprocity. When you use a language model to generate that perfect apology, you entirely bypass the vulnerability. You did not wrestle with your own guilt, you did not endure the agonizing struggle of trying to find the right words, and you did not take the personal risk of getting it wrong and having to try again. You just outsourced the heavy emotional labor to an algorithm.
JohnIt's the difference between using a dictionary to look up a word to express yourself versus hiring a ghostwriter to talk to your wife in the dark.
AmandaExactly. One tool extends your humanity; the other entirely erases your presence from the interaction. Your spouse is feeling connected to a probabilistic string of text generated by a server farm, not to you. You weren't even in the room, emotionally speaking. And the long-term data on this substitution is incredibly dark. The sources highlight a watershed longitudinal study published very recently. They followed over 2,000 adults over a 12-month period to observe how AI interaction affects real-world loneliness.
JohnAnd the results are completely counter to what the tech industry promises, right?
AmandaCompletely. They found that individuals who felt emotionally isolated frequently turned to AI chatbots for social companionship. On the surface, it makes sense: they were hurting and sought immediate relief. But the mechanism of that relief is a trap.
JohnWhat happened?
AmandaFour months into the study, the researchers found that the emotional isolation of those individuals had actually increased significantly. The data proves that AI companions provide a shallow, counterfeit relief that actively crowds out real human interaction.
JohnOkay, because it's just easier.
AmandaRight, you get a quick, frictionless hit of validation from the bot, so you lose the motivation to do the hard, awkward work of calling a real friend. It simulates the feeling of being received, but without any of the actual stakes. Ultimately, the hole just gets deeper.
JohnSo, synthesizing all of this for you listening right now, I want you to look at your phone differently today. The next time you open a messaging app to talk to someone you care about, recognize the limitations of the glass in your hand. Understand that the flat, instantaneous text you are typing is structurally designed to fail you.
AmandaIt really is.
JohnIt is mathematically stripping away a full third of your intended meaning before the other person even unlocks their screen. Remember that enduring human connection isn't about data speed or communication efficiency or chasing instant read receipts. It is about actively choosing to give your emotions the multi-sensory weight, the awkward vulnerability, and the intentional time that they actually deserve.
AmandaAnd tying this back to the broader implications of how we live digitally, I want to leave you with one final thought to explore on your own. Think about the digital footprint you will inevitably leave behind at the end of your life.
JohnOh, that's an interesting angle.
AmandaRight now, for the vast majority of us, your legacy is a sterile, highly searchable, perfectly flat transcript of blue and green text boxes. And if we are completely honest, it is mostly full of logistical data. "What time are you coming home?" "Did you remember to buy paper towels?" "Call me back."
JohnIt is entirely logistics masquerading as connection.
AmandaExactly. Now, imagine a different architecture. Imagine leaving behind a private, emotional archive, a meticulously mapped chronological timeline of rich, multi-sensory .emo scenes. Imagine leaving messages that will physically beat in the hand of your loved ones years from now, carrying your exact haptic rhythm.
JohnThat's powerful.
AmandaMessages that radiate the specific color temperature of how you felt, carrying your prosody or hesitations and your voice, locked to open on the exact day you intended them to. If this technology is about to completely change how we communicate with the people we love in the present, how is it going to completely reinvent how we are remembered by our descendants in the future?