When machines begin to feel (part 1) #78
Join over 6,000 people who read "Future Scouting and Innovation" to explore how AI and emerging technologies are redefining the future.
Today’s artificial intelligence can write poetry, diagnose diseases, and engage in philosophical debates, yet it cannot feel the warmth of sunlight on its surface or distinguish the texture of silk from sandpaper through touch. This paradox sits at the heart of our current technological moment, where machines possess sophisticated reasoning capabilities while remaining fundamentally disconnected from the physical world they’re meant to serve. Large language models are brains without bodies, operating in a realm of pure abstraction, while the robots we’re beginning to deploy in factories and homes remain largely blind to the nuanced sensory information that humans process effortlessly and continuously.
The question I want to pose, and which will frame our exploration throughout this article, is not merely technical but existential: can intelligence truly be intelligent if it cannot sense? My position is that the sensory dimension is not an optional enhancement to be added later, but rather the foundational requirement for artificial intelligence to make the leap from clever pattern recognition to genuine comprehension and autonomous action in our complex, unpredictable physical world.
The disembodied intelligence of our current age
When we examine the landscape of contemporary artificial intelligence, we find systems of remarkable sophistication that nonetheless operate within a fundamental constraint, they exist in temporal and spatial isolation from the world they purport to understand. Today’s most advanced AI systems, whether they’re transformer-based language models or cutting-edge computer vision networks, share a common characteristic that profoundly limits their capability: they observe reality not as a continuous, multi-dimensional experience but as discrete snapshots, processed in retrospect and filtered through human curation.
Consider the way these systems learn. They ingest massive datasets, text scraped from the internet, millions of labeled images, recorded conversations, but these are always historical artifacts, representations of the world rather than the world itself. An AI that can describe a beach with poetic precision has never felt sand between its components, never experienced the interplay of sun and wind on a coastal afternoon, never developed the embodied understanding that comes from moving through space while multiple sensory channels provide complementary and sometimes contradictory information that must be reconciled in real-time.
This is what I call “digital intelligence”, powerful within controlled parameters, but fundamentally constrained by its disconnection from the continuous, messy, gloriously complex stream of sensory data that characterizes actual existence in the physical world. The implications of this constraint extend far beyond mere philosophical curiosity; they directly impact the practical capabilities and limitations of current AI systems, determining what they can and cannot do, and more importantly, what they cannot understand about the contexts in which they operate.
The body as portal to reality
Here we must make a conceptual leap that may seem obvious yet carries profound implications for the future of artificial intelligence. In biological systems, human, animal, even insect, sensory apparatus is not accessory equipment added to an already-complete organism. Rather, senses constitute the very mechanism through which the world exists for that being, the interface through which reality becomes knowable and actionable. A creature without sensory capacity doesn’t simply lack information about its environment; in a meaningful sense, it doesn’t have an environment at all.
What does this mean for artificial systems? It suggests that the integration of sophisticated sensory capabilities represents not merely an incremental improvement in machine functionality but a phase transition, a fundamental shift in the nature of artificial intelligence itself. A machine equipped with multiple, integrated sensory systems doesn’t just have more data streams; it has something approaching what philosophers call “being-in-the-world”, an anchored presence in physical reality that enables forms of learning, adaptation, and interaction that are simply impossible for disembodied systems.
This is why I believe the development of machine senses will prove to be the watershed moment in AI development, more significant than increases in processing power or advances in algorithmic architecture, though certainly dependent on both. When machines can continuously sense their environment through multiple modalities, correlating information across these channels in real-time while acting and learning simultaneously, we will have crossed into genuinely new territory where the distinction between digital and physical, between computation and experience, begins to blur in ways that challenge our current categorical frameworks.
Vision beyond recognition
Let’s begin our examination of individual senses with vision, the most developed area of machine sensing and yet still profoundly limited in ways that reveal the challenges ahead. Contemporary computer vision has achieved remarkable things: facial recognition systems can identify individuals across decades of aging, medical imaging AI can detect cancerous cells invisible to the human eye, autonomous vehicles can navigate complex urban environments. These achievements shouldn’t be minimized, yet they all share a common limitation: they remain fundamentally exercises in pattern matching rather than genuine visual comprehension.
To see, in the fullest sense, involves far more than identifying objects within a frame. It requires understanding spatial relationships in three dimensions, predicting how those relationships will change as objects and the observer move, distinguishing relevant information from background noise in constantly shifting environments, and most crucially, integrating visual information with expectations, memories, and inputs from other sensory modalities to build a coherent, actionable model of the world.
Consider a human walking through a crowded market. Our visual system doesn’t simply recognize “person,” “fruit stand,” “awning” as discrete objects; rather, we perceive a dynamic situation where people move with intentions that can be partially predicted, where the interplay of light and shadow indicates not just the presence of objects but their material properties, where peripheral vision alerts us to potential collision risks even as focused attention examines items of interest. This holistic, continuous, predictive form of seeing remains largely beyond current machine vision systems, which tend to process images as static puzzles to be solved rather than as windows into a dynamic, continuous reality.
The path forward requires vision systems that operate continuously rather than in discrete frames, that build and maintain spatial maps rather than repeatedly identifying the same objects, that can distinguish the significant from the trivial through learned experience rather than pre-programmed rules, and that integrate seamlessly with other sensory inputs to build multi-modal understanding. When machines achieve this level of visual sophistication, we’ll see capabilities emerge that go well beyond current applications: robots that can navigate genuinely novel environments without extensive pre-mapping, systems that can understand complex social situations through visual cues, machines that can learn physical tasks through observation in ways that don’t require thousands of training examples.
Hearing as relational understanding
Auditory perception presents its own unique challenges and opportunities for artificial systems, different from but equally important as visual sensing. While speech recognition has become remarkably sophisticated, modern systems can transcribe conversations with accuracy approaching human capability, true hearing involves far more than converting sound waves into text. The auditory dimension of experience encompasses tone, rhythm, volume, spatial location, and critically, the ability to distinguish meaningful signal from ambient noise while continuously monitoring the acoustic environment for changes that might signal danger, opportunity, or social demand.
What fascinates me about hearing as a sensing modality is its inherently relational nature. Unlike vision, which can be somewhat self-contained, you can close your eyes, sound constantly permeates our environment, carrying information about things we cannot see, events happening beyond our visual field, the intentions and emotional states of other beings. For machines operating in human environments, sophisticated auditory processing will prove essential not merely for responding to voice commands but for achieving the kind of ambient awareness that allows safe, effective collaboration with human partners.
Consider the difference between a robot that responds to explicit verbal instructions and one that can continuously monitor the acoustic environment, detecting the subtle changes in background noise that might indicate a colleague approaching from behind, recognizing stress in a human voice that suggests something has gone wrong, or identifying the characteristic sound patterns that indicate equipment malfunction before visible signs appear. This form of continuous, contextualized auditory awareness transforms the machine from a tool that must be explicitly directed to a presence capable of genuine environmental awareness and proactive response.
The technical challenges here are substantial: sound processing requires temporal integration over longer windows than vision, spatial localization demands sophisticated processing of tiny timing differences between ears or microphones, and extracting meaning from complex acoustic environments where multiple sound sources overlap requires computational approaches quite different from those that work well for visual pattern recognition. Yet the payoff for solving these challenges will be enormous, enabling machines that can operate as true partners in human environments rather than isolated systems requiring carefully controlled conditions.
The sense that grounds intelligence
If I had to identify the single most transformative sensory capability for artificial systems, the sense whose integration will most dramatically expand machine functionality, it would be sophisticated tactile sensing. Touch is the sense of boundaries and forces, the mechanism through which we understand not just what objects are but how they behave under stress, how they can be manipulated, what they’re made of, and crucially, the immediate feedback that allows learning through physical interaction with the world.
The current generation of robotic systems generally possesses only rudimentary tactile capability: simple pressure sensors that can detect binary states like “gripping” versus “not gripping.” What’s missing is the rich, multi-dimensional tactile information that humans gather continuously through mechanoreceptors distributed across our skin: fine gradations of pressure, temperature, texture, vibration frequency, and the complex integration of these signals that allows us to manipulate objects with extraordinary precision while constantly adjusting our actions based on tactile feedback.
Why does this matter so profoundly? Because without sophisticated touch, robots remain fundamentally limited in their ability to physically interact with their environment in ways that aren’t rigidly pre-programmed. A robot without good tactile sensing must either operate in highly controlled environments where object positions and properties are precisely known, or it must use conservative, slow approaches that compensate for its lack of tactile information through excessive caution. Neither approach scales to the messy, variable conditions of most real-world applications.
Consider the implications of genuine tactile capability across different domains. In surgical robotics, it would enable procedures requiring extraordinary delicacy and real-time adjustment based on tissue response. In manufacturing, it would allow machines to handle variable parts and materials without extensive custom tooling for each variation. In elderly care, it would enable safe, gentle physical assistance rather than the rigid, potentially dangerous movements of current care robots. In maintenance and repair, it would allow machines to work on equipment with variable conditions and improvise solutions rather than following rigid procedures.
The technical challenges of creating artificial touch are considerable: sensor arrays must be dense enough to provide high spatial resolution, responsive enough to detect rapid changes, durable enough to withstand the mechanical stresses of real-world operation, and integrated with control systems capable of processing and responding to tactile information in real-time. Yet progress is being made on all these fronts, and I expect the next decade to see tactile sensing transform from a research curiosity to a standard feature of capable robotic systems.
The invisible sense
Olfaction remains perhaps the most neglected sense in discussions of artificial sensing, yet it possesses unique capabilities that make it invaluable for certain applications. Smell is fundamentally about chemical detection: identifying specific molecules in the environment. But at a level of sensitivity and pattern recognition that far exceeds most analytical chemistry equipment. A trained dog can detect explosives or drugs in concentrations so low they wouldn’t register on most laboratory equipment, and can distinguish between chemically similar compounds that would confuse conventional sensors.
What makes olfaction particularly interesting for machine systems is its predictive power. Many forms of degradation, contamination, or danger produce detectable odors long before they become visible or otherwise apparent. Electrical fires smell distinctive before flames appear. Food spoilage produces characteristic odor patterns long before visible signs emerge. Disease processes can alter body chemistry in ways that produce subtle but detectable changes in scent. In each case, olfactory sensing provides early warning that enables preventive action rather than reactive response.
The applications span multiple domains where current sensing modalities prove inadequate. In healthcare, artificial olfaction could screen for diseases, monitor wound healing, or detect dangerous bacterial infections in vulnerable patients. In industrial settings, it could provide early warning of equipment problems, detect chemical leaks, or monitor process quality in manufacturing. In agriculture, it could assess crop health, detect pest infestations, or determine optimal harvest timing. In environmental monitoring, it could track air quality, detect pollution sources, or monitor ecosystem health through subtle chemical indicators.
The technical challenges of artificial olfaction differ significantly from those of other senses. Rather than processing electromagnetic radiation or mechanical forces, olfactory systems must detect and identify specific molecules, even in complex mixtures where multiple odors overlap. Current approaches using sensor arrays that respond differently to different molecular patterns show promise, but we’re still far from matching biological olfaction in sensitivity, discrimination capability, or the ability to learn new odor patterns through experience rather than pre-programming.
The underestimated taste
Taste, the sense most closely related to olfaction, might seem the least relevant for machine systems. After all, robots don’t need to enjoy their meals. Yet this dismissal misses the deeper significance of gustatory sensing as a sophisticated chemical analysis system that provides information about molecular composition, purity, and quality through direct interaction with materials.
What we experience as taste is actually a complex integration of chemical sensing (the traditional five taste categories), olfactory input (which provides most of what we call “flavor”), tactile information (texture and temperature), and learned associations that connect taste patterns to nutritional value, safety, and pleasure. For artificial systems, the goal isn’t to recreate subjective gustatory experience but rather to leverage the principle of direct chemical analysis through interaction to assess material properties in ways that other sensing modalities cannot match.
The applications are more significant than might initially appear. In food production, artificial taste could provide quality control that goes beyond simple chemical analysis to assess the complex properties that determine whether a product meets expectations. In pharmaceutical manufacturing, it could verify compound purity and composition. In environmental monitoring, it could analyze water quality or soil composition. In medical diagnostics, it could analyze bodily fluids for signs of disease or metabolic dysfunction. In each case, the key advantage is direct molecular interaction that provides information not easily obtainable through other means.
The technical implementation of artificial taste systems involves chemical sensors, pattern recognition algorithms that can distinguish complex molecular mixtures, and crucially, the ability to learn and refine discrimination through experience rather than requiring explicit programming for every potential chemical combination. While this area remains relatively underdeveloped compared to other sensing modalities, I expect growing interest as the value of sophisticated chemical sensing becomes more widely recognized.
Waiting for the second part
In this first part, we have explored what it really means for machines to sense the world. We have examined the five artificial senses: vision, hearing, touch, smell, and taste, not as isolated technical features, but as foundational capabilities that ground intelligence in physical reality. Each sense, taken on its own, already expands what machines can perceive and do. Together, they begin to outline a future in which artificial systems are no longer detached observers, but embodied presences embedded in the environments they operate within.
In the second part we will move beyond individual senses and focus on this crucial transition: how multi-sensory integration reshapes machine intelligence, enables new forms of learning and autonomy, and ultimately changes the way machines act, adapt, and relate to humans and their environments.
Don’t forget to subscribe so you don’t miss it!
(Service Announcement)
This newsletter (which now has over 6,000 subscribers and many more readers, as it’s also published online) is free and entirely independent.
It has never accepted sponsors or advertisements, and is made in my spare time.
If you like it, you can contribute by forwarding it to anyone who might be interested, or promoting it on social media.
Many readers, whom I sincerely thank, have become supporters by making a donation.
Thank you so much for your support!




Really solid breakdown of the sensory gap in modern AI. The idea that tactile feedback creates a tight action-perception loop is spot on, but I've noticed in robotics work that even basic force sensing gets wierd when latency creeps above 50ms. The brain compensates for delays we barely notice, but current control systems just oscillate or freeze up. One thing missing here is proprioception tho, knowing where limbs are in space matters as much as knowing what they're touching once these systems start moving autonomously.