Does AI need a “body” to become truly intelligent?
BigTech are now converging on Robotics in a race with China.
Hey Everyone,
In recent times, Robots are becoming a lot more important to how we are making progress in AI. LLMs are opening up new possibilities for embodied AI.
Embodiment Hypothesis is taking New Shapes
There is this idea called embodiment hypothesis that I really like. The embodiment hypothesis is the idea that intelligence emerges in the interaction of an agent with an environment and as a result of sensorimotor activity.
As LLMs give Robots a voice, and agentic AI is coming into being, I believe robots will take on increased importance in AI research and the U.S. vs. China in innovation. The EH argues that human-level intelligence can only emerge if an intelligence is able to sense and navigate a physical environment, the same way babies can. Intuitively this almost feels correct, just as a baby cat needs to learn the basics.
Robots made their stage debut the day after New Year’s 1921, however more than one hundred years later we will likely see fairly intelligent robots by the year 2030. Decades of science fiction have prepared us for what might soon arrive.
Recently, the ways in which LLMs and robots intersect are providing some surprising twists and turns. A study by the University of California, Berkeley, enables robots to navigate based on the principle of word prediction from language models. This approach could pave the way for a new generation of robots that can navigate complex environments with minimal training.
We may be at the dawn of a very different world.
A recent NY post article suggests robots are “replacing humans in spas and doctors’ offices”. The real world used to be a hard place for robots to do anything, that’s slowly change in the 2020s.
A brave new world of robotics research suggests robots may be useful in society at various tasks in the 2030s and especially the 2040s.
Berkeley really is a nexus for a lot of what is going on. In their paper, "Humanoid Locomotion as Next Token Prediction," the researchers treat the complex task of robot locomotion as a sequence prediction problem, similar to predicting the next word in language generation.
Researchers collected a dataset on trajectories from various sources, such as from neural network policies, model-based controllers, human motion capture, and YouTube videos of humans. Then they use this dataset to train a transformer policy by autoregressive modeling of observations and actions. Their transformer allows a humanoid to walk zero-shot on various terrains around San Francisco.
NYC, Aescape has created the first commercially available fully automated AI masseuse that’s a bit more advanced than your average mall massage chair — and works faster than your favorite masseuse.
A Better Masseuse than a Human?
The Aescape massage robot has two “hands” — which are heated and modeled after human hands, replicating their shape and strength — that move simultaneously and apply equal pressure doing twice the work in half the time a human could.
But what are we really heading to?
What is a “general-purpose humanoid?”
So what happens when commercial AGI, which is just Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that can perform as well or better than humans on a wide range of cognitive tasks, gets its equivalent in robotics? We’ll have robots that can learn and perform a range of tasks in cities, homes, factories, warehouses, restaurants and so forth.
The idea of some AGI point intersecting with robotics is somewhat speculative, but general purpose humanoid robots might even be able to perform social tasks out and about in the world. The fact remains, we don’t know exactly what “commercial” AGI in robots might look precisely look like yet in 2024.
Specialized robots are doing just fine without AGI uplifted GPH robots mind you:
For a state-of-the-art manicure, stylistas in a hurry can insert a nail color cartridge — like loading a pod into a Nespresso machine — into a Clockwork machine, which uses both AI and 3D technology to outline the nail and fill in a professional-looking polish change in less time it takes some to choose a nail color.
See the Demo
We know by the 2040s robots will likely be very involved in Senior care and try to help with the burden of our incredible demographic winter that is quickly approaching in the next 20 years. My hypothesis is that this might force China to push ahead in certain kinds or robotics, since the demographic winter arrives there faster than in many other places.
Sometime between 2028 and 2045 robots go from mediocre specialists to highly capable generalists across a range of tasks.
Robots get better at learning how to learn and learning in multi-modal contexts and on synthetic data.
Ok, Robot
The recent AI boom has led to enormous leaps in language and computer vision capabilities, allowing robotics researchers access to open-source AI models and tools that didn’t exist even three years ago, says Matthias Minderer, a senior computer vision research scientist at Google DeepMind, who was not involved in the project.
How intimate will robots become to our lives? What’s clear is in the next few decades these breakthroughs in robotic learning, mobile manipulation and locomotion (among others) that will impact the role automation plays in our daily lives and even with less people around we’ll find ways to populate the planet and anthropomorphize our creations.
In 2019, Meta (then Facebook) unveiled AI Habitat, an open-source simulation platform for training AIs to navigate homes, offices, and other spaces. In October 2023, Meta updated the platform — and this version brings human avatars into the simulated world. In the years ahead I believe companies like Meta, Nvidia, Microsoft and Google will push robotics at the intersection of AI into the future.
“Habitat 3.0 enables virtual collaboration between robots and people, where robots adapt to non-stationary environments and account for the actions, movements, and intents of humans”
In the 2020s we’ll create the “kindergartens” both virtual and otherwise where the next stage of robots will be born, if you will.
A world of smarter drones, specialized robots, GPRs and humanoid robots are coming. What kind of a different world will that feel like?
If the embodiment hypothesis is true, this pairing of a company developing some of the most advanced AI brains with one building state-of-the-art robot bodies could be the combination that leads to real human-like intelligence in an artificial being. This is why some evangelists like myself are watching OpenAI’s partnership with Figure AI and what China is doing as well, very closely.
I believe that China’s ecosystem of building is more suitable to arriving at an “AGI of robotics” moment first.
I don’t need a fancy demo to tell me what’s coming. Generative AI might not create as much value or generate much revenue by itself in the mid 2020s, but what it might lead to with robotics, science, healthcare and drug discovery over the long-term is fairly impressive.
I don’t know if the embodiment hypothesis is true, but we’re going to find out within a generation or two perhaps. Competition between the U.S. and China is likely going to be good for innovation on the whole. Agentic AI and humanoid general purpose robots will begin to attract more funding. Even as pure-play foundational LLM startups go bust and are unable to meet the demands of investors.
What do you think of the future or robotics in the age of Generative AI?
I am more in the camp that physical embodiment is not necessary for intelligent systems. However, an agentic approach with world models and no-free-lunch virtual experiences (i.e., where the agent's actions in the environment interact in a pullback or pushout effect framework, similar to deep reinforcement learning, but a bit more abstract in that it is the actions that interact and not necessarily the agent itself, this sort of creates an abstract map that we can think of (as being or representing or corresponding to) the "intuition" that current models seem to lack) seems like a plausible way to create fully intelligent agents that live in simulated environments. Of course, fully interactive real-world applications require robotics, but I do not see this as a prerequisite for creating artificial intelligence.
I know little about this area but plan on exploring it more at Risk & Progress soon. Honestly, the embodiment hypothesis does not ring true to me. That is, I don't think you need a "body" so much as you need senses. Even then, human sense are very limited and look how far we have gotten.