These new kinds of networks are revolutionizing how we think about AI

Did you play make-believe when you were a child? Make belief got a lot more interesting.

Growing up I used to dream of magically making a bridge appear so I could cross this river which lay between my house and most places where I spent time so I could go as the crow flies instead of trekking all the way to the nearest bridge. I walked a lot those days. Today, in my lab, I have robot miniboats that can make themselves into a bridge any time I want.

Back then I was also amazed by the Polaroid, by the chemicals that magically brought color photos to life right before my eyes. Now, my daughters live in a world where we have the equivalent of a digital Polaroid in the form of a smart phone and the they swipe it and they get images on screens respond to the swipe.

Typing a sentence on the digital Polaroid can conjure up photos or even movies.

But we don’t live in a 2D space we live in a 3D world and I am here to tell you that now we can take the image in the Polaroid and make it into a 3d thing you can interact with and we can type sentences that become intelligent physical machines.

We can do this because we can blend two worlds we have today: the digital world of artificial intelligence with the physical world of robots, leaping from bits to atoms. We know a lot about AI’s applications in the digital world and we have seen many controllable robots that are not intelligent on the manufacturing floor; these two worlds are about to start interacting in extraordinary ways. By combining artificial intelligence with physical machines, we get something very special. We get physical intelligence as the marriage between digital AI and physical machines. We want to use AI to create machines from pictures or words and we want that machine to be controlled by AI. Physical AI will change how we interact with things and how we live our lives; it’s coming fast around the corner and it will take a lot of imagination from all of us to make the most of it.

Let me tell you how physical intelligence is coming to life today.

I direct MIT’s Computer Science and Artificial Intelligence Lab, where I work with brilliant researchers to create the future of computing.

We want to take Polaroid photos and turn them into intelligent machines. For example, we can take this photo of a bunny and transform it into a robot. To do that our cutting-edge algorithm extracts a 3D tessellation of the object and then slices it into layers that get unfolded and can be printed. Then we fold the printed layers, we string some motors and sensors, and we get the bunny you see in this video. We can use this approach to make anything form a Polaroid.

We also type sentences and they become robots. You have seen how words become photos. For words to become physical machines we need innovation beyond using digital AI for pattern prediction like with transformers and diffusion models because these all lack a fundamental grasp of physics and common sense. The digital AI solutions do not know that if you carry a glass with water upside down the water will spill.

My lab at CSAIL developed a solution that uses a physical simulator to test for physical constraints and guide the exploration of generative AI. Let me show you.

We started with a simple language prompt such as “make me a robot that can walk” and within an hour you have the design, including shape, materials, actuators, and sensors, the program required to control it, and the fabrication files.

You can even give additional commands, for example “make it thicker”, “add some wings”, and “add another leg”. Within 24 hours, it is possible to use rapid fabrication to go from idea to controllable physical machine.

This is how you turn words into intelligent machines. Physical AI gives us the machine and the software to control the machine. Now, we want to interact with these printed machines.

Here you can see a robot as a helpful teammate, adapting to the motion of the human to install a cable on a panel. The intent of the human is detected using EMG sensors or electromyography sensors, which are devices used to measure the electrical activity produced by skeletal muscles. Using AI, the feedback from these sensors is classified into categories for example “go up” or “move right” and the command is communicated to the robot using WiFi to execute.

The robot can even learn from humans how to do tasks independently. In our lab we created a kitchen laboratory where we instrument people with sensors and record pose, muscle, even gaze when people perform kitchen tasks. We then use AI to learn from this data how to get robots to perform those tasks. The end result? Machines that move with greater grace and agility, more like people and less mechanistically.

These AI solutions that allow the robots to function as teammates or operate autonomously have to run on computers incorporated into the body of the robot. But only small computers fit. These hardware systems with small computers are also called edge devices. Today’s most powerful Generative AI and machine learning solutions require cloud-based server farms. They can not run on edge devices. Furthermore, because robots operate in the physical world they are considered safety critical systems and any software running on them needs to be safe; the AI brain can not make mistakes and needs to explain how it makes its decisions unlike today’s digital AI solutions.

We are tackling these Physical AI challenges using inspiration from nature. More precisely from a worm called C-elegance which inspired a lot of work in neuroscience. In stark contrast to the billions of neurons in the human brain, the worm lives on 302 neurons, which allow it to find food, move, shelter, – everything they need for a good life.

Inspired by the neural system of worms, we have created a new class of machine learning models called Liquid Networks. They are compact, more energy efficient, and explainable as compared to today’s solutions.

Let me show you.

This is our self-driving car, trained using a traditional AI solution, the type that is foundational in many applications today.

In the lower right corner you see the map, in the upper left corner the camera input stream, and the big box with blinking blue and yellow dots is the decision-making engine with 100,000 artificial neurons. It is impossible to correlate the activity of these neurons with the car’s behavior. Furthermore, if you look at the lower left corner which shows where in the image the AI engine looks to figure out what to do you see it is very noisy and this AI solution looks at the bushes and trees on the side of the road to stay in lane. That’s not how most people drive.

Now, contrast that to our worm inspired liquid network example for self-driving cars. Our solution can drive a car using 19 rather than 100,000 neurons and its attention is clean, focused on the road horizon and the side of the road. This is how you get a self-driving car to pay better attention.

Each of these neurons though, is more powerful computationally than the neurons we typically use in today’s AI solutions. This neural network architecture paves the way for foundational models that are thousands of times more efficient, and even offer a potential solution to the AI chip problem.

The ability to turn photos and works into functional machines, coupled with our ability to use liquid networks to create powerful brains for these machines is enabling the new world of physical intelligence. Physical intelligence will impact how we live in profound ways granting us the ability to interact with our world in ways that once existed only in the realm of fantasy. It’s up to us to use our imagination to invent these capabilities.

Yet seeking to develop and understand Physical AI is teaching us that we still have so much to learn about technology and about ourselves – questions that we need to answer sooner rather than later before the implications of AI play out without a guiding hand.

After all, we remain responsible for this planet, and everything living on it.

It’s our privilege to be the only species so advanced, so aware, and so capable of building these extraordinary tools.

I remain convinced that we have the power to harness their power to ensure a better future for all of humanity and the planet.

Thank you.

Share.
Exit mobile version