[ExI] Boston Dynamics turned its robot dog into a talking tour guide with ChatGPT

Sat Oct 28 19:38:14 UTC 2023

Robots That Can Chat

We created a robot tour guide using Spot integrated with Chat GPT and
other AI models as a proof of concept for the robotics applications of
foundational models.


Making a Robot Tour Guide using Spot’s SDK
A robot tour guide offered us a simple demo to test these concepts—the
robot could walk around, look at objects in the environment, use a VQA
or captioning model to describe them, and then elaborate on those
descriptions using an LLM. Additionally, the LLM could answer
questions from the tour audience, and plan what actions the robot
should take next. In this way, the LLM can be thought of as an improv
actor—we provide a broad strokes script and the LLM fills in the
blanks on the fly.

We encountered a few surprises along the way while putting this demo
together. For one, emergent behavior quickly arose just from the
robot’s very simple action space.

For example, we asked the robot “who is Marc Raibert?”, and it
responded “I don’t know. Let’s go to the IT help desk and ask!”, then
proceeded to ask the staff at the IT help desk who Marc Raibert was.
We didn’t prompt the LLM to ask for help. It drew the association
between the location “IT help desk” and the action of asking for help
independently. Another example: we asked the robot who its “parents”
were— it went to the “old Spots” where Spot V1 and Big Dog are
displayed in our office and told us that these were its “elders”.

We’re excited to continue exploring the intersection of artificial
intelligence and robotics. These two technologies are a great match.
Robots provide a fantastic way to “ground” large foundation models in
the real world. By the same token, these models can help provide
cultural context, general commonsense knowledge, and flexibility that
could be useful for many robotics tasks—for example, being able to
assign a task to a robot just by talking to it would help reduce the
learning curve for using these systems.

Yes, combining LLMs with robot devices is the obvious next step.


