Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment

Macdonald, Jacob P.; Mallick, Rohit; Wollaber, Allan B.; Peña, Jaime D.; McNeese, Nathan; Siu, Ho Chit

Author(s)

Macdonald, Jacob P.; Mallick, Rohit; Wollaber, Allan B.; Peña, Jaime D.; McNeese, Nathan; ... Show more

Download3610978.3640671.pdf (3.413Mb)

Publisher with Creative Commons License

Terms of use

Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/

Metadata

Show full item record

Abstract

The Context-observant LLM-Enabled Autonomous Robots (CLEAR) platform offers a general solution for large language model (LLM)-enabled robot autonomy. CLEAR-controlled robots use natural language to perceive and interact with their environment: contextual description deriving from computer vision and optional human commands prompt intelligent LLM responses that map to robotic actions. By emphasizing prompting, system behavior is programmed without manipulating code, and unlike other LLM-based robot control methods, we do not perform any model fine-tuning. CLEAR employs off-the-shelf pre-trained machine learning models for controlling robots ranging from simulated quadcopters to terrestrial quadrupeds. We provide the open-source CLEAR platform, along with sample implementations for a Unity-based quadcopter and Boston Dynamics Spot® robot. Each LLM used, GPT-3.5, GPT-4, and LLaMA2, exhibited behavioral differences when embodied by CLEAR, contrasting in actuation preference, ability to apply new knowledge, and receptivity to human instruction. GPT-4 demonstrates best performance compared to GPT-3.5 and LLaMA2, showing successful task execution 97% of the time. The CLEAR platform contributes to HRI by increasing the usability of robotics for natural human interaction.

Description

HRI ’24 Companion, March 11–14, 2024, Boulder, CO, USA

Date issued

2024-03-11

URI

https://hdl.handle.net/1721.1/154055

Department

Lincoln Laboratory

Publisher

ACM

Citation

Macdonald, Jacob P., Mallick, Rohit, Wollaber, Allan B., Peña, Jaime D., McNeese, Nathan et al. 2024. "Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment."

Version: Final published version

ISBN

979-8-4007-0323-2

Collections

MIT Open Access Articles

The following license files are associated with this item:

Creative Commons