Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment
Author(s)
Macdonald, Jacob P.; Mallick, Rohit; Wollaber, Allan B.; Peña, Jaime D.; McNeese, Nathan; Siu, Ho Chit; ... Show more Show less
Download3610978.3640671.pdf (3.413Mb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
The Context-observant LLM-Enabled Autonomous Robots (CLEAR) platform offers a general solution for large language model (LLM)-enabled robot autonomy. CLEAR-controlled robots use natural language to perceive and interact with their environment: contextual description deriving from computer vision and optional human commands prompt intelligent LLM responses that map to robotic actions. By emphasizing prompting, system behavior is programmed without manipulating code, and unlike other LLM-based robot control methods, we do not perform any model fine-tuning. CLEAR employs off-the-shelf pre-trained machine learning models for controlling robots ranging from simulated quadcopters to terrestrial quadrupeds. We provide the open-source CLEAR platform, along with sample implementations for a Unity-based quadcopter and Boston Dynamics Spot® robot. Each LLM used, GPT-3.5, GPT-4, and LLaMA2, exhibited behavioral differences when embodied by CLEAR, contrasting in actuation preference, ability to apply new knowledge, and receptivity to human instruction. GPT-4 demonstrates best performance compared to GPT-3.5 and LLaMA2, showing successful task execution 97% of the time. The CLEAR platform contributes to HRI by increasing the usability of robotics for natural human interaction.
Description
HRI ’24 Companion, March 11–14, 2024, Boulder, CO, USA
Date issued
2024-03-11Department
Lincoln LaboratoryPublisher
ACM
Citation
Macdonald, Jacob P., Mallick, Rohit, Wollaber, Allan B., Peña, Jaime D., McNeese, Nathan et al. 2024. "Language, Camera, Autonomy! Prompt-engineered Robot Control for Rapidly Evolving Deployment."
Version: Final published version
ISBN
979-8-4007-0323-2
Collections
The following license files are associated with this item: