Framework links Artificial Intelligence language agents with ROS for easier robot control

A new framework combines large language model based Artificial Intelligence agents with ROS to let non-experts program robots through natural language. It also adds imitation learning, action optimization, and iterative feedback to expand and refine robot skills.

Researchers have introduced a framework that integrates large language model based Artificial Intelligence agents with a robot operating system, aiming to make robot programming more flexible and accessible through natural language. The design targets a longstanding limitation in robotics, where experts typically must break tasks into atomic actions and manually assemble behaviors. That approach remains effective in controlled settings, but it is less suited to dynamic environments such as homes or healthcare contexts where capabilities may need to be updated quickly by non-experts.

The system divides responsibilities between experts and non-experts. Experts provide an initial library of pre-trained atomic actions such as picking and navigation, while non-experts interact through a chat interface without needing to write code. The framework centers on four connected parts: an atomic action library, an imitation learning module, an atomic action optimizer, and an Artificial Intelligence agent. Imitation learning allows users to expand the robot’s skill set by physically guiding the robot or demonstrating tasks, and the optimizer uses large language models plus Bayesian optimization to tune parameters in action code. The Artificial Intelligence agent then selects actions from user instructions and text-based environmental observations, supporting single-step execution, multi-step sequencing, custom code, and behavior trees for more complex logic.

Testing across several robots and environments showed the framework could handle both planning and adaptation. In a kitchen setup using a UR5 arm, the robot completed a 12-step coffee-making task from a single natural language prompt, demonstrating strong long-horizon planning without human intervention. Non-experts then added actions such as stirring and pouring through demonstration, enabling a later pasta-cooking task. In tabletop rearrangement experiments, performance dropped when relying only on the language model, but success rates remained consistently high when human corrections were added. The system also reused earlier feedback, applying prior corrections in later trials without being told again.

The framework also worked in remote and unstructured scenarios. An operator in Europe successfully controlled a robot in Asia using natural language, completing pick-and-place tasks despite a 2-3 second delay. In a laboratory setting, the system interpreted textbook-style instructions to conduct a pH test. Bayesian optimization improved air hockey performance from 30 % to 52 %, and a quadruped robot demonstrated real-time failure recovery in an office environment by resolving issues such as gripper obstructions.

Several reliability challenges remain. Performance was sensitive to prompt wording, and small phrasing changes could cause failures. The model could also be distracted by incidental examples or generate actions not present in the action library, though few-shot prompting reduced that behavior. Even with those limits, the framework showed that natural language control, imitation learning, and feedback-driven adjustment can make robotic systems more usable while still falling short of general-purpose autonomy.

58

Impact Score

IBM, Red Hat, and Google donate llm-d to CNCF

IBM Research, Red Hat, and Google Cloud have donated llm-d, an open-source Kubernetes framework for large language model inference, to the CNCF as a sandbox project. The move aims to create a vendor-neutral blueprint for deploying scalable inference across models, accelerators, and clouds.

AAMU named regional lead for Amazon Web Services machine learning university

Alabama A&M University has been named a regional lead institution for Amazon Web Services Machine Learning University, expanding its role in Artificial Intelligence and machine learning education, research, and workforce development. The designation follows the university’s recent national HBCU summit on Artificial Intelligence and sets up new curriculum, faculty training, and student career pathways across the Southeast.

EDPB backs global privacy statement on Artificial Intelligence-generated imagery

The European Data Protection Board has endorsed a joint Global Privacy Assembly statement warning that Artificial Intelligence-generated images and videos can seriously harm privacy, dignity, and safety. The statement calls for stronger safeguards, transparency, and protections for children and other vulnerable groups.

Intel unveils Arc Pro B70 and B65 workstation GPUs

Intel has introduced the Arc Pro B70 and Arc Pro B65 for workstation-class Artificial Intelligence compute and professional visualization. The Arc Pro B70 is the fullest expression yet of the Xe2 Battlemage discrete GPU design in this lineup.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.