Google Launches Gemini Robotics On-Device AI Model

Google DeepMind has unveiled its latest artificial intelligence model, Gemini Robotics On-Device, which operates entirely on local devices. This innovative voice-language-action (VLA) model is designed to enhance the capabilities of robots in real-world settings without relying on a data network, making it particularly valuable for latency-sensitive applications. Currently, access to the model is limited to participants in the company’s trusted tester program.

Gemini Robotics On-Device: A New Era for Robotics

Carolina Parada, Senior Director and Head of Robotics at Google DeepMind, announced the launch of Gemini Robotics On-Device in a recent blog post. This VLA model can be accessed through a Gemini Robotics software development kit (SDK) by those who have enrolled in the tester program. Users can also experiment with the model using Googleโ€™s MuJoCo physics simulator, which allows for realistic testing scenarios.

While specific details about the model’s architecture and training methods remain undisclosed due to its proprietary nature, Google has emphasized its impressive capabilities. The VLA model is tailored for bi-arm robots and boasts minimal computational requirements, enabling it to function efficiently on local devices. Notably, it can adapt to new tasks with just 50 to 100 demonstrations, showcasing its flexibility and learning potential.

Advanced Capabilities and Performance

Gemini Robotics On-Device is designed to understand and execute natural language instructions, allowing it to perform complex tasks such as unzipping bags and folding clothes. According to internal testing conducted by Google, the AI model demonstrates strong generalization performance while operating entirely locally. It reportedly outperforms other on-device models, particularly in handling challenging out-of-distribution tasks and executing complex multi-step instructions.

The model was initially trained for ALOHA robots but has also been successfully adapted for use with Franka FR3 and Apptronik’s Apollo humanoid robots. All these robots share a bi-arm configuration, which is essential for compatibility with Gemini Robotics On-Device. This adaptability highlights the model’s versatility across different robotic platforms.

Real-World Applications and Future Potential

The capabilities of Gemini Robotics On-Device extend beyond simple tasks. The AI model can manage previously unseen objects and scenarios, demonstrating its potential for real-world applications. Google claims that it can even execute industrial belt assembly tasks that require a high degree of precision and dexterity, showcasing its robustness in complex environments.

As the technology continues to evolve, the implications for robotics in various industries could be significant. The ability to operate independently of a data network opens up new possibilities for deployment in remote or latency-sensitive situations. With ongoing testing and development, Gemini Robotics On-Device may pave the way for more advanced robotic solutions in the future, enhancing productivity and efficiency across multiple sectors.


Observer Voice is the one stop site for National, International news, Sports, Editorโ€™s Choice, Art/culture contents, Quotes and much more. We also cover historical contents. Historical contents includes World History, Indian History, and what happened today. The website also covers Entertainment across the India and World.

Follow Us on Twitter, Instagram, Facebook, & LinkedIn

Back to top button