Skip to content

Nvidia’s New Robot: Learning by Seeing

After nearly two decades of establishing themselves as global leaders in graphics processing units, Nvidia has been expanding across a handful of adjacent markets for several years now.

While their established roots in the gaming market still accounts for most of their revenue, Nvidia is gathering significant traction in autonomous vehicles, datacentres, AI, machine learning and visualisation. In May 2018 the American company unveiled one of its AI department’s most exciting research programs, a robotics system capable of perceiving, programming and replicating human tasks.

Robots performing human tasks such as packaging or assembly is by no means new technology, however the way these tasks are communicated has historically been a complex process of rigid, systematic programming of the task’s required stages. According to the research team behind the project, Nvidia’s system is not only capable of recognising a task from a single demonstration, but smart enough to understand the context of the task and then replicate it. What this context means is if for example the components of the task do not specifically match with those provided in the task’s demonstration, the robot will be capable of adjusting until the completion criteria is met.

Lead researchers Birchfield and Tremblay discuss the project: “For robots to perform useful tasks in real-world settings, it must be easy to communicate the task to the robot; this includes both the desired result and any hints as to the best means to achieve that result. With demonstrations, a user can communicate a task to the robot and provide clues as to how to best perform the task.”

The robot’s perception is developed from a video feed of the task being performed, which neural networks then digest and feed into another network. This network then deduces an explanation of how to copy and recreate the action, and finally an execution network interprets this explanation from which the task can then be replicated.

Not only does a breakthrough like this significantly broaden the number of industrial applications for AI and automation technology, the ability to program tasks through simple demonstration could dramatically reduce existing barriers to entry. Furthermore, the ability for a robotics system to incorporate a more ‘human’ understanding of the task’s parameters and success criteria opens endless opportunities for industrial automation that were previously beyond current robotic capabilities.

Dashboard has published much discussion on how future digital oilfields will harness technology such as this to enable more efficient asset management. The potential role of robotics in this shift has also been covered, however only in a limited capacity, for example, robots performing simple tasks with rigid completion parameters. Nvidia’s technology could potentially reshape this, performing a much broader array of jobs that could be completed with little or no human engagement, further reducing expenses and margins for human error, while performing tasks with mechanical precision.