We present ORBIT, a unified and modular framework for robotics and robot learning, powered by NVIDIA Isaac Sim.
It offers a modular design to easily and efficiently create robotic environments with photo-realistic scenes,
and fast and accurate rigid and soft body simulation. With ORBIT, we provide a suite of benchmark tasks of
varying difficulty– from single-stage cabinet opening and cloth folding to multi-stage tasks such as room
reorganization. The tasks include variations in objects' physical properties and placements, material textures,
and scene lighting. To support working with diverse observations and actions spaces, we include various
fixed-arm and mobile manipulators with different controller implementations and physics-based sensors. ORBIT
allows training reinforcement learning policies and collecting large demonstration datasets from hand-crafted or
expert solutions in a matter of minutes by leveraging GPU-based parallelization. In summary, we offer fourteen
robot articulations, three different physics-based sensors, twenty learning environments, wrappers to four
different learning frameworks and interfaces to help connect to a real robot. With this framework, we aim to
support various research areas, including representation learning, reinforcement learning, imitation learning,
and motion planning. We hope it helps establish interdisciplinary collaborations between these communities and
its modularity makes it easily extensible for more tasks and applications in the future.
Video
Robotic Workflows
Reinforcement Learning
We include wrappers for different RL frameworks, such as RSL-RL, RL-Games and Stable-Baselines3.
These enable users to train their environments on a larger set of RL algorithms and facilitate
algorithmic research.
Using RSL-RL and RL-Games, we can train policies for cabinet opening and end-effector tracking in
minutes, obtaining up to 100K samples per second. Since stable-baselines3 is not GPU-optimized, we
obtain 6K-10K samples per second for the same tasks.
Imitation Learning
In ORBIT, we also include connections to various peripheral devices. These include keyboard and 3D Spacemouse.
Using these interfaces it is possible to send SE(2) and SE(3) commands for motion generation on robot.
Additionally, we provide data collection utilities to store demonstrations collected from peripheral devices
or policies. This data is stored in the data structure from robomimic, which allows training a wide range of
policies through learning from demonstrations.
Motion Planning and Control
Motion planning is one of the well-studied domains in robotics. The traditional Sense-Model-Plan-Act (SMPA)
methodology decomposes the complex problem of reasoning and control into possible sub-components.
ORBIT supports such paradigms by allowing users to define and evaluate hand-crafted state machines or
motion planners.
Deployment on real robot
It is possible to extend the framework to real robots by using the same API. We studied the feasibility of
deploying on a real robot using two different communication protocols: ZeroMQ, a lightweight message
passing protocol, and ROS, a popular middleware for robotics.
Franka Emika Arm connection with ZeroMQ
In the following videos, we show the physical Franka Emika arm being controlled by the same actions as the
simulated arm. The joint commands from ORBIT are sent to a computer running the real-time kernel for the
robot.
To abide by the real-time safety constraints, we use a quintic interpolator to upsample the 60 Hz joint
commands
from the simulator to 1000 Hz for execution on the robot.
We demonstrate the modular system design by keeping the same "agent" stack but replacing the simulated arm
with the real arm. Additionally, we experiment with two different tools on the arm: a parallel-jaw gripper
and a dexterous hand.
Franka
Allegro LiftFranka
Allegro Teleop
Franka
LiftFranka
Object Avoidence
Sim-to-real legged locomotion with ROS connection
Additionally, to demonstrate the flexibility of the framework and the ease of deployment, we show the
ANYmal-D robot being controlled by a policy trained in simulation. We use an MLP-based actuator network to
model the series elastic actuator of the robot which have complex dynamics due to non-linear dissipation and
delays. Additionally, we add randomization to the physics using the Isaac Replicator tool. The trained policy
is then deployed to a real ANYmal-D robot using the ANYbotics ROS stack.
Anymal
Training in SimulationTrained
Policy in SimulationPolicy
Deployed to Real Robot
Sample Tasks
Fixed Arm Manipulation
Rigid Obects
Open
CabinetHockey
Nut and
BoltPeg in
Hole
Deformable Objects
Hoist Flag
Drop Teddy
Pick Bear
Pour Fluid
In-hand Manipulation
Allegro
HandShadow
Hand
Mobile Manipulator
Mobile
ReachMobile
Cabinet Opening
Benchmarking Simulation Throughput
Physics
In order to benchmark the physics performance, we trained four tasks described in with different
numbers of parallel environments using RL-Games in ORBIT and in Isaac Gym. We evaluate the total
frames per second (FPS) obtained with increasing number of concurrent environments running. The
evaluation is done using an Intel i7-9800X CPU and a NVIDIA RTX3090 GPU.
* These numbers were computed using Isaac Sim 2022.1.0.
Rendering
In order to benchmark the multi-camera rendering performance, we created a simple scene which contains
just a robot and a table, and a detailed scene, which contains a variety of different assets, materials, and
light
sources. We benchmark these scenes with up to ten RGB cameras at two different resolutions: 320x240 and
640x480 with RTX ray-tracing. The figure above shows the total FPS obtained on the simple and the performance on
the detailed scenes respectively. As expected, the simple scenes provides a higher throughput due to less
clutter
and lighting sources. In current Isaac Sim, the rendering throughput is limited to the virtual memory available
on the GPU. Thus, increasing the number of cameras does not linearly increase the simulation output. The
evaluation
is done using NVIDIA RTX3090 GPU.
Scene 1: No direct lighting or background assets
Scene 2: Various lighting and background assets
* These numbers were computed using Isaac Sim 2022.1.0.
Citing
If you use Orbit in your research, please cite the following paper:
@article{mittal2023orbit,
title={ORBIT: A Unified Simulation Framework for Interactive Robot Learning Environments},
author={Mittal, Mayank and Yu, Calvin and Yu, Qinxi and Liu, Jingzhou and Rudin, Nikita and Hoeller, David and Yuan, Jia Lin and Tehrani, Pooria Poorsarvi and Singh, Ritvik and Guo, Yunrong and others},
journal={arXiv preprint arXiv:2301.04195},
year={2023}
}