SIM2REAL: How to Reduce the Reality Gap in Robotics

This page was last edited on 11 March 2026

What if everything a robot learns in simulation could work in reality?

  • Such a breakthrough would mean reducing the time and cost of robot development.
  • We could test millions of dangerous and risky scenarios safely.
  • The robots would learn control in simulation. Then they could enter the real world ready to work. No risk, no mistakes. And they could adapt to new situations.
  • It would be like a flight simulator for robots. Robots would have the guarantee that everything they learn there works immediately in real life.
Simulation vs Reality
Simulation vs Reality

Can we do that? How?

We can apply physical and mathematical concepts within the simulation to bring the results closer to the real world. Such concepts involve methods such as:

  • changing the simulation conditions.
  • changing physical parameters.
  • creating a copy of the environment where the robot will operate.

Later, I will explain how to bring all these ideas to life in simulation. Each of these concepts helps the robot bridge the gap to reality. Until then, this is the list of methods and approaches for reducing the gap between simulation and reality.

  1. Domain Randomization (DR): requires a change of the conditions in the simulation (e.g., lighting, textures, friction, noise). A method to create diverse synthetic data, preventing overfitting and improving generalization.
  2. Adversarial Domain Randomization(ADR): an extension of domain randomization aims to generate difficult conditions (e.g., extreme lighting, dynamics) using adversarial networks.
  3. Randomized Physics Augmentation(RPA): randomizes physical parameters  (e.g., mass [0.5kg, 2kg], friction [0.2, 0.8], inertia). It brings robustness to physical variations.
  4. Sim-to-Real Data Augmentation: in this method, transformations (e.g., noise, lighting, distortions) are applied to simulated visual data to mimic real conditions and improve visual perception.
  5. System Identification (SysID): it is a method of accurately measuring the real parameters of the robot (e.g., mass, friction, delays). It also increases fidelity of the simulation.
  6. Digital Twins: create a faithful virtual copy of the real environment, synchronized with sensors, to minimize sim-to-real discrepancies.
  7. Domain Adaptation: adapts the model learned in the simulation to real data through techniques such as fine-tuning, adversarial adaptation or feature alignment, using limited real data.

Is it possible for a robot to learn in simulation and work in reality? 

Examples of dexterous manipulation behaviors autonomously learned by Dactyl.
Examples of dexterous manipulation behaviors autonomously learned by Dactyl. Image credit OpenAI

Yes, and there are already examples:

  • Dactyl (OpenAI): the robot that manipulates objects. It was trained only in simulation with domain randomization.
  • Boston Dynamics Spot: partially simulated training, then tested in the field.
  • Autonomous cars: companies like Waymo and Tesla drive billions of virtual kilometers before sending the car on the road.

However, the simulation and the transfer to reality are not ’instant’. It is an incremental progress. Today we can transfer certain simple tasks (walking, manipulation), but not yet all complex tasks.

And why should I be interested in this progress?

  • In industry, we can create faster, cheaper prototypes, reducing the risk of breakdowns and accidents.
  • In education, students can experiment with advanced robotics without expensive labs, just with a laptop.
  • Agriculture is another industry where robots can learn to pick fruit or cut grass without damaging plants or the environment.
  • In healthcare, robot surgeons can be trained in simulation for delicate operations, without risk to patients.
  • We can build faster robots that learn to clean, manipulate objects or help elderly people. All tested in simulations before they reach homes.

These are just some of the industries and fields where close-to-realistic simulation brings major benefits. 

Previously, we have seen the methods to bridge the gap between simulation and reality. We also looked at robots trained in simulation and the benefits across industries. The next step is to dive into the theory behind sim2real.

Note
Before giving you the idea of ​​simulation and transfer to reality, I would like to add some realistic notes about what is publicly said that “we train in simulation and transfer to the real robot. That’s it, the real robot works as if it was trained with domain randomization!
Reality is distorted and partially hidden. Domain randomization (randomization of physical parameters in simulation) is essentially an elegant hack that masks an unsolved problem. Simulated physics is an approximation, and an RL agent learns to exploit any difference between simulation and reality.
A concrete example is given by researchers who have documented the phenomenon of “physics exploitation”. A robot trained in simulation learns to use artifacts of the physics engine (e.g. impossible slips, inaccurate moments of inertia) as strategies. When it reaches the real world, these strategies do not exist and the policy collapses.
In other words, the reality gap between simulation and reality in robotics and reinforcement learning is not fully solved at this moment.

What is SIM2REAL(Simulation to Reality)?

Let’s imagine an autonomous robot navigating from one point to another on rough terrain.

In the simulation, the road is almost straight. The sun always shines at the same intensity. Nothing around the robot moves.

Now let’s return to reality. The stones on the road have different sizes. The light changes. Rain can appear at any time.

If this contrast makes you feel uncertain or surprised, that is normal. It reflects one of the fundamental problems in robotics and AI: the gap between simulation and reality.

SIM2REAL tries to solve this. The goal is to train robust policies in simulation that also work directly on real robots, without additional fine-tuning.

But here lies the weakness. The performance in the real world is often worse than in the virtual one. This problem is called the reality gap.

What is the Reality Gap?

Even in the best simulations, the robot will never encounter all the surprises of the real world.

It will not see moving rocks, changing lights, dust, or sudden weather conditions.

This mismatch is called the reality gap.

It describes the difference between simulation and reality, and the search for ways to reduce it.

But how do we know if our methods really work? How can we check if the gap has been reduced for our project?

This step is crucial. It is like checking the shopping list after filling the cart. If the products and quantities match, we can go to the checkout.

How do we evaluate the gap between simulation and reality?

How do we measure this difference? And how do we know if we are taking a risk when testing the agent in reality?

The answer lies in a list of parameters. Together, these form a comprehensive framework for assessing the sim-to-real gap and the evident risks.

List of evaluation parameters for the sim-to-real benchmark

  1. Robustness
    • What does it measure? The model’s ability to maintain performance under varied or unexpected real-world conditions. These conditions differ from the simulation. Examples include changes in lighting, mechanical noise, or physical variations.
    • How is it measured? The model is tested in real-world scenarios under extreme conditions. These are stress tests, such as low light, vibrations, or physical disturbances like actuator noise. The model’s performance, for example success rate or error, is then compared to that in simulation. For instance, the percentage of successful attempts in a real environment with Gaussian noise applied can be measured.
    • Purpose: To evaluate how well the model tolerates sim-to-real discrepancies. It also checks the model’s ability to handle external uncertainties, such as variations in lighting or friction.
    • Why is it important? Robustness is essential for the practical applicability of models in the real world, where conditions are unpredictable.
  2. Success Rate
    • What does it measure? The percentage of attempts in which the model successfully completes a specific task in the real world. This is compared to the simulation. Examples include object grasping or robot navigation. It also includes cumulative reward in reinforcement learning (RL).
    • How is it measured? Successful attempts are counted in a standardized test set. This is done in both real and simulated environments. For example, 80% success in simulation vs. 70% in reality. For RL, the average cumulative reward per episode is calculated. This can be the sum of rewards along a trajectory.
    • Purpose: To quantify the direct performance of sim-to-real transfer for specific tasks.
    • Why is it important? It is an intuitive and direct metric. It reflects practical success.
  3. Generalization Error
    • What does it measure? The difference between the model’s performance in simulation and in the real world. It also considers the correlation between the two. For example, this can be measured via the Sim-vs-Real Correlation Coefficient (SRCC).
    • How is it measured? Metrics such as mean squared error (MSE), mean absolute error (MAE), or SRCC are computed. SRCC is the Pearson or Spearman correlation coefficient between simulated and real results. For instance, in 6D pose estimation, predictions from simulation are compared to real-world results.
    • Purpose: To evaluate how well the model generalizes to real conditions. It also quantifies the predictivity of the sim-to-real transfer.
    • Why is it important? A large error indicates a significant sim-to-real gap. SRCC provides a quantitative perspective on correlation. This is useful in continuous tasks, such as pose estimation.
  4. Training Data Diversity
    • What does it measure? The degree of variation in randomized simulation parameters. Examples include colors, lighting, friction, inertia, and delays. It also considers the inclusion of hybrid data, combining simulation and real data.
    • How is it measured? The range of randomized parameters is evaluated. For example, friction may vary within an interval [0.1, 1.0]. Their impact on real-world performance, such as success rate under varied conditions, is also assessed. Hybrid simulations can be quantified by the proportion of real data included.
    • Purpose: To determine how well domain randomization (DR) prepares the model for real-world variations. This is achieved through data diversification.
    • Why is it important? Increased diversity through DR reduces overfitting and improves generalization.
  5. Simulation Fidelity
    • What does it measure? How accurately the virtual environment simulates real conditions. This includes modeling contacts, friction, inertia, and delays. It also considers the identification of simulation errors.
    • How is it measured? Physical properties, such as the friction coefficient, are compared between simulation and real data. Visual properties, for example PSNR for images, are also compared. Simulation errors are identified through performance discrepancies, such as deviations in physical dynamics.
    • Purpose: To assess simulation accuracy relative to reality. It also helps identify sources of the sim-to-real gap.
    • Why is it important? A faithful simulation reduces the need for excessive domain randomization (DR). Proper modeling of contacts and delays is crucial. Studies show a correlation between rendering quality and successful transfer to the real world.
  6. Zero-Shot Performance
    • What does it measure? The model’s ability to operate in the real world without additional fine-tuning on real data.
    • How is it measured? The model is tested directly in the real world, and performance is measured (e.g., success rate, prediction error) without recalibration.
    • Purpose: To evaluate the effectiveness of DR in preparing the model for unseen real-world scenarios.
    • Why is it important? Zero-shot performance is a key indicator of transfer success without extra training costs.
  7. Computational Cost
    • What does it measure? The resources required to train the model with domain randomization (DR). This includes time, memory, and hardware requirements. Simulation efficiency is also considered.
    • How is it measured? Training time is monitored. GPU and CPU usage are tracked. Memory consumption is measured relative to the degree of randomization and simulation complexity.
    • Purpose: To evaluate the trade-off between DR benefits and computational costs. This is especially important for high-fidelity simulations, which model contacts, friction, and other dynamics.
    • Why is it important? DR increases costs due to multiple variations. Efficient computation is necessary for practical applicability.
  8. Training Stability
    • What does it measure? The consistency and convergence of the model during DR training, despite introduced variations.
    • How is it measured? Loss variation during training or algorithm convergence rate is monitored (e.g., in RL, reward variance).
    • Purpose: To ensure that DR does not destabilize training due to excessive randomizations.
    • Why is it important? Aggressive randomization can lead to instability, affecting performance. Benchmarks must balance diversity with stability.
  9. Cross-Domain Transferability
    • What does it measure? The model’s ability to adapt across multiple real-world domains (e.g., from one robot to another or between different rooms).
    • How is it measured? The model is tested in multiple different real environments (e.g., different robots or environmental conditions) and performance is compared (e.g., success rate).
    • Purpose: To assess model flexibility in varied real-world scenarios.
    • Why is it important? DR facilitates cross-domain transfer by exposing the model to wide variations, essential for diverse practical applications.
  10. Visual Perception Quality
    • What does it measure? The accuracy of interpreting real visual data compared to simulated data, including metrics like precision, recall, and F1-score for classification/detection tasks.
    • How is it measured? Metrics such as precision, recall, and F1-score are used for object detection or segmentation, and position errors are measured for visual estimations (e.g., 6D pose estimation).
    • Purpose: To evaluate how well the model generalizes to real visual conditions influenced by DR (e.g., color/texture randomization).
    • Why is it important? Visual discrepancies are a major source of sim-to-real gap. Precision and recall are key for visual tasks, as highlighted in studies.

The 10 parameters listed above are general metrics for evaluating the sim-to-real gap.

However, depending on the specific application, additional parameters may be defined.

For example, in cloth manipulation or autonomous driving, one might include metrics like 6D pose accuracy, contact quality, sensor anomaly robustness, or performance under extreme conditions, as suggested in recent studies.

Now we know what data gives us information about the simulation. We also know how close it is to reality. The next step is to implement these methods and approaches in simulations. The first method is called Domain Randomization. It forms the basis for everything that means the gap between simulation and reality.

Domain Randomization

The simple idea of domain randomization comes from the need to change the simulation.

Instead of a fixed and predictable world, we create a world full of variations.

The concept is to expose the agent to as many different situations in simulation as possible.

Even extreme situations are included. This makes the agent robust. It also helps it adapt automatically to surprises in the real world.

There are many examples where domain randomization is essential.

  • These include robots handling fragile objects in factories.
  • Drones navigating complex environments with unexpected obstacles.
  • Or autonomous vehicles coping with changing road and light conditions.

In each case, success depends on how well the simulation-trained policy tolerates variations.

That is exactly what Domain Randomization is about.

As an analogy, think of Domain Randomization as training an athlete. You don’t just train them on perfect terrain. You also expose them to wind, rain, sand, and unexpected obstacles. This way, their performance in real competition is less unpredictable.

When to Apply Domain Randomization

DR is especially applicable when:

  • We have a model trained in simulation. Now, we want to use it in the real world.
  • There is high variability in the real environment. This would make the model ‘overfit‘ on the simulation.
  • We do not have enough real data for direct training. Or, the data is expensive to obtain.

Basically, DR is useful for achieving zero-shot performance in reality. The robot does what it learned in the simulation, even if the real environment is not identical.

You can start by changing the physical parameters: masses, frictions, stiffnesses, inertias. You can vary the lighting: intensity, direction, shadows. You can modify the textures and colors of objects, their sizes or positions. We will talk more about these variations in the following sections. We will also discuss the parameters and how you can vary them.

Randomization at a joint of a humanoid robot.
Randomization at a joint of a humanoid robot.
  • Joint: Right Hip Pitch
  • Randomized Parameters:
    • Maximum/Minimum Joint Angle Limit ± 5°
    • Maximum Rotation Speed ​​± 10%
    • Joint Friction ± 20%

Just like in the real world, the agent will learn to maintain balance and walk even if each joint has slight physical variations.

Why do we vary the maximum/minimum joint angle limit ± 5°?

Real joints are not perfectly calibrated, and sensors and motors have small variations.

±5° is a typical value for large joints (hip, shoulder) in a humanoid because:

  • It is large enough for the agent to learn to be robust to positioning errors.
  • It is not too large to lead to impossible or dangerous behaviors.

In practice, this percentage can be adjusted: small joints (finger, ankle) ±2°, large joints ±5–10°.

Why ±10% for maximum rotation speed?

Real motors do not always deliver exactly the speed set in the simulation.

±10% is a trade-off between variability and stability in training.

More than ±15–20% can quickly destabilize the agent in the simulation if it does not have correction mechanisms.

Why ±20% for friction in the joint?

The internal friction of real joints can vary between identical units or due to wear and temperature.

±20% is a value commonly found in robust control and RL work for humanoids (e.g. MuJoCo, PyBullet), because:

  • It is sufficient for the agent to learn to compensate for the variations.
  • It does not destabilize the simulation in an unrealistic way.

How are these percentages chosen in practice?

  • Real measurements: if you have access to the robot, measure the real variations of each joint (sensor noise, friction, motor deviation).
  • Literature: many works use similar values, e.g. ±5° for joint angles, ±10–20% for physical parameters.
  • Trial & Error / Hyperparameter tuning: adjusts depending on how stable the agent is in the simulation and how well it transfers the performance to reality.

Randomization Frequency

Another important aspect is how often we apply randomization.

  • Per Episode: most physical parameters such as mass, inertia, or friction are randomized once at the beginning of each training episode. This keeps the simulation stable while still exposing the agent to many different worlds.
  • Per Step: sensor noise, small motor deviations, or observation perturbations are often randomized at every simulation step. This teaches the agent to cope with unpredictable, fast changes.
  • Mixed: in practice, both frequencies are combined. For example, joint angle limits or friction may change per episode, while sensor noise varies per step.

Industry practice shows that episodic randomization is the standard for stable training, while per-step randomization is crucial for robustness against real-world sensor noise.

So far we have discussed how we can apply Domain Randomization to a joint of a humanoid robot. This assumes that the simulation is randomly varied to increase the robustness of the agent. Next we will look at how we apply a natural extension of DR. This is Adversarial Domain Randomization (ADR).


Adversarial Domain Randomization

  • It is an extension of Domain Randomization (DR).
  • Instead of choosing the parameters of the simulator purely randomly, an adversary or algorithm is used that generates extreme or difficult conditions for the agent.
  • The goal is to identify and expose the agent to the most challenging situations. In this way, it will learn to be resilient and robust in the real world.
  • Basically, ADR turns passive randomization into an active process of “challenging” the agent. A method of increasing tolerance to environmental variability.

When do we apply Adversarial Domain Randomization?

  • When we want the agent to be robust to extreme or unexpected real-world conditions.
  • When there is high variability in the environment and simple random randomization is not enough.
  • When we want to improve zero-shot performance in rare or difficult situations.
  • When we train models for critical tasks. In reality, their failure can be costly or dangerous (e.g., handling fragile objects, autonomous navigation in complex environments).
Robotic arm that grabs objects on a table
Robotic arm that grabs objects on a table
  • ADR introduces extreme conditions:
    • Very bright or very dim lighting. Illumination: ±50–100% of normal intensity
    • Partially obscured, unusually positioned objects. Gaussian noise: σ = 0.05–0.2
    • High visual noise (e.g. glare, blur). Blur: 3–7 pixel kernel

This makes the agent robust to unexpected visual situations in the real world.


Randomized Physics Augmentation

This method focused on the physical parameters of the robot and the environment. It is an extension of Domain Randomization.

Instead of varying visual or perceptual aspects, RPA focuses on:

  • Joint and object masses
  • Friction between joints or with the ground
  • Joint stiffness and damping
  • Inertia and center of mass of the robot or objects

The goal is to create an agent that is robust to unexpected physical variations in the real world. Unlike visual DR, RPA affects the physical behavior of the agent in the simulation.

When do we apply Randomized Physics Augmentation?

  • When we want the agent to be robust to physical variations of the robot or the real environment.
  • When there are uncertainties in the real parameters: different masses, non-constant frictions, worn or imprecise joints.
  • When we want to improve zero-shot performance in physical tasks, such as walking, grasping objects or manipulating loads.
  • It is especially useful for simulations of mobile or humanoid robots, but also for industrial manipulators or drones.
Drone carrying object
Drone carrying object

The main task of an autonomous drone carrying objects is to maintain trajectory and stability during flight.

Randomized physical parameters:

  • Total drone mass: ±15% of nominal mass (e.g., 1.7–2.3 kg), using uniform distribution. This includes variations caused by different batteries or objects carried.
  • Aerodynamic damping: ±10% to simulate variations in air currents or turbulence.
  • Motor friction: ±20% to reflect real-world variations in motor performance or wear.
  • Objects carried: mass between 50–200g, centers of mass slightly offset to introduce uncertainty in balance.

Randomization Frequency

Parameters are re-randomized at the beginning of each training episode to expose the agent to different conditions.

Limitations

The random values ​​are restricted so that the simulation remains stable (drones should not fall instantly or flip over at extreme values).

The benefit of Randomized Physics Augmentation is that the agent learns to maintain trajectory and stability even if the mass, weight distribution, or dynamics of the drone differ from the standard simulation. The result is an agent that is robust to unexpected physical variations, ready for real-world situations.


Sim-to-Real Data Augmentation

It is a method of reducing the sim-to-real gap focused on the visual and perceptual data of the agent.

Instead of modifying the physical parameters of the robot, the simulated data is transformed to make it closer to real conditions.

Transformations can include:

  • Noise added to images or sensors
  • Variations in lighting, shadows, reflections
  • Geometric or camera distortions
  • Modifications of textures, colors, contrasts

The goal is for the model to be robust to visual and environmental conditions different from the simulation.

When do we apply Sim-to-Real Data Augmentation?

  • When we train agents on simulated data, but we want to use them in real conditions.
  • When the quality of the simulated images is very good and the model risks being over-specialized on these conditions.
  • When we have limited or expensive real data. In this case, augmenting simulated data can partially replace their collection.
  • It is useful for tasks such as object recognition, detection, 6D positioning, visual navigation or robotic manipulation.
Robotic arm that needs to grasp objects on a table, using visual perception from an RGB camera
Robotic arm that needs to grasp objects on a table, using visual perception from an RGB camera

Randomized visual parameters:

Lighting:

  • Intensity: ±50% of standard value
  • Direction: ±30° of nominal light source position
  • Shadows and reflections simulated

Visual noise:

  • Gaussian noise: σ = 0.05–0.2 applied to all RGB channels
  • Salt-and-pepper noise: 1–3% pixels affected

Geometric distortions:

  • Perspective shift: ±5° on X and Y axes
  • Blur: 3-7 pixel kernel, random at each frame

Object textures and colors:

  • Varied colors ±30% in RGB
  • Different textures for objects and mass, to simulate real variations (glossy, matte, rough)

Partially covered or moved objects:

  • Random positioning ±5–10 cm
  • Partially obscured by other objects or robot arm

Augmentation Frequency

Different transformations are applied to each frame or episode so that the model is exposed to varying visual conditions during training.

As a rule, transformations should not make objects completely unrecognizable. The ranges for noise, blur, or illumination should be realistic for the real environment.

The goal of this method is for the agent to learn to recognize objects even if:

  • The illumination changes suddenly
  • The objects are partially hidden
  • The cameras introduce noise or distortion

So that the result is robust in real-world zero-shot, without depending on expensive real data.


System Identification(SysID)

It is a method used to increase the fidelity of the simulation by measuring and incorporating real robot parameters.

Instead of varying the parameters for robustness, SysID involves accurately determining the physical and dynamic characteristics of the real system.

The measured parameters can include:

  • Masses and centers of mass of the robot components
  • Moments of inertia and weight distribution
  • Frictions of joints and motors
  • Delays and dynamics of motors
  • Sensor characteristics: bias, noise, resolution

The goal is for the simulation to be as close to reality as possible, thus reducing the sim-to-real gap.

When do we apply System Identification?

  • When we need very fidelity simulations, for training or testing robots.
  • When the standard simulation or nominal parameters do not accurately reflect the behavior of the real robot.
  • When we want to combine DR or ADR with high fidelity. That is, the simulation should be realistic, but also include controlled variations for robustness.
  • It is useful for critical robots, humanoids, manipulators or drones, where physical deviations can affect performance or safety.
A humanoid robot that must walk and maintain balance in simulation and reality.
A humanoid robot that must walk and maintain balance in simulation and reality.

Identified and measured parameters

Segment masses:

  • Torso: 15.2 kg
  • Thighs: 4.5 kg each
  • Calf: 3.2 kg each
  • Foot: 1.5 kg each

Centers of mass (CoM) for each segment:

  • Local X, Y, Z coordinates relative to joints

Moments of inertia:

  • Ixx, Iyy, Izz for each segment, measured with mechanical rulers or CAD + real measurements

Joint and actuator friction:

  • Static and dynamic friction ±5–10%
  • Wear and temperature effects

Delay and actuator dynamics:

  • Delay from command to movement: 15–20 ms
  • Response time at maximum acceleration: 0.1 s

Sensors:

  • Bias and noise for IMU: accelerometer ±0.02 g, gyroscope ±0.01 rad/s
  • Noise for torsion or force sensors at the soles of the feet: ±2 N

Measurement methods:

  • Direct physical measurements (scales, rulers, CAD)
  • Controlled experiments: applying known forces and measuring the response
  • Sensor calibration and estimating dynamic parameters through optimization

Implementation in simulation:

  • We replace the nominal parameters with those measured in the physics engine (Unity, MuJoCo, PyBullet)
  • We adjust mass, inertia, friction, motor dynamics, sensor noise according to SysID values

Applying this method în training reduces the sim-to-real gap through high fidelity. The agent is more predictable in real-world zero-shot and learns on a model close to reality.


Digital Twins

Digital Twin is an accurate replica of a real system or robot, synchronized in real time. The goal is to simulate the real behavior and environment as close to reality as possible, reducing the sim-to-real gap.

The main component is the connectivity between the real system and the virtual model, so that:

  • Real sensor data is transmitted to the twin
  • The twin updates the simulation state and can test scenarios before the real robot acts

Digital Twins combine physical fidelity, visual fidelity, and real-time synchronization.

When do we apply Digital Twins?

We apply this method when a highly fidelity simulation is needed for critical robots or industrial applications.

We test risky scenarios without damaging the real robot.

We train RL agents on environments that reflect real conditions in real time.

And we synchronize control and monitoring between real and simulation.

It is especially useful for:

  • Industrial robots and manipulators
  • Drones or autonomous vehicles
  • Humanoid robots for testing in complex real-world environments
Robotic arm that sorts fragile objects on a production line, with high precision and without damaging the objects
Robotic arm that sorts fragile objects on a production line, with high precision and without damaging the objects

Data collected from the real robot

Joint positions:

  • Encoder and feedback from motors for each joint (joint angles, θ)
  • Resolution: 0.01°
  • Frequency: 100 Hz

Joint velocities and accelerations:

  • Numerical derivatives from positions or directly from motor sensors
  • Important for trajectory control and deviation detection

Clamping forces:

  • Torsion and pressure sensors in the arm claw
  • Measuring range: 0–50 N
  • Sensor noise: ±0.5 N

Motor temperature and aging:

  • Motor temperature to prevent overheating
  • Estimating wear through variations in motor response and current consumption

Digital Twin Implementation

Motor and joint modeling:

  • We use SysID data for each motor: rotor mass, inertia, static and dynamic friction
  • Joint dynamics integrated into the simulator (Unity, MuJoCo or PyBullet) to reflect the real behavior of the arm

Real-time synchronization:

  • The Twin receives a continuous stream of: real-time positions, speeds, forces and motor temperatures
  • The simulator updates the virtual model to accurately reflect the arm’s state at any given moment

Twin testing scenarios:

  • Introducing new objects with different shapes and weights
  • Simulated heavy or fragile loads to test gripping strategies
  • Extreme conditions: partially slippery objects, unexpected positioning or rapid movements on the production line

Feedback loop:

  • Any strategy tested on the twin can be verified before execution on the real robot
  • Adjust control or PID parameters without risk to objects or equipment

Domain Adaptation

Domain Adaptation is a method for reducing the sim-to-real gap focused on adapting a model learned in simulation (source domain) to real data (target domain).

In this method, we do not modify the simulation, but adjust the model or representations to work correctly in the real world. This is the difference between Domain Adaptation and Domain Randomization.

Common techniques:

  • Fine-tuning on a small set of real data
  • Adversarial adaptation: the model learns representations that are invariant between simulation and reality
  • Feature alignment: aligns feature distributions between simulation and reality

When do we apply Domain Adaptation?

When we have the model trained in simulation, but the real-world performance is poor.

When we have limited, expensive, or hard-to-obtain real-world data.

When we want to improve zero-shot transfer by adjusting the learned representations of the model.

It is especially useful for:

  • Visual recognition and object detection
  • Navigation and visual control for robots
  • Manipulation tasks where simulation does not capture all real-world variations
Robotic arm that needs to grasp real objects on a table using visual perception (RGB-D camera)
Robotic arm that needs to grasp real objects on a table using visual perception (RGB-D camera)

Step 1: Initial simulation training

  • Simulator: Unity / PyBullet / MuJoCo
  • Example of simulated data: 10,000 images of objects on the table, under different angles and illumination
  • Model: CNN + Actor-Critic RL or CNN for object detection

The purpose of this model is to learn to detect objects and determine the correct trajectory of the arm.

Step 2: Collect real data

  • Real data: 500–1,000 images of real objects on the same table
  • Sensors:
    • RGB camera: 1280×720 px
    • Depth camera: 30 fps

Note: images labeled with object position (bounding box or 6D pose).

Step 3: Domain Adaptation

Fine-tuning:

  • The model pre-trained on simulation is additionally trained on the real set, with a reduced learning rate (0.001–0.0001)
  • Light augmentations (noise, shift, blur) are used to prevent overfitting on limited real data

Adversarial adaptation (optional):

  • Discriminator that tries to differentiate between features from simulation and features from real
  • The CNN is trained to generate indistinct representations between simulation and reality

Feature alignment:

  • Align feature distributions via Maximum Mean Discrepancy (MMD) or Contrastive Loss

Step 4: Testing and Implementation

Test: zero-shot application on new real objects

Metrics:

  • Grasping Accuracy: % of objects correctly caught
  • Position Errors: average deviation in cm between target and catch
  • Success Rate: number of catches without leaks or damage

This model transfers performance from simulation to reality.

In addition, it reduces time and cost of real-world data collection. And even if simulation does not capture all real-world conditions, it has robust performance.


Choosing the right simulator

Which one to choose? IsaacSim, MuJoCo, PyBullet, Unity, or Gazebo Sim?

Before choosing a simulator, we need to define the purpose and details of the application.

  • Robot type: simulators differ in terms of physical fidelity. They also differ in the type of robot they can effectively model (humanoids, manipulators, drones, mobile vehicles).
  • Desired fidelity level: if you need very accurate simulations for advanced control or RL, MuJoCo or IsaacSim offer more accurate physics than PyBullet or Unity. The latter allow for more visual and complex scenarios.
  • Simulation purpose: quick test, visual prototyping, RL training, or implementation for real-world transfer (sim-to-real).
  • Integration with existing ecosystems: ROS, Python, ML-Agents or other frameworks influences the optimal simulator.
  • Hardware resources: simulators like IsaacSim require powerful GPUs for ray tracing or high complexity simulations.
Simulator Comparison Radar Chart
Simulator Comparison Radar Chart

Advantages and disadvantages of each simulator

Advantages:

  • Realistic physics with NVIDIA PhysX and Omniverse.
  • Advanced support for sensor simulations: RGB cameras, LIDAR, IMU.
  • Very good for sim-to-real simulations, especially for mobile and industrial robots.
  • Integration with ML/DL and Python APIs.

Disadvantages:

  • It requires a powerful GPU.
  • Higher complexity in initial setup.
  • Smaller community compared to MuJoCo or PyBullet.

ROS integration: Native support via ROS 2 and ROSBridge, but not as direct as Gazebo.

Advantages:

  • Extremely accurate and fast physics, ideal for RL and humanoids.
  • High numerical stability.
  • Support for multiple joints and complex systems.
  • Good for academic/research experiments.

Disadvantages:

  • Simpler visualization. Not focused on visual realism.
  • Limited ROS integration (you can use Python ROS nodes, but not natively).

ROS integration: partial, via Python packages.

Advantages:

  • Free, open-source, and easy to configure.
  • Good for rapid prototyping and RL.
  • Supports humanoids, manipulators, and dynamic objects.
  • Decent visualization.

Disadvantages:

  • Less accurate physics compared to MuJoCo or IsaacSim.
  • Limited support for very visually complex scenarios.

ROS integration: possible via ROS-PyBullet bridge, less robust.

Advantages:

  • Extremely flexible for visuals and complex interaction.
  • Supports ML-Agents for RL, including multi-agent.
  • Ideal for visual prototyping, educational simulations, or 3D visualization.
  • Excellent GPU hardware support.

Disadvantages:

  • Physics is not as accurate as MuJoCo or IsaacSim.
  • ROS integration requires external plugins.
  • Complex configurations can be heavy on modest computers.

ROS integration: possible via ROS-TCP Connector or plugins.

Advantages:

  • Open-source and natively integrated with ROS.
  • Good for mobile robotics and simple industrial simulations.
  • Support for multiple sensor types.

Disadvantages:

  • Less realistic physics than IsaacSim or MuJoCo.
  • Simpler visualization.
  • Less suitable for RL training with complex graphics.

ROS integration: native and robust.

Conclusion

Sim2Real is not just a technical problem, but an invitation to imagine how we can bring reality into simulation and simulation into reality. The ultimate question is not whether we can close the gap, but how far we can go until simulation and reality become indistinguishable.


References:


Epsilon Greedy << Previous | Next >> Experience Replay