In this tutorial, I will guide you step by step to install PyTorch, Stable-Baselines3, and Gymnasium on Windows and Linux. It’s exactly how I did it on my personal laptop where I train RL agents. By the end of this tutorial, you will see how to run the training on CPU and GPU, possible errors, and how to fix them.
Decision Tree

I created a decision tree to guide you in choosing the correct installation combination based on the hardware and software of your machine. Thus, we have:
- Operating system: Windows or Linux,
- Type of hardware: CPU, NVIDIA GPU, or AMD GPU,
- The appropriate version of PyTorch. This is the foundation for Stable-Baselines3 and Gymnasium.
This decision tree is crucial because Stable-Baselines3 and Gymnasium depend on PyTorch. The latter has dozens of different variants depending on the hardware and drivers. If the choice is made incorrectly, hard-to-diagnose errors will occur. Two of the most common errors are:
torch.cuda.is_available() == False, ImportError: DLL load failed
RuntimeError: CUDA driver version is insufficient
Installation order matters
PyTorch must be installed first, because SB3 does not work without it.
Gymnasium will be installed at the end because some environments (Atari, MuJoCo, Robotics) have large dependencies and should not be installed unnecessarily.
Hardware dictates the correct PyTorch version
- If you only have a CPU → install the CPU version. It’s the simplest one.
- If you have an NVIDIA GPU → you must choose the version compatible with CUDA (for example cu121 or cu126).
- If you have an AMD GPU → choose the ROCm version (for example rocm7.0/7.1).
System Specifications Used in This Tutorial
Below are the exact system configurations on which I performed the installation and testing of PyTorch, Stable-Baselines3, and Gymnasium.
Main Machine (Host): Windows 11 Pro
- OS: Microsoft Windows 11 Pro
- Laptop model: Dell Latitude 5521
- CPU: 11th Gen Intel® Core™ i7-11850H @ 2.50 GHz (8 Cores / 16 Threads)
- Integrated GPU: Intel® UHD Graphics (1 GB)
- Dedicated GPU: NVIDIA GeForce MX450 (2 GB GDDR5, Driver 32.0.15.8108)
- CUDA toolkit: NVIDIA CUDA 13.0
- System type: x64-based architecture (UEFI BIOS mode)
In this tutorial, I executed all CPU and GPU benchmarks for PyTorch and Stable-Baselines3 on this Windows host machine.
The installed CUDA 13.0 toolkit allows the use of PyTorch builds with GPU acceleration for faster training performance.
Secondary Environment: Ubuntu 22.04 LTS (Virtual Machine)
- Virtualization platform: Oracle VirtualBox 7.x
- Guest OS: Ubuntu Jammy 22.04 LTS (64-bit)
- Allocated resources: 8 vCPUs, 9 GB RAM, 50 GB Disk
- Graphics controller: VMSVGA (software rendering, no GPU acceleration)
The Linux VM is used exclusively for testing installation procedures and compatibility.
GPU acceleration (CUDA or ROCm) is not available inside the virtual machine.
Why I Use Conda on Both Windows and Linux
The reason is simple and has a great effect on my environments. I use Conda on both systems to keep my environments compatible across platforms.
Conda allows me to create isolated environments with the exact versions of Python, PyTorch, Stable-Baselines3, and Gymnasium I need.
By doing this:
- I avoid dependency conflicts between
pipandaptpackages on Linux, - I can activate the same environment name (
gymenv) on both systems, - I can reproduce the same results and code without any modification.
Info: In this tutorial, you’ll find the steps for installing OpenAI Gymnasium on Windows using Conda: How to Install OpenAI Gymnasium in Windows and Launch Your First Python RL Environment.
Installing PyTorch for Windows (GPU + CUDA) and Linux (CPU-only or VM)
A. PyTorch Installation on Windows (NVIDIA GPU + CUDA)
On my Windows 11 Pro laptop, I have an NVIDIA GeForce MX450 GPU and CUDA Toolkit 13.0 installed.
Therefore, I need to install the PyTorch build compatible with CUDA 12.1 (cu121).
Why CUDA 12.1 and not 13?
Even though my laptop has CUDA 13.0 installed, PyTorch 2.x officially supports up to CUDA 12.1 on Windows.
Since CUDA 13 drivers are backward-compatible, using the cu121 build ensures full GPU acceleration and perfect stability.
Step-by-step (Windows):
# Step 1 – Activate the existing environment
conda activate gymenv
# Step 2 – Check Python version
python --version
# should display: Python 3.11.x
# Step 3 – Install PyTorch with GPU support
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# Step 4 – Verify installation
python -c "import torch; print('PyTorch version:', torch.__version__); print('CUDA available:', torch.cuda.is_available())"
Below you can see the confirmation from Conda after installing the PyTorch build compatible with CUDA.
This output shows that the GPU-enabled version (cu121) was successfully installed and recognized by the environment.

B. PyTorch Installation on Linux (Virtual Machine)
The Ubuntu 22.04 virtual machine has no GPU access (software rendering only). In this case we will install the CPU-only version of PyTorch.
Install Conda on Your Linux Virtual Machine
If Conda is not installed yet on the Linux machine, this should be the first step.
Install Miniconda (recommended)
Miniconda is a lightweight version of Anaconda — it installs faster and uses less space.
Run these commands one by one in your Linux terminal:
# Download the latest Miniconda installer for Linux wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh # Run the installer bash Miniconda3-latest-Linux-x86_64.sh #After it finishes, activate Conda source ~/.bashrc #Now you can test that Conda works conda --version

Install PyTorch for CPU
Now it’s time to install PyTorch, the engine that will power your neural networks.
Since we are using a virtual machine without GPU, we’ll install the CPU version of PyTorch.
# Check the Python version. You should see something like:3.13.x
python --version
# Create a new environment for Reinforcement Learning
conda create -n rl python=3.13 -y
#Info: if you have a message about conda tos, run the following commands
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r
conda create -n rl python=3.13 -y
# Activate the rl envrionment
conda activate rl
# Install PyTorch for CPU
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
# Verify PyTorch Installation
# Type python and then press Enter
python
# Copy and paste these three lines, and press 2xEnter
import torch
print("PyTorch version:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
If you see something like in the bellow image, then congratulations! PyTorch for CPU is installed correctly!

Install Stable-Baselines3
Why install Stable-Baselines3 (SB3) now? Because:
- SB3 is directly based on PyTorch,
- Now that PyTorch is confirmed to be functional, we can install SB3 without any risk of incompatibility.
Stable-Baselines3 is the brain that uses PyTorch as an engine. PyTorch knows how to “think” (numerical calculations), but SB3 knows how to learn. We use it to implement algorithms like PPO, DQN, SAC, or A2C.
So, after we’ve put the engine (PyTorch) in place, now we’re going to install the driver (SB3).
# Before installing SB3, make sure your Conda environment is active # for Windows conda activate gymenv # for Linux conda activate rl # Universal command (valid for Windows and Linux) pip install stable-baselines3 #If you want extra features such as TensorBoard, atari, etc.: pip install "stable-baselines3[extra]" # After installation, simply check: python -c "import stable_baselines3; print(stable_baselines3.__version__)"

Install Gymnasium
Now that you have PyTorch (the engine) and Stable-Baselines3 (the driver) installed, we are going to install Gymnasium. It is the training track where your reinforcement learning (RL) agent will train.
Installation on Windows and Linux
# Before installing Gymnasium, make sure your Conda environment is active
# for Windows
conda activate gymenv
# for Linux
conda activate rl
# Universal command (valid for Windows and Linux)
# Install the basic Gymnasium package.
# This installs the core environments such as CartPole, MountainCar, Pendulum, etc.
pip install gymnasium
# Verify installation:
# step 1. Type python and then press Enter
python
# step 2. copy and paste the bellow python code. Then press Enter
import gymnasium as gym
env = gym.make("CartPole-v1")
observation, info = env.reset()
print("Environment loaded successfully on Linux!")
print("Initial observation:", observation)
env.close()
If the code runs without errors and prints similar values, Gymnasium works great. The results should be something like in the bellow image:

Verify the Full Reinforcement Learning Setup
Now that everything is installed, let’s test the complete setup by training a small RL agent.
We’ll use Stable-Baselines3 (SB3) together with Gymnasium and PyTorch to solve one of the simplest but most used environments: CartPole-v1.
At this step, our goal is to check that:
- Gymnasium can create and run environments,
- SB3 can communicate with PyTorch,
- Training and logging work correctly.
If this demo runs without errors, our RL setup is working.
Before starting the demo training, we should verify that the training is running on GPU in Windows and CPU in Linux.
PyTorch + SB3 automatically uses the GPU if available. However, some users are unsure whether training actually uses the GPU or just the CPU.
How to check if SB3 uses GPU or CPU for PyTorch
# Before installing Gymnasium, make sure your Conda environment is active
# for Windows
conda activate gymenv
# for Linux
conda activate rl
# step 1. Type python and then press Enter
python
# step 2. copy and paste the bellow python code. Then press Enter
import torch
print("PyTorch version:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
print("Device name:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "CPU only")
The results should be like in the bellow images:


Demo training a small RL agent with PyTorch, SB3, and Gymnasium on Windows and Linux
STEP 1: Activate the environment
# Before installing Gymnasium, make sure your Conda environment is active # for Windows conda activate gymenv # for Linux conda activate rl
STEP 2: Create a new Python file
Create a Python file named test_sb3_cartpole.py and copy the code below inside it.
import gymnasium as gym
from stable_baselines3 import PPO
# 1. Create the environment (CartPole)
env = gym.make("CartPole-v1")
# 2. Initialize the agent using the PPO algorithm
model = PPO("MlpPolicy", env, verbose=1)
# 3. Train the agent for 10,000 steps
print("Training started...")
model.learn(total_timesteps=10_000)
print("Training finished!")
# 4. Save the trained model
model.save("ppo_cartpole_test")
# 5. Test the trained model
obs, info = env.reset()
for _ in range(5):
action, _states = model.predict(obs, deterministic=True)
obs, reward, terminated, truncated, info = env.step(action)
env.render()
if terminated or truncated:
obs, info = env.reset()
env.close()
print("Everything works!")
STEP 3: Run the script
# In your terminal: python test_sb3_cartpole.py
If everything is installed correctly, you’ll see a training log like this:


Note: For simple environments like CartPole, GPU acceleration provides little or no speed-up. But for vision-based or high-dimensional tasks (Atari, MuJoCo, Robotics), GPU makes a big difference.





