Close

It [the simulation] walks!

A project log for The Unnamed Spider bot

A fast and graceful remote-controlled spider

jeremyJeremy 11/27/2025 at 03:290 Comments

Six months ago, I never would have imagined building a robot reinforcement learning framework, but here we are. (psst, it's called Genesis Forge, and you should check it out)

It's been a busy 6 months since my last post, so let me take you through the training progress.

Foundations - how reinforcement learning works

Before we get too far into it, let's talk a little about how reinforcement learning works.

You start with a simulated robot—joints, actuators, and a flat surface to explore.

Your training script observes the robot's state: joint positions, velocity, height above ground, etc. These observations feed into a reinforcement learning algorithm (typically PPO).

The PPO algorithm responds with actions—usually target positions or torques for each actuator. A robot with 24 actuators would likely get 24 actions per step.

Your script then calculates a reward based on the robot's new state. Did it stay upright? Move forward? These individual rewards combine into a single score that tells PPO whether its actions helped or hurt.

Through repetition, the robot learns. It starts by flailing randomly—it has limbs but no idea what they're for. Gradually, guided by rewards, it discovers which movements work. Eventually, it lifts itself off the ground and takes its first steps.

The whole process mirrors how you'd train an animal: reward good behavior, ignore bad, and let the learner figure out the details.

Humble beginnings - Gymnasium + StableBaselines

I started simple by exporting my robot to Mujoco XML and training it in the Gymnasium environment using StableBaselines3. This worked well enough — I got the robot to stand up, but it never truly started walking. I realized I'd need to graduate to a more capable system to get much further.

I liked to call this the drunken spider.

Isaac Lab - so much promise...

The more I researched reinforcement learning, the more I encountered people talking about NVIDIA IsaacLab. It's a feature-rich platform that runs training directly on NVIDIA GPUs, enabling thousands of simultaneous training environments. 

I spent months working with it. Countless hours converting my robot model from intuitive mujoco XML to the more complex USD format. I built various environments and tweaked settings, but my spider wasn't learning. 

It turned out that my simulator was perpetually unstable, and the robots would randomly explode off the ground.

I scoured the documentation and tried everything that was supposed to help with simulation stability. I opened discussion threads asking for help, which went unanswered. 

Finally, in mid-August, I abandoned IsaacLab.

Genesis: A fresh start

Enter Genesis — a new robotics simulator with its first commit in December 2024. It boasts being cross-platform, 100% Python, and faster than any other simulator around. After a few days getting familiar with it, within a week, my robot was not only standing up but taking its first steps!

Genesis deserves serious credit. Not only is their simulation platform outstanding, and their development team is incredibly active on GitHub. They respond to questions, fix bugs, and constantly add performance improvements to make training even faster!

Despite all this, I missed some of the IsaacLab ecosystem. Genesis is a simulator, optimized for parallel environments, not a training ecosystem. IsaacLab, on the other hand, provides a library of composable blocks that abstract away the implementation details, leaving just the important bit. 

Genesis Forge was born

Taking inspiration from IsaacLab, I started building a series of reusable building blocks for Genesis environments and named this framework Genesis Forge. It has become an end-to-end system for creating robotic training environments in the Genesis simulator, with all the boilerplate handled for you. You can focus on what matters, while keeping your code clean and easy to understand.

It even has some enhanced features, like being able to "play" your trained robot with a gamepad controller.

Gamepad controller interfaceWhat's next?

Now that I've trained a spider to walk in a simulator, I guess it's time to build the real thing.

I've already built a single leg, so the first thing I want to do is wire the leg up to the computer and have it mirror one of the legs in the simulator. Then I'll put in a very large motor order and attempt to construct the entire thing.

Stay tuned.

Discussions