Favorite Reinforcement Learning Environments

Alternatives to Atari Games or OpenAI Gym for your next RL project


I decided to make a list of some underrated RL environments. As cool as environments like OpenAI’s Dactyl are (and by extenion NVIDIA’s Issac Gym), the ones listed here seem like they’re more useful for producing new breakthroughs.

For the sake of this list, I’m also prioritizing environments where it’s relatively easy for a newcomer to get started. For example, Google released a fascinating paper on using RL for optimizing chip design. The only problem is that the necessary data for training such an algorithm is still proprietary, so it’s probably out of reach unless you’re already working for a large chip-maker.

ML Economist


If you’ve talked with me in person, you’re probably aware of my views on capitalism vs. socialism. Specifically, I’m talking about the views that neither one can be applied universally, and that there’s a time and a place for both. In some ways, you can compare this to a living animal with it’s fight-or-flight and rest-and-digest responses. Much like a living animal, the decisions that need to be made for economic policies hit the limits of what “dumb” statistics can do fairly quickly. I think it’s in my best interest to make sure algorithmic decision-making is known not just to the high-frequency traders (I suppose I’ve already contributed a bunch to that), but also to the policymakers. Obviously being able to model the economy further would depend on both greater data and compute power, but also the ability to understand human politics. Still, the former is a matter of resource investment, and the latter might not be impossible.

Train track management



UPDATE: Nethack has been solved (see the twitter announcement: https://twitter.com/egrefen/status/1331659533077319685)

Math problems

Musculoskeletal Robots



Reinforcement learning’s impressive accomplishments sit uncomfortably alongside its clear shortcomings: producing agents that perform a task (making coffee) without causing negative side effects (burning the house down to boil water faster) is an open problem.

Watch this video to learn, from the SafeLife team at the Partnership on AI, how you can use your ML expertise to help solve this critical issue!

Participate in the benchmark here: http://wandb.me/safelife

ML 3D mind-teaser Puzzles

This AI Makes Puzzle Solving Look Easy! 🧩


Fusion Plant Simulator

Agence (interactive VR videogame)

RL Agents: SOS!

A new multimedia experience lets audience members help artificially intelligent creatures work together to survive. What’s new: Agence, an interactive virtual reality (VR) project from Toronto-based Transitional Forms and the National Film Board of Canada, blends audience participation with reinforcement learning to create an experience that’s half film, half video game. The production, which runs on VR, mobile, and desktop platforms, debuted at the 2020 Venice Biennale exhibition of contemporary art. It’s available for download from Steam. How it works: Five cute, three-legged creatures live atop a tiny, spherical world. They must learn to work together to grow giant flowers for food without throwing the planet off-balance. Players can simply watch them work or play an active role in the story by planting flowers or moving agents around.

Players can let the agents interact under control of a rules-based algorithm or turn on a reinforcement learning (RL) model that drives them to seek rewards, such as bites of fruit, and avoid repeating mistakes, such as falling off the edge of the world. The agents were pre-trained in a stripped-down version of the game world using a method called proximal policy optimization, which makes RL less sensitive to step size without the tradeoffs incurred by other approaches. The game’s creators settled on PPO because it was quickest at training the agents to solve the game’s physical challenges, such as learning to balance their weight to keep the world from spinning, technical director Dante Camarena told The Batch. The developers are collecting data on how users interact with the agents. They’ll use the information to update the training simulation monthly. Behind the news: Agence director Pietro Gagliano received an Emmy in 2015 for a VR experience in which viewers encountered the Headless Horseman from the Sleepy Hollow TV series. Why it matters: Agence represents a new type of medium in which the audience members collaborate with AI to create unique, immersive experiences. It offers new possibilities for user input and interactive storytelling that — whether or not Agence itself catches on — seem destined to transform electronic entertainment. We’re thinking: Video game opponents driven by rules can be challenging, but imagine trying to outsmart the cops in Grand Theft Auto if they could learn from your past heists.


Cited as:

    title = "Favorite Reinforcement Learning Environments",
    author = "McAteer, Matthew",
    journal = "matthewmcateer.me",
    year = "2020",
    url = "https://matthewmcateer.me/blog/favorite-rl-environments/"

If you notice mistakes and errors in this post, don’t hesitate to contact me at [contact at matthewmcateer dot me] and I will be very happy to correct them right away! Alternatily, you can follow me on Twitter and reach out to me there.

See you in the next post 😄

I write about AI, Biotech, and a bunch of other topics. Subscribe to get new posts by email!

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

At least this isn't a full-screen popup

That'd be more annoying. Anyways, subscribe to my newsletter to get new posts by email! I write about AI, Biotech, and a bunch of other topics.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.