Almost ten years ago, the online phenomenon “Play Pokemon on Twitch, over a million people convened to play Pokémon Red simultaneously, with each player’s keystrokes registering as commands to a single pixelated avatar. Now, just as Magikarp grows into Gyarados, advances in technology are raising a new question: Can AI play Pokemon?
Seattle-based software engineer for the past few years. peter whidden has trained a reinforcement learning algorithm to navigate the classic first game in the Pokémon series. During that time, AI has played over 50,000 hours of games.hidden post 33 minutes of YouTube The video, which tells the story of the development of AI, received 2.2 million views in nine days.
“What’s so fun to see is how many people are participating in it,” Whidden told TechCrunch. He has uploaded the code he used to his GitHub, along with instructions on how to operate and train the AI. “There’s a lot of people who are very interested in actually doing this production and design process,” one fan said, applying the code to his Pokémon Crystal, another of his creations on the retro Game Boy. I was able to do.
The AI reinforcement model is Pavlovian, giving the AI point-based incentives to level up Pokémon, explore new areas, win battles, and defeat gym leaders. In some cases, these incentives don’t quite match the progression of the game, but the AI’s failures are still oddly appealing, which is probably why Whidden’s video went viral.
One of the AI’s attempts is to simply stand still and stare at the water in Pallet Town, the first location you visit in the game, never moving. You get stuck in an area with animated water, grass, and NPCs that come and go. This means that every frame seems like a new experience to the AI, even though you’re just sitting there motionless without getting your first Pokemon yet. But this AI isn’t in a hurry to “catch everyone.” It’s just enjoying the beauty of the Kanto region (or maybe it’s taking an ethical stance on forcing these cute little animals to fight…who’s to say Can you do it).
“So, for our own purposes, it’s more valuable to just wander around and take in the scenery than to explore other parts of the world,” Whidden explains in the video. “This is a contradiction that we encounter in real life. Curiosity leads us to the most important discoveries, but at the same time it can also make us easily distracted and get us into trouble.”
AI somehow continues to tug at our heart strings. The AI then experiences a traumatic event at the Pokémon Center. The success of the AI is measured in part by the combined level of all Pokémon in the party. However, if the AI goes to the Pokemon Center and presses the button enough times to deposit the Pokemon in the warehouse, the sum of all levels will decrease significantly, sending a strong negative signal to the AI. If you had both Pidgey and the cryptid nicknamed “AAAAAAAAA” in your party, the total for all levels would be 25, but if you leave Pidgey with her PC, the total would be only 12.
“Although they don’t have emotions like humans, a single event with a very high reward value can leave a lasting impact on their behavior,” says Whidden. “In this case, losing a Pokémon once is enough to form a bad bond with the entire Pokémon Center, and the AI will completely avoid that in all future games.”
Although the AI has the ability to experience trauma and admire the beautiful pixels of Palette Town, it is still just a computer. This AI is unable to read and interpret in-game dialogue, so in early iterations the program would get stuck at an early crossroads in the game. When you arrive at her second town in Pokémon Red, the Pokémon Professor in Pallet Town will give you an item to take home. However, the AI was having trouble backtracking to deliver the package, making it impossible to go any further. So Whidden skipped to start each game after the package was delivered and used Squirtle as his AI starter Pokemon. Because early games are generally easier with water Pokemon.
“The farthest thing in the video is [the AI] What you can reach is Mt. Moon between the first and second gym,” Whidden told TechCrunch. Navigating caves in early Pokémon games is notoriously frustrating, even if you had a real human brain. However, Whidden recently tweaked some of the rewards in the code and tried different learning algorithms, and eventually he said the AI managed to leave the cave and reach Cerulean City.
Other researchers are using reinforcement learning to explore the use of AI in games, such as DeepMind’s AlphaGo, the first computer program to beat a professional Go player. But the reason Whidden’s videos have garnered so much attention is because he’s so adept at explaining unfamiliar concepts through the familiar medium of Pokémon.