AI Researcher Matthew Guzdial from the University of Alberta who previously developed a similar game generator says, “The work is remarkable”. The training of Genie involved 30,000 hours of video mined from the internet, covering hundreds of 2D platform games. This approach has been done before as Guzdial’s game generator likewise garnered insights from these videos to produce abstract platformers. Even Nvidia’s GameGAN was trained using video data, resulting in the duplication of games such as Pac-Man.
All these models, however, were trained using a combination of input actions and button presses on a controller alongside video footage. For instance, a video frame of Mario jumping would be coupled with the corresponding ‘jump’ action. Tagging video footage with input actions can be labor-intensive, thus limiting the available training data. Conversely, Genie was trained solely on video footage, with the AI learning which of eight potential actions would result in changes in the game character’s position. This technique transformed numerous hours of online video into valuable training data.
The remarkable feature of Genie is its ability to generate simple games based on hand-drawn sketches. It can generate each new frame based on the player’s action in real time. For instance, pressing ‘Jump’ prompts Genie to update the current image, showing the character jumping. Similarly, pressing ‘Left’ results in an image of the character shifting to the left. This dynamic feature enables the game to progress frame by frame as the player interacts.
Future renditions of Genie could perform at an increased speed. Tim Rocktäschel, Research Scientist at Google DeepMind leading the Genie project, contends that there are no structural barriers preventing Genie from achieving 30 frames per second. Genie incorporates technology utilized by modern comprehensive language models, a field showing significant advancements in inference speed.
Genie has also learned to master some visual quirks that are commonly found in platform games. An example is parallax, a visual effect where the foreground moves sideways faster than the background. Genie frequently incorporates this effect into the games it generates.
Even though Genie remains an internal research project without plans for public release, Guzdial points out that the Google DeepMind team envisions it as a potential game-making tool in the future. This view aligns with Guzdial’s own research interests, “I’m excited to see what they build,” he comments.