In the realm of artificial intelligence training, games serve as a valuable analogue for real-world tasks. The implication is that an AI game-playing agent could potentially learn more about real-world navigation than any entity confined to a singular environment.
The concept could extend to a point where such intelligent agents are not mere opponents in gameplay, but cooperative entities, playing alongside humans in various games; this was a vision shared by one of the team members at Google DeepMind who were responsible for developing the AI agent, SIMA.
SIMA’s training was extensive, including loads of instances where human players engaged in video games, both on their own and in groups. Training data included keyboard and mouse movements, as well as noted details of the players’ actions in each game. SIMA’s training involved the application of an AI strategy known as imitation learning, where the goal was for the agent to mimic human gameplay.
As a result, SIMA has mastery over 600 basic commands such as “turn left”, “climb ladder” or “open the map”, and can execute each of these in under 10 seconds, on average.
Interestingly, the team discovered that the SIMA agent’s performance improved when trained over a variety of games, compared to focusing on one game. This appeared to be due to the agent’s ability to discern shared principles among the games and, in turn, refine its skills and achieve better task execution.
The AI agent demonstrated the ability to play games that it had never been exposed to before, marking a significant accomplishment in AI research. A specialist in AI at Queen Mary University of London highlighted this as indicative of an exciting potential for future applications.
By learning how to carry out tasks based on human-supplied examples and guidance, AI systems can potentially evolve into more powerful tools, particularly when trained with larger data sets. However, it should be noted that SIMA’s training data set is relatively restricted and is likely responsible for any limitations in performance currently displayed by the system.