Clippy Inbound for Call of Duty? | AI and Games Newsletter 29/05/24
Microsoft's push for Copilot in games has a mixed reception
The AI and Games Newsletter brings concise and informative discussion on artificial intelligence for video games each and every week. Plus summarising all of our content releasing across various channels, from YouTube videos to episodes of our podcast ‘Branching Factor’ and in-person events.
You can subscribe and support AI and Games right here on Substack, with weekly editions appearing in your inbox. The newsletter is also shared with our audience on LinkedIn.
Hello everyone, welcome back to this weeks edition of the AI and Games newsletter. There’s a whole bunch of stuff to talk about this week: further musings on my new project Goal State, reminiscing about 10 years of YouTube and now you can chat with AI while playing Minecraft…?
Okay, let’s first dig into the announcements, and then tell us some stories.
Weekly Announcements
Hot off last weeks announcement of my upcoming Kickstarter for my online course project Goal State, the latest episode of the Branching Factor podcast was a little self-indulgent. I hosted a live QnA with my YouTube audience while also digging into more of my philosophy towards building this new platform.
A reminder that I’ll be participating in GamesIndustry.biz’s GI Sprint event running from June 17th - July 5th. You can find out more about the event here.
Just this morning I was pleased to see my interview with Ed Nightingale over at Eurogamer just landed, putting me on the front page of the site. We had a great chat right after my talk at the London Developer Conference about all things AI and video games. Check it out.
Lastly, I forgot to mention a while back, but a big thank you to everyone who has helped the AI and Games Substack break over 1000 subscribers (which sits nicely alongside the thousands of followers on LinkedIn). It’s great to have an excuse to write more about the world of AI for games each and every week. Plus, a cheeky reminder, that you can support my work on AI and Games via paid subscriptions on Substack, that provide insider access to upcoming content, ad-free episodes of Branching Factor, and some more fun stuff coming soon!
‘That Time I Scared a Comms Team…’
The latest video to drop on the main AI and Games YouTube channel was in keeping with the theme of celebrating 10 years of AI and Games. In the first anniversary video, I reminisced about my experiences and how building up AI and Games first as a YouTube series, and then as a broader platform, led to huge changes to my career.
In this episode, I was originally asked by my crowdfunding audience about which of my YouTube videos were my personal favourites. But I’m British, and we don’t indulge in such self-confidence - self-flagellation maybe, but heaven forbid we say we’ve done a good job at something. So I decided to do something a little different: I share some stories behind the videos, and how certain episodes wound up being made. Plus I give some deeper insight into the challenges I face in making my work stand-up to scrutiny, and maintaining the quality my audience has come to expect.
Microsoft Copilot x Xbox Games?
One of the stories, nay tweets, that gathered a lot of attention last week was from Microsoft whose ‘Copilot’ account posted an update showcasing an upcoming feature of that allows for the system to interact with the user while playing Minecraft.
Microsoft AI CEO Mustafa Suleyman later argued that this is “a magical experience: smart, intuitive, natural, and useful”, but the internet at large didn’t see it that way. The example itself is of Copilot giving guidance on how to craft a basic item, as well as how to craft the materials needed.
The response to this unveiling has largely been a collective groan among the general gaming populace. Not just because it is a relatively simple example - which can be achieved manually by using the in-game recipe book - but many rightfully pointed out that it suggests their operating system is observing their behaviour, that resources are being wasted on AI that is ultimately pointless - and in the worst case could have been achieved with a Google, nay… Bing, of the task at hand,
Now I won’t sugar-coat it, I don’t think this is a particularly good application of this type of technology. Not only do I think it’s a bad demo, frankly I am also not convinced it’s even real. Given my personal rule is that until I can replicate consumer-grade AI on my own local device then I simply believe it is partially, if not entirely fabricated. And it’s easy to reach that conclusion, when you consider the cringe-inducing dialogue of the user, combined with the cadence and style of the ‘Copilot’ voice - which doesn’t really align with most synthesised voice libraries. Plus, speaking as someone who helps create authentic videos for AI companies to showcase their wares, I can always spot a fake UI transition or two. No way Windows 11 runs that smoothly in practice.
But, I wanted to talk about this because while I am wholly unconvinced of this demonstration at this time, it does speak to an interesting and practical application of large language models and similar generative AI technologies in the context of games. I am going to treat the example shown in this tweet as legitimate for the rest of this write-up, so please bear that in mind. While some of the points I raise below are hypothetical, several others sadly are not.
Unpacking the Idea
If you unpack what is happening in this demonstration, there are several key elements that are interesting:
The system is recognising which game is being played, and later recognising aspects of the current game state.
The system utilises this as context for a conversation with the user to understand their goals.
It establishes what actions the player should now conduct based upon this to give guidance on how to move forward.
Some of this is exciting and has great potential, but the other side is actually exposing something that many users seldom think about. So let’s dig into that some more.
AI as an Assistant and Narrator
Much of what we have seen thus far on how LLMs can be adopted in games has been predicated largely on the idea of using it to create stories and dialogue. An idea popularised first by the likes of AI Dungeon, but also a number of companies that are aiming to create intelligent non-player characters (NPCs) that utilise a suite of state-of-the-art AI systems to enable fully immersed conversation. I’ve discussed these at great length previously on AI and Games, and I’d recommend looking up my writings on the likes of Inworld, Convai, and more recently Ubisoft’s NEO.
But there a much broader range of applications for LLMs and adjacent technologies within games. I think what is perhaps the more interesting aspect of this demo is that it is, hypothetically, doing the three things I’ve listed above. This is rather than story or dialogue generation, a process of working as an assistant, but also as something of a silent commentator.
I argue commentator, given it’s quietly establishing a history and context of actions that the player has made such that it understands how to provide assistance. It has determined that the player has started a game of Minecraft and then used this as a means to make some suggestions on how to enable their goals. This is actually quite a powerful idea within the context of games, but again, I think this is a horrible example of it.
Recent work by academic researchers has highlighted the broad range of research happening in areas of game design and development using large language models. This is a topic I will be returning to in much more detail later in the year, but one aspect that is highlighted in (Gallotta et al, 2024) is that they discuss how player assistants have existed for many years in the likes of the Civilisation and The Sims franchises, but have seldom been capable of achieving this level of bespoke guidance using conversational personality.
Similarly, they highlight the potential of using an AI as a commentator. There is huge overlap between a (good) assistant and a commentator, given the ability to summarise information in context to then give advice based upon it. We typically think of commentary in the likes of sports games (FIFA or rather, EA Sports FC) which is automated. But this extends into other areas, with the likes of eSports seeing significant research in recent years given popular titles such as Dota2 and League of Legends record their in-game event data such that it can be replayed by spectators.
If this Copilot system worked like this in theory, there are more interesting applications that sit in that line between assistant and commentator. For example:
Returning to an open-world game (e.g. Far Cry, Assassin’s Creed, Cyberpunk 2077) months after you last played, and asking the game for a reminder of what you were doing when you last played, and advice on what to do next.
Meeting a character in a game whose backstory you’ve completely forgotten, and you ask for a quick summary.
A visually-impaired player querying the system about the state of the local environment, such that they can then make a decision.
Real-time audio captions that give more contextually relevant information.
So while I’m generally apathetic towards the demo as it was shown, I am a big advocate of this type of work expanding, given I think it could lead not only to more useful tools and features for developers and players, but could also lead to entirely new approaches to game design as well.
Big Brother is Watching
Now despite the exciting possibilities, the rather legitimate (and recurring) complaints about this unveiling was the invasion of the users privacy. Copilot immediately understands that the player is running Minecraft. Now ideally a user would ask permission for the system to be able to recognise this, but it also leads to a bigger issue: in order to spot you’re playing Minecraft, it would have to look at everything else you’re up to.
It’s long been established that Microsoft Windows, like many popular high-end tech products, tracks the behaviour of its end-users - and speaking from experience, you can only disable so much of what Windows is doing under the hood. But I doubt the average user is aware of the level of oversight Microsoft has over your day-to-day activities. If anything this only highlights that issue to the common user.
Compared to the early days of user-tracking in Windows 7, users are increasingly more savvy to the realities of how their data is utilised by big-tech to further their own agendas. For every relatively benign or even positive utilisation of this information, there is equally several annoying, or outright terrifying applications. Funnily enough, this has arguably done more to highlight this issue than I think Microsoft thought it would.
Where are the Kinects?
Y’know, it’s funny how we’re now being advertised conversational chat bots that interact with our games, over a decade after Microsoft’s first real foray into this sector. The Xbox One was famous not just for the full integration of the Kinect peripheral in the device, but also how it subsequently affected the rather what would become a troubled launch of the system.
Now funnily enough this idea simply doesn’t work on the Xbox ecosystem, because they got rid of the microphones. The PlayStation controller has a microphone, but I doubt Microsoft will be able to ship a game with Copilot on their platform! I just find it funny that the console gaming ecosystem outside of PC’s is largely ill equipped for these proposed innovations.
Wrapping Up
That’s it for this edition. Thanks once again for checking out the AI and Games newsletter, and I’ll be back with some more ramblings next week! In the meantime, a quick shout-out to our friends over at Villainous Games Studios - who recently appeared on Branching Factor - who have just launched their game Harvest Hunt on Steam. Go check it out!