Sponsored By

Postcard from GDC 2005: Tutorial - Machine LearningPostcard from GDC 2005: Tutorial - Machine Learning

This tutorial provided a gentle but thorough overview about machine learning (ML): the techniques and costs implied, and how games can benefit from it citing such games as The Sims and Black and White as new game genres made possible by machine learning.

March 8, 2005

9 Min Read
Game Developer logo in a gray background | Game Developer

Author: by Daniel Sánchez-Crespo Dalmau

Video game artificial intelligence has traditionally focused on simulating interesting behaviors, whether it's combat tactics, NPC interaction, stealth, or even story telling elements. In recent years, adventures, role-playing games and strategy games have shown us how sophisticated and rich game AI systems can be. Still, most of these games exhibit pre-scripted, "staged" behavior only, where character learning is a minor component, if taken into consideration at all. Adventure game characters will never remember us the second time they see us, and characters in a fighting game will seldom adapt to our fighting style through a series of combats.

Still, as CPU power increases, and player's expectations sky-rocket, learning is slowly gaining ground in the game AI development community. A solid proof of this was evident at Monday's full day tutorial on game learning techniques at the Game Developers Conference in San Francisco, where professors John Laird and Michael Van Lent from the University of Michigan surveyed the different machine learning techniques to a room-filling audience.

The tutorial was divided into chewable portions which, as a whole, provided a gentle but thorough overview of what machine learning (ML) is all about, the techniques and costs implied, and how games can benefit from it. Both speakers took turns, with frequent stops to allow questions from the audience, thus making a deep, complex subject easier to follow and understand.

For the first portion of the talk, both speakers tried to give basic info on what ML is, when should it be used, and when it can or should be faked. ML is an added computational cost which only benefits some scenarios, and so it should only be used where it can positively affect the gameplay, not as a hyped piece of technology. Here's a recipe that appeared several times during the talk, and very well summarizes the pros and cons of learning in AI systems:

Positive side of ML:

•  More interesting, believable behavior due to learning
•  Personalized, re-playable experience
•  New types of games (Black and White and The Sims come to mind)

Negative side of ML:

•  More difficult to predict behavior, less control for designer
•  May take a long time to evolve
•  May get stuck / be unreliable

It was interesting and refreshing to see two major league AI researchers give an unbiased, neutral opinion where ML is not the solution to all problems in the world, but just a new component that should be evaluated and used when needed.

For the longer portion of the talk, both Laird and Van Lent focused on explaining classic machine learning techniques, with emphasis on their applicability for video games. Here's a brief overview of what was covered, and the key ideas:

The first surveyed technique was classic decision trees, expanded with rule induction as the learning method. In decision trees, knowledge is represented in a tree with each node being a test, and each child node being an outcome of the test. So, by descending from root to leaves we select the exact configuration of the system, and thus the associated behavior. So, a tree may have a root that classifies characters between friendly or hostile. The second-level node may classify according to the type of weapon they are wielding, and the leaves may specify the behavior associated with each configuration (such as ENEMY with RANGED WEAPON implies FLEE behavior). Decision trees have been used in games for well over twenty years. What's interesting about them is that a number of algorithms (namely, the ID3 and the more recent C4.5 algorithm) have been devised to, given the right number of example cases, automatically build the tree by means of an induction paradigm.

An example may look something like:

<ENEMY, RANGED WEAPON > = FLEE
<ENEMY, SWORD > = ATTACK

And by using ID3 on that set a tree would be built, and thus the character would automatically learn the decision criteria, correctly selecting behaviors according to that criteria. ID3 basically works recursively on the set of attributes to be used as classifiers (alignment and type of weapon in the example), selecting at each step the one that divides the input in two similar parts.

As a real world example, induction on decision trees was used in Black and White, as the creature learnt to predict the player's reaction. Every time the creature did something, it recorded the player's reaction (slap, reward, etc.), and used this action-player feedback tuple as the input to an induction mechanism to build the decision tree that conditions future action selection. This way the creature ended up doing the kind of actions the player rewarded, and avoided the rest.

A second surveyed technique were the often hyped, but never totally understood neural networks (NNs). Too complex to explain in full detail here, suffice to say NNs are similar to decision trees in a way that both try to build pattern-recognition constructs from a series of discrete examples. Still, where decision trees used a tree metaphor, NNs use a connectionist approach, where decisions are taken by a series of neuron layers. Each neuron is connected to several on the next layer, so by adjusting the sensibility of each connection we can change the response of the network to a given input value. By repeating a series of test cases, the NN will configure itself (more accurately, the weights of the connections between each neuron pair) so it responds properly to a given stimulus.

In a game-related example, we could encode a series of relevant values for an FPS bot, such as his life level, the life level of the player, their respective weapons, their relative distance, etc. and then get the network to generate different behavior identifiers (such as EVADE, SHOOT, HIDE, etc.). With time, we can "train" the network so it responds appropriately and its behavior is not only efficient, but also adaptive to the player's actions. Games such as the Battlecruiser series and Heavy Gear used them for control purposes, although they have not yet been embraced by most games, concerned about their stability and their actual impact on gameplay.

The discussion then moved to Genetic Algorithms (GA), which are basically useful in generating populations (for example enemies in an AI system), and optimizing them for a purpose (killing the player in the least possible time is a typical example). A GA would start by creating a sample population using a random combination of a "genetic code" (speed, aggressiveness, etc.), and then measuring them through some kind of fitness test. In our example, we could select the best 5 to 10% of that population. Then, using a genetic metaphor, we would "mate" these top individuals, and thus generate a second wave of better enemies, and so on. Being a local optimization model, each new wave will approach optimality asymptotically, so GA's are a great way to train AIs even before the game ships. The downside is that it takes a long time and hundreds (if not thousands) of generations for the system to converge on good solutions, so in-game uses of GA's are not feasible in most cases.

Another technique covered in the tutorial was Bayesian Learning (BL), which is getting lots of scientific papers in the last few years. BL is based in the Bayes theorem, which determines the probability of two facts being actually dependent or independent from each other. By examining examples and using a very simple arithmetic, we can determine if two facts are causally linked one another (and so if one happens we can predict the other one following close by), or if they are not related. By repeated evaluation of the Bayes theorem, we can insert examples in a global structure, called a Bayes Classifier (for the arithmetic version) or a Bayes Network, which is a graph-like structure that encode the probabilities of certain fact sequences, like a monster hitting us, then running away. Bayes Networks are probably going to be one of those must-have techniques in a few years, as they map a great number of real-world learning scenarios quite well. On the other hand, their memory footprint can be substantial. For those interested in them, the Game Programming Wisdom book is an excellent starting point.

Finally, the latter part of the tutorial dealt with more ethology-inspired techniques, such as reinforcement learning and episodic memories. In reinforcement learning, we try to learn by associating actions and rewards, so we use these rewards as action selectors for future behavior. In this respect, both passive (from another agent's actions) and active (from our own experience) reinforcement learning methods were explored, and the popular Q-Learning algorithm was used as an example.

Episodic memories, on the other hand, try to build data structures that store memories, such as encounters with other characters, goals completed, etc. so at a later stage we can query that memory, for example to drive a meaningful conversation with the player, or at least to remember his face. Data structures were suggested, with emphasis on a SOAR-based implementation at the University of Michigan the speakers are working on.

As a summary, the tutorial provided a gentle yet thorough introduction to the subject. The materials given will probably be the spark for many game studios to devote more time to researching actual applications of machine learning, as their potential to create new games or improve current genres is very high.

As a positive side note, it is interesting to see how the games industry is becoming more and more aware of the research being carried out at the academic world and how, in the field of AI, these two sectors are coming together to define how future games will behave. Only by blending the best from both sides of the road will current systems be surpassed, so next-generation gamers can enjoy more complex, interactive behaviors in future video games.

______________________________________________________

Read more about:

Features
Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like