Sponsored By
Evan Todd, Blogger

September 29, 2015

10 Min Read

Developers hate him!

We'll cover some standard tips and tricks here, but we're not really interested in those. We're looking for the One Weird Trick to rule them all. Hopefully each trick we encounter brings us closer to coding Mecca.

In the beginning

The first video game I ever wrote was called Ninja Wars.

Yes, that is an HTML table of images. I changed the src attribute to move stuff around.

The top of the Javascript file looked like this:


var x = 314;
var y = 8;
var prevy= 1;
var prevx= 1;
var prevsw= 0;
var row= 304;
var endrow= 142;
var sword= 296;
var yrow = 0;
var yendrow = 186;
var I = 0;
var e = 0;
var counter = 0;
var found = 0;
var esword = 26;
var eprevsw = 8;
var bluehealth = 40;
var redhealth = 40;
var n = 0;
var you = 'ninja';
var bullet = 'sword';
var enemy = 'enemy';
var ebullet = 'enemysword';
var besieged = 0;
var siegecount = 0;
var esiegecount = 0;
var ebesieged = 0;
var healthcount = 0;
var player = 0;
var starcount = 0;
var infortress = false;
var prevyou = you;
var einfortress = false;
var prevenemy = enemy;
var previmg = "";
var prevbullet= "";
var eprevbullet= "";
var randnum = 0;
var randnum2 = 0;
var randnum3 = 0;
var randnum4 = 0;
var buildcount = 0;
var characters = new Array(4);
characters = ['ninja','tank','heli','builder'];
var bullets = new Array(3);
bullets = ['sword','bullet','5cal','sword'];
var echaracters = new Array(3);
echaracters = ['enemy','tank2','eheli','ebuilder'];
var ebullets = new Array(3);
ebullets = ['enemysword','bullet2','e5cal','enemysword'];
var health = new Array(4);
health = [40,30,20,10];
var prevorb = 0;
var prevnum = 0;

Hopefully this looks somewhat familiar, and I'm not the only one to start out writing code like this. Regardless, this debacle demonstrates our first trick:

Trick #1: globals are evil

We don't even know why they're evil, we just intuitively know.

edit: I should clarify that I'm no longer against globals. But that's getting ahead of ourselves.

Pretty soon we learn about objects. We can group our variables:


class Ninja
{
    int x, y;
    int previousX, previousY;
    int health = 100;
}

class Sword
{
    int x, y;
    int previousX, previousY;
    int sharpness = 9000;
}

We can even use inheritance to avoid copy-pasting:


class Movable
{
    int x, y;
    int previousX, previousY;
}

class Ninja : public Movable
{
    int health = 100;
}

class Sword : public Movable
{
    int sharpness = 9000;
}

Inheritance is nice! Nice enough to serve as our next trick:

Trick #2: object-oriented programming

Object-oriented is so great that it forms the core of many classic games, including Doom 3. Which, coincidentally, is open source.

Doom 3 loves inheritance. Don't believe me? Here's a small subset of its class hierarchy:


idClass
    idEntity
        idAnimatedEntity
            idWeapon
            idAFEntity_Base
                idAFEntity_ClawFourFingers
                idAFEntity_Vehicle
                    idAFEntity_VehicleFourWheels
                    idAFEntity_VehicleSixWheels
                idAFEntity_Gibbable
                    idAFEntity_WithAttachedHead
                    idActor
                        idPlayer
                        idAI

Imagine you're an employee at id Software. This inheritance hierarchy works great for a few months. Then, one fateful Monday, disaster strikes. The boss comes in and says "Hey, change of plans. The player is now a car."

Look at idPlayer and idAFEntity_VehicleFourWheels in the hierarchy. Big Problem #1: we need to move a lot of code.

Big Problem #2: the boss comes to his senses and calls off the "player is a car" idea. Instead, we're adding turrets to everything. The car is becoming a Halo Warthog, and the player is getting a giant turret mounted on his back.

As lazy programmers, we decide to use inheritance again to avoid copy-pasting. But look at the hierarchy. Where can we put the turret code? The only ancestor shared by idPlayer and idAFEntity_VehicleFourWheels is idAFEntity_Base.

We'll probably put the code in idAFEntity_Base and add a boolean flag calledturret_is_active. We'll only set it true for the car and the player. This works, but the terrible, logical conclusion is that our base classes end up loaded with tons of cruft. Here's the source code for idEntity.

Go ahead, scroll through it. You don't have to read it all.


https://github.com/id-Software/DOOM-3-BFG/blob/9c37079c16015fc58de29d3de366e0d93dc11f8a/neo/d3xp/Entity.h#L163

The point is, that's a lot of code. Notice how every single entity — down to the last piece of physics debris — has a concept of a team, and of getting killed. Clearly not ideal.

If you're a Unity developer, you already know the solution: components! Here's what they look like:

Rather than inheriting functionality, Unity entities are just bags of components. This solves our earlier turret problem easily: just add a turret component to the player and car entities.

Here's what Doom 3 might look like if it used components:


idPlayer
    idTransform
    idHealth
    idAnimatedModel
    idAnimator
    idRigidBody
    idBipedalCharacterController
    idPlayerController
idAFEntity_VehicleFourWheels
    idTransform
    idAnimatedModel
    idRigidBody
    idFourWheelController
...

What have we learned?

Trick #3: in general, favor composition over inheritance

Take a moment to review the tricks we've covered so far: global variables bad, objects good, components better.

You won't believe what happens next!

Let's take a small detour into the world of low-level performance with a very simple question: which function is faster?


double a(double x)
{
    return Math.sqrt(x);
}

static double[] data;
double b(int x)
{
    return data[x];
}

We'll hand-wave a lot of complexity away and just assume that these two functions eventually compile down to one x86 instruction each. Function a will probably compile to sqrtps, and function b might compile to something like lea("load effective address").

sqrtps takes about 14 CPU cycles on a modern Intel processor, according toIntel's manual. What about lea?

The answer is "it's complicated". It depends on where we load data from.

Registers

L1

32KB per core

64B line

4 cycles

L2

256KB per core

64B line

11 cycles

L3

6MB

64B line

40-75 cycles

Main memory

8GB

4KB page

100-300 cycles

That last number is important. 100-300 cycles to hit main memory! This means in any given situation, our biggest bottleneck is probably memory access. And from the looks of it, we can improve this by using L1, L2, and L3 cache more often. How do we do that?

Let's return to Doom 3 for a real-life example. Here's the Doom 3 update loop:


for ( idEntity* ent = activeEntities.Next();
    ent != NULL;
    ent = ent->activeNode.Next() )
{
    if ( g_cinematic.GetBool() && inCinematic && !ent->cinematic )
    {
        ent->GetPhysics()->UpdateTime( time );
        continue;
    }
    timer_singlethink.Clear();
    timer_singlethink.Start();
    RunEntityThink( *ent, cmdMgr );
    timer_singlethink.Stop();
    ms = timer_singlethink.Milliseconds();
    if ( ms >= g_timeentities.GetFloat() )
        Printf( "%d: entity '%s': %.1f ms\n", time, ent->name.c_str(), ms );
    num++;
}

From an object-oriented perspective, this code is pretty clean and generic. I'm assuming RunEntityThink calls some virtual Think() method where we could do just about anything. Very extensible.

Uh oh, here comes the boss again. He has some questions.

  • What's executing? Er... sorry boss, it depends on what entities are active at the time. We don't really know.

  • In what order is it executing? No idea. We add and remove entities to the list at random times during gameplay.

  • How can we parallelize this? Gee boss, that's a tough one. Since the entities execute randomly, they may be accessing each other's state. If we split them up between threads, who knows what might happen.

In short:

But wait, there's more! If we look closely, we see the entities are stored in a linked list. Here's how that might look in memory:

This makes our L1, L2, and L3 cache very sad. When we access the first item in the list, the cache thinks, "hey, I bet the next thing they'll want is nearby", and it pulls in the next 64 bytes after our initial memory access. But then we immediately jump to a completely different location in memory, and the cache has to wipe out those 64 bytes and pull in new data from RAM.

Trick #4: line up data in memory for huge performance gains

Like this:


for (int i = 0; i < rigid_bodies.length; i++)
    rigid_bodies[i].update();

for (int i = 0; i < ai_controllers.length; i++)
    ai_controllers[i].update();

for (int i = 0; i < animated_models.length; i++)
    animated_models[i].update();

// ...

An object-oriented programmer might be infuriated at this. It's not generic enough! But look, now we're iterating over a set of contiguous arrays (not arrays of pointers, mind you). Here's what it looks like in memory:

Everything is all lined up in order. The cache is happy. As a side bonus, this version allows us to answer all those pesky questions the boss had. We know what's executing and in what order, and it looks much easier to parallelize.

edit: Don't get too hung up on cache optimization. There's a lot more to it than just throwing everything in an array. I only bring it up to prove a point, and I'm only qualified to give the basic introduction. Check out the links at the bottom for more info.

Time to wrap this up and get to the point. What's the One Weird Trick? What do tricks 1-4 (and many more) all have in common?

The One Weird Trick: data first, not code first

Why were globals so evil (trick #1)? Because they allowed us to get away with lazy data design.

Why did objects help us (trick #2)? Because they helped us organize our data better.

Why did components help us even more (trick #3)? Because they modeled our data better by matching the structure of reality better.

Even the CPU likes it when we organize our data correctly (trick #4).

No really, what is the trick actually

Let's break it down in practical terms. Here's a representation of a typical Unity-like component-based game design.

Each component is an object. It has some state variables listed at the top, and then some methods to do stuff with those variables.

This is a well-designed object-oriented system, so the variables are private. The only code that can access them is the object's own methods. This is called "encapsulation".

Each object has a certain amount of complexity. But fear not! OOP promises that as long as we keep the state private, that complexity will stay encapsulated within the object, and won't spread to the other objects.

Unfortunately, this is a lie.

Sometimes, we need a function to access two or three objects. We end up either splitting the function between those objects, or writing a bunch of getters and setters so our function can access what it needs. Neither solution is very satisfying.

Here is the truth. Some things cannot be represented as objects very well. I propose an alternate paradigm, one which represents every program perfectly:

Once we separate process from data, things start to make more sense.

Object-oriented helps us write good code because it encourages us to encapsulate complexity (i.e. state). But it forces us to do so in a certain way. Why not instead encapsulate like this, if it makes sense within our problem space?

In summary, design data structures to match your specific problem. Don't shoehorn a single concept into a bunch of separate, encapsulated objects.

Next, write functions that leave the smallest possible footprint on that data. If possible, write pure stateless functions.

That's the trick.

Conclusion

If this struck a chord with you, it's because I stole most of it. Get it straight from the source:

Thanks for reading! Please let me know your thoughts in the comments.

Mirrored on my blog

Read more about:

Featured Blogs

About the Author

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like