Trending
Opinion: How will Project 2025 impact game developers?
The Heritage Foundation's manifesto for the possible next administration could do great harm to many, including large portions of the game development community.
Featured Blog | This community-written post highlights the best of what the game industry has to offer. Read more like it on the Game Developer Blogs or learn how to Submit Your Own Blog Post
The usual approach to memory management when developing an STL style Deque is often not very suitable for the demands of game development. This article goes over a different approach used in an internal STL implementation, it's benefits and drawbacks.
In the past I've been responsible for the development of an internal STL implementation, one specifically centred around game development and one focused on getting the best out of the sometimes limited resources game developers have to play with. I've discussed the reasons behind why we went with a unique implementation in the past on my blog, though I always wanted to cover how we implemented the STL deque, which I think is a sometimes misunderstood and unfortunately not a very often used container.
But first, what is a deque.
A good place to start is to say it’s pronounced ‘Deck’ and is short hand for ‘double ended queue’, giving you an idea of what it’s main strengths might be. Like a vector it offers constant time operations when adding or removing from the back, but unlike the vector it also offers the same when adding or removing from the front, making it a perfect container for queue style containers.
And it’s this later ability that makes it vastly different from a vector than the interface might otherwise suggest.
One major missing set of functions is something that you should always be using by default on a vector – reserve() and capacity(). And those missing functions should indicate an underlying difference in the way memory is allocated.
An STL implemented deque usually allocates memory in pools which are able to contain N number of entries in the container. Exceed the number that can fit in a pool and another pool is created, linked to the previous one. This gives it the ability to add values to the front and back of the container, in effect creating a chunky link list structure internally.
And it’s a good idea. By enabling a chunky (or sometimes not so chunky) list structure you have less of a need to preallocate memory, inserting values leans more towards a fixed, or more predictable, model and you lose the spikes that often accompany a vector should you keep extending past the current capacity.
But it’s not perfect. And it’s primary strength can also be it’s primary weakness.
Game developers like static memory. They like to allocate memory and stick with it. They don’t like memory that is allocating left, right and centre and they don’t like to have to guess at whats been allocated and when. Consoles like static memory too, and they like memory to be in the general vicinity of memory they are already looking at.
Coming from this, game developers and consoles generally don’t like link lists (though that’s not to say we don’t use them when we need to). And if we have to use them, it’s good to explicitly use them, or even better to use an intrusive style list instead.
But surely you can be a bit clever. Maybe setting the size of a pool to be how ever many elements you want to add, in other words trying to mimic reserve(). But then you still need to allocate more memory when you add to the front, or the back, or where-ever depending where your particular implementation of a deque drops in the first elements (and since there is nothing guiding the implementation in the standard it could be anywhere).
And some implementations (Visual Studio I am looking squarely at you) have horrific ways of specifying how big these pools should be and simply do not provide the flexibility you need when working with them.
So what did we want and how did we approach implementing our own Deque container?
The first thing we wanted was our reserve and capacity back (along with our custom resize_capacity and other related functions) because that gives programmers vital control over what’s being allocated and when. Granted you could probably get similar behaviour by using a different allocator, but we don’t want to have to use a unique allocator 99% of the time!
As a result (and it’s a good one) that leads us back to having a block of continuous memory allocated within the deque which can make traversing the container much more cache friendly and makes memory management much easier. It also removes the question of “if I remove this element will the memory layout change?”. We’re sticking with vector style memory, which dictates that memory will not be freed unless specifically requested, another feature that ties in with developers needing to be confident of what their containers are doing.
This also allows us to lead quite easily towards a fixed::deque container with the same interface and very similar behaviour which is much harder with the chunky linked list approach.
But obviously a vector has this memory layout, and doesn’t have constant time insertion at the start of the deque. So something needs to be different?
A Ring Buffer is a pretty common data structure in computer science and fits the bill perfectly. Add an element to the back or front, and simply wrap around the available heap until your buffer start/end hit each other. At that point either stop accepting new entries (as a fixed deque would) or allocate additional memory and keep adding (in our case we increase the allocated size by 50% to keep it consistent with the vector).
This allows us to have constant time insertion at the start/end but it does make our iterators a bit more complicated as they need to be aware of their heap and buffer boundaries, but that’s only a few pointers so it’s not to bad (and traversing linked lists using iterators isn’t much different so it’s not much of a trade off).
Obviously going down this route has it’s downsides. Inserting elements one after another without reserving is much slower but developers shouldn’t be doing that anyway and most importantly traversal of the deque is much improved. Benchmarks on the platform we develop for at work show an increase in traversal speed in the order of 100% on some platforms, with the worst being around 30% faster which is nothing to sniff at.
But a fundamental change like this does means that people coming to this Deque when they are intimately familiar with a std::deque will either miss the sometimes small tweaks and use it inefficiently or be thrown by what they see and be less inclined to use it. But decent documentation and mentoring can easily overcome those hurdles.
From the outset memory management and traversal were the areas we were most concerned about when looking at the ftl::deque, and while insertion is certainly slower if you don’t reserve (though no-one should be using an FTL Deque without reserving memory before hand) this is a price I’m quite happy to pay.
This post was originally published on Engineering Game Development on the 30th March 2010.
Read more about:
Featured BlogsYou May Also Like