Sponsored By

Designing and implementing a pool allocator data structure for memory management in games

A Implementation of a Pool allocator data structure in C++ for memory management.

Mateus Gondim Lima, Blogger

July 20, 2018

11 Min Read
Game Developer logo in a gray background | Game Developer

 

This post was originally posted on my blog

In game programming, we are constantly creating objects at runtime, such as meshes, sprites, vectors, textures and so on. Since we usually can not know how many objects we are going to need at runtime, the necessary memory for those objects have to be allocated dynamically.

Dynamic memory allocation, also known as heap allocation, it’s typically done in C++ via malloc() and free(), or with C++’s global new and delete operators. The problem with using the operating system’s heap allocator to allocate and deallocate memory for the objects in our games, is that of performance. The operating system’s heap allocator is very slow, mainly for two reasons. First, a heap allocator is a general purpose facility, so it needs to accept allocations of any size. This requires a lot of management overhead, making malloc() and free() inherently slow. Second, most operating systems do a context-switch from user mode to kernel mode to execute a call to malloc() or free(). These context switches can be extraordinarily expensive. So, one thing is certain, doing a lot of heap allocations at runtime, especially within a loop, will definitely degrade the performance of the game.

This situation leave us with a problem, we need to dynamically allocate memory for all those objects and assets that are created at runtime, but the default heap allocators are extremely slow. What most game engines do, is to implement one or more custom allocators. A custom allocator can have better performance characteristics than the operating system’s heap allocator for two reasons. First, a custom allocator can preallocate a memory block to answer the requests, allowing it to run in user mode and completely avoid the cost to context-switch to kernel mode. The memory block itself can be allocated using malloc() or new, or declared as a global variable. Second, by designing the allocator to meet specific usage patterns, it can avoid the management overhead of a general purpose allocator, making it much more efficient.

Pool Allocator

One very common memory usage pattern in game programming, is the need to allocate lots of small blocks of memory of the same size. This happens, for example, when we need to create multiple objects of the same type, such as matrices, meshes, AABBs, particles and so on. For this type of allocation pattern, a pool allocator is often the best choice.

The way pool allocators work, is by preallocating a large memory block whose size is an exact multiple of the size of each element that will be allocated. For example, suppose we want a pool allocator for creating 3D vectors, its size would be an exact multiple of 12 bytes — that’s 3 elements per vector times four (for 32-bit floats) or eight bytes (for 64-bit floats) per element. Each element of the pool is added to a linked list of free elements; on initialization, this free list contains all the elements of the pool. When a allocation request is made, we simply take the first element of the free list, and return it. Likewise, when a element is freed, we simply store it back on the beginning of the list. Because both allocations and frees require only a few pointer manipulations, those operations have a complexity of O(1).

The linked list of free elements, can be a singly-linked list, that is, we only need to store a single pointer for each free element of the list. One way of doing this, would be to preallocate another memory block for those pointers, a block of size (sizeof(void*) * num_pool_elements) bytes. But a better solution is to realize that those elements on the free list are, by definition, free memory elements, so instead of preallocating another memory block, we can store the ‘next’ pointers of the free list on the pool’s elements that are not being used. This approach, which does not waste any memory, works as long as the pool_element_size >= sizeof(void*).

As example, let’s see how a pool allocator of 5 elements, each with a size of 8 bytes, would manage its memory block and its free list.

                                         Figure 1. Pool allocator on initialization

As we can see from figure 1, on initialization we have all the elements of the pool inside the free list, see how we store each ‘next’ pointer inside the free memory blocks(elements). Also notice that we need to keep a pointer to the first element in the free list.

asdf
                                     Figure 2. Allocator after an allocation request

                             Figure 3. Allocator after two more allocation requests

In figure 2 we can see how the allocator satisfies a memory request. See how we simply return the first element in the free list, and update the head of the list to point to the new first element — the element that was pointed by the previous first element of the list. Figure 3 shows the pool after two more allocations.

                             Figure 4. Allocator after freeing the second element

In figure 4 we can see the pool after freeing the second memory block. As we can see, we move the freed memory block to the start of the list, for this, we need to do two updates. First, we set the head of the list to point to newly freed element. Second, we set the newly freed element’s next pointer, to point to the previous first element of the list.

                              Figure 5. Allocator after freeing the third element

Implementation

 The pool allocator data structure was implemented in C++ on the files Pool_allocator.hpp and Poll_allocator.cpp. In the next sections we’ll go into some implementation details on the initialization, allocation and freeing operations.


namespace mem {
	class Pool_allocator final {
		friend std::ostream & operator<<(std::ostream & os, Pool_allocator & pool);
	public:
		Pool_allocator();
		Pool_allocator(const std::size_t element_sz_bytes, const std::size_t num_elements, const std::size_t aligmnent);

		Pool_allocator(const Pool_allocator &alloc) = delete; //copy constructor
		Pool_allocator & operator=(const Pool_allocator &rhs) = delete; // copy-assignment operator
		Pool_allocator(Pool_allocator &&alloc) = delete;//move constructor
		Pool_allocator & operator=(Pool_allocator && rhs) = delete; // move-assignment operator

		~Pool_allocator();

		void alloc_pool(const std::size_t element_sz_bytes, const std::size_t num_elements, const std::size_t aligmnent);

		void *get_element();
		void  free_element(void *pelement);
		std::size_t	get_element_size() const;
	private:
		void  realease_pool_mem();

		std::size_t  m_pool_sz_bytes;    //the size in bytes of the pool
		std::size_t  m_element_sz_bytes; // size in bytes of each element
		std::size_t  m_alignment;         // the element/pool alignment requirement in bytes
		void*        m_pmemory;           // pointer to the first address of the pool, used to relase all the memory
		void**       m_ppfree_memory_list; // pointer to pointer, used to point to the head of the free list
	};

	std::ostream & operator<<(std::ostream & os, Pool_allocator & pool);
}

                                         Listing 1. Class declaration

In listing 01 we have the class declaration. We keep a double pointer to void for storing the address of the first element in the list (m_ppfree_memory_list) . We also store a pointer to the first memory address of the poll’s memory block(m_pmemory), this is used to deallocate the entire pool. The initialization is done in the function alloc_pool, we can request a pool’s element by calling get_element, and for freeing a pool’s element we use the free_element function.

The initialization function have as parameters the size in bytes of each element, the requested number of elements in the pool, and the alignment requirement. The first “part” of the function, basically checks the parameters for some error conditions, for example, it guarantees that the size of each element is at least as big as the size of (void*), verifies that the element’s size is a multiple of the alignment requirement and so on. The key part of the function is the creation of the pool’s free list, shown in listing 2.


//cast the void* to void**, so the memory is 'interpret' as a buffer of void*
m_ppfree_memory_list = static_cast<void**>(m_pmemory);
// get the address of the end of the memory block
std::uintptr_t end_address = reinterpret_cast<std::uintptr_t>(m_ppfree_memory_list) + (m_element_sz_bytes * num_elements);

for (std::size_t element_cnt = 0; element_cnt < num_elements; ++element_cnt) {
	//get the addresses of the current chunck and the next one
	std::uintptr_t curr_address = reinterpret_cast<std::uintptr_t>(m_ppfree_memory_list) + element_cnt * m_element_sz_bytes;
	std::uintptr_t next_address = curr_address + m_element_sz_bytes;
	//cast the address of the current chunk to a void**
	void **curr_memory = reinterpret_cast<void**>(curr_address);
	if (next_address >= end_address) { // last chunck
		*curr_memory = nullptr;
	}
	else {
		*curr_memory = reinterpret_cast<void*>(next_address);
	}
}

                                     Listing 2. Constructing the free list

To create the free list, we first initialize the head of the list, setting its value to the starting address of the pool’s memory block. Next, using the element’s size in bytes and the number of elements in the pool, we calculate the address of the last element. Finally, knowing the number of elements and the last element’s address, we iterate through each element in the pool, writing the address of the next element in its memory block, with the exception of the last element, whose next pointer value is set to null.


void* Pool_allocator::get_element()
{
    if (m_pmemory == nullptr) {
    	std::cerr << "ERROR " << __FUNCTION__ << ": No memory was allocated to this pool" << std::endl;
        return nullptr;
    }
    if (m_ppfree_memory_list != nullptr) {
  	void* pblock = reinterpret_cast<void*>(m_ppfree_memory_list);
	m_ppfree_memory_list = static_cast<void**>(*m_ppfree_memory_list);

	return pblock;
    }
    else { // out of memory blocks
  	std::cerr << "ERROR " << __FUNCTION__ << ": out of memory blocks" << std::endl;
	return nullptr;
    }
}

                                     Listing 3. The get_element function

The get_element function, shown in listing 3, is used to request a pool’s element. This function contains some code to check for an uninitialized pool, and for an out of memory pool, in both cases it returns a null pointer. The actual code to get a pool’s element is only two lines. First, it stores in a local variable, the address of the first element in the list(the return value), and second, sets the head of the list to point to the next element in the list, i.e, the element pointed by the previous first element.

 


void Pool_allocator::free_element(void* pblock)
{
	if (pblock == nullptr) {
		return;
	}
	
	if (m_pmemory == nullptr) {
		std::cerr << "ERROR " << __FUNCTION__ << ": No memory was allocated to this pool" << std::endl;
		return;
	}

	if (m_ppfree_memory_list == nullptr) { // the free list is empty
		m_ppfree_memory_list = reinterpret_cast<void**>(pblock);
		*m_ppfree_memory_list = nullptr;
	}
	else {
		void** ppreturned_block = m_ppfree_memory_list;
		m_ppfree_memory_list = reinterpret_cast<void**>(pblock);
		*m_ppfree_memory_list = reinterpret_cast<void*>(ppreturned_block);
	}
}

                                     Listing 4. The free_element function

The free_element function, shown in listing 4, implements the deallocation operation. As we can see, the process of moving the element back to the beginning of the list, is implemented in only a few lines broken into two cases. Case one, if the list was previously empty, i.e, the pool was out of memory, we simply set the head of the list to point to this newly added element, and set the new element’s next pointer to null. Case two, if the list was not empty, we set the head of the list to point to the newly added element, and set the new element’s next pointer to the address of the previous first element of the list, i.e, the element that was pointed by the head of the list.

As we can see, the pool allocator data structure is a great tool to have in our toolbox as game developers, it’s fast, the allocation and freeing operations are both O(1) functions, easy to implement, and it bypasses a very common problem in memory management, that of memory fragmentation. In a nutshell, memory fragmentation happens when, after lots of allocations and deallocations of different sizes, the memory becomes fragmented, i.e, free and used blocks of different sizes, interspersed throughout the memory. The problem with a fragmented memory, is that sometimes the allocator is unable to satisfy a request, because there is no free contiguous block of the requested size, only in fragmented smaller chunks, so we basically run out of memory before being actually out of memory. Because the pool allocator only allocates and frees elements of the same size, the allocator can never fail due to a lack of a large enough contiguous free block.

For more information on memory management i recommend [1] .

References

[1] Game Engine Architecture, Second Edition

Read more about:

Blogs
Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like