Writing an iPhone Game Engine (Part 1- Memory management)

On the iPhone platform, memory is a very precious resources. If they are not handled properly, the application will receive one or two memory warning, and then your application will be killed by the OS. So I decided to write my own memory allocator to preallocate a large chunk of memory so that my game will not be killed by the OS when running, it will either start or not start. This is my first time to write a memory allocator, and it is not as sophisticated as "Ready, Set, Allocate!", but it just works fine enough for me.

In my little engine, a pool allocator is written for memory allocation, with different pre-defined pool size ranging from 8, 16, 32, 64bytes to 1048546bytes. As my target platform is iPhone, the maximum pool size is 1048546bytes which is used only for a few high resolution textures, most of the memory is spent on the the smaller pool size. During the program starts, a large chunk of memory is created and it is divided into different smaller chunks for different pool size as follows:
Notice that the large byte chuck is located in the smaller memory address for proper byte alignment. And within each chunk, it is divided into equally sized block for each particular size allocation:
The memory blocks within each chunk are maintained as a linked list so that when the pool memory allocator need to allocate/deallocate memory, it just need to return a free memory block from the list/add it back to the linked list. The memory within each memory blocks is used to store the 'next pointer' for the next free memory block in the linked list so that we do not need to allocate extra memory to keep track of the linked list (this approach is learnt from Game Engine Architecture):
For each allocation, the allocator need to decided which pool chunk need to be used depends on the size of the allocation. Then within that chunk, a free memory block is returned which is just the head of the linked list within that chunk. For each deallocation, as we divide the memory into different chunk, we know the boundary address of each pool chunk, so by checking the deallocated pointer address, we can determine which pool chunk it belongs to, then we can just add the memory back to that chunk's free memory block linked list. One drawback of this approach is it does not verify whether the deallocated pointer is actually allocated by the user nor it is double freed. To 'partly overcome' this problem, I added limited check to verify the input to each deallocation. First, I would check whether the input is byte aligned, for example if the deallocated pointer is within the 1048546 bytes chunk, then the pointer address must be 1048546 byte aligned. Second, as we partition the memory chunk, we know how many memory blocks is within each memory chunk, we can maintain a current free blocks number which will increase and decrease for each allocation and deallocation. When the program exits and this free block number does not match with the total number of blocks, then memory may either be leaked or double freed. But this only solve the problem partially.

To actually solve the problem and also check for memory leaks. I need to log every allocation and deallocation. Originally, for each allocation, I just store the returned pointer address and location of each allocation(which source file and line number) using the macro __FILE__, __LINE__. But this does not do well enough to track down all memory leak as some files are templated such as btAlignedAllocator.h in the bullet physics library(yes, I use bullet physics in my engine). Using the macro __FILE__, __LINE__ will only log down the allocation in these header file which does not help much for tracking memory leak. Therefore, I also log the callstack of each allocation using the system call backtrace() and backtrace_symbols()(which is available in Unix-like platform). Then I can track down all memory leak easily. However, logging every allocation is a very slow process and it is only enabled in debug build/ enabled when necessary.

In conclusion, my memory allocator still have different things to improve such as verifying the user deallocation; adding some meta-data within each memory block such as the allocation size; And for the thread safety, currently I only use a mutex to protected the memory, I may switch to a lock-free version in the future. Despite these short comings, this allocator works well enough for me as it avoid receiving memory warning from the iOS, avoiding memory fragmentation and help me track down memory leaks.

Reference:
[1] http://g.oswego.edu/dl/html/malloc.html
[2] Ready, Set, Allocate!: http://altdevblogaday.com/2011/04/11/ready-set-allocate-part-1/
[3] Game Engine Architecture: http://www.gameenginebook.com/
[4] http://gcc.gnu.org/onlinedocs/cpp/Standard-Predefined-Macros.html
[5] http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man3/backtrace_symbols.3.html

Writing an iPhone Game Engine (Part 0- Introduction)

This time I would like to talk about my hobby project which is writing an iPhone Game Engine. This project started from August last year and works with 2 artists. Although the game is still not finished, I want to share what I have learnt so far. Let's show some screen shots first:

Fighting against Ships

Exploring the game world
In the game, the player will control a ship to explore the world, discovering new cities and fighting with other ships. The player can also change the ships when the game progress.
I will have several blog posts to talk about the techniques I used in my little engine. The up coming topics will includes:
    - Memory management
    - Tools (Maya plugin and level editor)
    - Scripting (Lua)
    - Streaming system
    - Audio (OpenAL for effect sounds and Apple audio queue for BGM)
    - Performance tuning
After finishing the above topics I will talk about what I have done right and what have done wrong in this project.
In my next post, I will start to talk about the Memory management in my game engine. To end up this introductory post, I would like to show you more screen shots of the game and tools I have developed so far:
Profiling the game
Editor to compose a game entity

Level editor

Mac version of the game for artists to preview the game

SSAO using Line Integrals


Hi everyone, this is my first post in #AltDevBlogADay. Let me introduce myself first, I am Simon Yeung, currently working as a game programmer. I like graphics programming and sometimes write iPhone apps.
This time, I would like to talk about the SSAO implemented in my little demo program. I write this demo because I spent most of my time using openGL and know little about DirectX, so I decided to learn DX by writing this demo. So it is not well optimized.
SSAO, short for Screen Space Ambient Occlusion, is a technique for approximating the indirect shadow casted by surrounding scene geometry which is done in screen space by sampling from the depth buffer.
The SSAO is implemented using the line integrals from "Rendering techniques in Toy Story 3"[1]. Here is my results:
With SSAO

without SSAO

SSAO texture

Their method calculates the volume occluded by other objects inside a sphere at each fragment by sampling from the depth buffer.
From Slide 22 of the paper
The volume of sphere is found by using the equation:
From Slide 51 of the paper
And they use the Voronoi Diagram to associate the ratio of volume occupied by each sample points for their predefined sampling pattern.
But in my implementation, I didn't use the Voronoi Diagram, in stead, I tried to calculate the volume occupied by the depth sample using that equation in the pixel shader. However , due to the perspective projection, that equation no longer holds as the ray will not form a right angle triangle which is not the same as above figure, and resulting the artifact as below(the wall on the right side):
The artifact on the wall on the right side
So, I tried to solve the problem by using ray-sphere intersection to calculate a more accurate line integrals.
For example, when calculate the occlusion volume for the black cross in the above diagram (take 2 depth samples for easy explanation), I need to compute the length L1 and L2 by solving ray-sphere intersection. Also the length O1 and O2 can be computed by sampling from the depth buffer. Therefore, the volume of the sphere can be approximated by L1+L2 and the occlusion volume can be approximated by O1+O2. (I also added a distance attenuation factor to O1 and O2 if the depth difference is too large so that the tank does not occlude the wall in my demo program). And the AO value will be (O1+O2)/(L1+L2)
This eliminate the artifacts:
Solving ray-sphere intersection to eliminate the artifacts
The demo program uses 8 depth samples for each fragment. In order to fake a higher sample count, I also tried to rotate the sample points as suggested by the paper which gives a softer look for the AO:
Rotating the sample points
Then, a bilateral blur is applied to smooth out the noise. Although bilateral blur is not separable, it is faster to divided it into 2 passes (i.e. 1 horizontal and 1 vertical, just like Gaussian blur), with 5 samples for each pass, which gives a softer result:
After bilateral blur
Finally the SSAO texture is blend with the scene:
Applying SSAO to the scene
In conclusion, I finished the SSAO but it is not optimized, there are several places can be improved such as when calculating the line integrals, I made several branches in pixel shader which slow down a lot. Also I rotated the sample points by calculate a rotation angle using the fragment position in pixel shader which can also be optimized using a pre-computed rotation angle texture as the paper suggested. These things can be improved but my main purpose of this demo is make me familiar with DirectX, so I just left out the optimization and left as future enhancement.

References:
[1]: Rendering techniques in Toy Story 3, http://advances.realtimerendering.com/s2010/index.html
[2]: The brick texture is obtained from Crytek's Sponza Model: http://crytek.com/cryengine/cryengine3/downloads
[3]: The tank model is obtained from an XNA demo project:http://create.msdn.com/en-US/education/catalog/?contenttype=0&devarea=0&platform=21&sort=1