On the iPhone platform, memory is a very precious resources. If they are not handled properly, the application will receive one or two memory warning, and then your application will be killed by the OS. So I decided to write my own memory allocator to preallocate a large chunk of memory so that my game will not be killed by the OS when running, it will either start or not start. This is my first time to write a memory allocator, and it is not as sophisticated as "Ready, Set, Allocate!", but it just works fine enough for me.
In my little engine, a pool allocator is written for memory allocation, with different pre-defined pool size ranging from 8, 16, 32, 64bytes to 1048546bytes. As my target platform is iPhone, the maximum pool size is 1048546bytes which is used only for a few high resolution textures, most of the memory is spent on the the smaller pool size. During the program starts, a large chunk of memory is created and it is divided into different smaller chunks for different pool size as follows:
Notice that the large byte chuck is located in the smaller memory address for proper byte alignment. And within each chunk, it is divided into equally sized block for each particular size allocation:
The memory blocks within each chunk are maintained as a linked list so that when the pool memory allocator need to allocate/deallocate memory, it just need to return a free memory block from the list/add it back to the linked list. The memory within each memory blocks is used to store the 'next pointer' for the next free memory block in the linked list so that we do not need to allocate extra memory to keep track of the linked list (this approach is learnt from Game Engine Architecture):
For each allocation, the allocator need to decided which pool chunk need to be used depends on the size of the allocation. Then within that chunk, a free memory block is returned which is just the head of the linked list within that chunk. For each deallocation, as we divide the memory into different chunk, we know the boundary address of each pool chunk, so by checking the deallocated pointer address, we can determine which pool chunk it belongs to, then we can just add the memory back to that chunk's free memory block linked list. One drawback of this approach is it does not verify whether the deallocated pointer is actually allocated by the user nor it is double freed. To 'partly overcome' this problem, I added limited check to verify the input to each deallocation. First, I would check whether the input is byte aligned, for example if the deallocated pointer is within the 1048546 bytes chunk, then the pointer address must be 1048546 byte aligned. Second, as we partition the memory chunk, we know how many memory blocks is within each memory chunk, we can maintain a current free blocks number which will increase and decrease for each allocation and deallocation. When the program exits and this free block number does not match with the total number of blocks, then memory may either be leaked or double freed. But this only solve the problem partially.
To actually solve the problem and also check for memory leaks. I need to log every allocation and deallocation. Originally, for each allocation, I just store the returned pointer address and location of each allocation(which source file and line number) using the macro __FILE__, __LINE__. But this does not do well enough to track down all memory leak as some files are templated such as btAlignedAllocator.h in the bullet physics library(yes, I use bullet physics in my engine). Using the macro __FILE__, __LINE__ will only log down the allocation in these header file which does not help much for tracking memory leak. Therefore, I also log the callstack of each allocation using the system call backtrace() and backtrace_symbols()(which is available in Unix-like platform). Then I can track down all memory leak easily. However, logging every allocation is a very slow process and it is only enabled in debug build/ enabled when necessary.
In conclusion, my memory allocator still have different things to improve such as verifying the user deallocation; adding some meta-data within each memory block such as the allocation size; And for the thread safety, currently I only use a mutex to protected the memory, I may switch to a lock-free version in the future. Despite these short comings, this allocator works well enough for me as it avoid receiving memory warning from the iOS, avoiding memory fragmentation and help me track down memory leaks.
Reference:
[1] http://g.oswego.edu/dl/html/malloc.html
[2] Ready, Set, Allocate!: http://altdevblogaday.com/2011/04/11/ready-set-allocate-part-1/
[3] Game Engine Architecture: http://www.gameenginebook.com/
[4] http://gcc.gnu.org/onlinedocs/cpp/Standard-Predefined-Macros.html
[5] http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man3/backtrace_symbols.3.html
Writing an iPhone Game Engine (Part 0- Introduction)
This time I would like to talk about my hobby project which is writing an iPhone Game Engine. This project started from August last year and works with 2 artists. Although the game is still not finished, I want to share what I have learnt so far. Let's show some screen shots first:
In the game, the player will control a ship to explore the world, discovering new cities and fighting with other ships. The player can also change the ships when the game progress.
I will have several blog posts to talk about the techniques I used in my little engine. The up coming topics will includes:
- Memory management
- Tools (Maya plugin and level editor)
- Scripting (Lua)
- Streaming system
![]() |
| Fighting against Ships |
![]() |
| Exploring the game world |
I will have several blog posts to talk about the techniques I used in my little engine. The up coming topics will includes:
- Memory management
- Tools (Maya plugin and level editor)
- Scripting (Lua)
- Streaming system
- Audio (OpenAL for effect sounds and Apple audio queue for BGM)
- Performance tuning
After finishing the above topics I will talk about what I have done right and what have done wrong in this project.
In my next post, I will start to talk about the Memory management in my game engine. To end up this introductory post, I would like to show you more screen shots of the game and tools I have developed so far:
- Performance tuning
After finishing the above topics I will talk about what I have done right and what have done wrong in this project.
In my next post, I will start to talk about the Memory management in my game engine. To end up this introductory post, I would like to show you more screen shots of the game and tools I have developed so far:
![]() |
| Profiling the game |
![]() |
| Editor to compose a game entity |
![]() |
| Level editor |
![]() |
| Mac version of the game for artists to preview the game |
SSAO using Line Integrals
This time, I would like to talk about the SSAO implemented in my little demo program. I write this demo because I spent most of my time using openGL and know little about DirectX, so I decided to learn DX by writing this demo. So it is not well optimized.
SSAO, short for Screen Space Ambient Occlusion, is a technique for approximating the indirect shadow casted by surrounding scene geometry which is done in screen space by sampling from the depth buffer.
The SSAO is implemented using the line integrals from "Rendering techniques in Toy Story 3"[1]. Here is my results:
| With SSAO |
| without SSAO |
| SSAO texture |
Their method calculates the volume occluded by other objects inside a sphere at each fragment by sampling from the depth buffer.
![]() |
| From Slide 22 of the paper |
![]() |
| From Slide 51 of the paper |
But in my implementation, I didn't use the Voronoi Diagram, in stead, I tried to calculate the volume occupied by the depth sample using that equation in the pixel shader. However , due to the perspective projection, that equation no longer holds as the ray will not form a right angle triangle which is not the same as above figure, and resulting the artifact as below(the wall on the right side):
| The artifact on the wall on the right side |
For example, when calculate the occlusion volume for the black cross in the above diagram (take 2 depth samples for easy explanation), I need to compute the length L1 and L2 by solving ray-sphere intersection. Also the length O1 and O2 can be computed by sampling from the depth buffer. Therefore, the volume of the sphere can be approximated by L1+L2 and the occlusion volume can be approximated by O1+O2. (I also added a distance attenuation factor to O1 and O2 if the depth difference is too large so that the tank does not occlude the wall in my demo program). And the AO value will be (O1+O2)/(L1+L2)
This eliminate the artifacts:
| Solving ray-sphere intersection to eliminate the artifacts |
| Rotating the sample points |
| After bilateral blur |
| Applying SSAO to the scene |
References:
[1]: Rendering techniques in Toy Story 3, http://advances.realtimerendering.com/s2010/index.html
[2]: The brick texture is obtained from Crytek's Sponza Model: http://crytek.com/cryengine/cryengine3/downloads
[3]: The tank model is obtained from an XNA demo project:http://create.msdn.com/en-US/education/catalog/?contenttype=0&devarea=0&platform=21&sort=1
訂閱:
文章 (Atom)











