顯示包含「Graphics」標籤的文章。顯示所有文章
顯示包含「Graphics」標籤的文章。顯示所有文章

Trying out Voxel Cone Tracing...

It has been a long time since my last post, and this is just a quick post about what's going on in the past two months. I was trying to implement sparse voxel octree global illumination using cone tracing. Below are some screen shots showing my current progress:

Scene with direct lighting only
Scene with indirect lighting

Voxelized scene
building sparse octree

inject direct lighting to the voxelized scene
Another view of the voxelized scene

creating mip-map of the direct lighting
(mip-map resolution256x256x256)
mip-map resolution 128x128x128
mip-map resolution 64x64x64

AO computed from the voxelized scene
Another angle to view the scene

Scene lit by both direct and indirect light, notice
the shadowed floor reflect the red and blue curtain 
The back of the knight is illuminated by
 indirect lighting only
Another camera angle
Light buffer showing only indirect lighting
Light buffer with both direct and indirect lighting

The above screen shots are captured with 512x512x512 voxel resolution, illuminated by one directional light. The indirect illumination show some flicking when the camera move if the shadowing map resolution is not high. Also the performance is not good on my GTX460, performing anisotropic mip-mapping takes up most of the time... I will write more posts on it after tidying up all the stuffs, hope everything can be finished by the end of this year...


Angle based SSAO

Introduction
SSAO (Screen space ambient occlusion) is a common post processing effect that approximate how much light is occluded in a given surface by the surrounding objects. In this year SIGGRAPH, there are a few slides in "The Technology behind the Unreal Engine 4 Elemental Demo" about how they implement SSAO. Their technique can either use only the depth buffer or with the addition of per-pixel normal. And I tried to implement both version with a slight modification:


Using only the depth buffer
The definition of ambient occlusion is to calculate the visibility integral over the hemisphere of a given surface:


To approximate this in screen space, we design our sampling pattern as paired samples:
paired sample pattern
So for each pair of samples, we can approximate how much the shading point is occluded in 2D instead of integrating over the hemisphere:

The AO term for each given pair of samples will be min( (θleft + θright)/π, 1). Then by averaging the AO terms of all the sample pairs (in my case, there are 6 pairs), we achieve the following result:


Dealing with large depth differences
As seen from the above screen shot, there is dark halos around the knight. But the knight should not contribute AO to the castle as he is too far away. So to deal with the large depth differences. I adopt the approach used in Toy Story 3. If one of the paired sample is too far away from the shading point, say the red point in the following figure, it will be replace by the pink point, which is on the same plane as the other valid paired sample:


So we can interpolate between the red point and the pink point for dealing with the large depth difference. Now the dark halo has gone:


The above treatment only handle if one of the paired sample is far away from shading point. What if both of the samples have large depth differences?

dark halo artifact is shown around the sword
AO strength of this pic is increased to high light the artifact 

In this case, it will result in the dark halo around the sword in the above screen shot. Remember we are averaging the all the paired samples to compute the final AO value. So to deal with this artifact, we just assign a weight to each paired samples and then re-normalize the final result. Say, for each paired sample, if both of the samples are within a small depth differences, that sample pair will have a weight of 1. If only 1 sample is far away, that pair will have a weight of 0.5. And finally if both of the samples is far away, the weight will be 0. This can eliminate most(but not all) of the artifacts:

Approximating arc-cos function
In this approach, the AO is calculated by using the angle between the paired samples, which need to evaluate the arc-cos function which is a bit expensive. We can approximate acos(x) with a linear function:  π(1-x)/2.


And the resulting AO looks much darker with this approximation:

computed with the arc-cos function
computed with the linear approximation

Note that the maximum error between the two function is around 18.946 degree.


This may affect the AO for the area of a curved surface with low tessellation. You may either need to increase the bias angle threshold or switch to a more accurate function. So my second attempt is to approximate it with a quadratic function:  π(1- sign(x) * x * x)/2.


And this approximation shows a much similar result to the one using the arc-cos function.
computed with the arc-cos function
computed with the quadratic approximation

And the maximum error of this function is around 9.473 degree.


Using per-pixel normal
We can enhance the details of AO by making use of the per-pixel normal. The per-pixel normal is used for further restricting the angle to compute the AO where the angle θleft, θright are clamped to the tangent plane :


And here is the final result:


Conclusion
The result of this AO is pleasant by taking total 12 samples per pixel and with 16 rotation in 4x4 pixel block at half resolution. I did not apply bilateral blur to the AO result, but applying the blur may gives a softer AO look. Also approximating the arc-cos function with a linear function although is not accurate, but it gives a good enough result for me. Finally more time are need to spend on generating the sampling pattern in the future where the pattern I currently used is nearly uniform distributed (with some jittering).

References
[1] The Technology behind the Unreal Engine 4 Elemental Demo http://advances.realtimerendering.com/s2012/Epic/The%20Technology%20Behind%20the%20Elemental%20Demo%2016x9.pptx
[2] Rendering techniques in Toy Story 3
http://advances.realtimerendering.com/s2010/Ownby,Hall%20and%20Hall%20-%20Toystory3%20(SIGGRAPH%202010%20Advanced%20RealTime%20Rendering%20Course).pdf
[3] Image-Space Horizon-Based Ambient Occlusion
http://www.nvidia.com/object/siggraph-2008-HBAO.html
[4] http://www.wolframalpha.com/
[5] The models are export from UDK and extracted from Infinity Blade using umodel.exe

Shader Generator

Introduction
In the last few weeks, I was busy with rewriting my iPhone engine so that it can also run on the Windows platform (so that I can use Visual Studio in stead of Xcode~) and most importantly, I can play around with D3D11. During the rewrite, I want to improve the process of writing shaders so that I don't need to write similar shaders multiple times for each shader permutation (say, for each surface, I have to write a shader for static mesh, skinned mesh, instanced static mesh... multiplies with the number of render pass), and instead I can focus on coding how the surface would looks like. So I decided to write a shader generator that will generate those shaders which is similar to the surface shader in Unity. I choose the surface shader approach instead of a graph based approach like Unreal Engine, because being a programer, I feel more comfortable (and faster) to write code than dragging tree nodes using the GUI. In the current implementation of the shader generator, it can only generate vertex and pixel shaders for the light pre pass renderer which is the lighting model used before.

Defining the surface
To generate the target vertex and pixel shaders by the shader generator, we need to define how the surface looks like by writing surface shader. In my version of surface shader, I need to define 3 functions: vertex function, surface function and lighting function. The vertex function defines the vertex properties like position and texture coordinates.
  1. VTX_FUNC_OUTPUT vtxFunc(VTX_FUNC_INPUT input)
  2. {
  3. VTX_FUNC_OUTPUT output;
  4. output.position = mul( float4(input.position, 1), worldViewProj  );
  5. output.normal = mul( worldInv, float4(input.normal, 0) ).xyz;
  6. output.uv0 = input.uv0;
  7. return output;
  8. }
The surface function which describe how the surface looks like by defining the diffuse color of the surface, glossiness and the surface normal.
  1. SUF_FUNC_OUTPUT sufFunc(SUF_FUNC_INPUT input)
  2. {
  3. SUF_FUNC_OUTPUT output;
  4. output.normal = input.normal;
  5. output.diffuse = diffuseTex.Sample( samplerLinear, input.uv0 ).rgb;
  6. output.glossiness = glossiness;
  7. return output;
  8. }
Finally the lighting function will decide which lighting model is used to calculate the reflected color of the surface.
  1. LIGHT_FUNC_OUTPUT lightFuncLPP(LIGHT_FUNC_INPUT input)
  2. {
  3. LIGHT_FUNC_OUTPUT output;
  4. float4 lightColor = lightBuffer.Sample(samplerLinear, input.pxPos.xy * renderTargetSizeInv.xy );
  5. output.color = float4(input.diffuse * lightColor.rgb, 1);
  6. return output;
  7. }
By defining the above functions, writer of the surface shader only need to fill in the output structure of the function by using the input structure with some auxiliary functions and shader constants provided by the engine.

Generating the shaders
As you can see in the above code snippet, my surface shader is just defining normal HLSL function with a fixed input and output structure for the functions. So to generate the vertex and pixel shaders, we just need to  copy these functions to the target shader code which will invoke those functions defined in the surface shader. Take the above vertex function as an example, the generated vertex shader would look like:
  1. #include "include.h"
  2. struct VS_INPUT
  3. {
  4. float3 position : POSITION0;
  5. float3 normal : NORMAL0;
  6. float2 uv0 : UV0;
  7. };
  8. struct VS_OUTPUT
  9. {
  10. float4 position : SV_POSITION0;
  11. float3 normal : NORMAL0;
  12. float2 uv0 : UV0;
  13. };
  14. typedef VS_INPUT VTX_FUNC_INPUT;
  15. typedef VS_OUTPUT VTX_FUNC_OUTPUT;
  16. /********************* User Defined Content ********************/
  17. VTX_FUNC_OUTPUT vtxFunc(VTX_FUNC_INPUT input)
  18. {
  19. VTX_FUNC_OUTPUT output;
  20. output.position = mul( float4(input.position, 1), worldViewProj  );
  21. output.normal = mul( worldInv, float4(input.normal, 0) ).xyz;
  22. output.uv0 = input.uv0;
  23. return output;
  24. }
  25. /******************** End User Defined Content *****************/
  26. VS_OUTPUT main(VS_INPUT input)
  27. {
  28. return vtxFunc(input);
  29. }
During code generation, the shader generator need to figure out what input and output structure are needed to feed into the user defined functions. This task is simple and can be accomplished by using some string functions.

Simplifying the shader
As I mentioned before, my shader generator is used for generating shaders used in the light pre pass renderer. There are 2 passes in light pre pass renderer which need different shader input and output. For example in the G-buffer pass, the shaders are only interested in the surface normal data but not the diffuse color while the data need by second geometry pass are the opposite. However all the surface information (surface normal and diffuse color) are defined in the surface function inside the surface shader. If we simply generating shaders like last section, we will generate some redundant code that cannot be optimized by the shader compiler. For example, the pixel shader in G buffer pass may need to sample the diffuse texture which require the texture coordinates input from vertex shader but the diffuse color is actually don't needed in this pass, the compiler may not be able to figure out we don't need the texture coordinates output in vertex shader. Of course we can force the writer to define some #if preprocessor inside the surface function for the particular render pass to eliminate the useless output, but this will complicated the surface shader authoring process as writing surface shader is to describe how the surface looks like, ideally, don't need to worry about the output of a render pass.

So the problem is to figure out what the output data are actually need in a given pass and eliminate those outputs that are not needed. For example, given we are generating shaders for the G buffer pass and a surface function:

  1. SUF_FUNC_OUTPUT sufFunc(SUF_FUNC_INPUT input)
  2. {
  3. SUF_FUNC_OUTPUT output;
  4. output.normal = input.normal;
  5. output.diffuse = diffuseTex.Sample( samplerLinear, input.uv0 ).rgb;
  6. output.glossiness = glossiness;
  7. return output;
  8. }
We only want to keep the variables output.normal and output.glossiness. And the variable output.diffuse, and other variables that is referenced by output.diffuse (diffuseTex, samplerLinear, input.uv0) are going to be eliminated. To find out such variable dependency, we need to teach the shader generator to understand HLSL grammar and find out all the assignment statements and branching conditions to derive the variable dependency.

To do this, we need to generate an abstract syntax tree from the shader source code. Of course we can write our own LALR parser to achieve this goal, but I chose to use lex&yacc (or flex&bison) to generate the parse tree. Luckily we are working on a subset of the HLSL syntax(only need to define functions and don't need to use pointers) and HLSL syntax is similar to C language, so modifying the ANSI-C grammar rule for lex&yacc would do the job. Here is my modified grammar rule used to generate the parse tree. By traversing the parse tree, the variable dependency can be obtained, hence we know which variables need to be eliminated and eliminate them by taking out the assignment statements, then the compiler will do the rest. Below is the simplified pixel shader generated in the previous example:
  1. #include "include.h"
  2. cbuffer _materialParam : register( MATERIAL_CONSTANT_BUFFER_SLOT_0 )
  3. {
  4. float glossiness;
  5. };
  6. Texture2D diffuseTex : register( MATERIAL_SHADER_RESOURCE_SLOT_0 );
  7. struct PS_INPUT
  8. {
  9. float4 position : SV_POSITION0;
  10. float3 normal : NORMAL0;
  11. };
  12. struct PS_OUTPUT
  13. {
  14. float4 gBuffer : SV_Target0;
  15. };
  16. struct SUF_FUNC_OUTPUT
  17. {
  18. float3 normal;
  19. float glossiness;
  20. };
  21. typedef PS_INPUT SUF_FUNC_INPUT;
  22. /********************* User Defined Content ********************/
  23. SUF_FUNC_OUTPUT sufFunc(SUF_FUNC_INPUT input)
  24. {
  25. SUF_FUNC_OUTPUT output;
  26. output.normal = input.normal;
  27.                                                                  ;
  28. output.glossiness = glossiness;
  29. return output;
  30. }
  31. /******************** End User Defined Content *****************/
  32. PS_OUTPUT main(PS_INPUT input)
  33. {
  34. SUF_FUNC_OUTPUT sufOut= sufFunc(input);
  35. PS_OUTPUT output;
  36. output.gBuffer= normalToGBuffer(sufOut.normal, sufOut.glossiness);
  37. return output;
  38. }
Extending the surface shader syntax
As I use lex&yacc to parse the surface shader, I can extend the surface shader syntax by adding more grammar rule, so that writer of the surface shader can define what shader constants and textures are needed in their surface function to generate the constant buffer and shader resources in the source code. Also my surface shader syntax permit user to define their struct and function other than their 3 main functions (vertex, surface and lighting function), where they will also be copied into the generated source code. Here is a sample of how my surface shader would looks like:

  1. RenderType{
  2. opaque;
  3. };
  4. ShaderConstant
  5. {
  6. float glossiness : ui_slider_0_255_Glossiness;
  7. };
  8. TextureResource
  9. {
  10. Texture2D diffuseTex;
  11. };
  12. VTX_FUNC_OUTPUT vtxFunc(VTX_FUNC_INPUT input)
  13. {
  14. VTX_FUNC_OUTPUT output;
  15. output.position = mul( float4(input.position, 1), worldViewProj  );
  16. output.normal = mul( worldInv, float4(input.normal, 0) ).xyz;
  17. output.uv0 = input.uv0;
  18. return output;
  19. }
  20. SUF_FUNC_OUTPUT sufFunc(SUF_FUNC_INPUT input)
  21. {
  22. SUF_FUNC_OUTPUT output;
  23. output.normal = input.normal;
  24. output.diffuse = diffuseTex.Sample( samplerLinear, input.uv0 ).rgb;
  25. output.glossiness = glossiness;
  26. return output;
  27. }
  28. LIGHT_FUNC_OUTPUT lightFuncLPP(LIGHT_FUNC_INPUT input)
  29. {
  30. LIGHT_FUNC_OUTPUT output;
  31. float4 lightColor = lightBuffer.Sample(samplerLinear, input.pxPos.xy * renderTargetSizeInv.xy );
  32. output.color = float4(input.diffuse * lightColor.rgb, 1);
  33. return output;
  34. }
Conclusions
This post described how I generate vertex and pixel shader source codes for different render passes by defining a surface shader which avoid me to write similar shaders multiple times and without worrying the particular shader input and output for each render pass. Currently, the shader generator can only generate vertex and pixel shader in HLSL for static mesh in the light pre pass renderer. The shader generator is still under progress where generating shader source code for the forward pass is still have not done yet. Besides domain, hull and geometry shaders are not implemented. Also GLSL support is missing, but this can be generated (in theory...) by building a more sophisticated abstract syntax tree during parsing the surface shader grammar or defining some new grammar rule in the surface shader (using lex&yacc) for easier generating both HLSL and GLSL source code. But these will be left for the future as I still need to rewrite my engine and get it running again...

References
[1] Unity - Surface Shader Examples http://docs.unity3d.com/Documentation/Components/SL-SurfaceShaderExamples.html
[2] Lex & Yacc Tutorial http://epaperpress.com/lexandyacc/
[3] ANSI C grammar, Lex specification http://www.lysator.liu.se/c/ANSI-C-grammar-l.html
[4] ANSI C Yacc grammar http://www.lysator.liu.se/c/ANSI-C-grammar-y.html
[5] http://www.ibm.com/developerworks/opensource/library/l-flexbison/index.html
[6] http://www.gamedev.net/topic/200275-yaccbison-locations/


Photon Mapping Part 2

Introduction
Continue with previous post, this post will describe how light map is calculated from the photon map. My light map stores incoming radiance of indirect lighting on a surface which are projected into Spherical Harmonics(SH) basis. 4 SH coefficients is used  for each color channels. So 3 textures are used for RGB channels (total 12 coefficients).

Baking the light map
To bake the light map, the scene must have a set of unique, non-overlapping texture coordinates(UV) that correspond to a unique world space position so that the incoming radiance at a world position can be represented. This set of UV can be generated inside modeling package or using UVAtlas. In my simple case, this UV is mapped manually.
To generate the light map, given a mesh with unique UV and the light map resolution, we need to rasterize the mesh (using scan-line or half-space rasterization) into the texture space with interpolated world space position across the triangles. So we can associate a world space position to a light map texel. Then for each texel, we can sample the photon map at the corresponding world space position by performing a final gather step just like previous post for offline rendering. So the incoming radiance at that world space position, hence the texel in the light map, can be calculated. Then the data is projected into SH coefficients, stored in 3 16-bits floating point textures. Below is a light map that extracting the dominant light color from SH coefficients:

The baked light map showing the dominant
light color from SH coefficients

Using the light map
After baking the light map, during run-time, the direct lighting is rendering with usual way, a point light is used to approximated the area light in the ray traced version, the difference is more noticeable at the shadow edges.

direct lighting only, real time version
direct lighting only, ray traced version

Then we sample the SH coefficients from the light map to calculate the indirect lighting
indirect lighting only, real time version
indirect lighting only, ray traced version

Combining the direct and indirect lighting, the final result becomes:
direct + indirect lighting, real time version
direct + indirect lighting, ray traced version

As we store the light map in SH, we can apply normal map to the mesh to change the reflected radiance.
Rendered with normal map
Indirect lighting with normal map
We can also applying some tessellation, adding some ambient occlusion(AO) to make the result more interesting:
Rendered with light map, normal map, tessellation and AO
Rendered with light map, normal map, tessellation and AO
Conclusion
This post gives an overview on how to bake light map of indirect lighting data by sampling from the photon map. I use SH to store the incoming radiance, but other data can be stored such as storing the reflected diffuse radiance of the surface, which can reduce texture storage and doesn't require floating point texture. Besides, the SH coefficients can be store per vertex in the static mesh instead of light map. Lastly, by sampling the photon map with final gather rays, light probe for dynamic objects can also be baked using similar methods.

References
March of the Froblins: http://developer.amd.com/samples/demos/pages/froblins.aspx
Lighting and Material of HALO 3: http://www.bungie.net/images/Inside/publications/presentations/lighting_material.zip

Photon Mapping Part 1

Introduction

In this generation of computer graphics, global illumination (GI) is an important technique which calculate indirect lighting within a scene. Photon mapping is one of the GI technique using particle tracing to compute images in offline rendering. Photon mapping is an easy to implement technique, so I choose to learn it and my target is to bake light map storing indirect diffuse lighting information using the photon map. Photon mapping consists of 2 passes: photon map pass and render pass, which will be described below.


Photon Map Pass

In this pass, photons will be casted into the scene from the position of light source. Each photon store packet of energy. When photon hits a surface of the scene, the photon will either be reflected (either diffusely or specularly), transmitted  or absorbed, which is determined by Russian roulette. 

Photons are traced in the scene to simulate the light transportation

This hit event represents the incoming energy of that surface and will be stored in a k-d tree (known as photon map) for looking up in the render pass. Each hit event would store the photon energy, the incoming direction and the hit position.
However, it is more convenient to store radiance than storing energy in photon because when using punctual light source(e.g. point light), it is hard to compute the energy emits given the light source radiance. So I use the method described in Physically Based Rendering, a weight of radiance is stored in each photon:



When a photon hits a surface, the probability of being reflected in a new random direction used in Russian roulette is:



This probability equation is chosen because photon will have a higher chance of being reflected if it is brighter. If the photon is reflected, its radiance will be updated to:


And the photon will continue to trace in the newly reflected direction.

Render pass

In render pass, the direct and indirect lighting is computed separately. The direction lighting is computed using ray tracing.

Direct light only

The indirect lighting is computed by sampling from the photon map. When calculating the indirect lighting in a given position(in this case, the shading pixel), we can locate N nearby photons in photon map to estimate the incoming radiance using kernel density estimation. A kernel function need to satisfy the conditions:

I use the Simpson's kernel(also known as Silverman's second order kernel) suggested in the book Physically Based Rendering:

Then the density can be computed using kernel estimator for N samples within a distance d (i.e. the distance of the photon that is the most far away in the N samples):
Then the reflected radiance at the shading position can be computed with:
However, the result showing some circular artifact:

Using the photon map directly for indirect diffuse
light would show artifact
To tackle this problem, either increase the number of photon to a very high number, or we can perform a final gather step. In the final gather step, we shoot a number of final gather rays from the pixel that we are shading in random direction over the hemisphere of the shading point.

Final gather rays are casted from every shading position

When final gather ray hit another surface, then the photon map is queried just like before and the reflected radiance from this surface will be the incoming radiance of the shading pixel. Using Monte Carlo integration, the reflected radiance at the shading pixel can be calculated by sampling the final gather rays. Here is the final result:

Direct light + Indirect light, with final gather
Indirect light only, with final gather
Conclusion

In this post, the steps to implement photon map is briefly described. It is a 2 passes approach with the photon map pass building a photon map as kd-tree representing the indirect lighting data and the render pass use the photon map to compute the final image. In next part, I will describe how to make use of the photon map to bake light map for real time application.



References
A Practical Guide to Global Illumination using Photon Maps: http://nameless.cis.udel.edu/class_data/cg/jensen_photon_mapping_tutorial.pdf
Physically Based Rendering: http://www.pbrt.org/