tag:blogger.com,1999:blog-76594611797098964302024-03-18T12:32:52.073+08:00Simon's Tech BlogSimonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.comBlogger54125tag:blogger.com,1999:blog-7659461179709896430.post-39910343415976841002022-11-28T11:50:00.000+08:002022-11-28T11:50:38.218+08:00Studying "Spectral Primary Decomposition"<p><span style="font-size: large;"><b>Introduction</b></span></p><p>It has been a long time since my blog post (because of Covid, work, Elden Ring...). So I decided to study the <a href="https://graphics.geometrian.com/research/spectral-primaries.html">"Spectral Primary Decomposition for rendering with sRGB Reflectance"</a>, which used in previous posts, to recall my memory. It is an efficient technique to up-sample sRGB texture to spectral reflectance by multiplying the sRGB values with 3 precomputed basis functions:</p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgdqIQu-wTpHgRF00MYXJLGRJnS-GJ6PLAzz9M_oMQRC4pxUqAy4BjM53xs26wrD60Iz6_TSBnA7OC3q_FkTWBGN2ZxAAKx05EEL4L2A5GsDVy8A7h8JzqwX5-hb7dl3A4aKKYKBqQBjIvidG-zGIJwrlCZ8ZidW0Qic1qbvC1YCRAppo4ROrXDb94Lyg" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="655" data-original-width="2399" height="175" src="https://blogger.googleusercontent.com/img/a/AVvXsEgdqIQu-wTpHgRF00MYXJLGRJnS-GJ6PLAzz9M_oMQRC4pxUqAy4BjM53xs26wrD60Iz6_TSBnA7OC3q_FkTWBGN2ZxAAKx05EEL4L2A5GsDVy8A7h8JzqwX5-hb7dl3A4aKKYKBqQBjIvidG-zGIJwrlCZ8ZidW0Qic1qbvC1YCRAppo4ROrXDb94Lyg" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Overview of "Spectral Primary Decomposition" from the <a href="https://geometrian.com/data/research/spectral-primaries/SpectralPrimaryDecompositionPoster.pdf"><i>Explanatory Poster</i></a><br /></td></tr></tbody></table>In this post, I would like to find an efficient spectral up-sampling method which also support wider gamut (e.g. Display-P3...) or investigate why this technique does not support wider gamut.<p></p><p><br /></p><p><span style="font-size: large;"><b>Porting to Octave</b></span></p><p>In the paper, it provides sample source code written in Matlab. Since I do not have a Matlab license, so the first thing I need to do is to port the source code to the open source <a href="https://octave.org/">Octave</a> (ported source code can be found <a href="https://drive.google.com/file/d/16FwiLBZtGC8Mq2-P5tkFHVUwUuSL9jzE/view?usp=share_link">here</a>). During the porting process, the <a href="https://www.mathworks.com/help/optim/ug/fmincon.html"><b>fmincon()</b></a> used for finding the 3 spectral primary basis functions in Octave does not work, so I switched to use <a href="https://octave.sourceforge.io/octave/function/sqp.html"><b>sqp()</b></a> instead (also removed the <a href="https://www.mathworks.com/help/optim/ug/linprog.html"><b>linprog()</b></a> from original source code).</p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvcxulgukqaJy9I13XwE_fwWRDgvMV3pha351_iB7xlbhuwA9BHyN2YsW0H3KoKmE4mufhnVlBw8JOwjnI3LyadXPeu6dIgVnaSP6P8PSutsYrAooCbF_mDX8jhyfQH6rSvQSsWO7h5ffJ_FdiEuvf7GAuDT6WTkuI11CF07srnAmxA8v1c-Wg0XkZug/s2140/graph_linear.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1196" data-original-width="2140" height="224" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvcxulgukqaJy9I13XwE_fwWRDgvMV3pha351_iB7xlbhuwA9BHyN2YsW0H3KoKmE4mufhnVlBw8JOwjnI3LyadXPeu6dIgVnaSP6P8PSutsYrAooCbF_mDX8jhyfQH6rSvQSsWO7h5ffJ_FdiEuvf7GAuDT6WTkuI11CF07srnAmxA8v1c-Wg0XkZug/s2140/graph_linear.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Basis Functions generated in Octave<br /></td></tr></tbody></table><p></p><p>The resulting graph is not as smooth as the original paper. So I decided to try different initial value for the objective function. I chose a normalized Color Matching Function:</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjoVLRu0QKnxRdpvGSlNEVbzgWLZ3_2M0VbSMULIdlGSvocC3ZO0zc5EZfqrVZBOoX6C6VKtLUkoNsardOs2Gk3d2H2FpYy5gdbXSbIBbBkzwKK_dLvtuJvW0umzWD_XD7SqowhUXncNjdtfvLog3D6lTIUCeu6molQ_gx62HrldDRLVVkUhykZDGdPrw" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1196" data-original-width="2140" height="224" src="https://blogger.googleusercontent.com/img/a/AVvXsEjoVLRu0QKnxRdpvGSlNEVbzgWLZ3_2M0VbSMULIdlGSvocC3ZO0zc5EZfqrVZBOoX6C6VKtLUkoNsardOs2Gk3d2H2FpYy5gdbXSbIBbBkzwKK_dLvtuJvW0umzWD_XD7SqowhUXncNjdtfvLog3D6lTIUCeu6molQ_gx62HrldDRLVVkUhykZDGdPrw" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Basis Functions generated with normalized CMF initial value<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEh6guL7uvKHKFq-uPxDE9oPYf7qKhH5Awfy8x3R_JHbQ46BuCol7nx5oeFTB-QqNN3huUynOgnMaRX7PQz7N9j82gJaRWaZNd03gv8BkM7fYG4ERffPNhsz_r7ntXBuFI66xVsFeIyid26MH1SFuEEN4LkzcgUSSnYW4-TGEX8PPPUVrUhGmwvFiT86UQ" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="324" data-original-width="503" height="206" src="https://blogger.googleusercontent.com/img/a/AVvXsEh6guL7uvKHKFq-uPxDE9oPYf7qKhH5Awfy8x3R_JHbQ46BuCol7nx5oeFTB-QqNN3huUynOgnMaRX7PQz7N9j82gJaRWaZNd03gv8BkM7fYG4ERffPNhsz_r7ntXBuFI66xVsFeIyid26MH1SFuEEN4LkzcgUSSnYW4-TGEX8PPPUVrUhGmwvFiT86UQ=w320-h206" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Code for generating normalized CMF initial value<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p>The resulting curves look smoother with normalized CMF as initial value. Also, during the porting process, I switched to use CMF2006 2 degree observer instead of CMF1931 / 2006 10 degree observer used in original source code.</p><p><br /></p><p><span style="font-size: large;"><b>Working with wider gamut</b></span></p><p>So the next step is to change the color primaries from sRGB to Display-P3 (which the original source listed as infeasible). As expected, the result is not good, not only saturated color cannot be up-sampled, the color within the sRGB gamut are not similar to the original color, and saturated red color will have an orange tint after up-sampling: (Note that below images have Display-P3 color profile attached, to view those saturated color outside sRGB gamut, a wide gamut monitor is needed)<br /></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_0_1_sRGB.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="371" data-original-width="800" height="185" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_0_1_sRGB.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Up-sampled saturated sRGB color<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_0_1_P3.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="373" data-original-width="800" height="186" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_0_1_P3.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Up-sampled saturated P3 color</td></tr></tbody></table>
</td>
</tr>
</tbody></table><p>So, I tried to modify the objective function <i>opt_fn()</i> used in <a href="https://octave.sourceforge.io/octave/function/sqp.html"><b>sqp()</b></a> to include a weight to minimize the sRGB primaries color difference:</p><p></p><table>
</table><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjK7cxcz-nwo5ezHfyXMIU7mfdU2qCSVOcc5-_FlFwM5P7Tju8qVW7pC8H8uIHLARJC9qouYND8-DpsQw-YeGhyCv5DysvtqblZq3gJg-jGygNg2VjAAu3894x32MA2JAzQsf9rhFDWAMQo7VOd733p_j6nqT2bQ_rc2duqEWX9bs0YgE3E7rBIqYj-WQ" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1088" data-original-width="1484" height="470" src="https://blogger.googleusercontent.com/img/a/AVvXsEjK7cxcz-nwo5ezHfyXMIU7mfdU2qCSVOcc5-_FlFwM5P7Tju8qVW7pC8H8uIHLARJC9qouYND8-DpsQw-YeGhyCv5DysvtqblZq3gJg-jGygNg2VjAAu3894x32MA2JAzQsf9rhFDWAMQo7VOd733p_j6nqT2bQ_rc2duqEWX9bs0YgE3E7rBIqYj-WQ" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Code snippet of the objective function with sRGB primaries weight<br /></td></tr></tbody></table><p></p><p>The result improves a bit and the up-sampled saturated red has a less orange tint:</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_0_1_opt_sRGB.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="368" data-original-width="800" height="184" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_0_1_opt_sRGB.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Up-sampled saturated sRGB color with sRGB primaries weight<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_0_1_opt_P3.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="365" data-original-width="800" height="183" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_0_1_opt_P3.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Up-sampled saturated P3 color with sRGB primaries weight</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p>Up to this point, all the precomputed spectral primary basis functions are within [0, 1] range (i.e. to not reflect more light in each basis function), I was wondering what if we relax this constraint and enforce this limit after linearly combining all the basis functions. I have tried to relax the range of individual basis function to [-0.05, 1.05], [-0.075, 1.075] and [-0.1, 1.1] (details can be found in the visualization website from <a href="https://drive.google.com/file/d/16FwiLBZtGC8Mq2-P5tkFHVUwUuSL9jzE/view?usp=share_link">modified source code</a>). With the relaxed range, we can get very similar sRGB color after up-sampling:</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_n01_p11_sRGB.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="365" data-original-width="800" height="183" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_n01_p11_sRGB.png" width="400" /></a></div>However, for those saturated Display-P3 color, we still cannot up sample them exactly, and can only achieve slightly more saturated color compared to sRGB color:<p></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_n01_p11_P3.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="349" data-original-width="800" height="175" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/upsample_n01_p11_P3.png" width="400" /></a></div><p></p><p>The up-sampled saturated red is having a visible difference from the original color before up-sampling, I have tried to modify the objective function to only optimize the Red basis function (ignoring the Green and Blue basis functions), and still cannot get an exact up-sampled saturated red from a D65 light source. May be it is impossible to produce the most saturated Display-P3 red with a D65 light source without violating the physical constraint.<br /></p><p>Out of curiosity, I tried to plot the chromaticity diagram of the up-sampled color. The result shows that, using limited [0, 1] range, the up-sampling process can produce "more color" (but not accurate, e.g. red color will be up-sampled to "orange-red"), while using relaxed constraint will reduce the up-sampled color gamut.</p><table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjkQfw1SOgdnpuDad01ev1bHGEBfYBp2fw3qIOxjZF9UoiUCWFynCCrjWhdIAiCFrMIt-RFt2mtXYRO6qKq4oBZ28eWKuMhkwp9RxOA5DtPTDVsrKAgOoMpwJ5G_BA7sCmWt24WlJzgEgDBU4_fdNW_SS515Mn_FlHVZufA4M9vmZLYwySSogYBmLBkww" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1042" data-original-width="1042" height="240" src="https://blogger.googleusercontent.com/img/a/AVvXsEjkQfw1SOgdnpuDad01ev1bHGEBfYBp2fw3qIOxjZF9UoiUCWFynCCrjWhdIAiCFrMIt-RFt2mtXYRO6qKq4oBZ28eWKuMhkwp9RxOA5DtPTDVsrKAgOoMpwJ5G_BA7sCmWt24WlJzgEgDBU4_fdNW_SS515Mn_FlHVZufA4M9vmZLYwySSogYBmLBkww" width="240" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Chromaticity diagram of up-sampled color using limited [0, 1] range<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEg-DuWVpSuRJiyFZfAAMoS8dXqJZpuWmK7u1ayApTaMXg_BRjuM91sG6u3H95PTj4PnCtwD18T886PWuyMB5gIEO8qqQK3is1d6VsD9CuTb91-78jUD_egS3h_cyAjO5pNHiqb2EjVjIbI8v1L5L-ohJrwUghTfdv4S6vHx8WwvRm3WzcLYnfZGa7KTMw" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1042" data-original-width="1042" height="240" src="https://blogger.googleusercontent.com/img/a/AVvXsEg-DuWVpSuRJiyFZfAAMoS8dXqJZpuWmK7u1ayApTaMXg_BRjuM91sG6u3H95PTj4PnCtwD18T886PWuyMB5gIEO8qqQK3is1d6VsD9CuTb91-78jUD_egS3h_cyAjO5pNHiqb2EjVjIbI8v1L5L-ohJrwUghTfdv4S6vHx8WwvRm3WzcLYnfZGa7KTMw" width="240" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Chromaticity diagram of up-sampled color using relaxed [-0.1, 1.1] range</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p><br /></p><p><span style="font-size: large;"><b>CMF Reference White</b></span></p><p>Up to this point, the calculation for the up-sampled color is using D65 as reference white. But one day, I saw <a href="https://twitter.com/troy_s/status/1554567074281754624/photo/1">this tweet</a>:</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjay0Ag5vOzQK6f2IvvMRi6bQlojCh40G5UqKoHWfcuIzmdGi6t9d8DhMwpT_dnyN8aiN5zUoyJ73NBa3HEcR8PjP6TYCoJo7MKNQTll38TlzjNPo2RDfAAJOp3LlQfpGO2A2LqqPqkOfrfJ-JQwspzrsHqMgGYISQiOyA009ac8LaYwibOxOb89OFU_g" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="296" data-original-width="1751" height="109" src="https://blogger.googleusercontent.com/img/a/AVvXsEjay0Ag5vOzQK6f2IvvMRi6bQlojCh40G5UqKoHWfcuIzmdGi6t9d8DhMwpT_dnyN8aiN5zUoyJ73NBa3HEcR8PjP6TYCoJo7MKNQTll38TlzjNPo2RDfAAJOp3LlQfpGO2A2LqqPqkOfrfJ-JQwspzrsHqMgGYISQiOyA009ac8LaYwibOxOb89OFU_g" width="640" /></a></div><p>The CMF is using an <a href="http://yuhaozhu.com/blog/cmf.html">equal-energy white as its reference white</a>. So I was wondering whether all my calculation was wrong and should add chromatic adaptation after CMF integration.<br /></p><p>So, I decided to find the spectral reflectance of color checker to integrate with the CMF to verify whether chromatic adaption are needed after CMF integration. Using the color checker data found from <a href="https://babelcolor.com/colorchecker-2.htm">here</a>, illuminating those grey patches with D65 and then integrate the result with CMF get the following results: <br /></p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEgKGCBK9rJyA1trb3rSYgFPkAHqhRPXkbf_sStnsi_1_eWWUeFIj42Jgxpr9qQzrW8Vl63oIohEKVGeAjD-uujQ84U2Ijg62SL1XAAmaYTJ3Gwey62L2autr1JJK42XL60jWNDpqEvsfyM-DBgZKHiSpZTfBqYRsj4wHs2nWXkJwebQOEXxf0q2O0Pj9A" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="535" data-original-width="1307" height="164" src="https://blogger.googleusercontent.com/img/a/AVvXsEgKGCBK9rJyA1trb3rSYgFPkAHqhRPXkbf_sStnsi_1_eWWUeFIj42Jgxpr9qQzrW8Vl63oIohEKVGeAjD-uujQ84U2Ijg62SL1XAAmaYTJ3Gwey62L2autr1JJK42XL60jWNDpqEvsfyM-DBgZKHiSpZTfBqYRsj4wHs2nWXkJwebQOEXxf0q2O0Pj9A" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Illuminating grey patches with D65, integrate with CMF without CAT from Illuminant E </td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEhZvdrQce7Y9t-E9o-LITLOi6ADJ5vniuFjDc-2lrWr2IhRLOtMB3K_HiqINdTeO7vO5aOuaKto5sV1zXzMf2WX_3IbW3q7cVV5zgXMYmzdQLzWHp9mZmlugjyHQNfZUYYCGbN_XacAwTdhH9KkXRRxpcaxfK3bItUGA10TVLun1syCgJXHyOc1Vy0qxg" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="139" data-original-width="255" height="109" src="https://blogger.googleusercontent.com/img/a/AVvXsEhZvdrQce7Y9t-E9o-LITLOi6ADJ5vniuFjDc-2lrWr2IhRLOtMB3K_HiqINdTeO7vO5aOuaKto5sV1zXzMf2WX_3IbW3q7cVV5zgXMYmzdQLzWHp9mZmlugjyHQNfZUYYCGbN_XacAwTdhH9KkXRRxpcaxfK3bItUGA10TVLun1syCgJXHyOc1Vy0qxg=w200-h109" width="200" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">sRGB value of measured Color Checker (2005)<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p>Our computed sRGB value are very similar to the measured data, so it seems like we don't need an extra chromatic adaption to adapt the color from the CMF equal-energy reference white (or please let me know if my maths are incorrect).<br /></p><p><span style="font-size: large;"><b>Optimizing up sampling function with Color Checker Data</b></span><br /></p><p>After working with color checker data, I came up with an idea to modify the spectral basis objective function to include a weight to bias it to match with the neutral 6.5 grey patch spectral reflectance data. We can get a decent match for the up-sampled spectral reflectance of color checker grey patches (i.e. white 9.5, neutral 8, neutral 6.5, neutral 5, neutral 3.5, neutral 2). <br /></p><table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjXLERPJeG-M7nFzPazYq-r7Ypw8_tkI-SjKTDsGLWsyGk_A2cbi1JLOCGhOWVhOYx8aaO1pY3JKEC7RGzTwFcslBOYlWLWNgJzgRFlklXoI1VVcuP9VTLc2RtZKXg_3Vna4bAmwlarxJg9X6NJWtDyaMplfXS5HKUhsXQ5MVdFZuasSjsR1qPJZRUUzQ" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1201" data-original-width="2138" height="180" src="https://blogger.googleusercontent.com/img/a/AVvXsEjXLERPJeG-M7nFzPazYq-r7Ypw8_tkI-SjKTDsGLWsyGk_A2cbi1JLOCGhOWVhOYx8aaO1pY3JKEC7RGzTwFcslBOYlWLWNgJzgRFlklXoI1VVcuP9VTLc2RtZKXg_3Vna4bAmwlarxJg9X6NJWtDyaMplfXS5HKUhsXQ5MVdFZuasSjsR1qPJZRUUzQ" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Spectral Basis computed for Display-P3<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEjW1UTdcDNufWdLK4ej7I_sQ3xt3aqPjImDHt6V2vfnTn2MoBo91CyPX9sBjKtImUG-Ucw99o6u72fK6XwnZ4QxOfNRoEzasd5sOOqM3FyVfcc7Eu8f6LqbeHl7usiXbleRnrY8FYFqf1kD02o_g_y8ZnqG-BSTgQPjg3shXJdUbg3sKC7yUVm0qSercQ" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1201" data-original-width="2138" height="180" src="https://blogger.googleusercontent.com/img/a/AVvXsEjW1UTdcDNufWdLK4ej7I_sQ3xt3aqPjImDHt6V2vfnTn2MoBo91CyPX9sBjKtImUG-Ucw99o6u72fK6XwnZ4QxOfNRoEzasd5sOOqM3FyVfcc7Eu8f6LqbeHl7usiXbleRnrY8FYFqf1kD02o_g_y8ZnqG-BSTgQPjg3shXJdUbg3sKC7yUVm0qSercQ" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Spectral Basis weighted with Neutral 6.5 color checker patch<br /></td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEiaD8JFs6m2naRSAcOMu9xmotwpk4tk0wKAdk2sYzyO-tmBUvStEVoH9xGcR7WIyqvCMb9kbdbfRgLyAPD7mjcIa5YguxERDIvx5MkPG6dDxpmxfLJPf1kPwqfmszZiv9FSqogdeAAyrfOpGwHeODFAyvbgZoowIC_VXz-RyRoFfGUNUcjfT-Q4b_50Lg" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1194" data-original-width="2144" height="178" src="https://blogger.googleusercontent.com/img/a/AVvXsEiaD8JFs6m2naRSAcOMu9xmotwpk4tk0wKAdk2sYzyO-tmBUvStEVoH9xGcR7WIyqvCMb9kbdbfRgLyAPD7mjcIa5YguxERDIvx5MkPG6dDxpmxfLJPf1kPwqfmszZiv9FSqogdeAAyrfOpGwHeODFAyvbgZoowIC_VXz-RyRoFfGUNUcjfT-Q4b_50Lg" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Up-sampled spectral reflectance of color checker grey patches using Spectral Basis computed for Display-P3</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEhvoZSgoDURb_r8Qqd3bNDGh2kDCZDoVy6nuXvE2x9KKFBqH-tIeh-LgGCYfGSkchxIfZ_8IY3EVJtBVMAO02vhBN0tP9EL9fjQycewCZh8lps2_iXCfnyMGevxrfexDy94CQnFGP_RBIZdOo7n-a-0ddrAsV_U5stGGe3IwXwyVSTSE7NewAxtbRyn5g" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1194" data-original-width="2144" height="178" src="https://blogger.googleusercontent.com/img/a/AVvXsEhvoZSgoDURb_r8Qqd3bNDGh2kDCZDoVy6nuXvE2x9KKFBqH-tIeh-LgGCYfGSkchxIfZ_8IY3EVJtBVMAO02vhBN0tP9EL9fjQycewCZh8lps2_iXCfnyMGevxrfexDy94CQnFGP_RBIZdOo7n-a-0ddrAsV_U5stGGe3IwXwyVSTSE7NewAxtbRyn5g" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Up-sampled spectral reflectance of color checker grey patches using Spectral Basis weighted with Neutral 6.5 patch data<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p>However, the up-sampled white color will have a slight round-trip error:<br /></p><div class="separator" style="clear: both; text-align: center;"><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/color_checker_white_weight_6_5.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="123" data-original-width="800" height="99" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/studying_spectral_primary_decomposition/color_checker_white_weight_6_5.png" width="640" /></a></div><p></p><p><br /></p><p><span style="font-size: large;"><b>Conclusion</b></span></p><p>In this post, I have ported the original "Spectral Primary Decomposition" source code to Octave, tried to change it to up-sample Display-P3 color, but the result is not very good. Also, within a game engine, we usually have exposure and tone mapping adjustment, which affect the final pixel color. So I was wondering whether the up-sampling method should take those parameters into account. But doing so, the texture color meaning will be different from the PBR albedo texture. So, I will leave it for future investigation.</p><p><br /></p><p><b>References</b></p><p><span style="font-size: x-small;">[1] <a href="https://graphics.geometrian.com/research/spectral-primaries.html">https://graphics.geometrian.com/research/spectral-primaries.html</a></span></p><p><span style="font-size: x-small;">[2] <a href="http://yuhaozhu.com/blog/cmf.html">http://yuhaozhu.com/blog/cmf.html</a> </span></p><p><span style="font-size: x-small;">[3] <a href="https://babelcolor.com/colorchecker-2.htm">https://babelcolor.com/colorchecker-2.htm</a></span><br /></p><p><br /></p><p></p>Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-39650620385427402492021-08-01T15:48:00.003+08:002021-08-02T02:07:23.837+08:00Color Matching Function Comparison<p><b><span style="font-size: large;">Introduction</span></b></p><p>When performing spectral rendering, we need to use the Color Matching Function(CMF) to convert the spectral radiance to XYZ values, and then convert to RGB value for display. Different people have a slight variation when perceiving color, and <a href="http://files.cie.co.at/873_CIE%20Research%20Strategy%20%28August%202016%29%20-%20Topic%206.pdf">age may also affect how color are perceived</a> too. So the <a href="https://en.wikipedia.org/wiki/International_Commission_on_Illumination">CIE</a> defines several standard observers for an average person. The commonly used CMF are CIE 1931 2° Standard Observer and CIE 1964 10° Standard Observer. Beside these 2 CMF, there also exist other CMF such as <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space#Similar_color_spaces">Judd and Vos modified CIE 1931 2° CMF</a> and <a href="http://www.cvrl.org/ciexyzpr.htm">CIE 2006 CMF</a>. In this post, I will try to compare the images rendered with different CMF (as well as some analytical approximation). A demo can be downloaded <a href="https://drive.google.com/file/d/1l4RECRtmaFds_0Gwx5ccwQuYR8v7Pt_o/view?usp=sharing">here</a> (the demo renders using wavelength between [380, 780]nm, which may introduce some error with CMF that have a larger range).<br /></p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGeFfpdLxVzxBCvYuNLymOYcZNhdIP6JLcj9RUnaknXUZlaGtuQ0cJwt9p5OZ3vFLm5CLrczV-mgx-B_YYqU6utRBoAk4YGFtkV57D_u9Y2eHuW2UvcDEELz91GO30of9nTIoB-HW3_OHQ/s16000/main.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="694" data-original-width="1362" height="326" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjGeFfpdLxVzxBCvYuNLymOYcZNhdIP6JLcj9RUnaknXUZlaGtuQ0cJwt9p5OZ3vFLm5CLrczV-mgx-B_YYqU6utRBoAk4YGFtkV57D_u9Y2eHuW2UvcDEELz91GO30of9nTIoB-HW3_OHQ/w640-h326/main.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Left: rendered with CIE2006 CMF<br />Right: rendered with CIE1931 CMF <br /></td></tr></tbody></table><p></p><p></p><p><b><span style="font-size: large;">CMF Luminance</span></b><br /></p><p>When I was implementing different CMF into my renderer, replacing the CMF directly will result in slightly different brightness of the rendered images:</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigEzQFi54bDLMPLQ72uMPRYPiqxZpWIg2jHjpqMto2dEf-oPaIJfeWRARm17Vb4bY8ZdVMARPEYkOk5ylSRWaRNY1t9I_EzlqzNq9tLA3HjBmR5eEVk-qfd4_kIV8bzLvEW87vHXkr5b7K/s16000/brightness_1931.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigEzQFi54bDLMPLQ72uMPRYPiqxZpWIg2jHjpqMto2dEf-oPaIJfeWRARm17Vb4bY8ZdVMARPEYkOk5ylSRWaRNY1t9I_EzlqzNq9tLA3HjBmR5eEVk-qfd4_kIV8bzLvEW87vHXkr5b7K/w320-h170/brightness_1931.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Rendered with 1931 CMF<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiYuuQHand1P9csZdrYloEiyz0YHMY6-IlMAQq2I1b8gTE-rVZWXwo4Q9dsKnyZ4Hx-qFSAiSKf57SCenBDIoct8oj4b3SRnyjFcPKFK-t3B3i-fMFkVLo0Cch3nrm4n1d5uV_9qv7L-2T6/s16000/brightness_1964.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiYuuQHand1P9csZdrYloEiyz0YHMY6-IlMAQq2I1b8gTE-rVZWXwo4Q9dsKnyZ4Hx-qFSAiSKf57SCenBDIoct8oj4b3SRnyjFcPKFK-t3B3i-fMFkVLo0Cch3nrm4n1d5uV_9qv7L-2T6/w320-h170/brightness_1964.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Rendered with 1964 CMF<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p>This is because the renderer uses photometric units (e.g. lumen, lux..) to define the brightness of the light sources. Since the definition of <a href="https://en.wikipedia.org/wiki/Luminous_energy">luminous energy</a> depends on the luminosity function (usually the <i>y(</i><span><i>λ)</i> of CMF), we need to calculate the intensity of the light source with respect to the chosen CMF. Using the correct </span><span>luminosity function, both rendered images have similar brightness:</span> <br /></p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigEzQFi54bDLMPLQ72uMPRYPiqxZpWIg2jHjpqMto2dEf-oPaIJfeWRARm17Vb4bY8ZdVMARPEYkOk5ylSRWaRNY1t9I_EzlqzNq9tLA3HjBmR5eEVk-qfd4_kIV8bzLvEW87vHXkr5b7K/s16000/brightness_1931.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigEzQFi54bDLMPLQ72uMPRYPiqxZpWIg2jHjpqMto2dEf-oPaIJfeWRARm17Vb4bY8ZdVMARPEYkOk5ylSRWaRNY1t9I_EzlqzNq9tLA3HjBmR5eEVk-qfd4_kIV8bzLvEW87vHXkr5b7K/w320-h170/brightness_1931.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Rendered with 1931 CMF</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPkHAPSwGRqIuaBkJYQPzbHZAgIzNccT4koIw4-GIxJhBijRIGaJonrQsDWgWwpTV9EqFTR6m7SZDjFwPJpEUZfh4OGEhBW78vbAcBzTCzFthV2dY2wap4xGCAsyOpP6EPE25IbjJX5T8l/s16000/brightness_1964_fix.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPkHAPSwGRqIuaBkJYQPzbHZAgIzNccT4koIw4-GIxJhBijRIGaJonrQsDWgWwpTV9EqFTR6m7SZDjFwPJpEUZfh4OGEhBW78vbAcBzTCzFthV2dY2wap4xGCAsyOpP6EPE25IbjJX5T8l/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Rendered with 1964 CMF + luminance adjustment<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p> </p><p><span style="font-size: large;"><b>CMF White Point</b></span></p><p>When using different CMF, the white point of different standard illuminant will be <a href="https://en.wikipedia.org/wiki/Standard_illuminant#White_points_of_standard_illuminants">slightly different</a>:</p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPigFFk2zsX_Ham6hj5d8289yRvBGZ5iBXIRJVMBSl5che8P0AqZ6kwDReskARLJfyJnZ07rIdVorpnML-jiePQBIWxIPQfN32cJh2bdYIxYLL4p4WtFwSsHXheiUu3aMRlw1QSdcy_IiF/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="327" data-original-width="1012" height="206" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPigFFk2zsX_Ham6hj5d8289yRvBGZ5iBXIRJVMBSl5che8P0AqZ6kwDReskARLJfyJnZ07rIdVorpnML-jiePQBIWxIPQfN32cJh2bdYIxYLL4p4WtFwSsHXheiUu3aMRlw1QSdcy_IiF/w640-h206/white_point.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">White point from wikipedia<br /></td></tr></tbody></table><p></p><p>Since we are dealing with game texture, color are usually defined in sRGB with a D65 white point, we need to find the white point of the D65 illuminant for the CMF that will be tested in this post. Unfortunately, I can't find D65 white point for the CIE 2006 CMF on the internet, so I calculated it myself (The
calculation steps can be found in the <a href="https://colab.research.google.com/drive/1EIqkm9-Y6Sl0VpiOBSgXM4XlUxutwmGz?usp=sharing">Colab source code</a>): </p><p></p><blockquote><p>CIE 2006 2° : (0.313453, 0.330802) </p><p>CIE 2006 10° : (0.313786, 0.331275) </p></blockquote><p></p><p>But when I rendered some images with and without chromatic adaptation, the result looks similar: </p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgw-UObfngj-QPoCHSExwkpX0Zxu215aARfF4cbD3ZklffMRkgshGlrIalA1v-bQsMOjwM97isg16bQrI8Min7ppGL9W8mvuvJQhCqUXVd-2lb9bTLiryGvBN05de7P4itimkX-tcpN3V4S/s16000/whitePt_1964_noCAT.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgw-UObfngj-QPoCHSExwkpX0Zxu215aARfF4cbD3ZklffMRkgshGlrIalA1v-bQsMOjwM97isg16bQrI8Min7ppGL9W8mvuvJQhCqUXVd-2lb9bTLiryGvBN05de7P4itimkX-tcpN3V4S/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">1964 CMF without chromatic adaptation</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBmaoQYdxUZf3WtovCynVUFjCfcBB6saKQPFZK6zLLLrglGMFWmJjERfJ6iegllptPD_9B1q8XjosHwJENvt4Wv-vcXT5urXkjqz0I5h2kzeeF7f7iFcl63Ab26pe7Pm91FV9go-LUIix2/s16000/whitePt_1964_withCAT.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBmaoQYdxUZf3WtovCynVUFjCfcBB6saKQPFZK6zLLLrglGMFWmJjERfJ6iegllptPD_9B1q8XjosHwJENvt4Wv-vcXT5urXkjqz0I5h2kzeeF7f7iFcl63Ab26pe7Pm91FV9go-LUIix2/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">1964 CMF with chromatic adaptation</td></tr></tbody></table>
</td>
</tr>
</tbody></table><p>So I searched on the internet, I can't find any information whether we need to chromatic adapt the rendered image due to different white point when using different CMF... May be this is because the difference is so small that applying chromatic adaptation makes no visible difference. </p><p><br /></p><p><span style="font-size: large;"><b>CIE 2006 CMF analytical approximation</b></span><br /></p><p>The popular CIE 1931 and 1964 CMF have simple analytical approximation, such as: <a href="http://jcgt.org/published/0002/02/01/paper.pdf">"Simple Analytic Approximations to the CIE XYZ Color Matching Functions"</a> (which will be tested in this post). The newer CIE 2006 CMF lacks such an approximation. So I derived one using similar methods and the curve fitting process can be found in the <a href="https://colab.research.google.com/drive/1EIqkm9-Y6Sl0VpiOBSgXM4XlUxutwmGz?usp=sharing">Colab source code</a>.<br /></p><p></p><p>2006 2° lobe approximation:</p><p>
</p><table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwbzrG4RDJL80n3FkQ5gC7yoLpgRl-BsV1hJs-KxQmC9zohL_DJA8tAp6_EyeaAhsKOg0JZEgJYE73hIMytOPyXB3MSaC-jRIN1qgLK-CqHs8lkOss149tIvLOVSNDyZP_tNIHfrnaaEkg/s16000/approx_CMF_2006_2.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="269" data-original-width="1004" height="173" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwbzrG4RDJL80n3FkQ5gC7yoLpgRl-BsV1hJs-KxQmC9zohL_DJA8tAp6_EyeaAhsKOg0JZEgJYE73hIMytOPyXB3MSaC-jRIN1qgLK-CqHs8lkOss149tIvLOVSNDyZP_tNIHfrnaaEkg/s16000/approx_CMF_2006_2.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">2006 2° lobe approximation shader source code<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivQzQ31lYUyroNFPFEs_41byqEVgbiZtSzNhWoSP_Z02i1pAtVbggwreFvqCyNiNlLWfLeq9hEPbkkzn0Fk_E49bhQp2A0gBz3r59WNDI6Ag-tsrPtSrYgXjnkfd9NH1UvlPwSzXpgtR_R/s16000/CMF_2006_2.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="249" data-original-width="378" height="132" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivQzQ31lYUyroNFPFEs_41byqEVgbiZtSzNhWoSP_Z02i1pAtVbggwreFvqCyNiNlLWfLeq9hEPbkkzn0Fk_E49bhQp2A0gBz3r59WNDI6Ag-tsrPtSrYgXjnkfd9NH1UvlPwSzXpgtR_R/s16000/CMF_2006_2.png" width="200" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">black lines: exact 2006 2° CMF<br />color lines: approximated 2006 2° CMF<br /></td><td class="tr-caption" style="text-align: center;"> </td><td class="tr-caption" style="text-align: center;"><br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>2006 10° lobe approximation: <br /><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglncfgAB5q9Nmebi3xgPMwo9yV9YJr4n3HtR26a6aSxR-afuoZYR_YM0RjaM26btPD_7OV3pU4wdX-DCjbNOx7OUGOb5qHlnMJcs2gFVmNvN7ehxA0cNQyTA40yQNYAvu3Rm1jdfmDChdp/s16000/approx_CMF_2006_10.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="223" data-original-width="1004" height="142" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglncfgAB5q9Nmebi3xgPMwo9yV9YJr4n3HtR26a6aSxR-afuoZYR_YM0RjaM26btPD_7OV3pU4wdX-DCjbNOx7OUGOb5qHlnMJcs2gFVmNvN7ehxA0cNQyTA40yQNYAvu3Rm1jdfmDChdp/s16000/approx_CMF_2006_10.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">2006 10° lobe approximation shader source code</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEhuyJwCp-UaLpIZYzl5ik2Q8k0CqIDIbXw4O7cu51O4qaJBcCRoShwtFlyGPb3n5Y2DnpGyGZbirjSUTwDyjvdgd2CAM-z_g3fQ28kSdPuvdIvMN-kTFvTPTvMvcTaM9LHtS5Uj6a5yNQ/s16000/CMF_2006_10.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="248" data-original-width="372" height="133" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEhuyJwCp-UaLpIZYzl5ik2Q8k0CqIDIbXw4O7cu51O4qaJBcCRoShwtFlyGPb3n5Y2DnpGyGZbirjSUTwDyjvdgd2CAM-z_g3fQ28kSdPuvdIvMN-kTFvTPTvMvcTaM9LHtS5Uj6a5yNQ/s16000/CMF_2006_10.png" width="200" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">black lines: exact 2006 10° CMF<br />color lines: approximated 2006 10° CMF</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p><b><span style="font-size: large;">Saturated lights comparison</span></b></p><p>With the above changes to the path tracer, we can render some images for comparison. A scene with several saturated lights using sRGB color (1,0,0), (1,1,0), (0,1,0), (0,1,1), (0,0,1), (1,0,1) is tested (which will be spectral up-sampled). 10 different CMF are used:</p><ul style="text-align: left;"><li>CIE 1931 2° </li><li>CIE 1931 2° with Judd Vos adjustment</li><li>CIE 1931 2° single lobe analytic approximation</li><li>CIE 1931 2° multi lobe analytic approximation</li><li>CIE 1964 10° </li><li>CIE 1964 10° single lobe analytic approximation</li><li>CIE 2006 2°</li><li>CIE 2006 2° lobe analytic approximation</li><li>CIE 2006 10°</li><li>CIE 2006 10° lobe analytic approximation</li></ul><p>Here are the results:</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg55au6DuK5e8YR3YbTASCFbtpcYnyAk6J4909IV6NpRcNj6sPNYYe8DfTKOUiImGNle9vYBfXfMaaHfiNLK-xSJv4iTryXun0-kphv6KIfL4VCammAs4CRS6X1FHQi9p9PC1nAA9RPk1Gk/s16000/saturated_lights_1931.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg55au6DuK5e8YR3YbTASCFbtpcYnyAk6J4909IV6NpRcNj6sPNYYe8DfTKOUiImGNle9vYBfXfMaaHfiNLK-xSJv4iTryXun0-kphv6KIfL4VCammAs4CRS6X1FHQi9p9PC1nAA9RPk1Gk/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1931 2°</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7OKo-HlXea3dDmYcBdflaM-IUvrGYvFEvLMv8Z2AWZsWsOutQyWtYkebRzd-XZamIRJ-pEY_lyuzQL0Q_s_PiUFAZZhhQJI1rqsU53qRxXPg42vBXK_G-JQ521E3-q8zBXzZ0s6qwU3rq/s16000/saturated_lights_1931_jv.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7OKo-HlXea3dDmYcBdflaM-IUvrGYvFEvLMv8Z2AWZsWsOutQyWtYkebRzd-XZamIRJ-pEY_lyuzQL0Q_s_PiUFAZZhhQJI1rqsU53qRxXPg42vBXK_G-JQ521E3-q8zBXzZ0s6qwU3rq/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1931 2° with Judd Vos adjustment</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaOlppOb-kwEd84kMRYmRjT-fMZmeGenQQ_BwYT5VmofjYRDwwoHDBAZd3XZQonSx-64TNUaoUjC2bLcPlm6EKCzcvOyAmGk1-Kczuoe2heE-II1kAMrR2nIRHx2ROguDIeRMlFPX5fgz3/s16000/saturated_lights_1931_single_lobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaOlppOb-kwEd84kMRYmRjT-fMZmeGenQQ_BwYT5VmofjYRDwwoHDBAZd3XZQonSx-64TNUaoUjC2bLcPlm6EKCzcvOyAmGk1-Kczuoe2heE-II1kAMrR2nIRHx2ROguDIeRMlFPX5fgz3/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1931 2° single lobe analytic approximation</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_r76f6QdZXYB1xVLYbaCAyymBtDqBwSjhW-TdcoKRPqSFijQZQ51lnCrDizOYNqNmp7F7wMR8C4OL4fAi4kicGxTGF1bx4BGAapXM7rLoUeaZgbxV0IDExJ0LyA7dDdRXNxxa9eSuC-JO/s16000/saturated_lights_1931_multi_lobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_r76f6QdZXYB1xVLYbaCAyymBtDqBwSjhW-TdcoKRPqSFijQZQ51lnCrDizOYNqNmp7F7wMR8C4OL4fAi4kicGxTGF1bx4BGAapXM7rLoUeaZgbxV0IDExJ0LyA7dDdRXNxxa9eSuC-JO/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1931 2° multi lobe analytic approximation</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjF5wOcEKB1-ZaEIpuYiIGzqZUK8mD5jfWXtbJ28Y7Qe242gsYrvA6h8LJm-IFzVnrZbYpfQ1AaKEhxrwFYyMse7v7I6MLCjf8nz-Xiu2bdDc65DukXjmU4gueKVMEMXgUjRMPnA6eSwhM3/s16000/saturated_lights_1964.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjF5wOcEKB1-ZaEIpuYiIGzqZUK8mD5jfWXtbJ28Y7Qe242gsYrvA6h8LJm-IFzVnrZbYpfQ1AaKEhxrwFYyMse7v7I6MLCjf8nz-Xiu2bdDc65DukXjmU4gueKVMEMXgUjRMPnA6eSwhM3/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1964 10°</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbU6TtSf-JYS6tZ0b4ZLl1oePXAPG-U3QEfEosCV7qEQ5uPPsV9jc1w934mecumEXUTcdwo-9PGE4k9R5hcMph1rbzv8BWPW1ATiuav0vtysMg5Xvwt5lPPZpiV7QoEt2RK6hZOKcccj6q/s16000/saturated_lights_1964_single_lobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbU6TtSf-JYS6tZ0b4ZLl1oePXAPG-U3QEfEosCV7qEQ5uPPsV9jc1w934mecumEXUTcdwo-9PGE4k9R5hcMph1rbzv8BWPW1ATiuav0vtysMg5Xvwt5lPPZpiV7QoEt2RK6hZOKcccj6q/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1964 10° single lobe analytic approximation</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjatB6phdpV33tD4ub_5Pv3j2Q53Kr1jFIElV_H2QL8KJ2J_07SKB0OW5xjFhN-fqfG8y8DyYaQrlXrHMq_LJ_Hy7qYhxgAzGNwhuvUgdQxORXaiagytuZx9ukySH7JVNm-QXZCtkYRNOq2/s16000/saturated_lights_2006_2.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjatB6phdpV33tD4ub_5Pv3j2Q53Kr1jFIElV_H2QL8KJ2J_07SKB0OW5xjFhN-fqfG8y8DyYaQrlXrHMq_LJ_Hy7qYhxgAzGNwhuvUgdQxORXaiagytuZx9ukySH7JVNm-QXZCtkYRNOq2/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 2006 2°</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidQs2cCLNPfq75qKhbHXJdX50gzi9ICLdjelpqdInIA91R4R851G6K-7inDT8WCSxhuAy3t_pHLJ48gmPjOJWyd09wuwh6_UKadMZv1sew4sKv4REmZ64oJxB8fTXT5W6NewUdSCSAq-3h/s16000/saturated_lights_2006_2_lobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEidQs2cCLNPfq75qKhbHXJdX50gzi9ICLdjelpqdInIA91R4R851G6K-7inDT8WCSxhuAy3t_pHLJ48gmPjOJWyd09wuwh6_UKadMZv1sew4sKv4REmZ64oJxB8fTXT5W6NewUdSCSAq-3h/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 2006 2° lobe analytic approximation</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkuczkJI3rR5v3MbETG9h-g7rFeHpGHAhnT3nFRFslp6rpO1NbwVL-DTgp9SQzxXf1kUAUnUkbtjmc2-hMVlA4lkhueCXl23bcpbrOMPg2DYjGzB-UIl2HiBVTMTfg4yD2LWzRGbSko1-m/s16000/saturated_lights_2006_10.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkuczkJI3rR5v3MbETG9h-g7rFeHpGHAhnT3nFRFslp6rpO1NbwVL-DTgp9SQzxXf1kUAUnUkbtjmc2-hMVlA4lkhueCXl23bcpbrOMPg2DYjGzB-UIl2HiBVTMTfg4yD2LWzRGbSko1-m/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 2006 10°</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiIB5I4sQrc7ZQ2iD3qklfpqtWkSt_JTlTtpT9ITVbAM3v9hAjw1OrriGwyTL_-K-KYN7LQcJN-RqghDBVHZPF5ZuC7aaCwZ79Svi94XkkSD13sAVzsF4Fe1H8OlDRwHQ47JsK4v4Eh22u/s16000/saturated_lights_2006_10_lobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjiIB5I4sQrc7ZQ2iD3qklfpqtWkSt_JTlTtpT9ITVbAM3v9hAjw1OrriGwyTL_-K-KYN7LQcJN-RqghDBVHZPF5ZuC7aaCwZ79Svi94XkkSD13sAVzsF4Fe1H8OlDRwHQ47JsK4v4Eh22u/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 2006 10° lobe analytic approximation</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p></p><p>From <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space#Similar_color_spaces">Wikipedia</a>:</p><p></p><blockquote>"<i>The CIE 1931 CMF is known to underestimate the contribution of the shorter blue wavelengths.</i>"</blockquote><p></p><p>So I was expecting some variation for the blue color when using different CMF. But to my surprise, only the CIE 1931 CMF suffer from the <strike><i><a href="http://www.brucelindbloom.com/index.html?UPLab.html">“Blue Turns Purple” Problem</a></i></strike> (Edited: As pointed out by <a href="https://twitter.com/troy_s/status/1421837070348087298">troy_s on twitter</a>, the reference I provided was wrong, the link talks about psychophysical effect, while the current issue is mishandling of light data) which we have encountered in <a href="https://simonstechblog.blogspot.com/2020/03/dxr-path-tracer.html">previous</a> <a href="https://simonstechblog.blogspot.com/2021/06/implementing-gamut-mapping.html">posts</a> (i.e. saturated sRGB blue light will render purple color). Originally, after previous blog post, I was investigating this issue and was suspecting the ACES tone mapper cause the color shift (as this issue does not happen when rendering in narrow sRGB gamut with Reinhard tone mapper). I was thinking may be we can use the OKLab color space to get the hue value before tone mapping and tone map only the lightness to keep the blue color. But when I tried with this approach, the hue value obtained before tone mapping is still purple color, which suggest may not be the tone mapper causing the issue (or somehow my method of getting the hue value from HDR value is wrong...). So I have no idea on how to solve the issue and randomly toggle some debug view modes. Accidentally, I found that some of the purple color are actually inside my AdobeRGB monitor display gamut (but outside the sRGB gamut on another monitor...), so the problem is not only caused by out of gamut color producing the purple shift... </p><table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/color_matching_function_comparison/blue_purple.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="2090" data-original-width="3807" height="176" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/color_matching_function_comparison/blue_purple.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The purple color on the wall is within displayable Adobe RGB gamut<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/color_matching_function_comparison/blue_purple_outOfGamut.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="2090" data-original-width="3807" height="176" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/color_matching_function_comparison/blue_purple_outOfGamut.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Highlighting out of gamut pixel with cyan color<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p>So I decided to investigate the problem for spectral renderer first (and ignore the RGB renderer), and that's why I tested different CMF in this blog post. (Also, as a side note, the behavior for the blue turns purple color problem is a bit different between RGB and spectral renderer, using a more saturated blue color, e.g. (0, 0, 1) in Rec2020, can hide this issue in RGB renderer while using the same more saturated blue color with 1931 CMF spectral renderer still suffer from the problem, while other CMF doesn't have this issue.)<br /></p><p> </p><p><span style="font-size: large;"><b>Color Checker comparison</b></span></p><p>Next, we compare a color checker lit by a white light source. Since my spectral renderer need to maintain compatibility with RGB rendering and I was too lazy to implement spectral material using measured spectral reflectance, so both the color checker and the light source are up-sampled from sRGB color.</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirssAxD1u84ThAW4cs2Mgl2LFF5wucByqWCz1gh0wQQ4zgS956hbY0MyreGdMx00P3QlXTMweLKewtULCcMLYVlMNbrJgEJUkEZJl2j6ot8JrCz-62CGhqP6lXhxTTOrc6TUW-0nRM-6dd/s16000/color_checker_1931.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirssAxD1u84ThAW4cs2Mgl2LFF5wucByqWCz1gh0wQQ4zgS956hbY0MyreGdMx00P3QlXTMweLKewtULCcMLYVlMNbrJgEJUkEZJl2j6ot8JrCz-62CGhqP6lXhxTTOrc6TUW-0nRM-6dd/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1931 2°</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmEih2M4jE7W1Iniz3ZqGlD1wUeniiDDt3nDyQzRwBd5K41ulVE83OL7HaGGvPG1P81L2NSb_Vofsm9RLRVpzdnQDstTZA0PmypYGTPcIZyxRr3G7mieOcRRhJpI5QtQ-N39atSPqo1TFL/s16000/color_checker_1931_jv.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmEih2M4jE7W1Iniz3ZqGlD1wUeniiDDt3nDyQzRwBd5K41ulVE83OL7HaGGvPG1P81L2NSb_Vofsm9RLRVpzdnQDstTZA0PmypYGTPcIZyxRr3G7mieOcRRhJpI5QtQ-N39atSPqo1TFL/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1931 2° with Judd Vos adjustment</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmykeOaSo4d4KosMLJl8M98I390HyC3_-Tg-tIFX-739LBCAluRe8ScOYbX5JWXGvEASKUkrrBdOQO8VV61DAM7095Jb_wZmUcpWP_V-cVdEfxLBsfrCyZpXdCl417lGciRBIcTV-4S-45/s16000/color_checker_1931_singleLobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjmykeOaSo4d4KosMLJl8M98I390HyC3_-Tg-tIFX-739LBCAluRe8ScOYbX5JWXGvEASKUkrrBdOQO8VV61DAM7095Jb_wZmUcpWP_V-cVdEfxLBsfrCyZpXdCl417lGciRBIcTV-4S-45/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1931 2° single lobe analytic approximation</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbWNlrNYynFiX1hIhQ1cltTws0ReJnN5kXxgGmZQrOqEOYO7PFYke7yyRCiSfvVonuVxDA6bOOwS9zDae0vCNiHzvQ43UKxzkPyQCmFrdbdb7SZnDbztXXT77-0EDv7yTQvWqVuendjIyD/s16000/color_checker_1931_multiLobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbWNlrNYynFiX1hIhQ1cltTws0ReJnN5kXxgGmZQrOqEOYO7PFYke7yyRCiSfvVonuVxDA6bOOwS9zDae0vCNiHzvQ43UKxzkPyQCmFrdbdb7SZnDbztXXT77-0EDv7yTQvWqVuendjIyD/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1931 2° multi lobe analytic approximation</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVf-l-n4te3MLOkIfoIBw3lDkAbPXFqEzHy0CmeDrlxjBMM1KhYjqZwRoiYd7d07QBqA8qIeS0tDPUpLlXnwyVCgmuPcBtLlIsRjQnCxvlbD0zUBGn8AxZPrQkT4d-ga-PuH-y5Zec7xBX/s16000/color_checker_1964.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVf-l-n4te3MLOkIfoIBw3lDkAbPXFqEzHy0CmeDrlxjBMM1KhYjqZwRoiYd7d07QBqA8qIeS0tDPUpLlXnwyVCgmuPcBtLlIsRjQnCxvlbD0zUBGn8AxZPrQkT4d-ga-PuH-y5Zec7xBX/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1964 10°</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5F5-ZXmDA7knmeVZbyq3_h3CfS79VIyI284XsTqrKb9v_oiIpBah0XCuO7eLobL4UPGy3Fyoy9TZvhCEqoL4S2h3nEQsBG65HF5Pme99sjvRBRSzTdHaowIW7ksc_v8ShlKg3XNONxfvV/s16000/color_checker_1964_singleLobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5F5-ZXmDA7knmeVZbyq3_h3CfS79VIyI284XsTqrKb9v_oiIpBah0XCuO7eLobL4UPGy3Fyoy9TZvhCEqoL4S2h3nEQsBG65HF5Pme99sjvRBRSzTdHaowIW7ksc_v8ShlKg3XNONxfvV/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 1964 10° single lobe analytic approximation</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVNkEeK9p-ZVl3wO5QYDIFJfwgTvF_Dk2x-4P8SpN7dsmD0vQSuJPh88P0E1yod1o7QQkwNeoqY9pGttT59-8CXbEP6teDVEHO1HEt-H7k5OerqF9ySq0NxXzAJPJUgPGGuzqnK_Lksa-S/s16000/color_checker_2006_2.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVNkEeK9p-ZVl3wO5QYDIFJfwgTvF_Dk2x-4P8SpN7dsmD0vQSuJPh88P0E1yod1o7QQkwNeoqY9pGttT59-8CXbEP6teDVEHO1HEt-H7k5OerqF9ySq0NxXzAJPJUgPGGuzqnK_Lksa-S/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 2006 2°</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2be9Jlzeai-tkSs4cmr7KGF83dqozsI65I-eTcNR5334NxE4hkOThbkyS01PuXVCnKgzQy8DptBadmdpU8pPILru91SGvymrloq1BBTfmPhvAKXAtnwPpQ7WnjEEmsklZ6wUkfgmTSytb/s16000/color_checker_2006_2_lobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2be9Jlzeai-tkSs4cmr7KGF83dqozsI65I-eTcNR5334NxE4hkOThbkyS01PuXVCnKgzQy8DptBadmdpU8pPILru91SGvymrloq1BBTfmPhvAKXAtnwPpQ7WnjEEmsklZ6wUkfgmTSytb/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 2006 2° lobe analytic approximation</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi72LT1j2UAw-J3nj65LjqfKBLOGSAJj4CY05u5EkhGZKPPRSgCMXuYOWoW3bsBzJ6Suu4zGYofoGkSn_VnusVL46kahNrSwbzD2_A2hO59ZHNPWmtNpwJCLRuqNwP2EBkQ-3HSdp8OXTYy/s16000/color_checker_2006_10.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi72LT1j2UAw-J3nj65LjqfKBLOGSAJj4CY05u5EkhGZKPPRSgCMXuYOWoW3bsBzJ6Suu4zGYofoGkSn_VnusVL46kahNrSwbzD2_A2hO59ZHNPWmtNpwJCLRuqNwP2EBkQ-3HSdp8OXTYy/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 2006 10°</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi480rEzjNal5enHLBSwF_PwJT1HrE5OIRp-LbKUsp1ujrA97LiCOoYQauJRFycZf7mQbtf0nuDKmeXnfqJPRw9PnjeRt4Xegs_FRlCk6UjGDTrsx4RavTggKKNUyv2vy3LSY0cCqtgoFC2/s16000/color_checker_2006_10_lobe.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi480rEzjNal5enHLBSwF_PwJT1HrE5OIRp-LbKUsp1ujrA97LiCOoYQauJRFycZf7mQbtf0nuDKmeXnfqJPRw9PnjeRt4Xegs_FRlCk6UjGDTrsx4RavTggKKNUyv2vy3LSY0cCqtgoFC2/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">CIE 2006 10° lobe analytic approximation</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p>From the above results, different CMF have similar looks except the blue color.<br /></p><p></p><p><br /></p><p><span style="font-size: large;"><b>Conclusion</b></span></p><p>In this post, we have compare different CMF, provided an analytical approximation for the CIE 2006 CMF and calculate the D65 white point for CIE 2006 CMF (the math can be found in the <a href="https://colab.research.google.com/drive/1EIqkm9-Y6Sl0VpiOBSgXM4XlUxutwmGz?usp=sharing">Colab source code</a>). All the CMF produce similar color except the blue color, with CMF newer than the 1931 CMF can render saturated blue color correctly without turning it into purple color. May be we should use newer CMF instead, especially when working with wide gamut color. And the company Konica Minolta points out that: the <a href="https://sensing.konicaminolta.asia/deficiencies-of-the-cie-1931-color-matching-functions/">CIE 1931 CMF has issue with wider color gamut with OLED display</a> (which suggest to use CIE 2015 CMF instead). But sadly, I cannot find the data for CIE 2015 CMF, so it is not tested in this post.<br /></p><p><br /></p><p><b>Reference</b></p><p>[1] <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space">https://en.wikipedia.org/wiki/CIE_1931_color_space</a></p><p>[2] <a href="http://cvrl.ioo.ucl.ac.uk/">http://cvrl.ioo.ucl.ac.uk/</a> <br /></p><p>[2] <a href="http://jcgt.org/published/0002/02/01/paper.pdf">http://jcgt.org/published/0002/02/01/paper.pdf</a><br /></p><p>[3] <a href="https://en.wikipedia.org/wiki/ColorChecker">https://en.wikipedia.org/wiki/ColorChecker</a></p><p>[4] <a href="https://en.wikipedia.org/wiki/Standard_illuminant">https://en.wikipedia.org/wiki/Standard_illuminant</a></p><p>[5] <a href="https://www.rit.edu/cos/colorscience/rc_useful_data.php">https://www.rit.edu/cos/colorscience/rc_useful_data.php</a><br /></p><p>[6] <a href="https://sensing.konicaminolta.asia/deficiencies-of-the-cie-1931-color-matching-functions/">https://sensing.konicaminolta.asia/deficiencies-of-the-cie-1931-color-matching-functions/</a><br /></p><p></p><p></p>Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-88397538796283031352021-06-26T14:42:00.006+08:002021-06-26T15:02:21.081+08:00Implementing Gamut Mapping<p><span style="font-size: large;"><b>Introduction</b></span></p><p>Continue with <a href="https://simonstechblog.blogspot.com/2021/05/studying-gamut-clipping.html">previous post</a>, after learning how gamut clipping works, I want to know how it behaves in rendered image, so I implemented it in my toy path tracer with clipping to arbitrary gamut. It can be downloaded <a href="https://drive.google.com/file/d/1R3mGRkG8T8reNRthwXr1XTh1JzMvjFtD/view?usp=sharing">here</a>. Also, the <a href="https://www.shadertoy.com/view/7tlGRS">Shadertoy sample</a> is updated to support clipping to arbitrary gamut.</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsrG3b734OoWcXUMiWdKgID07sx6taoy7hembiAtDhJx8Ykj-b5LB-wqlJ-WAiFp23om70meGY7ji982AUIByXjANl0KNT8hUXfArIocPMdOgMcKhaWclSwY8xUwqRoQGGqgf_SM3O_aEM/s16000/main_gamut_clip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="694" data-original-width="1362" height="163" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsrG3b734OoWcXUMiWdKgID07sx6taoy7hembiAtDhJx8Ykj-b5LB-wqlJ-WAiFp23om70meGY7ji982AUIByXjANl0KNT8hUXfArIocPMdOgMcKhaWclSwY8xUwqRoQGGqgf_SM3O_aEM/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">With gamut clipping<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxL5cIned761r0PsEjegftd783eTwx0Uetw7tZqp0zgyALsy8MwuYDPMwAL2niM4jlduM8PqjtoT_W0Oc0H-5wjpbAZDDkL2I3UfuRKzsp-6dtep-Qew35abZh3ca8gf3GtiDIjCpi3Or2/s16000/main_no_clip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="694" data-original-width="1362" height="163" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxL5cIned761r0PsEjegftd783eTwx0Uetw7tZqp0zgyALsy8MwuYDPMwAL2niM4jlduM8PqjtoT_W0Oc0H-5wjpbAZDDkL2I3UfuRKzsp-6dtep-Qew35abZh3ca8gf3GtiDIjCpi3Or2/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Without gamut clipping<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<br /><p><span style="font-size: large;"><b>Solving max saturation analytically</b></span><br /></p><p><span>We need to compute the </span><span><span>maximum saturation to perform gamut clipping.</span> In the <a href="https://bottosson.github.io/posts/gamutclipping/">originally gamut clipping blog post</a>, the author relies on fitting a polynomial function for the sRGB max saturation. But for my path tracer, it can output to different color gamut (e.g. Adobe RGB, P3 D65...), I was too lazy to write such curve fitting function for arbitrary gamut, so I took a look at how the </span><span><span>max saturation </span></span><span>polynomial function is derived from the original <a href="https://colab.research.google.com/drive/1JdXHhEyjjEE--19ZPH1bZV_LiGQBndzs?usp=sharing">Colab source code</a>:</span></p><div class="separator" style="clear: both; text-align: center;"><span><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGwtJNn6WvZt6uibre6-nYu6GwjmWnB8miylqvc9OqokTF9KsKUzHRSYtO-EyTGEBFAvnzn0yz7LC2xxFG7LIAwh04oI87lKZkGJLv0vJ-lOi_07AuIduAf-y27hcsJU-J0YJxJKiReMoZ/s16000/solve_max_sat_from_def.png" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="656" data-original-width="2092" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGwtJNn6WvZt6uibre6-nYu6GwjmWnB8miylqvc9OqokTF9KsKUzHRSYtO-EyTGEBFAvnzn0yz7LC2xxFG7LIAwh04oI87lKZkGJLv0vJ-lOi_07AuIduAf-y27hcsJU-J0YJxJKiReMoZ/w640-h200/solve_max_sat_from_def.png" width="640" /></a></span></div><p></p><p>Luckily, when optimizing the <b>e_R()</b> / <b>e_G() </b>/ <b>e_B()</b> function to 0, it is equivalent to solving the equation <b>to_R() </b>/ <b>to_G() </b>/ <b>to_B() </b>= 0, which is a cubic function with analytical solution: </p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgghfB7Pjm-YQdvZmRWX0XFAh4_f4Bp9EGnLiLosWHrB7vHEZIgCjEYrihN3zqw3NWgKCmsvjSsUmRaFzZ6g4maYn8pdJF9lLdlr1K6EYG3dCS8bfnaq-eEUFp7siTfW7F3ZKunQUYrDpfC/s16000/solve_max_sat.png" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="1024" data-original-width="1570" height="418" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgghfB7Pjm-YQdvZmRWX0XFAh4_f4Bp9EGnLiLosWHrB7vHEZIgCjEYrihN3zqw3NWgKCmsvjSsUmRaFzZ6g4maYn8pdJF9lLdlr1K6EYG3dCS8bfnaq-eEUFp7siTfW7F3ZKunQUYrDpfC/w640-h418/solve_max_sat.png" width="640" /></a></div><p></p><p>To calculate max saturation for arbitrary gamut, we can first compute the <b>r_dir </b>/ <b>g_dir </b>/ <b>b_dir</b> for our target gamut, then compute the <b><i>Oklab to target gamut matrix</i></b>, finally we can solve the cubic equation to compute the maximum saturation. Details can be found in the <a href="https://www.shadertoy.com/view/7tlGRS">Shadertoy sample code</a>.<br /></p><p><span>But, solving this cubic equation will have some </span><span><span>precision </span>issue at some hue value around the blue color, so the <a href="https://www.shadertoy.com/view/7tlGRS">Shadertoy demo</a> perform a step of Halley's method to minimize the issue. If the target clipping gamut is not large (e.g. sRGB, AdobeRGB...) Solving the cubic equation with numerical method (e.g. 1 step of Halley's method + 1 step of Newton's method) using a good initial guess (e.g. I have tried 0.4 in the <a href="https://www.shadertoy.com/view/7tlGRS">Shadertoy demo</a>) may be enough and will be more </span><span><span>stable </span>numerically.</span></p><p><span></span></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinFQMVqwL7SI9xoQVWSDcjrrC0-5e5S0TUd_5OY48g-cGxPTqgwZQym3dzWaJ99x43jnIB449zvAMRwsYvHajKFvK_d8Z4RE79vhPSGJ9Ha8rJQ_b_7_kaaj3jUU1e86s2QXpACs3Decbe/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="473" data-original-width="1611" height="189" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinFQMVqwL7SI9xoQVWSDcjrrC0-5e5S0TUd_5OY48g-cGxPTqgwZQym3dzWaJ99x43jnIB449zvAMRwsYvHajKFvK_d8Z4RE79vhPSGJ9Ha8rJQ_b_7_kaaj3jUU1e86s2QXpACs3Decbe/w640-h189/precision_error.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">The left image show the precision error for calculating the cusp point at hue 232.58 degree<br />The right image can calculate the cusp point correctly with < 1 degree hue difference from left image<br /></td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><span></span></div><p></p><p> </p><span style="font-size: large;"><b>Solving RGB=1 clipping line with 2 curves only</b></span><p><a href="https://simonstechblog.blogspot.com/2021/05/studying-gamut-clipping.html">From previous post</a>, we know that the upper clipping line of the valid gamut "triangle" is the line with Red/Green/Blue value = 1, and at most 2 clipping lines are used:</p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqynmMn9qPBGE0VZR2cE5Au3jvcNOsWGh5d0V486mjx5-UII1BfandT-_gLTf_1t19SyFo2BvArB08tYVA9IpPhoIv7r6uJ0b_7MJyD48IUsVrbURYmnaxCXsD4RgJkAf_EuAPZG2vfPNj/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="470" data-original-width="775" height="243" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqynmMn9qPBGE0VZR2cE5Au3jvcNOsWGh5d0V486mjx5-UII1BfandT-_gLTf_1t19SyFo2BvArB08tYVA9IpPhoIv7r6uJ0b_7MJyD48IUsVrbURYmnaxCXsD4RgJkAf_EuAPZG2vfPNj/w400-h243/rgb_one_clip_line_x2.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">This yellow hue use 2 upper clipping lines (red and green lines)<br /></td></tr></tbody></table><p></p><p>In the updated <a href="https://www.shadertoy.com/view/7tlGRS">Shadertoy demo</a>, the upper "triangle" clipping method is changed to use 2 clipping lines depending on the <b>r_dir </b>/ <b>g_dir </b>/ <b>b_dir</b> (computed during max saturation).</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVpU6-pFmo7h7RhWkviKBX5hRrAYTwaCCA01CcKS9m3ecfwXZnNtp6icRxUgCQgW8SciZipm6q3HIWy1ozV7GH0YraIGbDc4qppbY03oNMFXsu68zvTyeJ_-Z1wRKvx0yRt0JZyGMWPCtY/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="108" data-original-width="272" height="79" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVpU6-pFmo7h7RhWkviKBX5hRrAYTwaCCA01CcKS9m3ecfwXZnNtp6icRxUgCQgW8SciZipm6q3HIWy1ozV7GH0YraIGbDc4qppbY03oNMFXsu68zvTyeJ_-Z1wRKvx0yRt0JZyGMWPCtY/w200-h79/use_all_one_clip_lines.png" width="200" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Originally clipping code using all 3 lines<br /></td></tr></tbody></table>
</td>
<td>
<div class="separator" style="clear: both; text-align: center;"></div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4VYlNRLtGivhpsipZc3Wxuyv0mb6_CH5a4Fewczgz-5CHGihGUZEHXySQsKW0Wd1XCfIlMIgJfuy51LjOMtwZUjJFrWdZllr91W53r2zW9-5x_1ryQk8rDw2NOlTopF6k8fTXuQ9JqEZ2/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="323" data-original-width="372" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4VYlNRLtGivhpsipZc3Wxuyv0mb6_CH5a4Fewczgz-5CHGihGUZEHXySQsKW0Wd1XCfIlMIgJfuy51LjOMtwZUjJFrWdZllr91W53r2zW9-5x_1ryQk8rDw2NOlTopF6k8fTXuQ9JqEZ2/" width="276" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">updated clipping code using 2 lines depending on hue<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p></p><p>And during my implementation, I accidentally found that when performing gamut clipping for ACEScg color space, I forgot to calculate the chromatic adaptation due to different white point (Oklab uses D65 while ACEScg uses roughly D60), all 3 upper clipping lines need to be used:</p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvuTieqAWBfthx2GoQXlamJdYxsLQVclBgkpRAZhUb1hTkWPde22datxoXOZrEfpf5ziMPWnY7etVaUJHhUmYNuz0yzV_Jlgw-blNwL_BYGCd7UtvOrHa2oo6l6noILrEd6TA1-xjDxOnU/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="468" data-original-width="775" height="241" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvuTieqAWBfthx2GoQXlamJdYxsLQVclBgkpRAZhUb1hTkWPde22datxoXOZrEfpf5ziMPWnY7etVaUJHhUmYNuz0yzV_Jlgw-blNwL_BYGCd7UtvOrHa2oo6l6noILrEd6TA1-xjDxOnU/w400-h241/CAT_bug.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">All 3 upper clipping lines are used due to chromatic adaption bug<br /></td></tr></tbody></table><p></p><p><span style="font-size: large;"><b>Result</b></span></p><p><span>Now, let's see how gamut clipping looks in rendered image. All 5 gamut clipping methods from <a href="https://bottosson.github.io/posts/gamutclipping/">Björn Ottosson's blog</a> are </span><span><span>implemented</span>:</span></p><ol style="text-align: left;"><li>Keep lightness constant, only compress chroma (Chroma clipped)<br /></li><li>Projection towards a single point, hue independent (L<span style="font-size: xx-small;">0</span>=0.5 projection)<br /></li><li>Projection towards a single point, hue dependent (L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span> projection)</li><li>Adaptive L<span style="font-size: xx-small;">0</span>, hue independent (Adaptive L<span style="font-size: xx-small;">0</span>=0.5)<br /></li><li>Adaptive L<span style="font-size: xx-small;">0</span>, hue dependent (Adaptive L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span>)</li></ol><p>Let's start with a night scene, the clipping effect is most noticeable in the blue curtain and a slight change in the green curtain:</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_iNdZvJu1sYAhA5_ra4OhRRR_yCWrxgInCViWXpYcgDGFV-cqpDtDR9OmzY9qhiJEXvrDE-Vn8qThbhjvPe-JGdbJ3ypx-8k74Qzv8P4bJ2l4_Ws2QuJU03zkI7W0a7v7NuLNRgFx8lqB/s16000/scene_0_no_clip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_iNdZvJu1sYAhA5_ra4OhRRR_yCWrxgInCViWXpYcgDGFV-cqpDtDR9OmzY9qhiJEXvrDE-Vn8qThbhjvPe-JGdbJ3ypx-8k74Qzv8P4bJ2l4_Ws2QuJU03zkI7W0a7v7NuLNRgFx8lqB/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Without gamut clipping <br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjZwK8GACAhgXHbM5-0pnHk3XiXJ3xL9uAfkQCVYbtp17vYionMO0JSppk8YtDECXlzyHi6eJL7W1PlaEdHgWPiV5TAfZLjaqDNLZd_5KL9vWTSf-exZWgcHnxst9VOgX14LyHA_lF56f2/s16000/scene_0_keep_lightness.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjZwK8GACAhgXHbM5-0pnHk3XiXJ3xL9uAfkQCVYbtp17vYionMO0JSppk8YtDECXlzyHi6eJL7W1PlaEdHgWPiV5TAfZLjaqDNLZd_5KL9vWTSf-exZWgcHnxst9VOgX14LyHA_lF56f2/w320-h170/scene_0_keep_lightness.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Chroma clipped<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXJ6M04__R-JXy5Zun06ywUWqF0OfIhXDWBfCoLoXM8kyisC9zOx_oB2QVBxoKl0m3oq1IjzIQaKQ24ANavHAmPrNBfyskmuueop_47GHTzDMdK947xc-9lTShpG-q_Hgi-Vg6JggvB4EP/s16000/scene_0_outOfGamut.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXJ6M04__R-JXy5Zun06ywUWqF0OfIhXDWBfCoLoXM8kyisC9zOx_oB2QVBxoKl0m3oq1IjzIQaKQ24ANavHAmPrNBfyskmuueop_47GHTzDMdK947xc-9lTShpG-q_Hgi-Vg6JggvB4EP/w320-h170/scene_0_outOfGamut.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Out of gamut pixels<br /></td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBEv1u6tlEU5_jeepOWAcWHvtqzJzoFgzapq8rDMRN2DVr_gQ1sGA3FpoAErVdCtUH-lKnGU_BFNrqOKq7jyj5ROgNwwN20f8XXKcUTG-sGZ9Tz2Fofr3PgHoR19mLm5O_1QnUxJNcyBrG/s16000/scene_0_singlePtHueIndependent.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBEv1u6tlEU5_jeepOWAcWHvtqzJzoFgzapq8rDMRN2DVr_gQ1sGA3FpoAErVdCtUH-lKnGU_BFNrqOKq7jyj5ROgNwwN20f8XXKcUTG-sGZ9Tz2Fofr3PgHoR19mLm5O_1QnUxJNcyBrG/w320-h170/scene_0_singlePtHueIndependent.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">L<span style="font-size: xx-small;">0</span>=0.5 projection<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGipCJkY4E0o-tuTnDk6-71Mat7Z-Ja7N5psl-C54ImnfAKpLrvNwUD3gk8dvnCzg3qooLxrZetDKUx76PeXxGKo8u5Ke9C5fvbxXBg24-kMV_VtPDIf4frToZzE_2ltHO4I8aUkOcg0py/s16000/scene_0_adaptiveHueIndependent_5_0.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGipCJkY4E0o-tuTnDk6-71Mat7Z-Ja7N5psl-C54ImnfAKpLrvNwUD3gk8dvnCzg3qooLxrZetDKUx76PeXxGKo8u5Ke9C5fvbxXBg24-kMV_VtPDIf4frToZzE_2ltHO4I8aUkOcg0py/w320-h170/scene_0_adaptiveHueIndependent_5_0.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=0.5, α=5.0<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEia611Gm74jiAhxMWhVdb7Ztin2vQwNUN4fQ5gVtFU3ktQFNuO_Lthiep9_atYfed8qvhp9pWJs6w3ui0ODS6oJLEUFzOMNS7lQMe0D7Tjsjli794-WimCjKDRpDPyDumgDGlAdP1gTA3NX/s16000/scene_0_adaptiveHueIndependent_0_05.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEia611Gm74jiAhxMWhVdb7Ztin2vQwNUN4fQ5gVtFU3ktQFNuO_Lthiep9_atYfed8qvhp9pWJs6w3ui0ODS6oJLEUFzOMNS7lQMe0D7Tjsjli794-WimCjKDRpDPyDumgDGlAdP1gTA3NX/w320-h170/scene_0_adaptiveHueIndependent_0_05.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=0.5, α=0.05</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHsa4tOdgAmin9r-q9JWtcLRd8oI5PcBG2Tcqf7Ir-_UpP4NsRhsGAjNGRzcT5E0xE5dZjmA-xAqa-zNwZeE2UMfl9AgVp04Couq7owdTAwWclL0GSk7FDm2b42qiz8Mz2hmyw4NnOU-Rg/s16000/scene_0_singlePtHueDependent.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHsa4tOdgAmin9r-q9JWtcLRd8oI5PcBG2Tcqf7Ir-_UpP4NsRhsGAjNGRzcT5E0xE5dZjmA-xAqa-zNwZeE2UMfl9AgVp04Couq7owdTAwWclL0GSk7FDm2b42qiz8Mz2hmyw4NnOU-Rg/w320-h170/scene_0_singlePtHueDependent.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span> projection<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXqJ5k9wpsncc_465GdQXpP8bzeBuHQZGkWyBEalkSqGfEVHRNpY11UKRGEKoKTtTNBJ1qryyhHg4xdHVYMELtoT9SfvNicPidjn8kvbAevrswtwm7pkS8ljEmXAmRf0pXBXNvyIkjZNVP/s16000/scene_0_adaptiveHueDependent_5_0.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXqJ5k9wpsncc_465GdQXpP8bzeBuHQZGkWyBEalkSqGfEVHRNpY11UKRGEKoKTtTNBJ1qryyhHg4xdHVYMELtoT9SfvNicPidjn8kvbAevrswtwm7pkS8ljEmXAmRf0pXBXNvyIkjZNVP/w320-h170/scene_0_adaptiveHueDependent_5_0.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span>, α=5.0</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO5AWwRja6aNPxkUdNWsfyd1gMM0jzF2a800O8OnbE13xmT0ZhE1FFxuWiaDqUpOK4fH6bHg_RmFhNjnHIQdxiVTPvbMu21Q7XLoYdAHYxS9tLy0Jxn0mhqTuJCycnTrkp96yTw21RfGRo/s16000/scene_0_adaptiveHueDependent_0_05.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO5AWwRja6aNPxkUdNWsfyd1gMM0jzF2a800O8OnbE13xmT0ZhE1FFxuWiaDqUpOK4fH6bHg_RmFhNjnHIQdxiVTPvbMu21Q7XLoYdAHYxS9tLy0Jxn0mhqTuJCycnTrkp96yTw21RfGRo/w320-h170/scene_0_adaptiveHueDependent_0_05.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span>, α=0.05</td></tr></tbody></table>
</td>
</tr>
</tbody></table><p><span></span></p><p><span></span></p><p><span>Then the following test scenes all use a light with saturated color (e.g. red color with (1, 0, 0) in Rec2020) to generate out of gamut color. With a saturated magenta colored light, gamut clipping can do a pretty good job at showing the details for the out of gamut area (e.g. around the lion face)</span><span> </span></p><table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfE7OnktH5snDy7jA0M0PvQT4VQj1TjdTFsR3UZByfvgAhx7M6vrm8BQRunpEdPxCX0OKi8965PvNB1_tWoY6GWtu4QUv3k6FX-gq3widrvlON1Z_QISyuZ17u-klACCra6EvUx0tTFgD-/s16000/scene_2_noClip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfE7OnktH5snDy7jA0M0PvQT4VQj1TjdTFsR3UZByfvgAhx7M6vrm8BQRunpEdPxCX0OKi8965PvNB1_tWoY6GWtu4QUv3k6FX-gq3widrvlON1Z_QISyuZ17u-klACCra6EvUx0tTFgD-/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Without gamut clipping </td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoK5azf3t-L3pZMidinolLN9zlOADkry9J3bOj-FHiI9CG0RRBNTLzZ2CKsBGQPZ4GzQ12LVJkdoLAeRP31pYpZGQDkYNNCoGw2UNYY0oEn8JK9Vn8KlchoG3rE53d3uIVX1JzjsZVTBww/s16000/scene_2_keepLightness.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoK5azf3t-L3pZMidinolLN9zlOADkry9J3bOj-FHiI9CG0RRBNTLzZ2CKsBGQPZ4GzQ12LVJkdoLAeRP31pYpZGQDkYNNCoGw2UNYY0oEn8JK9Vn8KlchoG3rE53d3uIVX1JzjsZVTBww/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Chroma clipped</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgme1m1ALoYCXN1rgiEZOOIk8wvvSkSo4I6daNIr3hjOPaQllNdLFaU6nYyW-VzhbGDvtbMBuciigfPOWYkVmvq-jRhuNX3HyIwAz9UQ2-rTvntW-ss7xyvqyUo52DzuUAt5uWltmU1lq5E/s16000/scene_2_outOfGamut.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgme1m1ALoYCXN1rgiEZOOIk8wvvSkSo4I6daNIr3hjOPaQllNdLFaU6nYyW-VzhbGDvtbMBuciigfPOWYkVmvq-jRhuNX3HyIwAz9UQ2-rTvntW-ss7xyvqyUo52DzuUAt5uWltmU1lq5E/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Out of gamut pixels</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRjBHVKMiwG0E_cFqBuUblpJEu7uHjeRzWOETBvjLb65S4U-ENBQwXwLwdsACrcwmbvlcaWW3bMfFlyoF1U1Vmnt8KWfU2BVfsqlNRPS-xaT4G8na3w7GwJ2seto96QI254R5bT9Vbz9LJ/s16000/scene_2_singlePointHueIndependent.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiRjBHVKMiwG0E_cFqBuUblpJEu7uHjeRzWOETBvjLb65S4U-ENBQwXwLwdsACrcwmbvlcaWW3bMfFlyoF1U1Vmnt8KWfU2BVfsqlNRPS-xaT4G8na3w7GwJ2seto96QI254R5bT9Vbz9LJ/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">L<span style="font-size: xx-small;">0</span>=0.5 projection</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhjjod0vy9fFpsnbTYo5rLKlhf4PGKWRC26-jKK7VO47FtGPgL-DnscIZ6KiMv0XTbWEVcHia1vc3c2tElAshRz3P1ERj7XtBI490we9wVPAejL72R2N6DW5LnyQKJqzFnn849h0rk0-09/s16000/scene_2_adaptiveHueIndependent_5_0.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhhjjod0vy9fFpsnbTYo5rLKlhf4PGKWRC26-jKK7VO47FtGPgL-DnscIZ6KiMv0XTbWEVcHia1vc3c2tElAshRz3P1ERj7XtBI490we9wVPAejL72R2N6DW5LnyQKJqzFnn849h0rk0-09/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=0.5, α=5.0</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMEqUKG1t8oCPFnvGfyhlDgmaRITypNv5p6k_0WwpAbZJVYEQKPWj9h6gGDzvDp6vSigqkzUUYpqHtfAS24M_-5ZId6JVo9_m3aqEINRFeSnN-JJGLuTJLMGJyW0bD_1qAJGzvpOWDK-Zu/s16000/scene_2_adaptiveHueIndependent_0_05.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMEqUKG1t8oCPFnvGfyhlDgmaRITypNv5p6k_0WwpAbZJVYEQKPWj9h6gGDzvDp6vSigqkzUUYpqHtfAS24M_-5ZId6JVo9_m3aqEINRFeSnN-JJGLuTJLMGJyW0bD_1qAJGzvpOWDK-Zu/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=0.5, α=0.05</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQruQmgqzax-813rgrAGlInPhn9PzeNqFqW-2l19d8RI_36oJeBNIu9dzIlNyPSdsPIVnhHxkWBHSY7R4WkCYuNPzJdIO5wjGti7tEUCkzIKqTxCQf2ZqYB0iW_0_LefT99BKZmIzM1e3I/s16000/scene_2_singlePointHueDependent.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQruQmgqzax-813rgrAGlInPhn9PzeNqFqW-2l19d8RI_36oJeBNIu9dzIlNyPSdsPIVnhHxkWBHSY7R4WkCYuNPzJdIO5wjGti7tEUCkzIKqTxCQf2ZqYB0iW_0_LefT99BKZmIzM1e3I/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span> projection</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifpDMH9ZiXfpRCplAGYXZ30AAHhr80JJ0LmSekSYM7-O14iQK9W7-nCevwXlAvet_lMESC6faftkQ1_PsIDOtK1VcURmvQ5kVvg4spW4ccQCLJ9wbj8aT4wKKiDIpiQiVI0_XxUAc6b9G_/s16000/scene_2_adaptiveHueDependent_5_0.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifpDMH9ZiXfpRCplAGYXZ30AAHhr80JJ0LmSekSYM7-O14iQK9W7-nCevwXlAvet_lMESC6faftkQ1_PsIDOtK1VcURmvQ5kVvg4spW4ccQCLJ9wbj8aT4wKKiDIpiQiVI0_XxUAc6b9G_/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span>, α=5.0</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2YF-v-EPRgJ1v31q1g_ToZ-GkgYMsGdaa0VWdIA8vE5n44AO_z159SS0BTvZk_Uc2CvKv0UILGb-MGtb009MVKUI8ROxduHfBTdfbehvqs7Sm7jz50ONijuJ1_CbHHM5gS_hUl-yXExzz/s16000/scene_2_adaptiveHueDependent_0_05.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2YF-v-EPRgJ1v31q1g_ToZ-GkgYMsGdaa0VWdIA8vE5n44AO_z159SS0BTvZk_Uc2CvKv0UILGb-MGtb009MVKUI8ROxduHfBTdfbehvqs7Sm7jz50ONijuJ1_CbHHM5gS_hUl-yXExzz/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span>, α=0.05</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p><span>Changing to a saturated green light, different clipping methods will change the perceived lighting, especially using projection towards a single point method.</span></p><p><span></span></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhx4KAXFJRwd6VV3JFROw45IKukbgvPoc5HrgWT_8rudS8yPl6JTJJnkw8F4RmhlWYr0CB0GLCX0op3HOj8jFrpO7Bzlw-QHO0PrJ39WetzZPBMyX6gIwtGrwcIEE7tLJKj44690PrT5K9-/s16000/scene_4_noClip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhx4KAXFJRwd6VV3JFROw45IKukbgvPoc5HrgWT_8rudS8yPl6JTJJnkw8F4RmhlWYr0CB0GLCX0op3HOj8jFrpO7Bzlw-QHO0PrJ39WetzZPBMyX6gIwtGrwcIEE7tLJKj44690PrT5K9-/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Without gamut clipping </td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiy93YNsVmHa15elqCkAxyMCE3753Zoex0kvSuW5VLcbsaTq7QgJvD1gAUH3vW1_B-kvbixh4nB-3Hy6-oZr2WBVM9jZ_DkAzLLnQ-UOKD_BOtGdKDam0PgNlvp0D8xI2HBkxqd9VjaTaNY/s16000/scene_4_keepLightness.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiy93YNsVmHa15elqCkAxyMCE3753Zoex0kvSuW5VLcbsaTq7QgJvD1gAUH3vW1_B-kvbixh4nB-3Hy6-oZr2WBVM9jZ_DkAzLLnQ-UOKD_BOtGdKDam0PgNlvp0D8xI2HBkxqd9VjaTaNY/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Chroma clipped</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_ub-3UmBbM0IK_fAnLs4hKHH05uwWwqwf-5WAtnXQoIz5sATy5sliSMOSDZxcOuBIEH4jDFGuByM7iseB9tRBu9huv72CvhALB0ZhU8499lK_f195_s6KxnShEQaUVR_bFMo5UY4Jshde/s16000/scene_4_outOfGamut.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_ub-3UmBbM0IK_fAnLs4hKHH05uwWwqwf-5WAtnXQoIz5sATy5sliSMOSDZxcOuBIEH4jDFGuByM7iseB9tRBu9huv72CvhALB0ZhU8499lK_f195_s6KxnShEQaUVR_bFMo5UY4Jshde/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Out of gamut pixels</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3gpU9m3gy1riCub2MzaJIjPsiAEaYMCsA4jvIfKuwhtB6wExEd6PbNMdkK4PhmXXOIjEYwSrgIfghWmuea-tnFSI9lkxbAmZSFlDXII5tpkFOuAf8B0_muG6wtJ41wbqT6BQmKi_XVxqC/s16000/scene_4_singlePointHueIndependent.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3gpU9m3gy1riCub2MzaJIjPsiAEaYMCsA4jvIfKuwhtB6wExEd6PbNMdkK4PhmXXOIjEYwSrgIfghWmuea-tnFSI9lkxbAmZSFlDXII5tpkFOuAf8B0_muG6wtJ41wbqT6BQmKi_XVxqC/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">L<span style="font-size: xx-small;">0</span>=0.5 projection</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1DP9mTR7tVXbn-DcGG3HVXagrgz8saPA8skSf0f__wbb4cX9NjZnESupILq9nDXW35EuSgJS68GMguLi-evaadATQ6xBKpmmAprd_J4YAo485v_KkMYDfW6kozN5gF2eLmUltsOZlvFXn/s16000/scene_4_adaptiveHueIndependent_5_0.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1DP9mTR7tVXbn-DcGG3HVXagrgz8saPA8skSf0f__wbb4cX9NjZnESupILq9nDXW35EuSgJS68GMguLi-evaadATQ6xBKpmmAprd_J4YAo485v_KkMYDfW6kozN5gF2eLmUltsOZlvFXn/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=0.5, α=5.0</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaVDy0yFhGqDmhRP08lE8nziclJUJIxrfRK8Ge8LKJSGAMvnqTobLoLfVHzQXmxpKex4Ke381WJQP-vg8ixSACUe2JdmpkXYnJFmwacohVvhYHUno0IHe1Y6tSFpv_m3mTuTGPFS58B4n8/s16000/scene_4_adaptiveHueIndependent_0_05.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaVDy0yFhGqDmhRP08lE8nziclJUJIxrfRK8Ge8LKJSGAMvnqTobLoLfVHzQXmxpKex4Ke381WJQP-vg8ixSACUe2JdmpkXYnJFmwacohVvhYHUno0IHe1Y6tSFpv_m3mTuTGPFS58B4n8/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=0.5, α=0.05</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjuM1Zc_t3QhZvkc5gpTDI9f5SGAo9kFkEy_h_B1tbfWexwkGnakrey04owVzavO-Vg20KCMTs-WnEH6s3LrmMj3-3_G4VHEu7_n-azqt4sJMdq2Ud6zZ6wDlKjZ4lrRaeSLvPNbRXOsum5/s16000/scene_4_singlePointHueDependent.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjuM1Zc_t3QhZvkc5gpTDI9f5SGAo9kFkEy_h_B1tbfWexwkGnakrey04owVzavO-Vg20KCMTs-WnEH6s3LrmMj3-3_G4VHEu7_n-azqt4sJMdq2Ud6zZ6wDlKjZ4lrRaeSLvPNbRXOsum5/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span> projection</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHzKCIsaErzzFZp0r3SSMIDmGzRH4e3fjrrVloWZl3gz9ONjSdvCfo7HQE0nLUXRR4MlC7oVZbmjgvO54BPCAGmyuMQ829X1FeKswpRjJmiQCIpInkL5eRhrr1wcVsVkyKLUzJmGL58Cq-/s16000/scene_4_adaptiveHueDependent_5_0.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHzKCIsaErzzFZp0r3SSMIDmGzRH4e3fjrrVloWZl3gz9ONjSdvCfo7HQE0nLUXRR4MlC7oVZbmjgvO54BPCAGmyuMQ829X1FeKswpRjJmiQCIpInkL5eRhrr1wcVsVkyKLUzJmGL58Cq-/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span>, α=5.0</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi49e5LoDDjbuv_gP9BYaVCxr0zMT-Wt4lOfFLwpR99mXJJt_o0qkd1-Cyo-wAJ8kkXaO8VQDoVv8rzhIE-dSob5M8iNaIwQYUJumCs549S_beoyrHybqIuChiOLE9KES3ZsKhMcujb4O3x/s16000/scene_4_adaptiveHueDependent_0_05.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi49e5LoDDjbuv_gP9BYaVCxr0zMT-Wt4lOfFLwpR99mXJJt_o0qkd1-Cyo-wAJ8kkXaO8VQDoVv8rzhIE-dSob5M8iNaIwQYUJumCs549S_beoyrHybqIuChiOLE9KES3ZsKhMcujb4O3x/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span>, α=0.05</td></tr></tbody></table>
</td>
</tr>
</tbody></table><p>With a saturated red light, g<span><span>amut clipping can greatly reduce the orange/yellow hue shift. This reminds me the presentation: <a href="https://research.activision.com/publications/archives/hdr-in-call-of-duty">"HDR in Call of Duty"</a> and <a href="https://www.ea.com/frostbite/news/high-dynamic-range-color-grading-and-display-in-frostbite">"HDR color grading and display in Frostbite"</a>,
which talked about some of the VFX</span></span><span><span><span><span> </span></span></span></span><span><span><span><span><span><span>(e.g. fire/explosion) </span></span>may relies on such hue
shift</span></span>. I don't know whether it is good or not, but
gamut clipping may at least give a closer look between sRGB
display and HDR display...</span> <br /></span></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEil6hUtjYvpNC8HWfJdNJADq1-F4P-FL3sTCKzi9zOw9Y8XXWPHhqhzgiwW3iXi97cXvIkCnbCSa9hId-fven2StXQelXjqroC2nwZ4kEZBdYziOl37Z_pGE15EkmirsOnzzwUBGl-QdIVg/s16000/scene_1_noClip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEil6hUtjYvpNC8HWfJdNJADq1-F4P-FL3sTCKzi9zOw9Y8XXWPHhqhzgiwW3iXi97cXvIkCnbCSa9hId-fven2StXQelXjqroC2nwZ4kEZBdYziOl37Z_pGE15EkmirsOnzzwUBGl-QdIVg/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Without gamut clipping </td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpubgmhB8U24E8-2tTIurdG-IYBxfdYhILrAt4zasgmKiNdhcyjzxS2PmFpVBwjzykBrI04Qz2SdT3D9PMcs5XM1nMtWtvuHPi6vB0HI6FHnDf-ln-AjGPNWCaiongEegM4dZS3MAwLgqh/s16000/scene_1_keepLightness.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpubgmhB8U24E8-2tTIurdG-IYBxfdYhILrAt4zasgmKiNdhcyjzxS2PmFpVBwjzykBrI04Qz2SdT3D9PMcs5XM1nMtWtvuHPi6vB0HI6FHnDf-ln-AjGPNWCaiongEegM4dZS3MAwLgqh/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Chroma clipped</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjQOQh9NAfZaHcOndX273KHSx4G6WTbf8lRwlHezdFPhk5LKfJ49Vd3lxoCc4OOKvi6O_5I_4RUdTLlFsscXAV6ATIQQ_JNCHVpVXTXjonMySk_8IS1m7G_UwZsCQxd9hVqOTqYVU2pS2A/s16000/scene_1_outOfGamut.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjQOQh9NAfZaHcOndX273KHSx4G6WTbf8lRwlHezdFPhk5LKfJ49Vd3lxoCc4OOKvi6O_5I_4RUdTLlFsscXAV6ATIQQ_JNCHVpVXTXjonMySk_8IS1m7G_UwZsCQxd9hVqOTqYVU2pS2A/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Out of gamut pixels</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggPRHG8hv1XlHv_fN2hG_q8P-fNHYAB4hBsPj12f2Pwk1f1O5rggrh0ugomI0jore_dQvYOqhgrZg39NC8amrO0h8SDkFgPBQr7_7LPHZ4L_EmS4_Q-HesjFpi1lsHH8d_9G0Sm3Av8Xjq/s16000/scene_1_singlePointHueIndependent.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggPRHG8hv1XlHv_fN2hG_q8P-fNHYAB4hBsPj12f2Pwk1f1O5rggrh0ugomI0jore_dQvYOqhgrZg39NC8amrO0h8SDkFgPBQr7_7LPHZ4L_EmS4_Q-HesjFpi1lsHH8d_9G0Sm3Av8Xjq/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">L<span style="font-size: xx-small;">0</span>=0.5 projection</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhE88u5VeZa3Dq7nqcWHt7nme9k__igIUKIxIjAC9IbM7oh792F841cr6BFmnyTiS8FN_96i0h_RifxPupb5pI6SaEjvbfuMA6TdefYiRb_-_YGomUhgFGiippoZnR83q8JOMMQMYEMqsHz/s16000/scene_1_adaptiveIndependent_5_0.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhE88u5VeZa3Dq7nqcWHt7nme9k__igIUKIxIjAC9IbM7oh792F841cr6BFmnyTiS8FN_96i0h_RifxPupb5pI6SaEjvbfuMA6TdefYiRb_-_YGomUhgFGiippoZnR83q8JOMMQMYEMqsHz/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=0.5, α=5.0</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQI6w9FZs-I0sz4jBsVtSoY7VVKI6dgAwFZPmUyhCcFP4G8BL4ko2V1JXQ-kVU46d4hh9PdBOOgJ-0o8xboWmcjulmB6x5MUpF4AjFnwO0I-zpTCvWEecXvSGCo-_lRrzd-wMo1sT22wxi/s16000/scene_1_adaptiveIndependent_0_05.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQI6w9FZs-I0sz4jBsVtSoY7VVKI6dgAwFZPmUyhCcFP4G8BL4ko2V1JXQ-kVU46d4hh9PdBOOgJ-0o8xboWmcjulmB6x5MUpF4AjFnwO0I-zpTCvWEecXvSGCo-_lRrzd-wMo1sT22wxi/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=0.5, α=0.05</td></tr></tbody></table>
</td>
</tr>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOPWaQGnMJYt5u3VqJCDIDnJpGqchaWg7bQh3GuJHzu-ylcPsAOarCftmT8zGFxvPnvcE8nr1ylAqO1uZG2Edocjpq33c4g5vdW90BVdwvTcjz9xcn-LUpDvpVzPB-XmqCzVmgdIDzKU6M/s16000/scene_1_singlePointHueDependent.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOPWaQGnMJYt5u3VqJCDIDnJpGqchaWg7bQh3GuJHzu-ylcPsAOarCftmT8zGFxvPnvcE8nr1ylAqO1uZG2Edocjpq33c4g5vdW90BVdwvTcjz9xcn-LUpDvpVzPB-XmqCzVmgdIDzKU6M/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span> projection</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRxw0Vcq5EhdDPbDkRIb-RL4-n5DDtNXLVcB17-GJn_mmNs5-yRFSppaoT67qNGKGbcO4l_zNUCc2WE8TtT9_-xwVxuYd-ptBL3HQOBAN3TpAB1_yryPEerf0-l8RxIqI34n4PyPrEJ9UA/s16000/scene_1_adaptiveDependent_5_0.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRxw0Vcq5EhdDPbDkRIb-RL4-n5DDtNXLVcB17-GJn_mmNs5-yRFSppaoT67qNGKGbcO4l_zNUCc2WE8TtT9_-xwVxuYd-ptBL3HQOBAN3TpAB1_yryPEerf0-l8RxIqI34n4PyPrEJ9UA/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span>, α=5.0</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiILANuCJaZhXxq-koISB9frfqaewJBKnrM7DmmJ3hE54w0tDHT_oF8iaBROx5eVbkJJ8V2X9VxHIpx7A82zc1ZkKrFS1mwLyh7d-HsCDrLvJoU-gJZni2mNcffLLMNhZEAX-Ld26EpPJ19/s16000/scene_1_adaptiveDependent_0_05.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiILANuCJaZhXxq-koISB9frfqaewJBKnrM7DmmJ3hE54w0tDHT_oF8iaBROx5eVbkJJ8V2X9VxHIpx7A82zc1ZkKrFS1mwLyh7d-HsCDrLvJoU-gJZni2mNcffLLMNhZEAX-Ld26EpPJ19/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Adaptive L<span style="font-size: xx-small;">0</span>=L<span style="font-size: xx-small;">cusp</span>, α=0.05</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p><span>As gamut clipping can reduce hue shift for the saturated red color, I was wondering whether it can fix hue shift with blue colored light (in </span><span><span>sRGB)</span> showing purple which described in <a href="https://simonstechblog.blogspot.com/2020/03/dxr-path-tracer.html">DXR Path Tracer post</a> before. Unfortunately, gamut clipping can't fix this... I guess this may need to be fixed earlier in the pipeline (e.g. in tone mapper or use other gamut mapping method)...</span></p><table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9c7CP5h_hbwVAqBkoHeEgfUrlmG_57x6wpdJ-tGldQHnU1LCIUXkVWu4NVU_eUiFBe3Foeb1KK1d-6o9MCHPPMLTEhY3v4X83cqZj30lvb_sSVfYkVf9JkLJXtMztNXde6dDuQoxk_s2Q/s16000/sRGB_blue_noClip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9c7CP5h_hbwVAqBkoHeEgfUrlmG_57x6wpdJ-tGldQHnU1LCIUXkVWu4NVU_eUiFBe3Foeb1KK1d-6o9MCHPPMLTEhY3v4X83cqZj30lvb_sSVfYkVf9JkLJXtMztNXde6dDuQoxk_s2Q/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Without gamut clipping </td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><span></span></div>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4P18g_URVdD6EgflFLuV83qzaMR4yYzgXIjUYFzq6GXPIX-lFr0TynnRqocOiOgbo0-sCX0YSGnjouALnUYZbl5EmoEfY3CFx7BLlgjpVY7q_In2_7j56JkovYQaQbYi2ZJ9qRIo6HGMe/s16000/sRGB_blue_gamutClip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4P18g_URVdD6EgflFLuV83qzaMR4yYzgXIjUYFzq6GXPIX-lFr0TynnRqocOiOgbo0-sCX0YSGnjouALnUYZbl5EmoEfY3CFx7BLlgjpVY7q_In2_7j56JkovYQaQbYi2ZJ9qRIo6HGMe/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">With gamut clipping </td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><span></span></div>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit8aaa-_K5_ahoRGvoVW6PHdsnHZOqNIOMlDWEeIJk59g9OvvqdglNwjqEB0iz0EWApnQ3CjbnOR6-y41pNfExtAYJMU8P7t02We2rP30RLZu5qApju0YmqsOgsdYhR-h3nv3kV8tzV3Ix/s16000/sRGB_blue_outOfGamut.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit8aaa-_K5_ahoRGvoVW6PHdsnHZOqNIOMlDWEeIJk59g9OvvqdglNwjqEB0iz0EWApnQ3CjbnOR6-y41pNfExtAYJMU8P7t02We2rP30RLZu5qApju0YmqsOgsdYhR-h3nv3kV8tzV3Ix/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Out of gamut pixels</td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><span></span></div>
</td>
</tr>
</tbody></table>
<p><span>Lastly, a scene with not much saturated color, but overexposure is tested. Gamut clipping doesn't change the image much:</span><span> </span></p><table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRbe38sAAHF7U7lhp1BQqsth6WK-xRqMqm-WOf2Tub115lZLPWetl2cp2xj1fi2G8bm-en19MYDCU0nRHnVL7K5ykUbDNzUxE4Qt0Dq_Javy4AJyZXGV8_1u8YApaYpnr-iijpZgsBI8CQ/s16000/ov_noClip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRbe38sAAHF7U7lhp1BQqsth6WK-xRqMqm-WOf2Tub115lZLPWetl2cp2xj1fi2G8bm-en19MYDCU0nRHnVL7K5ykUbDNzUxE4Qt0Dq_Javy4AJyZXGV8_1u8YApaYpnr-iijpZgsBI8CQ/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Without gamut clipping </td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><span></span></div>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0r3SCuqF5SiwmXmRMe155A_3mOc0cw6QKmn3J3s_7k4LKr7ezWclwU5iLUT-Kd0vUaxnoHklr3VWLoP7TEXCXHCzUlwgax3EhvZdUDckizW-I7hJXyQ0diyRmEpr2P8vV1b2Z7CzhzDGh/s16000/ov_gamutClip.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0r3SCuqF5SiwmXmRMe155A_3mOc0cw6QKmn3J3s_7k4LKr7ezWclwU5iLUT-Kd0vUaxnoHklr3VWLoP7TEXCXHCzUlwgax3EhvZdUDckizW-I7hJXyQ0diyRmEpr2P8vV1b2Z7CzhzDGh/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">With gamut clipping </td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><span></span></div>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJnuGfpvZNo3Kmkq1oJGZ_U0YkaNuQV5-AKwypbhPpeCKFgJ-m7ClG_6PmKeXbj4FcjUqbE15AySZOqd6Pv7U92gnJLd8uw2bAxFUwpvHIcb0WgWuTyvTDGklvIOPUL49MnL0W-_d0FSm8/s16000/ov_outOfGamut.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJnuGfpvZNo3Kmkq1oJGZ_U0YkaNuQV5-AKwypbhPpeCKFgJ-m7ClG_6PmKeXbj4FcjUqbE15AySZOqd6Pv7U92gnJLd8uw2bAxFUwpvHIcb0WgWuTyvTDGklvIOPUL49MnL0W-_d0FSm8/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Out of gamut pixels</td></tr></tbody></table><div class="separator" style="clear: both; text-align: center;"><span></span></div>
</td>
</tr>
</tbody></table>
<br /><p><span style="font-size: large;"><b>Conclusion</b></span></p><p><span>In this post, an analytical solution is provided to perform gamut clipping for different gamut other than sRGB. Also different gamut clipping method are tested. </span><span>"Compress chroma </span><span><span>only" looks quite decent, while projection towards single point may change the perceived lightness of the image (depends on the lighting set up), while the adaptive method using small alpha value (e.g. 0.05) will behave similar to compress chroma only method, while with large alpha (e.g. >5.0), it will behave similar to the projection towards single point method. The demo can be <a href="https://drive.google.com/file/d/1R3mGRkG8T8reNRthwXr1XTh1JzMvjFtD/view?usp=sharing">downloaded</a> to play around with different gamut clipping method. Note that the demo relies on a saturated light color to generate out of gamut color and all the albedo textures are in sRGB (due to the texture spectral up-sampling method only support sRGB while light color is using a different spectral up-sampling method). Also, my demo performs the gamut clipping before blending with UI as all the UI are in sRGB color space, </span></span><span><span><span><span><span><span>in the future, I </span></span></span></span>may need to think about whether the UI should be </span></span><span><span><span><span>gamut clipped if wide color are used...<br /></span></span></span></span></p><p><span style="font-size: medium;"><b>References</b></span></p><p><span style="font-size: x-small;">[1] <a href="https://bottosson.github.io/posts/gamutclipping/">https://bottosson.github.io/posts/gamutclipping/</a></span></p><p><span style="font-size: x-small;">[2] <a href="https://www.ea.com/frostbite/news/high-dynamic-range-color-grading-and-display-in-frostbite">https://www.ea.com/frostbite/news/high-dynamic-range-color-grading-and-display-in-frostbite</a></span></p><p><span style="font-size: x-small;">[3] <a href="https://research.activision.com/publications/archives/hdr-in-call-of-duty">https://research.activision.com/publications/archives/hdr-in-call-of-duty</a><br /></span></p>Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-63535557944682379232021-05-23T18:53:00.002+08:002021-05-26T01:14:52.528+08:00Studying Gamut Clipping<p><span style="font-size: large;"><b>Introduction</b></span></p><p>Recently, I was studying a technique called gamut clipping from <a href="https://bottosson.github.io/posts/gamutclipping/">this blog post</a>. This technique is used for handling out of gamut color and bring them back to a valid range, which helps reducing hue shift and color clipping. From that blog post, it explains the concept clearly, but I was struggling to understand the sample code the author provided. So this blog post will describe what I have learnt when studying the gamut clipping source code. Also, I have written a <a href="https://www.shadertoy.com/view/fsBXRV">Shadertoy</a> sample to help me to understand the problem.</p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhADMu-4xenLySau8_wrBgJPCY52BFRo-_TLqlY9TFy_XBVmduuA0XcJSwANiBVlIKPzqBVwUUgfwRLggbGhFO7VWrT7Ocotxx_fykCfluDwhndXWY_lst6XtKWisFibAu7zEEroljo1mV0/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="338" data-original-width="600" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhADMu-4xenLySau8_wrBgJPCY52BFRo-_TLqlY9TFy_XBVmduuA0XcJSwANiBVlIKPzqBVwUUgfwRLggbGhFO7VWrT7Ocotxx_fykCfluDwhndXWY_lst6XtKWisFibAu7zEEroljo1mV0/w400-h225/main.gif" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Drawing the valid sRGB color with Oklab chroma as horizontal axis<br />Oklab lightness as vertical axis with hue value displayed at upper right corner<br /></td><td class="tr-caption" style="text-align: center;"><br /></td><td class="tr-caption" style="text-align: center;"><br /></td><td class="tr-caption" style="text-align: center;"><br /></td><td class="tr-caption" style="text-align: center;"><br /></td></tr></tbody></table><p></p><p><span style="font-size: large;"><b>Overview of the gamut clipping<br /></b></span></p><p>This section briefly describe the steps to perform gamut clipping, feel free to skip it if you have read the original <a href="https://bottosson.github.io/posts/gamutclipping/">gamut clipping blog post</a> already. The technique first start to convert the out of gamut color (e.g. those pixels with values >1.0 or <0.0 in sRGB) into the <a href="https://bottosson.github.io/posts/oklab/">Oklab color space</a> (which can be expressed with lightness, hue and chroma). Then we try to project this out of gamut color along a straight line to the"triangular" gamut boundary (like below picture). To calculate this intersection, we need to find the cusp coordinates at this particular hue slice. The author use curve fitting to approximate cusp coordinates with polynomial equation. With the cusp coordinates calculated, the intersection point can be found with numerical approximation using <a href="https://en.wikipedia.org/wiki/Halley%27s_method">Halley's method</a>. Finally that intersection point can be converted back to sRGB color space.</p><p></p>
<table>
<tbody><tr>
<td>
<div style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3mTIlqplE8RCFGVmwk3JGKS1h0gaZGXMiA10vgXimFEMu1tGzj9FuXKWZqkRqpQAF79PL1VoQgKHCmasmA1P4QXt_9mbTPm43MDrkoGHkSjwyNBAvn0jPJM27apaQPxRE_IblreBO0e6i/"><img alt="" data-original-height="424" data-original-width="753" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3mTIlqplE8RCFGVmwk3JGKS1h0gaZGXMiA10vgXimFEMu1tGzj9FuXKWZqkRqpQAF79PL1VoQgKHCmasmA1P4QXt_9mbTPm43MDrkoGHkSjwyNBAvn0jPJM27apaQPxRE_IblreBO0e6i/" width="320" /></a></div>
</td>
<td>
<div style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhl-MEIyKpJThqaktL8KWiuJaOdF1af57w3d97oY-9UhsxJmSxjP4tNf1zD5Hub9wMt-BNu77_2Lkm1-nIBPMNeOnap-onMlGVnbrjirleAaHSu9n3j62mLjVngCgcnaIydmI1UCfWJ6Lc3/"><img alt="" data-original-height="424" data-original-width="753" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhl-MEIyKpJThqaktL8KWiuJaOdF1af57w3d97oY-9UhsxJmSxjP4tNf1zD5Hub9wMt-BNu77_2Lkm1-nIBPMNeOnap-onMlGVnbrjirleAaHSu9n3j62mLjVngCgcnaIydmI1UCfWJ6Lc3/" width="320" /></a></div>
</td>
</tr>
<tr>
<td><span style="font-size: x-small;">Showing an out of gamut color point being projected back to a valid value along a straight line.</span></td>
<td><span style="font-size: x-small;">Displaying all color including those sRGB clipped color, out of gamut color will result in hue shift (e.g. blue hue displayed as purple).
</span></td>
</tr>
</tbody></table>
<p></p><p><span style="font-size: large;"><b>Finding maximum saturation </b></span></p><p>To approximate the cusp, instead of using the hue value and returning the cusp <i><b>(chroma, lightness)</b></i> coordinates directly. The author tries to use the maximum saturation value to find the cusp coordinates. This is the first thing I don't understand when reading the source code. Why saturation is related to the cusp? To understand this, we can start from the definition of saturation first. Both terms chroma and saturation describe colorfulness, but saturation is "somehow normalized" and "not affected" by the brightness. A chroma/lightness/saturation relationship can be defined <a href="https://en.wikipedia.org/wiki/Colorfulness#CIELUV_and_CIELAB">like this</a>:<br /></p><p></p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPEgKsdu4aXz2oE3HqVtsTDIhpRIsMLgxCAP_a0aGETnKfI3xjIWsHmBYeWEeWQw0ZuFlLLN-NYD_lBAFfcXLdX3Zz4bnYzqnYLWsQwPXmPa6WZCfTKTlrMpELHwkxKiuSpiyP_cleKPuQ/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="172" data-original-width="533" height="103" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPEgKsdu4aXz2oE3HqVtsTDIhpRIsMLgxCAP_a0aGETnKfI3xjIWsHmBYeWEeWQw0ZuFlLLN-NYD_lBAFfcXLdX3Zz4bnYzqnYLWsQwPXmPa6WZCfTKTlrMpELHwkxKiuSpiyP_cleKPuQ/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Saturation definition for CIELAB space from Wiki,<br />we use the same definition for Oklab space too.<br /></td></tr></tbody></table>Reading the author's <a href="https://colab.research.google.com/drive/1JdXHhEyjjEE--19ZPH1bZV_LiGQBndzs?usp=sharing">Colab source code</a>, he tried to approximate the max saturation with a polynomial function using <i><b>a=cos(hue)</b></i> and <i><b>b=sin(hue)</b></i>:<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDu1l3fW2Ua-NdYe-Dn1r5h-Wxpk1L00mf_9qx_evzN_xBjeMCcPOaEG1_kmh1Kvx7ODDRhyPkmqYAQ4xWHU5bGxJPy7Y_fvvIEzf-KZ4r4BaJQ22Msv5MOidfbBrdLiICZMFk0vIzOs18/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="82" data-original-width="1105" height="30" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDu1l3fW2Ua-NdYe-Dn1r5h-Wxpk1L00mf_9qx_evzN_xBjeMCcPOaEG1_kmh1Kvx7ODDRhyPkmqYAQ4xWHU5bGxJPy7Y_fvvIEzf-KZ4r4BaJQ22Msv5MOidfbBrdLiICZMFk0vIzOs18/w400-h30/optimize_func.png" width="400" /></a></div><p></p><p>Then these <i><b>(saturation, hue)</b></i> values will be converted to linear sRGB space and minimize it for <b>R=0</b>, <b>G=0</b> and <b>B=0</b>, which obtained 3 polynomial equations to approximate the maximum saturation.<br /></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrVbjQNK0thtV7sFK_cxVQOBbi6kLEkcGzrG72fxpn0R12X3zS06kwBQSgfD901fkCCPckEfqOYjtB2jmeRn9fk4yoJs2warBTfm5OZD5YOPcO12QeTsTq7lFF-2l4IzYrCiqhVmxetj_I/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="361" data-original-width="1142" height="202" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrVbjQNK0thtV7sFK_cxVQOBbi6kLEkcGzrG72fxpn0R12X3zS06kwBQSgfD901fkCCPckEfqOYjtB2jmeRn9fk4yoJs2warBTfm5OZD5YOPcO12QeTsTq7lFF-2l4IzYrCiqhVmxetj_I/w640-h202/colab_optimize.png" width="640" /></a></div><p></p><p>But why we can obtain maximum saturation when either <b>R</b> / <b>G</b> / <b>B</b>=0? To have a rough understanding of it, I decided to open a color picker with HSL slider to "simulate" how the RGB values changes: <br /></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY5Rav5Z336ml63nROTCXnhU9mjo4zYJcolJOeCs58LnTqvYSc0-hGAUBJN9PbQDZxjW_DDG31_Xvmec8xcpBjc4fo7u0LlRcXNTLnqVhwTsz14REIH0GHsODfs9ernDMKWR-n6cwAn_vM/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="528" data-original-width="800" height="211" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY5Rav5Z336ml63nROTCXnhU9mjo4zYJcolJOeCs58LnTqvYSc0-hGAUBJN9PbQDZxjW_DDG31_Xvmec8xcpBjc4fo7u0LlRcXNTLnqVhwTsz14REIH0GHsODfs9ernDMKWR-n6cwAn_vM/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">A color picker with maximum saturation and changing hue slowly.<br />There will always be a 0 in the RGB value.<br /></td><td class="tr-caption" style="text-align: center;"><br /></td></tr></tbody></table><p></p><p>First I choose the most saturated red color (255, 0, 0) in sRGB, which yield a HSL value (0, 240, 120). Then I change the hue value slowly to observe how the RGB value changes. From the above animated gif, we can see that there will always be a 0 in either R or G or B channel. So I guess the author is using this property and the Oklab space has similar property.</p><p>To further understand how the saturation looks like for all hue slices in Oklab, I plotted the saturation value, seems like it always have the largest value at the lower "triangle" edge: <br /></p><p></p><div style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3MgiLfybaGVS67Z50WgBEZiXu_hgQzKrMLx0Py72YfuaX7XoFd7mC10NpBWBkHGTkHhA7emA26PbjgC8IxiLnlh_LnOTVjVY9luODrxfv0oYpUjFf5OYcd9lcbhRlukRDymOOEX6cZjcb/"><img alt="" data-original-height="338" data-original-width="600" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3MgiLfybaGVS67Z50WgBEZiXu_hgQzKrMLx0Py72YfuaX7XoFd7mC10NpBWBkHGTkHhA7emA26PbjgC8IxiLnlh_LnOTVjVY9luODrxfv0oYpUjFf5OYcd9lcbhRlukRDymOOEX6cZjcb/" width="320" /></a></div><p></p><p>We can also draw the line for those pixels with <b>R</b>/<b>G</b>/<b>B</b> value = 0: </p><p></p><div style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTpC39IWVvP4AR8Fp6A8QGxVN7yRQ7q1i8mL9TXcI9slMQWOSBpxuw3IOeUKaKkeg3Unr-HLEYeOlyI_75XGG_2chhsIUTKKk9RhxFukEWReCgjBlg8MrhXKjctJhk7nlan3UeMBNN32ed/"><img alt="" data-original-height="338" data-original-width="600" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTpC39IWVvP4AR8Fp6A8QGxVN7yRQ7q1i8mL9TXcI9slMQWOSBpxuw3IOeUKaKkeg3Unr-HLEYeOlyI_75XGG_2chhsIUTKKk9RhxFukEWReCgjBlg8MrhXKjctJhk7nlan3UeMBNN32ed/" width="320" /></a></div><p></p><p>The lower "triangle" edge is actually the "clipping" lines when <b>R=0</b> or <b>G=0</b> or <b>B=0</b> (and switching between these 3 lines)! That's why the author tried to use 3 different polynomials to approximate the max saturation. Then the next problem is how to pick one of the three approximated polynomials. From the above animated gif, we know that the "clipping line" changes when sRGB values = (1, 0, 0), (0, 1,0) or (0, 0, 1). We can see this from the <a href="https://colab.research.google.com/drive/1JdXHhEyjjEE--19ZPH1bZV_LiGQBndzs?usp=sharing">Colab sample code</a> which find the <b>r_dir</b>/<b>g_dir</b>/<b>b_dir</b>, and use this to pick one of the three approximated polynomials.</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7F48oXo_I2zoNrygVmXSAPuCNGgpjQeQisfRKuNOXe41OGrsxNdDtcC-SA7eZBWryKfwKpdCQn1GS4-v3JmUpom37bfk6UIaUvnz6ExxuaujejQtNC0cKf2AFapldFrosMO_NyhhZ6x9Y/s1391/rgb_dir.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="347" data-original-width="1391" height="160" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj7F48oXo_I2zoNrygVmXSAPuCNGgpjQeQisfRKuNOXe41OGrsxNdDtcC-SA7eZBWryKfwKpdCQn1GS4-v3JmUpom37bfk6UIaUvnz6ExxuaujejQtNC0cKf2AFapldFrosMO_NyhhZ6x9Y/w640-h160/rgb_dir.png" width="640" /></a></div>The <b>r_dir</b>/<b>g_dir</b>/<b>b_dir</b> is the "half vector" between the "opposite hue" like the below figure. And this direction vector can be used with dot product to check whether a hue value (in form of <b>cos(hue)</b>, <b>sin(hue)</b>) is lying within which <b>r_h</b>/<b>g_h/b_h</b> range.<p></p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUXMaLSC-1hi2ar8XZWH4WXDENkb6jNhvpJjhrEubTJilR8yP1phyphenhyphenjDDKcKAiNBmG2B3ox-oa4uh2_swBlLQOgPqxjQYyh9F4auknWZ-i6i1xAgCPdOp8JKEeHlSRtV0XJHqsfvvzrg0LB/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="934" data-original-width="1024" height="292" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUXMaLSC-1hi2ar8XZWH4WXDENkb6jNhvpJjhrEubTJilR8yP1phyphenhyphenjDDKcKAiNBmG2B3ox-oa4uh2_swBlLQOgPqxjQYyh9F4auknWZ-i6i1xAgCPdOp8JKEeHlSRtV0XJHqsfvvzrg0LB/w320-h292/polar.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="font-size: x-small;">Visualizing the r/g/b_dir<b>'</b> vector (before multiplying the scalar constant for fast dot product check).</span></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9QTijxAWsU2te8mSX9f-JqZz2-JSPxoWRO9PK-5U8BXza1WevAOIC9ZmEv6lz4fzDWtX34X7fOkUXPQlCMu3tH_f6hBZxGFZFTPVWeODLL2nbXPzMGKO2Zgs-SQrV-NrhWzPu6BavbLd6/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="525" data-original-width="1007" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9QTijxAWsU2te8mSX9f-JqZz2-JSPxoWRO9PK-5U8BXza1WevAOIC9ZmEv6lz4fzDWtX34X7fOkUXPQlCMu3tH_f6hBZxGFZFTPVWeODLL2nbXPzMGKO2Zgs-SQrV-NrhWzPu6BavbLd6/w400-h209/r_dir_usage.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="font-size: x-small;">The formula for checking which hue range the <i><b>a</b></i>, <i><b>b</b></i> hue value is in, which can be derived from dot product.</span></td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p>Another problem I don't understand is: when the author tried to minimize the RGB value to 0, he raised the error value to the power 10 in the <b>e_R(x)</b>/<b>e_G(x)</b>/<b>e_B(x)</b> function. </p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO_QMi57Ps06Z36RXSbxl1f1_NxtqOvOOwlYwbmF_lhoLknWK2XwPrJPKfmvhylZ8bZLCazF0nuN9SUaSSu69jgJmscQUd6ViG0eP-sWlWNa0V7S2TT9UGhrLQPIKaVdzZcRPtAIMcFwO7/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="28" data-original-width="506" height="22" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiO_QMi57Ps06Z36RXSbxl1f1_NxtqOvOOwlYwbmF_lhoLknWK2XwPrJPKfmvhylZ8bZLCazF0nuN9SUaSSu69jgJmscQUd6ViG0eP-sWlWNa0V7S2TT9UGhrLQPIKaVdzZcRPtAIMcFwO7/w400-h22/pow10.png" width="400" /></a></div><p></p><p>When I tried to throw random values to the initial guess value in <b>scipy.optimize.minimize()</b> (e.g. use all 0 or 1 as initial guess), the resulting approximated curve is not that good...</p><p>However when I changed the <b>pow(x, 10)</b> to <b>abs(x)</b>, using random initial guess values can have a more predictable result (though the error is not as small as the author's approximation). May be I will use <b>abs(x)</b> instead to optimize for different color spaces in the future.<br /></p><p><b></b></p><p><span style="font-size: large;"><b>Finding cusp from </b></span><span style="font-size: large;"><b>maximum saturation</b></span></p><p>With the above polynomial approximation, we can find the "lower clipping line" of the gamut "triangle" (i.e. with either <b>R</b>/<b>G</b>/<b>B</b> value = 0). The cusp must be on this "clipping line". We only need to identify the lightness value at the cusp. Looking at the author's <b><i>find_cusp()</i></b> function, it seems like the cusp will have <b>R</b> or <b>G</b> or <b>B</b> value equals to 1:</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZP0YjZkFzWkQESHNEiDDut91Wl5R5IhJhDe1uDRm-d7TiGIGKANU42ugaUJsY4iIuORlyfSeZ0PwggNRFYoQ6B0xxohsHELAKcQNhqfkw_crgiVbCEb-kKS_e37x5xHaK5DUSHG6i4Ip7/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="423" data-original-width="923" height="294" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZP0YjZkFzWkQESHNEiDDut91Wl5R5IhJhDe1uDRm-d7TiGIGKANU42ugaUJsY4iIuORlyfSeZ0PwggNRFYoQ6B0xxohsHELAKcQNhqfkw_crgiVbCEb-kKS_e37x5xHaK5DUSHG6i4Ip7/w640-h294/find_cusp.png" width="640" /></a></div>To have a better understanding, we can repeat the above color picker "experiment", when changing the hue value at maximum saturation, beside there always having a 0 value in <b>R</b>/<b>G</b>/<b>B</b>, there will always having a 255 value in <b>R</b>/<b>G</b>/<b>B</b> too!<p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY5Rav5Z336ml63nROTCXnhU9mjo4zYJcolJOeCs58LnTqvYSc0-hGAUBJN9PbQDZxjW_DDG31_Xvmec8xcpBjc4fo7u0LlRcXNTLnqVhwTsz14REIH0GHsODfs9ernDMKWR-n6cwAn_vM/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="528" data-original-width="800" height="211" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiY5Rav5Z336ml63nROTCXnhU9mjo4zYJcolJOeCs58LnTqvYSc0-hGAUBJN9PbQDZxjW_DDG31_Xvmec8xcpBjc4fo7u0LlRcXNTLnqVhwTsz14REIH0GHsODfs9ernDMKWR-n6cwAn_vM/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><span style="font-size: x-small;">The same color picker gif as above,<br />pay attention to the RGB values,<br />in addition to always having a 0 value,<br />there is a 255 value too.</span><br /></td></tr></tbody></table><p>Plotting the line when <b>R</b>/<b>G</b>/<b>B</b> value = 1 for all hue slices in Oklab space:</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5hIYjWqQ-XqaQL0I-tJDqw-PR1QN8nqLe0CU_H1A-8oHM26hV7Bb6J3Q8tW5oNaryTp4XOmX1xEJUjZJvloM1DNmKbEYVXQHiJgwbHXqLS9XImgBxjSI6dNDVwNkSQc7RqX3sHiGmRfVC/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="338" data-original-width="600" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5hIYjWqQ-XqaQL0I-tJDqw-PR1QN8nqLe0CU_H1A-8oHM26hV7Bb6J3Q8tW5oNaryTp4XOmX1xEJUjZJvloM1DNmKbEYVXQHiJgwbHXqLS9XImgBxjSI6dNDVwNkSQc7RqX3sHiGmRfVC/" width="320" /></a></div><p>When <b>R/G/B</b> value = 1, it is the upper clipping line of the "gamut triangle"! So the cusp is the intersection between lower clipping line with <b>R/G/B</b>=0 and the upper clipping line with <b>R/G/B</b>=1.<br /></p><p>So, to obtain the maximum lightness and maximum saturation, we can scale the lightness so that when converted back to sRGB, one of the RGB value will = 1 (taking cubic root is needed because of the <a href="https://bottosson.github.io/posts/oklab/">Oklab color space LMS definition</a>).</p><p></p><p><span style="font-size: large;"><b>Finding gamut intersection</b></span></p><p></p><p>With the gamut cusp coordinates, target project coordinates and out of gamut coordinates, we can find the intersection coordinates at the gamut boundary during projection. First we need to check whether the intersection is in the upper half or the lower half of the valid range gamut "triangle". This can be determined by below formula which can be derived from checking whether the projection line is on the left/right side to the cusp line using cross product.</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwV7VWQI1-aR_6hSxvXhK8B99XQ9VJmEDdZn_ILidahB2hQkxNQ3zhNcPmHmZDhVFhE-EaaNFPQzR3Gk3LjjBsgb40hqUT1eeNROWfeGd-mdkUySVNd_0SecDdwduB96dWYAGBSCu8qUXz/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="89" data-original-width="606" height="59" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwV7VWQI1-aR_6hSxvXhK8B99XQ9VJmEDdZn_ILidahB2hQkxNQ3zhNcPmHmZDhVFhE-EaaNFPQzR3Gk3LjjBsgb40hqUT1eeNROWfeGd-mdkUySVNd_0SecDdwduB96dWYAGBSCu8qUXz/w400-h59/upper_lower_check.png" width="400" /></a></div><p></p><p>If in the lower half, it is just a simple line-line intersection. If in the upper half, we first approximate the intersection with line-line intersection, and then refine the answer with Halley's method. Since we know the upper clipping line is <b>R</b>/<b>G</b>/<b>B</b>=1, we can see the author is using this property when using Halley's method in the following code snippet:</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_DIUWD4eTZbL55eNp5yZ5S0xsm-X1RwMqCCh3XZDeO8yrs6-JKEHoxr62iVcVkEOwAaKULvDHXhWZU6aYFmWKhnGiUh2dR0smMAlxEoJYG-teTDVIXoc0eOT-PBnktHMKICg5Y8n2Du3e/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="90" data-original-width="750" height="48" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_DIUWD4eTZbL55eNp5yZ5S0xsm-X1RwMqCCh3XZDeO8yrs6-JKEHoxr62iVcVkEOwAaKULvDHXhWZU6aYFmWKhnGiUh2dR0smMAlxEoJYG-teTDVIXoc0eOT-PBnktHMKICg5Y8n2Du3e/w400-h48/halley_minus1.png" width="400" /></a></div><p></p><p>and the author use all 3 clipping curves and take the minimum value to select the closest curves that clip the gamut.</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeZfmaCx5Lpi7snZ3vEAyKcOnDDMg-JcZZ6uDKGbmJb62HDfI92H_bock7aYFQwmssK5SwS8FwJjgcBv5JrAO5ICoUipadDi8LtOonoYgbrlBl35eiY9lzuVfHIR3CiLfTy0bZQyXida9n/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="145" data-original-width="345" height="84" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeZfmaCx5Lpi7snZ3vEAyKcOnDDMg-JcZZ6uDKGbmJb62HDfI92H_bock7aYFQwmssK5SwS8FwJjgcBv5JrAO5ICoUipadDi8LtOonoYgbrlBl35eiY9lzuVfHIR3CiLfTy0bZQyXida9n/w200-h84/upper_min_rgb.png" width="200" /></a></div><p></p><p>If small error is acceptable, we can use 1 clipping curve instead of 3. The upper clipping curve changes happens roughly at RGB value=(0, 1, 1), (1, 0, 1) and (1, 1, 0). We can use the same method as picking the <b>R</b>/<b>G</b>/<b>B</b>=0 curve during maximum saturation calculation which relies on the <b>r_dir</b>, <b>g_dir</b>, <b>b_dir</b>. We can compute the <b>c_dir</b>, <b>m_dir</b> and <b>y_dir</b> (which correspond to the cyan, magenta and yellow direction). Those coefficients can be found in my <a href="https://www.shadertoy.com/view/fsBXRV">Shadertoy source code</a>. Because the upper clipping curve is not straight line, we may need 2 clipping lines to compute the correct answer for some hue values:</p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBM9wz01OYXT_UdPdHOT9li2Q74X0ZSx1xZbZXWQZWzXlovR504WB_Dr2VxWkozzAY_9aF0-yFM-6xy72aiGisvxhmgK_Fm0i_Ad6AGXdiDyLPAV1tPSGIp56KQn6tslikenXvnlFSSuIz/s759/one_line_2_curve.png" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="367" data-original-width="759" height="194" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBM9wz01OYXT_UdPdHOT9li2Q74X0ZSx1xZbZXWQZWzXlovR504WB_Dr2VxWkozzAY_9aF0-yFM-6xy72aiGisvxhmgK_Fm0i_Ad6AGXdiDyLPAV1tPSGIp56KQn6tslikenXvnlFSSuIz/w400-h194/one_line_2_curve.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Need 2 upper clipping lines for this hue slice<br /></td></tr></tbody></table><p></p><p><span style="font-size: large;"><b>Conclusion</b></span></p><p>In this post, I have described the process of learning the gamut clipping technique. With the help of the <a href="https://www.shadertoy.com/view/fsBXRV">Shadertoy sample</a>, we can see that the gamut boundary is the line with <b>R</b>/<b>G</b>/<b>B</b>=0 and <b>R</b>/<b>G</b>/<b>B</b>=1. And the gamut cusp is the intersection between <b>R</b>/<b>G</b>/<b>B</b>=0 line and <b>R</b>/<b>G</b>/<b>B</b>=1 line. But I still have some questions left in my mind like: Is it meaningful to use negative chroma values? Is the gamut clipping works well in pratice, or need a gamut compression method? May be I will implement gamut clipping in my toy path tracer to see how it looks like.</p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjR40oF6Z5kNfW4f8YNvDSyZhjiMaf44Me_B80zMUPQW4sA_-g_XLIDLkJV8LOPc7XWl3G28j9hvrXGu0s5kembBGQcEvOP8WpJMaDDxMMYjOXNzc8ZPNBLwbq9QFOBPrAfVO9eYDyYj5DS/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1080" data-original-width="1920" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjR40oF6Z5kNfW4f8YNvDSyZhjiMaf44Me_B80zMUPQW4sA_-g_XLIDLkJV8LOPc7XWl3G28j9hvrXGu0s5kembBGQcEvOP8WpJMaDDxMMYjOXNzc8ZPNBLwbq9QFOBPrAfVO9eYDyYj5DS/w400-h225/zoom_out.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Zooming out the graph, do the negative values (e.g. -ve chroma, -ve lightness) have meaning?<br />The G=1 clipping line "wrap around" to negative lightness, does it have meaning too?<br /></td></tr></tbody></table><p></p><p><b>Reference</b></p><p>[1] <a href="https://bottosson.github.io/posts/gamutclipping/">https://bottosson.github.io/posts/gamutclipping/</a> <br /></p><p>[2] <a href="https://bottosson.github.io/posts/oklab/">https://bottosson.github.io/posts/oklab/</a></p><p>[3] <a href="https://en.wikipedia.org/wiki/Colorfulness#CIELUV_and_CIELAB">https://en.wikipedia.org/wiki/Colorfulness#CIELUV_and_CIELAB</a><br /></p><p><br /></p>Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-82557293242095395242021-03-26T01:01:00.015+08:002021-03-26T01:13:35.056+08:00Importance sampling visible wavelength<p><span style="font-size: large;"><b>Introduction</b></span></p><p>It has been half a year since my last post. Due to the pandemic and political environment in Hong Kong, I don't have much time/mood to work on my hobby path tracer... And until recently, I tried to get back to this hobby, may be it is better to work on some small tasks first. One thing I am not satisfied in <a href="http://simonstechblog.blogspot.com/2020/07/spectral-path-tracer.html">previous spectral path tracing post</a> is using 3 different cosh curves(with peak at <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space#Color_matching_functions">the XYZ color matching functions</a>) to importance sample the visible wavelength instead of 1. So I decided to revise it and find another PDF to take random wavelength samples. A demo with the updated importance sampled wavelength can be downloaded <a href="https://drive.google.com/file/d/1RjKkKl8JbrU4V1aVc0BbD6CdjJ0ECzWZ/view?usp=sharing">here</a> and the python code used for generating the PDF can be viewed <a href="https://colab.research.google.com/drive/1nwkqQNqtO2SeMDgY2W44N4WQXyvLjrT7?usp=sharing">here</a> (inspired by <a href="https://twitter.com/BartWronsk/status/1365171200188510209">Bart Wronski's tweet</a> to use Colab).</p><p></p><span style="font-size: large;"><b>First Failed Try</b></span><p></p><p>The basic idea is to create a function with peak values at the same location as <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space#Color_matching_functions">the color matching functions</a>. I decided to use the sum of the XYZ color matching curves as PDF. <br /></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJ90T1CA7-4JTiEG_Y3DkFpye-ii7eaFTxMuwSiP1WQ-zIgderbR5Hc0hwada_vyOz9PW6nQOIVcK9dTFmc3eDiKQ5-VSuuILooBgsUNAGlSc28psRkYiULyByMFraKt1h-YmE1zONf-Zt/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="247" data-original-width="379" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJ90T1CA7-4JTiEG_Y3DkFpye-ii7eaFTxMuwSiP1WQ-zIgderbR5Hc0hwada_vyOz9PW6nQOIVcK9dTFmc3eDiKQ5-VSuuILooBgsUNAGlSc28psRkYiULyByMFraKt1h-YmE1zONf-Zt/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Black line is the Sum of XYZ curves<br /></td></tr></tbody></table><p></p><p>To simplify calculation, analytical approximation of the XYZ curves can be used. A common approximation can be found <a href="http://jcgt.org/published/0002/02/01/paper.pdf">here</a>, which seems hard to be integrated(due to the λ square) to create the CDF. So a rational function is used instead:</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqzlB44ip9U88QvdFyJymlkTVYyIFjFFFOWHA9GaeNoZkvTQ062UwvC9RSHmYffxS4soJwoAlQ79r2qFGYgXsNL1VjeThbHzgZZHBsEHpgessY9PUMK61Ja4DqXio2gwy1P4rmyZdFsmdM/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="185" data-original-width="594" height="63" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqzlB44ip9U88QvdFyJymlkTVYyIFjFFFOWHA9GaeNoZkvTQ062UwvC9RSHmYffxS4soJwoAlQ79r2qFGYgXsNL1VjeThbHzgZZHBsEHpgessY9PUMK61Ja4DqXio2gwy1P4rmyZdFsmdM/w200-h63/formula.png" width="200" /></a></div><p></p><p>The X curve is approximated with 1 peak and dropped the small peak at around 450nm, because we want to compute the "sum of XYZ" curve, that missing peak can be compensated by scaling the Z curves. The approximated curves looks like:</p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvYEkp_E40KOU5mWBwjyeiQI9BqqiIMYsCbZeNWGKsvcIehFjnX_SCK2MWtFf2sro6_8fvFPws2SjCdmPwpg1JEwE_G4tyG7C1xu-ldYtdlOtaSs_RtT4uXllAmq40DIBo4TNgt5k392ZH/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="248" data-original-width="381" height="208" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvYEkp_E40KOU5mWBwjyeiQI9BqqiIMYsCbZeNWGKsvcIehFjnX_SCK2MWtFf2sro6_8fvFPws2SjCdmPwpg1JEwE_G4tyG7C1xu-ldYtdlOtaSs_RtT4uXllAmq40DIBo4TNgt5k392ZH/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Light colored curves are the approximated function<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2wwd_hauOuyvWRFqpXyCKAzh7Wqi7REAKxQsrCPdV68caM9A7iCLf2KkyKcTSy3O6HQ07ydiCBKhu3AN9UeE_nDs6l8K2U5WnS1WbaLZfcziBcHT5YhIcHTsJloqvao8wjtOXXJPQowWv/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="250" data-original-width="393" height="204" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2wwd_hauOuyvWRFqpXyCKAzh7Wqi7REAKxQsrCPdV68caM9A7iCLf2KkyKcTSy3O6HQ07ydiCBKhu3AN9UeE_nDs6l8K2U5WnS1WbaLZfcziBcHT5YhIcHTsJloqvao8wjtOXXJPQowWv/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Grey color curve is the approximated PDF<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table>The approximated PDF is roughly similar to the sum of XYZ color matching curves. But I have made a mistake: Although the rational function can be integrated to create the CDF, I don't know how to compute the inverse of CDF(which is needed by the inverse method to draw random samples using uniform random numbers). So I need to find another way...<br /><p><b><span style="font-size: large;">Second Try</span></b></p><p>From previous section, although I don't know how to find the inverse of the approximated CDF, out of curiosity, I still want to know how the approximated CDF looks like, so I plot the graph:<br /></p><p></p><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiruU-hs0sM0S-hlGWJNLU0dCzci04Rv1AsVaAwnAQwsOzrLbG7WsZ7ppeTJ_-KGBSMD0bdVskoaWEPozqckRWWE2BbeRuXXaqOsxy2vxjcaA3dsizyoY5xE196llFIjzJEg267pb465Sme/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="248" data-original-width="380" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiruU-hs0sM0S-hlGWJNLU0dCzci04Rv1AsVaAwnAQwsOzrLbG7WsZ7ppeTJ_-KGBSMD0bdVskoaWEPozqckRWWE2BbeRuXXaqOsxy2vxjcaA3dsizyoY5xE196llFIjzJEg267pb465Sme/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Black line is the original CDF<br />Grey line is the approximated CDF</td><td class="tr-caption" style="text-align: center;"><br /></td></tr></tbody></table><p>It looks like some smoothstep functions added together, with 1 base smoothstep curve in range [380, 780] with 2 smaller smoothstep curves (in range around [400, 500] and [500, 650]) added on top of the base curve. May be I can approximate this CDF with some kind of
polynomial function. After some trail and error, an approximated CDF is found (details of CDF and PDF can be found in the <a href="https://colab.research.google.com/drive/1nwkqQNqtO2SeMDgY2W44N4WQXyvLjrT7?usp=sharing">python code</a>):</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgft-sHufmtt5T79Fd-bufdXYgqBTj-ebJD6HHXfWet8G7dhoS6xWbdZfcCG8n6R1oDZc8cwEcjv20J_o1szuDQt4xuzLOZauxkbGAU5Ve0M6EBG4ow5mQVHT1OQvl40-dIJDzTSolDhUrH/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="283" data-original-width="775" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgft-sHufmtt5T79Fd-bufdXYgqBTj-ebJD6HHXfWet8G7dhoS6xWbdZfcCG8n6R1oDZc8cwEcjv20J_o1szuDQt4xuzLOZauxkbGAU5Ve0M6EBG4ow5mQVHT1OQvl40-dIJDzTSolDhUrH/w640-h234/cdf_smoothStep_func.png" width="640" /></a></div><p></p><p>The above function divides the visible wavelength spectrum into 4 intervals to form a <a href="https://en.wikipedia.org/wiki/Piecewise">piecewise function</a>. Since smoothstep is a cubic function, adding another smoothstep function is still another cubic function, which can be inverted and differentiated. And the approximated "smoothstep CDF/PDF" curve looks like:</p><p></p>
<table>
<tbody>
<tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5FWaf-OX5w42wFuqZgiLLUnP15WtSIyeryQzY50zi5p7GIsl6UDK5uzqkGETRV1vS9XtEtDPDoXeKyWdmppEZ66nz6ceB7SAG6seUafw1DZIn58yHFytJ490Yrxxy3AnizRW7nMGOLxKc/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="248" data-original-width="380" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5FWaf-OX5w42wFuqZgiLLUnP15WtSIyeryQzY50zi5p7GIsl6UDK5uzqkGETRV1vS9XtEtDPDoXeKyWdmppEZ66nz6ceB7SAG6seUafw1DZIn58yHFytJ490Yrxxy3AnizRW7nMGOLxKc/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Blue line is the approximated "smoothstep CDF"<br />Black line is the original CDF<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieqYk7_cYaRUMVbzvCBNnVPoKkL_oi3AEiCF-s9NDEpgTX_uSZDKq7Qt7dhU0odhDMqU-S9Y5xe8S8TTNVYQEMSdYg4po0D9a-loI75qy7ZW1WA_a4O3tHO7pZifOQJMhi8Mm6WYuWyAPG/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="250" data-original-width="392" height="204" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieqYk7_cYaRUMVbzvCBNnVPoKkL_oi3AEiCF-s9NDEpgTX_uSZDKq7Qt7dhU0odhDMqU-S9Y5xe8S8TTNVYQEMSdYg4po0D9a-loI75qy7ZW1WA_a4O3tHO7pZifOQJMhi8Mm6WYuWyAPG/" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Blue line is the approximated "smoothstep PDF"<br />Black line is the original PDF</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
Although the "smoothstep CDF" looks smooth, its PDF is not (not <a href="https://en.wikipedia.org/wiki/Smoothness">C1 continuous</a> at around 500nm). But it has 2 peaks value at around 450nm and 600nm, may be let's try to render some images to see how it behaves.<br /><p><span style="font-size: large;"><b>Result</b></span><br /></p><p>The same Sponza scene is rendered with 3 different wavelength sampling functions: uniform, cosh and smoothstep (with stratification of random number disabled, color noise are more noticeable when zoomed in):</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3aXyicpUFHFrLPp5gFMYKwqOS085Q7ah7v5M4Dnxz8c6NNdcAi500gdA3Ev-BpUOx2vEwv0LkHBnNBKDfnFM7U6e6Tab7l3VLD3e9grpjV2ml0JTFc6yFxII3Pmh0WZrEXe0AoirhCv4r/s16000/100_0_uniform.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="106" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3aXyicpUFHFrLPp5gFMYKwqOS085Q7ah7v5M4Dnxz8c6NNdcAi500gdA3Ev-BpUOx2vEwv0LkHBnNBKDfnFM7U6e6Tab7l3VLD3e9grpjV2ml0JTFc6yFxII3Pmh0WZrEXe0AoirhCv4r/w200-h106/100_0_uniform.png" width="200" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">uniform weighted sampling<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEib7k88OSykvDmPfSHPA74BF_9JMfn1W-WEKY4ptTjZ4Cp44IT2v-zDJpZ7fLrKUVvddrPbF0e-vmSU7JSY4THtxuuhG_9gBaUUUu3-_VEZAPuvDjlaCrj9RD5YWExqvaLdQVksKO_Nj8Tt/s16000/100_0_cosh.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="106" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEib7k88OSykvDmPfSHPA74BF_9JMfn1W-WEKY4ptTjZ4Cp44IT2v-zDJpZ7fLrKUVvddrPbF0e-vmSU7JSY4THtxuuhG_9gBaUUUu3-_VEZAPuvDjlaCrj9RD5YWExqvaLdQVksKO_Nj8Tt/w200-h106/100_0_cosh.png" width="200" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">cosh weighted sampling</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisMCZbgvubYNx0j-6-0N2nQzTd9CYMBR-jKg369Cvb8WgAZmhj6CiPKtx8fkdCu3RdjvZPDDuGRb-hMlKNSVchVw7PFgg15SeaUmh4XiRlmWrjjfrBw74g1U46INqOS5nxFbqV-oEt-ZrA/s16000/100_0_smoothstep.png" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="728" data-original-width="1368" height="106" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisMCZbgvubYNx0j-6-0N2nQzTd9CYMBR-jKg369Cvb8WgAZmhj6CiPKtx8fkdCu3RdjvZPDDuGRb-hMlKNSVchVw7PFgg15SeaUmh4XiRlmWrjjfrBw74g1U46INqOS5nxFbqV-oEt-ZrA/w200-h106/100_0_smoothstep.png" width="200" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">smoothstep weighted sampling</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p>Both cosh and smoothstep wavelength sampling method show less color noise than uniform sampling method, with the smoothstep PDF slightly better than cosh function. Seems the C1 discontinuity of the PDF does not affect rendering very much. A demo can be downloaded <a href="https://drive.google.com/file/d/1RjKkKl8JbrU4V1aVc0BbD6CdjJ0ECzWZ/view?usp=sharing">here</a> to see how it looks in real-time.</p><p><span style="font-size: large;"><b>Conclusion</b></span></p><p>This post described an approximated function to importance sample the visible wavelength by using the sum of the color matching functions, which reduce the color noise slightly. The approximated CDF is composed of cubic piecewise functions. The python code used for generating the polynomial coefficients can be found <a href="https://colab.research.google.com/drive/1nwkqQNqtO2SeMDgY2W44N4WQXyvLjrT7?usp=sharing">here</a> (with some unused testing code too. e.g. I have tried to use 1 linear base function with 2 smaller smoothstep functions added on top, but the result is not much better...). Although the approximated PDF is not C1 continuous, it does not affect the rendering very much. If someone knows more about how the C1 continuity of the PDF affect rendering, please leave a comment below. Thank you. </p><p><span style="font-size: medium;"><b>References</b></span></p><p>[1] <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space#Color_matching_functions">https://en.wikipedia.org/wiki/CIE_1931_color_space#Color_matching_functions</a></p><p>[2] <a href="http://jcgt.org/published/0002/02/01/paper.pdf">Simple Analytic Approximations to the CIE XYZColor Matching Functions</a><br /></p>[3] <a href="https://stackoverflow.com/questions/13328676/c-solving-cubic-equations">https://stackoverflow.com/questions/13328676/c-solving-cubic-equations</a><br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-91874485747843693132020-09-30T01:47:00.000+08:002020-09-30T01:47:10.072+08:00sRGB/ACEScg Luminance Comparison<p><span style="font-size: large;"><b>Introduction</b></span></p><p>When I was searching information about rendering in different color spaces, I came across that <a href="https://chrisbrejon.com/cg-cinematography/chapter-1-5-academy-color-encoding-system-aces/">using wider color primaries</a> (e.g. ACEScg instead of sRGB/Rec709) to perform lighting calculation will have a result closer to spectral rendering. But will this affect the brightness of the bounced light? So I decided to find it out. (The math is a bit lengthy, please feel free to skip to the result.)<br /></p><p> <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhR4PzrNEos67RkKhhCJYKsrTt3KQXBP_Yo3taHW8EwtcZSb35ALg3TR5Jlro_iN62wB_yWrLwfWxSMBa9V7Wbat-ylcvA-jlg_tH-xZfrcvZKXS-gBakDCqkNuAG6xUS5Z4UjfOs84tYF7/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="1194" data-original-width="2048" height="374" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhR4PzrNEos67RkKhhCJYKsrTt3KQXBP_Yo3taHW8EwtcZSb35ALg3TR5Jlro_iN62wB_yWrLwfWxSMBa9V7Wbat-ylcvA-jlg_tH-xZfrcvZKXS-gBakDCqkNuAG6xUS5Z4UjfOs84tYF7/w640-h374/lum_sRGB.png" width="640" /></a></p><p></p><p><b><span style="font-size: large;">Comparison method</span></b></p><p>To predict the brightness of the rendered image, we can consider the reflected light color after <b><i>n</i></b> bounces. To simplify the problem, we assume all the surfaces are diffuse material. We can derive a formula for the RGB color vector <b><i>c</i></b> after <b><i>n</i></b> light bounces.</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3Q_8kCjRyIGxDWHmD6mCyFcX_gU7gwmTlQoXdekXTW7UaRGH-bDs0XdAJRg-xUUHvQ6ctuZq8xkLTvcJasv7bO-VxlKTVynMNI59do8xFR7ZHnfd7WrMcsmvErbKzt6Oa1byPt3wTcJyt/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="660" data-original-width="1424" height="185" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3Q_8kCjRyIGxDWHmD6mCyFcX_gU7gwmTlQoXdekXTW7UaRGH-bDs0XdAJRg-xUUHvQ6ctuZq8xkLTvcJasv7bO-VxlKTVynMNI59do8xFR7ZHnfd7WrMcsmvErbKzt6Oa1byPt3wTcJyt/w400-h185/diffuse.png" width="400" /></a></div><p></p><p>To calculate lighting in different color spaces, we need to convert the <i>albedo</i> and <i>initial light color</i> to our desired color gamut by multiplying with a matrix <b><i>M</i></b> (For rendering in sRGB/Rec709, <b><i>M</i></b> is identity matrix).</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3LAuFg2s-6sc-yLGGfMI5OzeMR1A1exHvJ4OzGnCCS3sirxabl_OfLuMXrtjdWtssaiL5cNjYeRenligiKpqguHvHm08X-Z01INpcg0S1LCBfH8b6aM20bNA8UaSNOhwJ1IuGr3qHP-nD/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="232" data-original-width="1669" height="55" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3LAuFg2s-6sc-yLGGfMI5OzeMR1A1exHvJ4OzGnCCS3sirxabl_OfLuMXrtjdWtssaiL5cNjYeRenligiKpqguHvHm08X-Z01INpcg0S1LCBfH8b6aM20bNA8UaSNOhwJ1IuGr3qHP-nD/w400-h55/diffuse_with_gamut_convert.png" width="400" /></a></div><p></p><p>Finally, we can calculate the luminance of the bounced light by computing the dot product between the color vector <b><i>c</i></b> and luminance vector <b><i>Y</i></b> of the color space (i.e. <b><i>Y</i></b> is the second row vector of the conversion matrix from a color space to XYZ space, with chromatic adaptation).</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQUKlIJKw64aD5M3RtTbm44q9P-bgTK414zqHbqvTVbkI7E9dcW0QwRr5vcVEk9hU4xv3QNi8XKFh8JycWV2ZCwOIv9kvc3T96Ii9Eh_ZMVZx_iciPOLEINtbOsD5H9_qcEdRMX3x8BMhr/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="707" data-original-width="1375" height="206" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQUKlIJKw64aD5M3RtTbm44q9P-bgTK414zqHbqvTVbkI7E9dcW0QwRr5vcVEk9hU4xv3QNi8XKFh8JycWV2ZCwOIv9kvc3T96Ii9Eh_ZMVZx_iciPOLEINtbOsD5H9_qcEdRMX3x8BMhr/w400-h206/luminance.png" width="400" /></a></div><p>Now, we have an equation to compute the brightness of a reflected ray after <b><i>n</i></b> bounces in arbitrary color space.<br /></p><p></p><br /><p></p><p><span style="font-size: large;"><b>Grey Material Test</b></span></p><p>To further simplify the problem, we assume all the surfaces are using the same material:<br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEip-Qyhz2TavxCbedplj9RGWB4P8kaHyVnKzxB7hNDD0h-1xXW9pA8wFosb2gA_DUnuNV5p34ruUSJFjd_rr2x8tRZXnvhovt-kR8KHjXd9q0_j_kLkZbb_RWr8NsB_oBaetV5Y48vN7FVQ/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="404" data-original-width="1220" height="133" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEip-Qyhz2TavxCbedplj9RGWB4P8kaHyVnKzxB7hNDD0h-1xXW9pA8wFosb2gA_DUnuNV5p34ruUSJFjd_rr2x8tRZXnvhovt-kR8KHjXd9q0_j_kLkZbb_RWr8NsB_oBaetV5Y48vN7FVQ/w400-h133/lum_all_d_equal.png" width="400" /></a></div><p></p><p></p><p>Then assuming all the surfaces are grey in color:<br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCpnJuVNGBQOULRoOQYZwJ5Zr8pFfQkWWwrBbeaXk86lBMZ1PxK0w5z5adZ1k4Wg0OI_8IwjuDeW-vryHfarpnnRNMOsw-7_Nvm80VZSyO3za9ClIrq-PtSWJKu4QD9BIoDFgPZO2zqG6E/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="371" data-original-width="1172" height="126" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCpnJuVNGBQOULRoOQYZwJ5Zr8pFfQkWWwrBbeaXk86lBMZ1PxK0w5z5adZ1k4Wg0OI_8IwjuDeW-vryHfarpnnRNMOsw-7_Nvm80VZSyO3za9ClIrq-PtSWJKu4QD9BIoDFgPZO2zqG6E/w400-h126/lum_grey.png" width="400" /></a></div><p>Now, the luminance equation is simpler to understand.</p><p>Substituting the matrix <b><i>M</i></b> and luminance vector <b><i>Y</i></b> for sRGB color gamut, the equation is very simple:<br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoru9NPu8XCB8DaEKuiD-a2KEyImHsnHZ_tcZcTCPS94LNI4_BSjpi5ne96Mz21glkYLiKeYgQ5GRmLjXifRBTHfgVErqEEPGEYvBzNmNIRsFJfygfjBf2kZs4hMS7aRXgE-CH2zOVMVw0/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="343" data-original-width="1221" height="113" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoru9NPu8XCB8DaEKuiD-a2KEyImHsnHZ_tcZcTCPS94LNI4_BSjpi5ne96Mz21glkYLiKeYgQ5GRmLjXifRBTHfgVErqEEPGEYvBzNmNIRsFJfygfjBf2kZs4hMS7aRXgE-CH2zOVMVw0/w400-h113/lum_grey_sRGB.png" width="400" /></a></div><p></p><p>Then we do the same thing for ACEScg. Surprisingly, there are some constants roughly equal to one, so we can approximate them with constant one and the result will roughly equal to the luminance equation of sRGB.</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCJp5qN5JXTKeBMo_K-Kyq9FM6dNgNzC3WBO6rXNw3SccrXaxhldyS-z4zOePNIg5rZkZWxeX9ZG-Sei3vKpWCDv-DSJfbuiTvNIPsBiPovmqveTokDSDrsQu8ER9ck3gkCQd5NOlEAqko/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="1361" data-original-width="2048" height="426" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhCJp5qN5JXTKeBMo_K-Kyq9FM6dNgNzC3WBO6rXNw3SccrXaxhldyS-z4zOePNIg5rZkZWxeX9ZG-Sei3vKpWCDv-DSJfbuiTvNIPsBiPovmqveTokDSDrsQu8ER9ck3gkCQd5NOlEAqko/w640-h426/lum_grey_ACEScg.png" width="640" /></a></div><p>As both equations are roughly equal, the rendered images in sRGB and ACEScg should be similar. Let's try to render images in sRGB and ACEScg to see the result (images are path traced with sRGB and ACEScg primaries, and then displayed in sRGB).
</p><p></p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhR4PzrNEos67RkKhhCJYKsrTt3KQXBP_Yo3taHW8EwtcZSb35ALg3TR5Jlro_iN62wB_yWrLwfWxSMBa9V7Wbat-ylcvA-jlg_tH-xZfrcvZKXS-gBakDCqkNuAG6xUS5Z4UjfOs84tYF7/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1194" data-original-width="2048" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhR4PzrNEos67RkKhhCJYKsrTt3KQXBP_Yo3taHW8EwtcZSb35ALg3TR5Jlro_iN62wB_yWrLwfWxSMBa9V7Wbat-ylcvA-jlg_tH-xZfrcvZKXS-gBakDCqkNuAG6xUS5Z4UjfOs84tYF7/w400-h234/lum_sRGB.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Path traced in sRGB</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTgzMUnTPCn9zvrmrPoiwMiqjpweOSBL50PQym8kBF3kbTi-DfaK1s-9pMcPslhV9D4-_Qdv7I31OOH2zyn3GcmyKvRKk8DTfnnsAqgPnykHTZEJwYo3iuhLChCxBFzNMPsS_P3w-Lz9jE/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1194" data-original-width="2048" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTgzMUnTPCn9zvrmrPoiwMiqjpweOSBL50PQym8kBF3kbTi-DfaK1s-9pMcPslhV9D4-_Qdv7I31OOH2zyn3GcmyKvRKk8DTfnnsAqgPnykHTZEJwYo3iuhLChCxBFzNMPsS_P3w-Lz9jE/w400-h234/lum_ACEScg.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Path traced in ACEScg<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table><br /><div>Both images looks very similar! So rendering in different color spaces with grey material will not change the brightness of the image. At least, the difference is very small after tone mapped to a displayable range.<p><br /></p><p><span style="font-size: large;"><b>Red Material Test</b></span></p><p>Now, let's try to use red material instead of grey material to see how the luminance changes (where <b><i>k</i></b> is a variable to control how 'red' the material is):</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipTFU3W5hrLwqqJcpbO0plI11jSoUeRmbgIc35VEByxYUGK2NuVLa2bftn8rwO4qjnADOal0wddxEdJodJOY9ouJbnbduUb_Ya5p-j07xSnkZsssxJQ1JF5HIRJaoi8qRPOxVmDN5fNOpI/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="372" data-original-width="1191" height="125" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipTFU3W5hrLwqqJcpbO0plI11jSoUeRmbgIc35VEByxYUGK2NuVLa2bftn8rwO4qjnADOal0wddxEdJodJOY9ouJbnbduUb_Ya5p-j07xSnkZsssxJQ1JF5HIRJaoi8qRPOxVmDN5fNOpI/w400-h125/lum_red.png" width="400" /></a></div><p></p><p>But the equation is still a bit complex, so we further assume the initial light color is white in color.<br /></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIBXtuSvSd6KiMjeAfrjSnUNmGloyuaHzVxWd2KcS-dvT-DiWt_33C3gbq-NoGHDRscI-l3PVZF59cAO6cMCriocHR3-hn7E3QBhDF2cTDUVl_xzaVjCFql68pgNR-jTKZxFUVr_MX_2fB/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="376" data-original-width="1110" height="135" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIBXtuSvSd6KiMjeAfrjSnUNmGloyuaHzVxWd2KcS-dvT-DiWt_33C3gbq-NoGHDRscI-l3PVZF59cAO6cMCriocHR3-hn7E3QBhDF2cTDUVl_xzaVjCFql68pgNR-jTKZxFUVr_MX_2fB/w400-h135/lum_red_white_light.png" width="400" /></a></div>Then we perform the same steps in last section, substituting <b><i>M</i></b> and <b><i>Y</i></b> into luminance equation.
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOvNK25b72YZ2jT2h8D_seEZH_ODIcXFFEDRidEFqQIsnUVmoHeSVKxHxC5k6Rc9OQKy3-cmtYiJtyV5FLZIHCQen6QrpiZPbMzHE8JFN6l3QNM1-fxMSrhfkzqLD0tLiIV_8hsE57S9HQ/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="366" data-original-width="980" height="120" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOvNK25b72YZ2jT2h8D_seEZH_ODIcXFFEDRidEFqQIsnUVmoHeSVKxHxC5k6Rc9OQKy3-cmtYiJtyV5FLZIHCQen6QrpiZPbMzHE8JFN6l3QNM1-fxMSrhfkzqLD0tLiIV_8hsE57S9HQ/w320-h120/lum_red_sRGB.png" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">sRGB luminance equation</td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0aQ-stKeSKwDicZ1s0lC08918ZzvAmG7pagpJmP1LQYTJZX2PSjUOaM0CnYWdj8URykxQcdP9Yb0Ew8518Yc1FPyQXwEkKHfDye5L_zdLjth2Dqxct8RnXAD78NO0PNv963htDino9mS2/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="299" data-original-width="1398" height="85" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg0aQ-stKeSKwDicZ1s0lC08918ZzvAmG7pagpJmP1LQYTJZX2PSjUOaM0CnYWdj8URykxQcdP9Yb0Ew8518Yc1FPyQXwEkKHfDye5L_zdLjth2Dqxct8RnXAD78NO0PNv963htDino9mS2/w400-h85/lum_red_ACEScg.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;"><br /><br />ACEScg luminance equation</td></tr></tbody></table>
</td>
</tr>
</tbody></table>
<p></p><p>Unfortunately, both equations are a bit too complex to compare, having 2 variables <b><i>k</i></b> and <b><i>n</i></b>... May be let's try to plot some graphs to see how those variables affect the luminance, with number of bounced light = 3 and 5 (i.e. <i><b>n</b></i>=3 and <b><i>n</i></b>=5, skipping the <b><i>N dot L</i></b> part because both equations have such term). From the graphs below: when <b><i>k</i></b> increases (i.e. the red color is getting more saturated, with RGB value closer to (1, 0, 0) ), luminance difference will increase, hence sRGB will have a larger luminance value than ACEScg.</p><p></p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAeYOfmBtlZbczcu6c1wo1M5IrKr2DU_xzaBSu2uSXoW13HHx6Qpqq3IOGHpXHsOibG37Tn4o2t06dQCYNrxDkyjq95HR7_f-QYLF2FSwm_V415tB9DatgzPuWWc5-JnNA4pVVhQsKRWNR/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="940" data-original-width="3185" height="189" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAeYOfmBtlZbczcu6c1wo1M5IrKr2DU_xzaBSu2uSXoW13HHx6Qpqq3IOGHpXHsOibG37Tn4o2t06dQCYNrxDkyjq95HR7_f-QYLF2FSwm_V415tB9DatgzPuWWc5-JnNA4pVVhQsKRWNR/w640-h189/red_plot.png" width="640" /></a></div><br /><p></p><p>Then comparing the images rendered in sRGB and ACEScg:</p><p></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifXcXJnVtHA_1ARyvak3Whaktu26NAK31j5NwBmbbOpu1vxVdZ3K7MKrsOr4Z40-hH3Hw-XYRc8dOiAP5Rik-bHdhocLeoRYRYRjFkLyg3s5f-ruv8x15AiwX7rmFkPRA45-qF_-BR_-wo/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1194" data-original-width="2048" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifXcXJnVtHA_1ARyvak3Whaktu26NAK31j5NwBmbbOpu1vxVdZ3K7MKrsOr4Z40-hH3Hw-XYRc8dOiAP5Rik-bHdhocLeoRYRYRjFkLyg3s5f-ruv8x15AiwX7rmFkPRA45-qF_-BR_-wo/w400-h234/red_sRGB.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Path traced in sRGB<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhkAdkQazqlXoBGXDMNcFF4nBwIFn6zubw6PXlDlNZAR2GIe5MDlxWEL9RIAJ2vxRRPycUHLQhQIRokyGCPp3nTvga9IqeGFp86UShebgXgi9woiMlh0lbdTbEFJP3xQAvuQUp5KKqQYVri/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1194" data-original-width="2048" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhkAdkQazqlXoBGXDMNcFF4nBwIFn6zubw6PXlDlNZAR2GIe5MDlxWEL9RIAJ2vxRRPycUHLQhQIRokyGCPp3nTvga9IqeGFp86UShebgXgi9woiMlh0lbdTbEFJP3xQAvuQUp5KKqQYVri/w400-h234/red_ACEScg.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Path traced in ACEScg<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table><p></p><p>The indirectly lit area looks much brighter when rendered in sRGB. This makes sense because for any red color, its red channel value will be closer to one (while green/blue values will be closer to 0) when represented in sRGB compared to be represented in ACEScg. After several multiplication, the reflected light value should be larger when computation is done in sRGB.<br /></p><p> </p><p><span style="font-size: large;"><b>RGB Material Test</b></span></p><p>How about using different colored material this time? Assuming 1/3 of the light bounced on surface that is red material, 1/3 is green material, and 1/3 is blue material. </p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_h2U9I2OC8E7uFOtE3DSJp4gVTNGI0QQ4nRyXNvlCZcKQdSvT9AhOw8XRNs6BMlTvly7DFf7I66Oj1uCMXRHBNU1t-fkkfYzhl0ptOmgq4xaKPojlGKC-JA-KdYTD-y-HMvoXUpYMfXOH/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="1144" data-original-width="1906" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_h2U9I2OC8E7uFOtE3DSJp4gVTNGI0QQ4nRyXNvlCZcKQdSvT9AhOw8XRNs6BMlTvly7DFf7I66Oj1uCMXRHBNU1t-fkkfYzhl0ptOmgq4xaKPojlGKC-JA-KdYTD-y-HMvoXUpYMfXOH/w400-h240/lum_rgb.png" width="400" /></a></div><p></p><p>Like previous 2 sections, substituting <b><i>M</i></b> and <b><i>Y</i></b>, the luminance equation becomes:</p><p></p><p></p>
<table>
<tbody><tr>
<td>
<div style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqXyjtEme1U1VNObkrh05m0ugaCaEa1EuNsfDGURzB33kmjlBawRBYZcIGJAmjdFFu0PyTB2D1UiZ-_u1Dy2LO-S0oY3R2W09bGdG0hWM2OzVBYOH-Xw4FXWWi3yrKhLvRphuhswzfj_jt/"><img alt="" data-original-height="350" data-original-width="877" height="128" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqXyjtEme1U1VNObkrh05m0ugaCaEa1EuNsfDGURzB33kmjlBawRBYZcIGJAmjdFFu0PyTB2D1UiZ-_u1Dy2LO-S0oY3R2W09bGdG0hWM2OzVBYOH-Xw4FXWWi3yrKhLvRphuhswzfj_jt/" width="320" /></a></div>
</td>
<td>
<div style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxFnJWE9NJ-6KN5pU2OKFMQIW-rOcLcCyTL31s4QZkaWE7KENZDXRIUWUUcYVygUhRmfqGggi1BPAVyBA7kMeFGWc8Qud_ESs2BVJtboVIXkAydOMelcLOOAcNaMa-HZB-0lQZtMUF1iIs/"><img alt="" data-original-height="380" data-original-width="2546" height="96" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxFnJWE9NJ-6KN5pU2OKFMQIW-rOcLcCyTL31s4QZkaWE7KENZDXRIUWUUcYVygUhRmfqGggi1BPAVyBA7kMeFGWc8Qud_ESs2BVJtboVIXkAydOMelcLOOAcNaMa-HZB-0lQZtMUF1iIs/w640-h96/lum_rgb_ACEScg.png" width="640" /></a></div>
</td>
</tr>
<tr>
<td align="center">
sRGB luminance equation
</td>
<td align="center">
ACEScg luminance equation
</td>
</tr>
</tbody></table> </div><div>And then plotting graphs to see how <b><i>k</i></b> and <b><i>n</i></b> varies:</div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVYLqog8lqChSuOO7aUEBsxtzStHzl4r0OoGEvU_xeC3j0vDW9X8kb29926FCGhHmj6GZaYCeHRqWpzcqfzVqc7WPLL1iRVGMZAVfSmls09crS1H0S0m63Gv94qEWcDggaqnleOTA_RBiR/" style="margin-left: 1em; margin-right: 1em;"><img alt="" data-original-height="913" data-original-width="3115" height="188" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVYLqog8lqChSuOO7aUEBsxtzStHzl4r0OoGEvU_xeC3j0vDW9X8kb29926FCGhHmj6GZaYCeHRqWpzcqfzVqc7WPLL1iRVGMZAVfSmls09crS1H0S0m63Gv94qEWcDggaqnleOTA_RBiR/w640-h188/rgb_plot.png" width="640" /></a></div><p></p><p>The result is different this time. The sRGB luminance is smaller than ACEScg luminance, and the difference increases when both <i><b>k</b></i> and <i><b>n</b></i> increases. So the bounced light will be darker when rendered in sRGB.<br /></p><p>Let's try rendering some images to see whether this is true. Although we cannot force the ray to bounce with 1/3 red/green/blue material exactly, I roughly assigned 1/3 of the material in red/green/blue color in the scene.<br /></p>
<table>
<tbody><tr>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipI_gPmJ2EfB8m67uOSfSHRDne9c4NunWOPMtFH3jdD3H-4RmuxMNRJwa0oEzjpCjlqAz53M8QpZ-Bwd0ibMo9OOB3QvnJ1-CYCyNdVOKU2cRAM3o0fKSC4C0ZK2zJGKLemFjDTG32USF2/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1194" data-original-width="2048" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipI_gPmJ2EfB8m67uOSfSHRDne9c4NunWOPMtFH3jdD3H-4RmuxMNRJwa0oEzjpCjlqAz53M8QpZ-Bwd0ibMo9OOB3QvnJ1-CYCyNdVOKU2cRAM3o0fKSC4C0ZK2zJGKLemFjDTG32USF2/w400-h234/color_sRGB.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Path traced in sRGB<br /></td></tr></tbody></table>
</td>
<td>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJnUEwQA28RrRvnYhFOjL1q2eY5cL6fsh1E41qtL67sdbRIYFTFBp3BnMCWkrj6x9ossoHJWa3Y6C-kI039VnAZpnFz_wlix30Hj2S0MAY5B97JaxkiS9Zi7OLsneYsv4HfOCuc7fj8ixr/" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="1194" data-original-width="2048" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJnUEwQA28RrRvnYhFOjL1q2eY5cL6fsh1E41qtL67sdbRIYFTFBp3BnMCWkrj6x9ossoHJWa3Y6C-kI039VnAZpnFz_wlix30Hj2S0MAY5B97JaxkiS9Zi7OLsneYsv4HfOCuc7fj8ixr/w400-h234/color_ACEScg.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Path traced in ACEScg<br /></td></tr></tbody></table>
</td>
</tr>
</tbody></table><br /><div>From the screen shots above, the indirectly lit red material looks darker when rendered in sRGB (especially the curtains on the ground floor), while the differences for the blue and green material are small. We can think of the result like previous section, for a given red color, when it is represented in sRGB, its red channel value is closer to one, however, its blue and green channels values are closer to 0 compared to be represented in ACEScg (same for both blue and green material). So after several multiplication with different color material, the RGB values in sRGB may becomes closer to 0 because different color material cancel out each other (e.g. when light bounced on red and green albedo surface (1, 0, 0) , (0,
1, 0) in sRGB, the reflected light will be zero, while the same color
represented in ACEScg, the color will not be "zeroed out"), resulting in darker image.<br /><p><br /></p><p><b><span style="font-size: large;">Conclusion</span></b></p><p>After testing with different assumptions, the brightness of images when rendered in sRGB can be darker, roughly equal or brighter than rendered in ACEScg. The brightness difference depends on material used in the scene. If the scene uses grey material only, brightness will be equal. If material has similar color (e.g. all red material), sRGB image will be brighter. If the scene has more color variation, the sRGB image may becomes darker. And turns out this conclusion can be arrived without doing such lengthy math. We can think of the same color value represented in sRGB and ACEScg space: Is the RGB value closer to 0 or 1 when represented in the color space? Will the RGB values 'cancel' each other when performing lighting calculation in the color space? So I was too stupid to figure out this simple answer early and instead worked on such lengthy math... >.<</p><p><br /></p><p><b>Reference</b></p><p><span style="font-size: x-small;">[1] <a href="https://chrisbrejon.com/cg-cinematography/chapter-1-5-academy-color-encoding-system-aces/">https://chrisbrejon.com/cg-cinematography/chapter-1-5-academy-color-encoding-system-aces/</a><br /></span></p><p><br /></p></div></div>Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-47040588541338864462020-07-11T17:51:00.000+08:002020-07-11T17:51:41.586+08:00Spectral Path Tracer<span style="font-size: large;"><b>Introduction</b></span><br />
Before starting this post, I would like to talk a bit about my homeland, Hong Kong. The Chinese government enacted a new <a href="https://www.bbc.com/news/world-asia-china-52765838">National Security Law</a>, bypassing our local legislative council. We can only read the <a href="https://hongkongfp.com/2020/07/01/in-full-english-translation-of-the-hong-kong-national-security-law/">full text of this law</a> after it is enacted (<span style="font-size: x-small;">with official English version published 3 days after that</span>). This law destroys our legal system completely, the government can appoint judges they like (<span style="font-size: xx-small;">Article 44</span>), jury can be removed from trial (<span style="font-size: xx-small;">Article 46</span>), and without media and public presence (<span style="font-size: xx-small;">Article 41</span>). This law is so vague that the government can prosecute anyone they don't like. People were arrested due to processing <a href="https://time.com/5862683/hong-kong-revolution-protest-chant-security-law/">anti-government stickers</a>. We don't even have the right to hate the government (<span style="font-size: xx-small;">Article 29.5</span>). If I promote "Boycott Chinese Products", I may broke this law already... Also the personnel of the security office do not need to obey HK law (<span style="font-size: xx-small;">Article 60</span>). This law even applies to foreigners outside HK (<span style="font-size: xx-small;">Article 38</span>). Our voting right is also deteriorating, more pro-democracy candidates can be disqualified by this law in the upcoming election (<span style="font-size: xx-small;">Article 35</span>)... So, if you are living in a democratic country, please cast a vote if you can.<br />
<br />
Back to the topic of spectral path tracer. Path tracing in the spectral domain has be added in my toy renderer (along side with tracing in sRGB / ACEScg space). Spectral path tracer trace rays with actual wavelength of light instead of RGB channels. The result is physically correct, and some of the effect can only be calculated by spectral rendering (e.g. dispersion, iridescence). Although my toy spectral path tracer does not support such material, I would like to investigate how spectral rendering affects the bounced light color compared to image rendered with RGB color spaces. The demo can be downloaded <b><a href="https://drive.google.com/file/d/1WFzAl6BPMsgFHUV3_XOC2qLhmKJrZeRv/view?usp=sharing">here</a></b>.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjO1k1bPYu_9KkNUsi-fEo8mUA4dF1nFcVo5BqkiPazEEpePte66V49NlR3aHG1H3QVq2_a-lFRPjxdxI88wVlLsejeUbmhyh8CC6pp1wgkAlTiWXK4VE_H7pwEh6t6y7oGPUWMJPVzMwnN/s1600/spectral_path_trace.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="371" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjO1k1bPYu_9KkNUsi-fEo8mUA4dF1nFcVo5BqkiPazEEpePte66V49NlR3aHG1H3QVq2_a-lFRPjxdxI88wVlLsejeUbmhyh8CC6pp1wgkAlTiWXK4VE_H7pwEh6t6y7oGPUWMJPVzMwnN/s640/spectral_path_trace.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Spectral rendered image</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Render Loop Modification</span></b><br />
Referencing my previous <a href="https://simonstechblog.blogspot.com/2020/03/dxr-path-tracer.html">DXR Path Tracer post</a>, only a few modification need to be done to support spectral path tracing:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVCVvWMyDF_6-t7PhC6QOLnl3nnENO9e4zwCX2wDWkV_sTKttVdLs_yoSK2fhJszvJRta1YrMofgteOy4iSBUjvTueq3ho2KeWF0LabwaPgpTkgo9SmvXhWxOiJcczvO6qoR2E30NgXu0j/s1600/flowchart.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="761" data-original-width="1600" height="304" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVCVvWMyDF_6-t7PhC6QOLnl3nnENO9e4zwCX2wDWkV_sTKttVdLs_yoSK2fhJszvJRta1YrMofgteOy4iSBUjvTueq3ho2KeWF0LabwaPgpTkgo9SmvXhWxOiJcczvO6qoR2E30NgXu0j/s640/flowchart.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">RGB path tracer render loop in previous post</td></tr>
</tbody></table>
When start tracing new rays, a wavelength is randomly picked. My first implementation uses <a href="https://cgg.mff.cuni.cz/~wilkie/Website/EGSR_14_files/WNDWH14HWSS.pdf">hero wavelength</a> with 3 wavelength samples per ray. The number 3 is chosen because it is convenient to replace existing code where rays are traced with RGB channels. So the "Light Path Texture" in previous post is modified to accumulate the energy at that 3 wavelengths during ray traversal. Finally, when the ray is terminated, the resulting energy in the "Light Path Texture" will be integrated with the <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space#Color_matching_functions">CIE XYZ color matching function</a> and stored in the "Lighting Sample Texture" in XYZ space, which later will be converted to the display device space (e.g. sRGB/AdobeRGB/Rec2020) as described in previous post.<br />
<br />
<span style="font-size: large;"><b>Spectral Up Sampling Texture</b></span><br />
One of the problem in spectral rendering is to convert texture from color to <a href="https://en.wikipedia.org/wiki/Spectral_power_distribution">spectral power distribution(SPD)</a>, this process is called spectral up sampling. Luckily, there are many paper talks about it. The technique called <a href="https://graphics.geometrian.com/research/spectral-primaries.html">"Spectral Primary Decomposition for Rendering with sRGB Reflectance"</a> is used in the demo to up sample texture. I choose this method because of its simplicity. This method can reconstruct the spectrum by using linear combination of the texture color with 3 pre-computed spectral basis function:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzV1f0SrIzL7riNYXwQad2_8XaikdqgKJndO6JrC9m2L3dCpgigvciu-rZZCZA4awWVN43XTUoD8NMhletKGYQIoTMZnOgmOug1zk0FKe7EAFwxgPgephO22tO8lo2N6nzITpNeZGt2o3Y/s1600/spec_pri.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="56" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjzV1f0SrIzL7riNYXwQad2_8XaikdqgKJndO6JrC9m2L3dCpgigvciu-rZZCZA4awWVN43XTUoD8NMhletKGYQIoTMZnOgmOug1zk0FKe7EAFwxgPgephO22tO8lo2N6nzITpNeZGt2o3Y/s400/spec_pri.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"></td></tr>
</tbody></table>
But one thing bothered me is that, the meaning of texture color is a bit different than the spectral up sampling method. In PBR rendering, texture color is referring to <a href="https://en.wikipedia.org/wiki/Albedo">albedo</a> (i.e. ratio of radiosity to the irradiance received by a surface.), which is independent of CIE XYZ observer. While the up-sampling method is trying to minimize the least squares problem of the texture color viewed under illuminant D65 with CIE standard observer. May be the RGB albedo values are computed with SPD and XYZ observer function? I have no idea about it and may investigate about this in the future.<br />
<br />
<span style="font-size: large;"><b>Spectral Up Sampling Light Color and Intensity</b></span><br />
Beside spectral up sampling the texture, light also need to be up sampled. Because the light color can be specified in wide color in the demo, the up sampling method used in above section is not enough. The method <a href="https://rgl.s3.eu-central-1.amazonaws.com/media/papers/Jakob2019Spectral_3.pdf">"A Low-Dimensional Function Space for Efficient Spectral Upsampling"</a> is used to up sample the light color. This method compute 3 coefficients using light color (i.e. from RGB to c<span style="font-size: xx-small;">0</span>, c<span style="font-size: xx-small;">1</span>, c<span style="font-size: xx-small;">2</span>), and then the spectral power distribution, <i>f </i>(λ), can be computed as below:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7okprlXYov22uAaNNbHhXmvMVDWejDPGjukNwMiTe5sxyLU77RAJHZppnmjKgT17bVRu3NdVDfecuQHhKwrCv2dIimjIW_BTc0HiaI6pQ_-okAF_F2vBmztbFN2GpUUj6H-iXI8J1s-QY/s1600/light_upsample_func.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="60" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7okprlXYov22uAaNNbHhXmvMVDWejDPGjukNwMiTe5sxyLU77RAJHZppnmjKgT17bVRu3NdVDfecuQHhKwrCv2dIimjIW_BTc0HiaI6pQ_-okAF_F2vBmztbFN2GpUUj6H-iXI8J1s-QY/s320/light_upsample_func.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"></td></tr>
</tbody></table>
Since, light is specified by color and intensity, after calculating the SPD coefficients, we need to scale the SPD curve, so that when integrating the scaled SPD with the CIE standard observer <span style="background-color: white; color: #222222; font-family: "arial" , sans-serif; font-size: 16px;">ȳ</span>(λ) curve, the result should equals to the specified luminance intensity :<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoTha8zbHfSUO5OaRTOaT_H_fjT3XG1egXdJSM_8zgsbATZx-OsJkMeqIqkTFDaqCzCZNCwGjXvorDsgf-7AqMAgGRx2-5lCFbnaBuz0PpWY9BKoKDOKHOC6tjuHrOC32TvlokJBuecTTQ/s1600/scale_luma.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="72" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoTha8zbHfSUO5OaRTOaT_H_fjT3XG1egXdJSM_8zgsbATZx-OsJkMeqIqkTFDaqCzCZNCwGjXvorDsgf-7AqMAgGRx2-5lCFbnaBuz0PpWY9BKoKDOKHOC6tjuHrOC32TvlokJBuecTTQ/s400/scale_luma.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"></td></tr>
</tbody></table>
The scaling factor K is calculated numerically using <a href="https://en.wikipedia.org/wiki/Trapezoidal_rule">Trapezoidal rule</a> with 1<span style="font-size: x-small;">nm</span> wavelength interval and the <span style="background-color: white; color: #222222; font-family: "arial" , sans-serif; font-size: 16px;">ȳ</span>(λ) curve is approximated with <a href="http://jcgt.org/published/0002/02/01/paper.pdf">"multi-lobe </a><a href="http://jcgt.org/published/0002/02/01/paper.pdf">approximation</a><a href="http://jcgt.org/published/0002/02/01/paper.pdf"> in "Simple Analytic Approximations to the CIE XYZColor Matching Functions"</a>. So, the light spectral power distribution is specified by 4 floating point numbers: 3 coefficients + 1 intensity scale.<br />
<br />
In the demo, the original light intensity of RGB path tracer is modified so that it better matches the intensity of the spectral rendered image. Before the modification, the RGB lighting is done by simply multiplying the light color with light intensity. But now, this value is also divided by the luminance of the color (but this lose some control in the color picker UI...).<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHjAKKsjXAXdKGa5jLUW_ufoGVvrtQxEDPfB5lxfVcUfxbQCMdC3WE-qg-C1Cx8OfbI9DJ6vDBmjzlYVsuaVZY9OBL1dsajZLd6AS1Fl3OAoL0VKXzj2qGYzvmsZDcFVF_3RJ3cMqnNB3w/s1600/intensity_ACES_without_luma.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="115" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHjAKKsjXAXdKGa5jLUW_ufoGVvrtQxEDPfB5lxfVcUfxbQCMdC3WE-qg-C1Cx8OfbI9DJ6vDBmjzlYVsuaVZY9OBL1dsajZLd6AS1Fl3OAoL0VKXzj2qGYzvmsZDcFVF_3RJ3cMqnNB3w/s200/intensity_ACES_without_luma.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">RGB light color multiply with<br />
intensity only</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfwOZsvnbJ_-LnLXZjDoR7w9YoY5lnZfq3nJTG5qwlsG1Rkc9lBQ98P_3zj4LYvehh6_SrrBqZfRIwIYdqIoBdib4zObBncJZQnFl3L2oNCIUxFYkUSpxPtIdfD4JAudjF08mZc5aIRDI4/s1600/intensity_ACES_with_luma.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="115" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfwOZsvnbJ_-LnLXZjDoR7w9YoY5lnZfq3nJTG5qwlsG1Rkc9lBQ98P_3zj4LYvehh6_SrrBqZfRIwIYdqIoBdib4zObBncJZQnFl3L2oNCIUxFYkUSpxPtIdfD4JAudjF08mZc5aIRDI4/s200/intensity_ACES_with_luma.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">RGB light color multiply with intensity<br />
divided by color luminance</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiko_M8p56U1oALSn0m8GBQqUcACGn1VubfhZOScF45mba4nkSF28nwZCSYhpmpU-1WukiMGrvejgrqChEPDB11tLeDr7Ck7sMM66eWoesOGYXKdO1X2dGr0ZeNuj5u5XDw7Fo1CINSr3I/s1600/intensity_spectral.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="115" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiko_M8p56U1oALSn0m8GBQqUcACGn1VubfhZOScF45mba4nkSF28nwZCSYhpmpU-1WukiMGrvejgrqChEPDB11tLeDr7Ck7sMM66eWoesOGYXKdO1X2dGr0ZeNuj5u5XDw7Fo1CINSr3I/s200/intensity_spectral.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Spectral light with scaled SPD curve<br />
<br /></td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
In addition to the luminance scale, we also need to chromatic adapt the light color from illuminant E to illuminant D65/D60 before computing the 3 SPD coefficients, because the coefficients are fitted using illuminant E. If not doing this, the image will have a reddish appearance. <br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0jzqhKSkeY6011Ue12K6jijRVgeePze2f6hAEU4j_qsXF6aF9VarT2M2nS4xqSFMO6FDki_e776VY1eSJTVFJ7Fn3yjfdqZFWEVfXPlEGlrcLbPFCfBGzjt44C0-_NV4_lb1uCnrHwTaz/s1600/spectral_light_without_CAT.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="186" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0jzqhKSkeY6011Ue12K6jijRVgeePze2f6hAEU4j_qsXF6aF9VarT2M2nS4xqSFMO6FDki_e776VY1eSJTVFJ7Fn3yjfdqZFWEVfXPlEGlrcLbPFCfBGzjt44C0-_NV4_lb1uCnrHwTaz/s320/spectral_light_without_CAT.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Computing light SPD coefficients without chromatic adaption</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiko_M8p56U1oALSn0m8GBQqUcACGn1VubfhZOScF45mba4nkSF28nwZCSYhpmpU-1WukiMGrvejgrqChEPDB11tLeDr7Ck7sMM66eWoesOGYXKdO1X2dGr0ZeNuj5u5XDw7Fo1CINSr3I/s1600/intensity_spectral.png" style="margin-left: auto; margin-right: auto;"><img border="0" height="184" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiko_M8p56U1oALSn0m8GBQqUcACGn1VubfhZOScF45mba4nkSF28nwZCSYhpmpU-1WukiMGrvejgrqChEPDB11tLeDr7Ck7sMM66eWoesOGYXKdO1X2dGr0ZeNuj5u5XDw7Fo1CINSr3I/s320/intensity_spectral.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Computing light SPD coefficients with chromatic adaption</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<span style="font-size: large;"><b>Importance Sampling Wavelength</b></span><br />
As mention at the start of the post, the wavelengths are sampled using hero wavelength, which randomly pick 1 wavelength within the visible spectrum (i.e. 380-780nm in the demo), and then 2 additional samples are picked, which evenly separated within visible wavelength range. With this approach, there is a high variance in color. Sometimes with 100 samples per pixel, the color converge to the final color, but more often, it requires > 1000 samples per pixel to converge. It really depends on luck...<br />
<table>
<tbody>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7n3JTNDFy3xxfhVjnVXEKlAJPUwkcF8c9LV-DIjzUUJ9LXSIyJR6Q4FUfi-N-xceAZwlezWqxwvpzPqVOsM9u10N7RDfjwEWKcycDIGnnQ9cOg8y9yUxwhcmPM1XvlBTrw9fD3mrdrYnO/s1600/importance_sample_hero_1.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7n3JTNDFy3xxfhVjnVXEKlAJPUwkcF8c9LV-DIjzUUJ9LXSIyJR6Q4FUfi-N-xceAZwlezWqxwvpzPqVOsM9u10N7RDfjwEWKcycDIGnnQ9cOg8y9yUxwhcmPM1XvlBTrw9fD3mrdrYnO/s200/importance_sample_hero_1.png" width="200" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUJhy7MF8KXTsFWi5W7764IaRRMxc3l6tgGtrPg54FnALWbDyQHg8JIpfBAwS2e62vD1eUCs85xWPk5q6Inmh7PLqhkf08hP_zK0mE-gZaQ9maqoUgGmCWdDSo5DCbh-2Un8JaWyt2Km7y/s1600/importance_sample_hero_2.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUJhy7MF8KXTsFWi5W7764IaRRMxc3l6tgGtrPg54FnALWbDyQHg8JIpfBAwS2e62vD1eUCs85xWPk5q6Inmh7PLqhkf08hP_zK0mE-gZaQ9maqoUgGmCWdDSo5DCbh-2Un8JaWyt2Km7y/s200/importance_sample_hero_2.png" width="200" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhv1sIstNDTfaG3S6t4qCmknrS5iHDDLSQYo22Tu3r8LIHet6DKmLktC5zIg1FHYGBIvgiMErTetfc828IpP4E-Edf0lnbZwwq7DKXowgG9PiQAmNeI04CQFDDPoGRqj9I1RMOOXSXZ1RtC/s1600/importance_sample_hero_3.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhv1sIstNDTfaG3S6t4qCmknrS5iHDDLSQYo22Tu3r8LIHet6DKmLktC5zIg1FHYGBIvgiMErTetfc828IpP4E-Edf0lnbZwwq7DKXowgG9PiQAmNeI04CQFDDPoGRqj9I1RMOOXSXZ1RtC/s200/importance_sample_hero_3.png" width="200" /></a>
</td>
</tr>
<tr>
<td align="center" colspan="3"><span style="font-size: x-small;">3 different spectral rendered images with hero wavelength using 100 samples per pixel,</span><br />
<span style="font-size: x-small;">The color looks a bit different between all 3 images with a noticeable red tint in the middle image.</span></td>
</tr>
</tbody></table>
<br />
To make the render converge faster, let's consider the CIE XYZ standard observer curves below, those samples with wavelength >650nm and <420nm will only have a few influence on the output image. So I tried to place more samples around the center of visible wavelength range.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2_DEo4xn8J0K6ONf5isMe4IDyu-Dzoigqj7HMlQ9piOZEyLMsf43v3R59TmlbXq0mSKKUTxdyk8FKlDJnmH_Z89EA_y9lr4FVUvvprALpXFcm77OCCqdrUMhnpRyP4M2iLgquCDf4FrGz/s1600/CIE_XYZ.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="241" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2_DEo4xn8J0K6ONf5isMe4IDyu-Dzoigqj7HMlQ9piOZEyLMsf43v3R59TmlbXq0mSKKUTxdyk8FKlDJnmH_Z89EA_y9lr4FVUvvprALpXFcm77OCCqdrUMhnpRyP4M2iLgquCDf4FrGz/s400/CIE_XYZ.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">CIE 1931 Color Matching Function from Wikipedia</td></tr>
</tbody></table>
My first failed attempt is to use a cos weighted PDF curve like this to randomly pick 3 wavelengths for each ray:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVVoeV2zRsaSom2s1A28ZmyGWKiqfpnfCwFr3puW7tpGU7P7TkM-pIkUUneF3AbcXh1qHyeYEYFTaUaWiT0CR2byZ1xr78QJSePTk_mhJsXxfes695lauB33gbh7Ay_Kif1heq-YMEnHoJ/s1600/cos_graph.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVVoeV2zRsaSom2s1A28ZmyGWKiqfpnfCwFr3puW7tpGU7P7TkM-pIkUUneF3AbcXh1qHyeYEYFTaUaWiT0CR2byZ1xr78QJSePTk_mhJsXxfes695lauB33gbh7Ay_Kif1heq-YMEnHoJ/s320/cos_graph.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"></td></tr>
</tbody></table>
A normalization constant is computed so that the PDF is integrated to one, and then CDF can be computed. To pick a random sample from this PDF, <a href="https://en.wikipedia.org/wiki/Inverse_transform_sampling">inverse method</a> can be used. To simplify the calculation, the PDF is centered at 0 with width 200 instead of [380, 780] range. After sampling λ from the inverse of CDF, the λ is shifted by 580 to make it lies in [380, 780] range. To find the inverse of CDF:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhkmPWsJ66jAQrj4mFpM0zBLuD9Y5lsm8DYMLuDy4jEtx7ZM3MzXG2HTd862z_MchGRpltE4Sc5mDbV7X89d9IH2aI5bRbDoqrv_Ry76EZTZdUInauhvC0YvhsPeyo7g7gKoUyG_a_Z3l38/s1600/cos_pdf_cdf.png" imageanchor="1"><img border="0" height="150" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhkmPWsJ66jAQrj4mFpM0zBLuD9Y5lsm8DYMLuDy4jEtx7ZM3MzXG2HTd862z_MchGRpltE4Sc5mDbV7X89d9IH2aI5bRbDoqrv_Ry76EZTZdUInauhvC0YvhsPeyo7g7gKoUyG_a_Z3l38/s400/cos_pdf_cdf.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Compute inverse CDF of the cos weighted PDF (with w=200)</td></tr>
</tbody></table>
<br />
Unfortunately, this cannot be inverted analytically as <a href="https://www.quora.com/How-do-you-solve-for-x-in-y-x+sin-x">mentioned here</a>. So <span class="q-box qu-userSelect--text" style="box-sizing: border-box; direction: ltr;"><span style="font-style: normal; font-weight: normal;"><a href="https://en.wikipedia.org/wiki/Newton%27s_method">Newton's method</a>(</span></span><span class="q-box qu-userSelect--text" style="box-sizing: border-box; direction: ltr;"><span style="font-style: normal; font-weight: normal;"><span class="q-box qu-userSelect--text" style="box-sizing: border-box; direction: ltr;"><span style="font-style: normal; font-weight: normal;">with 15 iterations,</span></span>) is used as suggested from <a href="https://www.quora.com/How-do-you-solve-for-x-in-y-x+sin-x">this post</a>, which have the follow result:</span></span><br />
<br />
<table>
<tbody>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1nnGosc-mAncf_T0RapgfkRzbqakNUCRgtQJcDf2CWeWALInTA1JXSw6S8SPmvanXttMz0Lna67mHJDWe7e4w0O-JujZqmnnRRgn7sRGN25NN-W2PPM0Bw-NuYoHQiv8BXaAKv9FpdIgW/s1600/importance_sample_cos_1.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1nnGosc-mAncf_T0RapgfkRzbqakNUCRgtQJcDf2CWeWALInTA1JXSw6S8SPmvanXttMz0Lna67mHJDWe7e4w0O-JujZqmnnRRgn7sRGN25NN-W2PPM0Bw-NuYoHQiv8BXaAKv9FpdIgW/s200/importance_sample_cos_1.png" width="200" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhA4Uy7-0xna3v85aq6xW6cj9KMsRT3aSYi7LKhfARdhaKxY0B6Fdel6Z6xtHBv9a7JWSKfxM59yqAFkAyejmM-iryyccJkkcogIXeydfT17evH5wGshuoQlZt_cIwfZetd6ywnbqoZ-H6P/s1600/importance_sample_cos_2.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhA4Uy7-0xna3v85aq6xW6cj9KMsRT3aSYi7LKhfARdhaKxY0B6Fdel6Z6xtHBv9a7JWSKfxM59yqAFkAyejmM-iryyccJkkcogIXeydfT17evH5wGshuoQlZt_cIwfZetd6ywnbqoZ-H6P/s200/importance_sample_cos_2.png" width="200" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqM8aAO1GWhsvu0R6sYLJFJPEy0muo7VtqSwkTYrOXTiPhe8bj1gIltEluRVtat5ZNXqi2ZInbTREtX9VN9Nfa0q_nUzGULq5rkyGsb-FbUVx7k9OS79aSEH2CabyY9bfhigIilgAF3y-t/s1600/importance_sample_cos_3.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhqM8aAO1GWhsvu0R6sYLJFJPEy0muo7VtqSwkTYrOXTiPhe8bj1gIltEluRVtat5ZNXqi2ZInbTREtX9VN9Nfa0q_nUzGULq5rkyGsb-FbUVx7k9OS79aSEH2CabyY9bfhigIilgAF3y-t/s200/importance_sample_cos_3.png" width="200" /></a>
</td>
</tr>
<tr>
<td align="center" colspan="3"><span style="font-size: x-small;">3 different spectral rendered images with cos-weighted PDF using 100 samples per pixel,</span><br />
<span style="font-size: x-small;">The color still looks a bit different between all 3 images...
</span></td>
</tr>
</tbody></table>
<span class="q-box qu-userSelect--text" style="box-sizing: border-box; direction: ltr;"><span style="font-style: normal; font-weight: normal;"><br /></span></span>
<span class="q-box qu-userSelect--text" style="box-sizing: border-box; direction: ltr;"><span style="font-style: normal; font-weight: normal;">Sadly, the result is not </span></span>improved, which gives more color variance than the hero wavelength method...<br />
<span class="q-box qu-userSelect--text" style="box-sizing: border-box; direction: ltr;"><span style="font-style: normal; font-weight: normal;"><br /></span></span>
<span class="q-box qu-userSelect--text" style="box-sizing: border-box; direction: ltr;"><span style="font-style: normal; font-weight: normal;">So I google for a while and found another paper: <a href="https://www.researchgate.net/publication/228938842_An_Improved_Technique_for_Full_Spectral_Rendering">"An Improved Technique for Full Spectral Rendering"</a>. It suggests to use the cosh function for PDF, which its CDF can be inverted analytically: </span></span><br />
<table>
<tbody>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgyqwnCS9cHeVbycKwco3q9PTm71OmANgd8pB3pdPs7qWytEzhBoa4gdW9ISR9e4J50Uot2cCa6Ia2jbyEiRGpoNdV3BTKgYsfhaU3kQqq9s-b2WIurGY8wbUYAMnTrFi6BM_ZE-523ld1X/s1600/cosh_pdf.png" imageanchor="1" style="clear: left; display: inline !important; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="65" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgyqwnCS9cHeVbycKwco3q9PTm71OmANgd8pB3pdPs7qWytEzhBoa4gdW9ISR9e4J50Uot2cCa6Ia2jbyEiRGpoNdV3BTKgYsfhaU3kQqq9s-b2WIurGY8wbUYAMnTrFi6BM_ZE-523ld1X/s320/cosh_pdf.png" width="320" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyJ9DICNxACXeI3MBnvtpyzgbRhXuzttgaFTn-UYcYiUwPr_XNhCkD-uQZy8HTiMy5bAKPWQDnWF2qVotUGBwfzygcm5mPkT-wPKyP1w7lEvAmNb5rwFhJ5vTcWsaJ4HgSEve1UtNTVWWF/s1600/cosh_graph.png" imageanchor="1"><img border="0" height="156" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhyJ9DICNxACXeI3MBnvtpyzgbRhXuzttgaFTn-UYcYiUwPr_XNhCkD-uQZy8HTiMy5bAKPWQDnWF2qVotUGBwfzygcm5mPkT-wPKyP1w7lEvAmNb5rwFhJ5vTcWsaJ4HgSEve1UtNTVWWF/s320/cosh_graph.png" width="320" /></a>
</td>
</tr>
</tbody></table>
The paper only suggest to use that PDF curve with center B= 538nm and A= 0.0072. Since this shape is similar to my cos weighted PDF, the color converge rate is similar (so I just skip capturing screen shots for this case)... But what if we use this curve with their center lying around the peak of the XYZ standard observer curve? To find this out, I tried to find the normalization constant within range [380, 780]nm, and then compute the CDF and inverse CDF:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaBKY8xh4JgxiZpgUSibpfhdLP9Lr_JnYTTcvByPP4vCnL0KBRa06YSsoJsfHYoMxBFvIHYcZWsbg22HTUJrHaY0MD-20cph6Tn3oMO-1t2AMeW59qeGKdlK0oq_ATaCccRSmPgmur-yhn/s1600/cosh_pdf_cdf.png" imageanchor="1"><img border="0" height="232" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgaBKY8xh4JgxiZpgUSibpfhdLP9Lr_JnYTTcvByPP4vCnL0KBRa06YSsoJsfHYoMxBFvIHYcZWsbg22HTUJrHaY0MD-20cph6Tn3oMO-1t2AMeW59qeGKdlK0oq_ATaCccRSmPgmur-yhn/s640/cosh_pdf_cdf.png" width="640" /></a><br />
<br />
By using 3 different PDF to sample the wavelengths (<span style="font-size: x-small;">A<span style="font-size: xx-small;">0</span>=0.0078, B<span style="font-size: xx-small;">0</span>= 600, A<span style="font-size: xx-small;">1</span>= 0.0072, B<span style="font-size: xx-small;">1</span>= 535, A<span style="font-size: xx-small;">2</span>= 0.0062, B<span style="font-size: xx-small;">2</span>= 445 are used in the demo</span>), the image converge much faster than using hero wavelength. Using about 100 SPP will often enough to get a similar color to the converged image. <br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4ejefdmIRzmd9wt9T_zlhitJLh2zUH0QrrKwcVKU-xzRqlhPDCrayQSTW6LPp__Y95ONROTk4qng6mWImcfonA1c9-fYxxZ0-ulkWvR1cCMI7eVfCYVmug4eK26WkqomblaERHeBR2whM/s1600/importance_sample_cosh_XYZ.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="186" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4ejefdmIRzmd9wt9T_zlhitJLh2zUH0QrrKwcVKU-xzRqlhPDCrayQSTW6LPp__Y95ONROTk4qng6mWImcfonA1c9-fYxxZ0-ulkWvR1cCMI7eVfCYVmug4eK26WkqomblaERHeBR2whM/s320/importance_sample_cosh_XYZ.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Rendered with 3 different cosh-curves PDF using 100SPP</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinoDZ7tr3sCiPLsR9mIpI7FTMHRMrW_9xTsTG1UL5yyuYDn4Y6Z2Oa-Y-AX3OJ0cP798l47-A-uqV9wNf6l2cf7cWNM-siKDVvWpCGa4NaCf_z9UT_pcKb-EE7t9Cxq-Nsj0mV9k962XyR/s1600/importance_sample_cosh_XYZ_converged.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="186" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinoDZ7tr3sCiPLsR9mIpI7FTMHRMrW_9xTsTG1UL5yyuYDn4Y6Z2Oa-Y-AX3OJ0cP798l47-A-uqV9wNf6l2cf7cWNM-siKDVvWpCGa4NaCf_z9UT_pcKb-EE7t9Cxq-Nsj0mV9k962XyR/s320/importance_sample_cosh_XYZ_converged.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Converged spectral rendered image.</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
Another problem with the color variance in hero wavelength is the camera movement. Since my demo is an interactive path tracer, when the camera moves, the path tracer re-generate a wavelength sample which change the color greatly every frame:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZPePANwvl_CMKTeBDcCggo5SMe2YxaZX_PA3TFxq1BnWlURowZp2ToGZVZny8AGubeN5rdGFsSTYXAU-jvF-DB8vvdxtiOGhvkwv6dfNOcygnVno6Wq6QP7jMiomy5HWSfqhLNTDz6IY7/s1600/camera_move_hero.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="480" data-original-width="854" height="223" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZPePANwvl_CMKTeBDcCggo5SMe2YxaZX_PA3TFxq1BnWlURowZp2ToGZVZny8AGubeN5rdGFsSTYXAU-jvF-DB8vvdxtiOGhvkwv6dfNOcygnVno6Wq6QP7jMiomy5HWSfqhLNTDz6IY7/s400/camera_move_hero.gif" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Camera movement with hero wavelength sampling</td></tr>
</tbody></table>
<br />
To give a better preview of color during first few samples of the path tracing. The random numbers are stratified into 9 regions so that the first ray will pick 3 random wavelengths lying around 600nm, 535nm and 445nm when substituted into the inverse CDF of cosh weighted curves, which will give some Red, Green, Blue color.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJ6Ud56ouDse77NJ7eqwNUW_a0tbxRZE9ZlixwFVzWHX0647W_bq7v45VSAYvyvUTcJj0IjktbV8Nu6uAywqqrJo7DYOsFC69esXX4JYM4nf3ZygsQGA4o8a6UqyLpvs3LIsdgOS_8gD2H/s1600/stratifiedRand.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="120" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJ6Ud56ouDse77NJ7eqwNUW_a0tbxRZE9ZlixwFVzWHX0647W_bq7v45VSAYvyvUTcJj0IjktbV8Nu6uAywqqrJo7DYOsFC69esXX4JYM4nf3ZygsQGA4o8a6UqyLpvs3LIsdgOS_8gD2H/s640/stratifiedRand.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Code to generate stratified random numbers P0, P1, P2 within [0, 1] range.</td></tr>
</tbody></table>
<br />
With this stratified random number, color variation is reduced during camera movement:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2XxpHBHHYSZiF4XknAD-qh9hKUlDm92gXJKj6qWQ-K_JVDjYFQ2OAucCRCZvXnf1rfUYohBvnlG4xiXTzTOnUBUBzdP1lqlmCOh88NNM8INbOLE7ruXkhqMGfLVUEAXmsJaXoTwzrqgPr/s1600/camera_move_cosh.gif" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="480" data-original-width="854" height="223" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2XxpHBHHYSZiF4XknAD-qh9hKUlDm92gXJKj6qWQ-K_JVDjYFQ2OAucCRCZvXnf1rfUYohBvnlG4xiXTzTOnUBUBzdP1lqlmCOh88NNM8INbOLE7ruXkhqMGfLVUEAXmsJaXoTwzrqgPr/s400/camera_move_cosh.gif" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Camera movement with stratified random numbers.</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Conclusion</span></b><br />
In this post, I have described how a basic spectral path tracer can be implemented. The spectral rendered image is a bit different from the RGB rendered image (The RGB rendered image is a bit more reddish compared to the spectral traced one.). This may be due to the spectral up sampling method used, or not using a D65 light source. However, the bounced light intensity is not much different between tracing in Spectral and ACEScg space. In the future I would like to try using different light source such as illuminant E/D/F to see how it affects the color. I would also like to have a technique to spectral up sampling albedo with wide color gamut instead of sRGB color only.<br />
<br />
<b>References</b><br />
<span style="font-size: x-small;">[1] <a href="https://cgg.mff.cuni.cz/~wilkie/Website/EGSR_14_files/WNDWH14HWSS.pdf">https://cgg.mff.cuni.cz/~wilkie/Website/EGSR_14_files/WNDWH14HWSS.pdf</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space">https://en.wikipedia.org/wiki/CIE_1931_color_space</a></span><br />
<span style="font-size: x-small;">[3] <a href="https://graphics.geometrian.com/research/spectral-primaries.html">https://graphics.geometrian.com/research/spectral-primaries.html</a></span><br />
<span style="font-size: x-small;">[4] <a href="https://rgl.s3.eu-central-1.amazonaws.com/media/papers/Jakob2019Spectral_3.pdf">https://rgl.s3.eu-central-1.amazonaws.com/media/papers/Jakob2019Spectral_3.pdf</a></span><br />
<span style="font-size: x-small;">[5] <a href="http://jcgt.org/published/0002/02/01/paper.pdf">http://jcgt.org/published/0002/02/01/paper.pdf</a> </span><br />
<span style="font-size: x-small;">[6] <a href="https://www.researchgate.net/publication/228938842_An_Improved_Technique_for_Full_Spectral_Rendering">https://www.researchgate.net/publication/228938842_An_Improved_Technique_for_Full_Spectral_Rendering</a> </span><br />
<br />
<br />
<br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-42695078911315570362020-04-22T03:30:00.001+08:002020-04-22T03:30:58.722+08:00HDR Display<span style="font-size: large;"><b>Introduction</b></span><br />
Continue with the DXR Path Tracer in the last post, I updated the demo to support HDR display. It also has various small features updated such as adding per monitor high DPI support, path trace resolution scale (e.g. path trace at 1080, and bilinear upscale to 4K.) and added dithering to tone mapped output to reduce color banding (Integer back buffer format only). The updated demo can be downloaded <a href="https://drive.google.com/file/d/1v6uxc9Yyz-HmorJTHouZkxOQLMObjoAI/view?usp=sharing"><b>here</b></a>.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgn5GIICqvt2EPPcZ-KrluoyfFNcNo2WXX7S5QymVrza_6ifA5pcia6lFtUD6sy3J3sGQp-769BRvWPrbXGrvPsbaNlV9kokaK_6QWsFx4qeUV3mh_yj92NfQIg7-H1LFUVtEFPKwsu0UTa/s1600/SDR_HDR_1.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="352" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgn5GIICqvt2EPPcZ-KrluoyfFNcNo2WXX7S5QymVrza_6ifA5pcia6lFtUD6sy3J3sGQp-769BRvWPrbXGrvPsbaNlV9kokaK_6QWsFx4qeUV3mh_yj92NfQIg7-H1LFUVtEFPKwsu0UTa/s640/SDR_HDR_1.JPG" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A photo of my HDR TV to show the difference between SDR and HDR</td></tr>
</tbody></table>
<br />
<span style="font-size: large;"><b>HDR Color Spaces</b></span><br />
There are 2 swapchain formats/color spaces can be chosen to output HDR images on HDR capable monitor/TV:<br />
<blockquote class="tr_bq">
1. Rec 2020 color space<span style="font-size: x-small;"> (DXGI_FORMAT_R10G10B10A2_UNORM + DXGI_COLOR_SPACE_RGB_FULL_G2084_NONE_P2020)</span><br />
2. scRGB color space <span style="font-size: x-small;">(DXGI_FORMAT_R16G16B16A16_FLOAT + DXGI_COLOR_SPACE_RGB_FULL_G10_NONE_P709)</span></blockquote>
Rec2020 color space is the common <a href="https://en.wikipedia.org/wiki/High-dynamic-range_video#HDR10">HDR10 format with PQ EOTF</a>. But it is recommended to <a href="https://youtu.be/OvLuQliiJlg?t=1752">use scRGB color space on Windows</a>. scRGB use the same color primaries at Rec709, which support wide color using negative values. I was confused about using negative values to represent a color (as well as intensity) at first. Because color gamut is usually displayed as <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space#CIE_xy_chromaticity_diagram_and_the_CIE_xyY_color_space">CIE xy chromaticity diagram</a>, I used to think of a given RGB value as a color interpolated inside the <span class="st">Red/Green/Blue </span><span class="st"><span class="st">gamut </span>triangle </span>using the barycentric <span class="st">coordinates</span><span class="st">. </span><br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN42uI80Zvj8txadM2Gd4ue0_bdRhEucEaYEKHIKm3SkCxMES48id1qEXYVnJcDNOxj9N5WvvFmJZOJqvt2-Jtv3ZUXG9agziiHQ5S7n6RaWj6a2gpuQ_PWRt4mKG7H10wkOerDAbOT7SV/s1600/gamut_test.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN42uI80Zvj8txadM2Gd4ue0_bdRhEucEaYEKHIKm3SkCxMES48id1qEXYVnJcDNOxj9N5WvvFmJZOJqvt2-Jtv3ZUXG9agziiHQ5S7n6RaWj6a2gpuQ_PWRt4mKG7H10wkOerDAbOT7SV/s320/gamut_test.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A debug chromaticity diagram showing Rec2020 color gamut. <br />
Using a HDR backbuffer will show less clipped color</td></tr>
</tbody></table>
<span class="st">Although, it makes sense to represent a wide color using negative numbers using </span><span class="st">barycentric </span>interpolation, I was confused how it to represent the intensity at the same time. It is because the chromaticity diagram skipped the luminance information. So, instead of thinking color inside the horseshoe-shaped diagram, it is easier for me to think about the color in 3D <a href="https://en.wikipedia.org/wiki/CIE_1931_color_space">XYZ color space</a>. The <span class="st">Red/Green/Blue color primaries of a gamut is 3 basis vectors in the XYZ color space. A </span>linear combination of RGB values with the color primaries basis vectors can represent a color and intensity. Thinking in this way make me feel more comfortable. So the path traced lighting values are fed into <a href="https://github.com/ampas/aces-dev/tree/master/transforms/ctl/outputTransforms">ACES HDR tone mapper</a>, transformed into Rec709 color space, and then divided by 80 when using scRGB color space(scRGB requires value of 1 to represent 80 nit).<br />
<br />
<span style="font-size: large;"><b>HDR10 metadata</b></span><br />
I have also played around with <a href="https://docs.microsoft.com/en-us/windows/win32/api/dxgi1_5/ns-dxgi1_5-dxgi_hdr_metadata_hdr10">HDR10 metadata</a> to see how it affects the image. But most of the data does not affect the image on my <a href="https://www.samsung.com/hk_en/tvs/puhd-mu7300/UA49MU7300JXZK/">Samsung UA49MU7300JXKZ</a> TV. The only data have effects on the image is "Max Mastering Luminance" which can affect the image brightness. Setting it to a small value will make image darker despite outputting a very bright 1000 nit image. Also, the HDR10 metadata only works in Full Screen Exclusive mode, or borderless windowed mode with <a href="https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-setwindowpos">HWND_TOPMOST Z order</a> (I guess is the <a href="https://devblogs.microsoft.com/directx/demystifying-full-screen-optimizations/">full screen optimization</a> get enabled), using borderless window with HWND_TOP Z order won't work (but this mode is easier to alt-tab on single monitor set up...). Besides, entering Full Screen Exclusive mode may fail when calling <a href="https://docs.microsoft.com/en-us/windows/win32/api/dxgi/nf-dxgi-idxgiswapchain-setfullscreenstate">SetFullscreenState()</a> if the display is not connected to the adapter that used for rendering. I didn't notice this until I started to work on laptop which use the RTX graphics card for ray tracing and the laptop monitor is connected to the Intel graphics card. Looks like some hard work need be done in order to support Full Screen Exclusive mode properly (e.g. create a D3D Device/Command Queue/Swapchain and for the Intel graphics card and copy the ray traced image from the RTX graphics to Intel graphics card for full screen swapchain). But unfortunately, my demo does not support multi-adapter, so the HDR10 metadata may not work on such setup (I am outputting to the external HDR TV using the RTX graphics card, so that doesn't create a much problem for me...)...<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVC2P8g3tl6VT7vcgMeK7i9vAtpaB5y1V0klltHmLqBDBuLrLuvZJvsiEZGpKo3gomhhKRxQxqT3HKfxSh_wZo8poAr0vo5rU5kLyJb8hk4R5fkxi-BDQ2h4Jtux9-WD5ef34jjZP27uG5/s1600/TV_setting.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVC2P8g3tl6VT7vcgMeK7i9vAtpaB5y1V0klltHmLqBDBuLrLuvZJvsiEZGpKo3gomhhKRxQxqT3HKfxSh_wZo8poAr0vo5rU5kLyJb8hk4R5fkxi-BDQ2h4Jtux9-WD5ef34jjZP27uG5/s320/TV_setting.JPG" width="283" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Capability of my HDR TV</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIfnIfWZUzzotgHO0oCiCueZlBR5hEWORZS3StU3ZLR5h1A1D7Ckn8yWcFuRJJGVOHydv-QLHuiQJQrsxKyLaOjjjNClnMQ3wjwoUXINfRiMhMm5ZFL7aUBsRX4Te2lQfErxsAo6ttjja8/s1600/HDR10metadata.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="302" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIfnIfWZUzzotgHO0oCiCueZlBR5hEWORZS3StU3ZLR5h1A1D7Ckn8yWcFuRJJGVOHydv-QLHuiQJQrsxKyLaOjjjNClnMQ3wjwoUXINfRiMhMm5ZFL7aUBsRX4Te2lQfErxsAo6ttjja8/s320/HDR10metadata.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">UI showing all the adjustable HDR10 meta in the demo</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<b><span style="font-size: large;">UI</span></b><br />
Blending SDR UI with HDR image is handled in 2 steps: blend color and then brightness. All the UI is rendered into an off screen buffer (in 8 bit Rec709 color space). And then later blended with the ACES tone mapped image. Take a look at the <a href="https://github.com/ampas/aces-dev/blob/master/transforms/ctl/lib/ACESlib.OutputTransforms.ctl">ACES tone mapping function snippet below</a>, the lighting value will be mapped to a normalized range in AP1 color space as an intermediate step (red part in the below code snippet). So in the demo, the UI in off screen buffer will be converted to AP1 color space and then blend with the tone mapped image at this step. (I have also tried blending in XYZ space and the result is similar in the demo.)<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbQQGv4knlWBzi0rGl6TavsPiu8Owv_Qb9bK5za5kqR-VBqOKQv08wSg_FRjIg859bh8sFi-4_YEZCHuNV43iV31_GBF_iHsWQFnlmxo26kr3bmT8cInNCVbPthca9kHeOggWD2ZSh75OB/s1600/HDR_tonemap.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="342" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbQQGv4knlWBzi0rGl6TavsPiu8Owv_Qb9bK5za5kqR-VBqOKQv08wSg_FRjIg859bh8sFi-4_YEZCHuNV43iV31_GBF_iHsWQFnlmxo26kr3bmT8cInNCVbPthca9kHeOggWD2ZSh75OB/s400/HDR_tonemap.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">ACES tone map function snippet</td></tr>
</tbody></table>
<br />
Then the UI blended image can be transformed into target color space (e.g. scRGB or Rec2020, purple part in the above code snippet). When converting the normalized color data to HDR data, the ACES tone mapper interpolates the RGB values between Y_MIN and Y_MAX (i.e. blue part in the above code snippet). During this brightness interpolation, the demo adjust Y_MAX (e.g. 1000 to 4000 nits) to user defined UI brightness (e.g. 80 to 300 nits) depending on the UI alpha, using the following formula:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBq7ivzA6LvEFJbwoqu6nI59TqkuFzT72I5RdGz6TA2oE4leITiJ1zBimDlQ9_B6k6_067-EadcE-tAYl9VFDTvTC5IROrLFZSGSb7Gu01FldOfnToOieECmSnor4oATokhQ4W_RXkFyUG/s1600/Y_max.png" imageanchor="1"><img border="0" height="40" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBq7ivzA6LvEFJbwoqu6nI59TqkuFzT72I5RdGz6TA2oE4leITiJ1zBimDlQ9_B6k6_067-EadcE-tAYl9VFDTvTC5IROrLFZSGSb7Gu01FldOfnToOieECmSnor4oATokhQ4W_RXkFyUG/s400/Y_max.png" width="400" /></a><br />
<br />
In the demo, BlendPow is set to 5 as default value. Although the result is not perfect (e.g. the UI alpha value may not looks blending linearly, depending on background luminance), it works well enough to avoid bleeding though from a bright background with UI alpha > 0.5:<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5amV69Bi7IXyiBPyDU8ndWaH35nZ-Hy9vfusBtGIiX3GOed-PRQ6fmXIZGJocEQUes8LuAp5HFYHeSIa3jN6Kc6y9bWBuyltLTjVOWb_XuftD06pKl39MVVpq4SbYh0eXVWQUYXrBTl9O/s1600/UI_blend_dark.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="111" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5amV69Bi7IXyiBPyDU8ndWaH35nZ-Hy9vfusBtGIiX3GOed-PRQ6fmXIZGJocEQUes8LuAp5HFYHeSIa3jN6Kc6y9bWBuyltLTjVOWb_XuftD06pKl39MVVpq4SbYh0eXVWQUYXrBTl9O/s200/UI_blend_dark.JPG" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Background Luminance: 0-10 nit</td></tr>
</tbody></table>
</td><td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhJBnrInMLXJ0Fx6JYCUSLxvQ-F3E3cLFdPhvBO-iESAJ11QDRSiYE1338xplOTxUG76Q2d5Vjy9_peZC21vyKxuItpWwRIH1_5WHvDKIdXYNNpoMLwCxjteNblC6tGl2ZJOjgX8Dy0EK2/s1600/UI_blend_normal.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="112" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhJBnrInMLXJ0Fx6JYCUSLxvQ-F3E3cLFdPhvBO-iESAJ11QDRSiYE1338xplOTxUG76Q2d5Vjy9_peZC21vyKxuItpWwRIH1_5WHvDKIdXYNNpoMLwCxjteNblC6tGl2ZJOjgX8Dy0EK2/s200/UI_blend_normal.JPG" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Background Luminance: 2 - 250 nit</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsWS1sRxX9JXTvnDqfHzNbL-Mhnujpu2pmwkVgWKav-72thnOZ53zOYIGtC19mn_rkXknKOTH05n2g7bwrX4B2KqysIB-BbT_s3akOd35_D9_SWeYOQ75vK-_i6Ibn9goXGbSirxPUZ-J6/s1600/UI_blend_bright.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="112" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgsWS1sRxX9JXTvnDqfHzNbL-Mhnujpu2pmwkVgWKav-72thnOZ53zOYIGtC19mn_rkXknKOTH05n2g7bwrX4B2KqysIB-BbT_s3akOd35_D9_SWeYOQ75vK-_i6Ibn9goXGbSirxPUZ-J6/s200/UI_blend_bright.JPG" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Background Luminance: 1000 nit</td></tr>
</tbody></table>
</td>
</tr>
<tr>
<td align="center" colspan="3"><span style="font-size: x-small;">Photos showing UI blending with HDR background, from dark to bright.</span></td>
</tr>
</tbody></table>
<br />
However, the above blending formula have artifact when Y_MAX is smaller than the UI brightness (But this may not happens and only happens in some debug view mode). In this case, the background may looks too bright after blending with the UI. To fix this, inverting the BlendPow may helps to minimize the artifact:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYJTzJXGlWP1MkHyXvUv-ibO-MNIyKeR9q_-lc9ubKfVgmCzP7aou8MuJBjcpXA4ITDYds3v17s3RJluoJwYIfTe3FXgXANZXuvABUfGCI4NkWrKxB5CrZtMj1ZFY7-W8HmMhWWVXNgpS-/s1600/Y_max2.png" imageanchor="1"><img border="0" height="80" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYJTzJXGlWP1MkHyXvUv-ibO-MNIyKeR9q_-lc9ubKfVgmCzP7aou8MuJBjcpXA4ITDYds3v17s3RJluoJwYIfTe3FXgXANZXuvABUfGCI4NkWrKxB5CrZtMj1ZFY7-W8HmMhWWVXNgpS-/s400/Y_max2.png" width="400" /></a><br />
<br />
The below is an extreme example showing the artifact with Y_MAX set to 1 nit and UI set to 80 nit:<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrUUqaD424APBQaZpRNp982vMcHphveRppF7kjmYdllqi8sS-BSXt0_xutOViIqHGBEBu8H441YC1YrYvaXZlDeEZ8bRFqr9OGr916cBfSfcZR9eVsQ9uZVIZUc59SDy14rumcY315SHIA/s1600/gradient_artifact.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrUUqaD424APBQaZpRNp982vMcHphveRppF7kjmYdllqi8sS-BSXt0_xutOViIqHGBEBu8H441YC1YrYvaXZlDeEZ8bRFqr9OGr916cBfSfcZR9eVsQ9uZVIZUc59SDy14rumcY315SHIA/s320/gradient_artifact.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Without the fix, the artifact is very noticeable at the upper right corner</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW5BX-iccdGYFF_-byHU1gTY1tn0I1FSJxBnZ3x9wSCoPmPyJUtuyoLT4GdN-jXGFLJmQLmIMS_hNQ_8ixiEbDfEYlqsN-CUvGMT93Kyf1FB-OJIG3prhKT5LtXPFWb9BD5_JEidvyJwu7/s1600/gradient_fix.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW5BX-iccdGYFF_-byHU1gTY1tn0I1FSJxBnZ3x9wSCoPmPyJUtuyoLT4GdN-jXGFLJmQLmIMS_hNQ_8ixiEbDfEYlqsN-CUvGMT93Kyf1FB-OJIG3prhKT5LtXPFWb9BD5_JEidvyJwu7/s320/gradient_fix.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">With the fix, the background is darkened properly</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
Looking back to the blue part of the ACES tone mapping function in the above code snippet, I was wondering will the image looks different as the Y_MIN and Y_MAX values are interpolated in the display color space (e.g. will the image looks different when interpolating in scRGB / Rec2020). To compare whether these 2 color are the same, we need to have both color in the same space. So let's convert the interpolated RGB value back to XYZ space:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiQfEh0UrtuoO5Jd4BELUqgml-9UzASczNKCj9AKtxxgsiGGee_e9G8mVLRa6P6aJLcJ7u8DLz84nbmzfcCS0XM3jgtXAGJNDfp5P-BMcCWcEEfV3IwyaNydF8dKLWvaJTgEMf6UrCXclK/s1600/lerp_compare.png" imageanchor="1"><img border="0" height="310" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiiQfEh0UrtuoO5Jd4BELUqgml-9UzASczNKCj9AKtxxgsiGGee_e9G8mVLRa6P6aJLcJ7u8DLz84nbmzfcCS0XM3jgtXAGJNDfp5P-BMcCWcEEfV3IwyaNydF8dKLWvaJTgEMf6UrCXclK/s320/lerp_compare.png" width="320" /></a><br />
<br />
So, interpolating in different display color space do have some differences as long as Y_MIN != 0. But The ACES tone mapper which defaults to use the STRECH_BLACK option which effectively setting Y_MIN= 0, so there should be no difference when interpolating values in difference color space. Out of curiosity, I have tried to disable the STRECH_BLACK option to see whether the image will look different when switching between the scRGB and Rec2020 back buffer with Y_MIN > 0, but the images still looks the same with a large Y_MIN... I am not sure why this happens, may be the difference is too small to be noticeable... In the demo, I take this into account and treat the Y_MIN as a value in the XYZ color space and interpolate the value like this:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQk7s788YuTzRWrOVjcT1pJBQA_BLCABuocK-ewqevRJG34b6BZ59L38AaOamkyluI4K-Q9ed3ZwOShpfqLo3bSGMzXPtr8ul8qlPeo5s7PQWgWl_v7reF9e5FOEGj8sdfhxwMDZCXq-Xh/s1600/lerp_Y_min_max.png" imageanchor="1"><img border="0" height="48" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQk7s788YuTzRWrOVjcT1pJBQA_BLCABuocK-ewqevRJG34b6BZ59L38AaOamkyluI4K-Q9ed3ZwOShpfqLo3bSGMzXPtr8ul8qlPeo5s7PQWgWl_v7reF9e5FOEGj8sdfhxwMDZCXq-Xh/s200/lerp_Y_min_max.png" width="200" /></a><br />
<br />
<span style="font-size: large;"><b>Debug View Mode</b></span><br />
To help debugging, several debug view modes are added. A "Luminance Range" view mode is used to display the luminance value of a pixel before tone mapping (ie. the physical light luminance entering the virtual camera) and after tone mapping (i.e. the output light luminance the monitor should be displayed):<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjeQEB36TjobxxhcEbJ4L3Rs5MEyI1XDuqS3IHkh9sARZK1IJX1ypSxtpxl9UCES87l3XXZ2KhxOFgZtf4dVDAwWXGBB0UIzTdsmGEKeyp5BU8744hBNOmCwX8xfG-7nF4VJgnfnJPMB2k/s1600/lum_range_abs.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjeQEB36TjobxxhcEbJ4L3Rs5MEyI1XDuqS3IHkh9sARZK1IJX1ypSxtpxl9UCES87l3XXZ2KhxOFgZtf4dVDAwWXGBB0UIzTdsmGEKeyp5BU8744hBNOmCwX8xfG-7nF4VJgnfnJPMB2k/s320/lum_range_abs.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Showing pixel luminance before tone mapping.</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAM6SWQDPl0e_VCw5jLv2s1XRNzIeU87IwfLjYjBL5kFgZ7WG3yQNQhPcHeHNfCFY9ju3-bTPy-uK_slebAjQsaUmYTQFiCXY-Fk1HAF2XxY4dTESrcBs-ScEMXakl7Q-uyJAkn8HrOSAi/s1600/lum_range_toneMapped.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAM6SWQDPl0e_VCw5jLv2s1XRNzIeU87IwfLjYjBL5kFgZ7WG3yQNQhPcHeHNfCFY9ju3-bTPy-uK_slebAjQsaUmYTQFiCXY-Fk1HAF2XxY4dTESrcBs-ScEMXakl7Q-uyJAkn8HrOSAi/s320/lum_range_toneMapped.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Showing pixel luminance after tone mapping.</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
A "Gamut Clip Test" mode to high light pixel that fall outside Rec709 gamut, i.e. those color can only be viewed with a HDR display or wide color monitor (e.g. AdobeRGB / P3 monitor).<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIpf489V0x1_6R3VsFsa5ubJ5lypHgGps4B7d_CDlSR2ZY_XmFaGRdVWvBbUXvEtEAJsO9-q1ZKK64lyulupf5gftCvMqy1HDa9sqdRwxuGb6GI6-LwjEWzLUfNQ0NthDwObRWuhl9PgN9/s1600/gamut_clip.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjIpf489V0x1_6R3VsFsa5ubJ5lypHgGps4B7d_CDlSR2ZY_XmFaGRdVWvBbUXvEtEAJsO9-q1ZKK64lyulupf5gftCvMqy1HDa9sqdRwxuGb6GI6-LwjEWzLUfNQ0NthDwObRWuhl9PgN9/s320/gamut_clip.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Highlight clipped pixel with cyan color</td></tr>
</tbody></table>
A "SDR HDR Split" mode to compare the SDR/HDR images. But this mode can only be a rough preview of how SDR will look like on HDR back buffer. It is because SDR images are usually displayed brighter than 80 nit, the SDR part need to be brighten up (say to 100-200 nit) to make it look similar to using a real SDR back buffer. Also, in this view mode, because the HDR back buffer expect a pixel value in nit (either linear or PQ encoded), I don't apply any gamma curve to the SDR portion of the image, which may also result in differences to a real SDR version too.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJOXoPJNlk7LIS05E5aETXtxggWbNU-JQAd4YeVwPAHcakQ42i-vn8cJ3ctkL_Gb2L1MCcOQ2ZacjBLtBF_TBlieOJEfdC4fU3GHQlmo1qDSUn-neItIkqB_Z_kLmdgAA8eHTjy98TQ0qS/s1600/SDR_HDR_2.JPG" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="221" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJOXoPJNlk7LIS05E5aETXtxggWbNU-JQAd4YeVwPAHcakQ42i-vn8cJ3ctkL_Gb2L1MCcOQ2ZacjBLtBF_TBlieOJEfdC4fU3GHQlmo1qDSUn-neItIkqB_Z_kLmdgAA8eHTjy98TQ0qS/s400/SDR_HDR_2.JPG" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A photo showing both SDR and HDR at the same time.<br />
Bloom is exaggerated to show the clipped color in SDR</td></tr>
</tbody></table>
<b><span style="font-size: large;">Conclusion</span></b><br />
In this post, I have talked about the color spaces used for outputting to HDR display. No matter which color space format is used (e.g. scRGB/ Rec2020), the displayed image should be identical if transformed correctly (except some precision difference). Also, I have tried to play around with the HDR10 metadata, but most of the metadata does not change the image on my TV... I guess how the metadata is interpreted is device dependent. Lastly, SDR UI is composited with the HDR image by first blending with the color and then brightness. A simple blending formula is enough for the demo. A complicated algorithm can be explored in the future, say brighten up the UI depending on the brightness of a blurred HDR background (e.g. may be storing the background luminance in the alpha channel and blur it together with bloom pass?). A demo can be downloaded <a href="https://drive.google.com/file/d/1v6uxc9Yyz-HmorJTHouZkxOQLMObjoAI/view?usp=sharing"><b>here</b></a> to test on your HDR display (Some of the options are hid if not connecting to HDR display).<br />
<br />
<b>References</b><br />
<span style="font-size: x-small;">[1] <a href="https://channel9.msdn.com/Events/Build/2017/P4061">https://channel9.msdn.com/Events/Build/2017/P4061</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://www.pyromuffin.com/2018/07/how-to-render-to-hdr-displays-on.html">https://www.pyromuffin.com/2018/07/how-to-render-to-hdr-displays-on.html</a></span><br />
<span style="font-size: x-small;">[3] <a href="https://www.gdcvault.com/play/1024803/Advances-in-the-HDR-Ecosystem">https://www.gdcvault.com/play/1024803/Advances-in-the-HDR-Ecosystem</a></span><br />
<span style="font-size: x-small;">[4] <a href="https://www.gdcvault.com/play/1026443/Not-So-Little-Light-Bringing">https://www.gdcvault.com/play/1026443/Not-So-Little-Light-Bringing</a></span><br />
<span style="font-size: x-small;">[5] <a href="https://developer.nvidia.com/implementing-hdr-rise-tomb-raider">https://developer.nvidia.com/implementing-hdr-rise-tomb-raider</a></span><br />
<span style="font-size: x-small;">[6] <a href="https://www.asawicki.info/news_1703_programming_hdr_monitor_support_in_direct3d">https://www.asawicki.info/news_1703_programming_hdr_monitor_support_in_direct3d</a></span><br />
<span style="font-size: x-small;">[7] </span><span style="font-size: x-small;"><a href="https://onedrive.live.com/?authkey=%21AFU3moSbUzyUgaE&cid=A4B88088C01D9E9A&id=A4B88088C01D9E9A%21170&parId=A4B88088C01D9E9A%21106&o=OneUp">https://onedrive.live.com/?authkey=%21AFU3moSbUzyUgaE&cid=A4B88088C01D9E9A&id=A4B88088C01D9E9A%21170&parId=A4B88088C01D9E9A%21106&o=OneUp</a></span><br />
<span style="font-size: x-small;">[8] <a href="https://www.shadertoy.com/view/4tV3WW">https://www.shadertoy.com/view/4tV3WW</a></span><br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-29962553837133857522020-03-14T16:47:00.000+08:002020-04-18T17:32:38.356+08:00DXR Path Tracer<b><span style="font-size: large;">Introduction</span></b><br />
Can't believe it has been half a year since my last DXR AO post. It was a <a href="https://www.bbc.com/news/world-asia-china-49317695">hard time in Hong Kong</a> last year, but <a href="https://www.scmp.com/news/hong-kong/hong-kong-economy/article/3044121/tourist-arrivals-take-sharpest-plunge-november">thanks to the social unrest</a> and <a href="https://www.hongkongfp.com/2020/02/02/no-choice-hong-kong-medical-workers-agree-strike-mainland-border-closures/">medical worker's strike</a>, the Wuhan Coronavirus does not spread widely in the local community (but still have new cases everyday...). Due to the virus, it is better to stay at home, so I continue to code my path tracer. This new path tracer is unbiased by terminating rays with Russian Roulette. During path tracing, physical light unit are used. Also, rendering can be done in wide color space. Finally, the path traced result is tone mapped and output to sRGB/wide color gamut depends on the display device. A demo can be downloaded <b><a href="https://drive.google.com/file/d/16UBQza1VTBVGncsbxokM7B1R1sKRUdUH/view?usp=sharing">here</a></b> (Please use latest graphics driver to run as I have encountered device remove hang on my laptop RTX2060 with old driver, but not on my desktop GTX1060... If the crash/hang still happens, please let me know. Thank you.).<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgcvCr4ZiMgNzwdXbrp9jEILvTFcdRH_19g_oqTME0MJZneb5cGUFc-7MYvk0jI4fjqhriIzuCZNl0NfIU2NeKQnxIHSSyVGqkPpsTFO59cfzI845i5EQYH_6mLkYrRw1NvJ9w3Sa9V6l_/s1600/path_trace_scr_shot.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="374" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgcvCr4ZiMgNzwdXbrp9jEILvTFcdRH_19g_oqTME0MJZneb5cGUFc-7MYvk0jI4fjqhriIzuCZNl0NfIU2NeKQnxIHSSyVGqkPpsTFO59cfzI845i5EQYH_6mLkYrRw1NvJ9w3Sa9V6l_/s640/path_trace_scr_shot.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Path Traced Sponza scene.</td></tr>
</tbody></table>
<br />
<br />
<b><span style="font-size: large;">Render Loop</span></b><br />
At the start of the demo (or after any camera movement/lighting changes), a structured buffer, <i>Ray Buffer</i>, is initialized with 1 ray per pixel using the camera transform.<br />
<div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghYrRZr-2F10FwM541K_9PXT1Hglep40yGAkrcmrAAK-6gfH_zKT2iX2reZRUSfK1GMF4VCFLLfeyYGLHvUU1dTLGZLNWzrjyUdCnYB_op_6UGohNj3GFocf276BUnN4Eyvzer1TVz5Y3_/s1600/rayData.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="100" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghYrRZr-2F10FwM541K_9PXT1Hglep40yGAkrcmrAAK-6gfH_zKT2iX2reZRUSfK1GMF4VCFLLfeyYGLHvUU1dTLGZLNWzrjyUdCnYB_op_6UGohNj3GFocf276BUnN4Eyvzer1TVz5Y3_/s400/rayData.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The struct stored in <i>Ray Buffer</i>, not tightly packed for easier understanding.</td></tr>
</tbody></table>
</div>
<div>
Then a ray generation shader is dispatched to read the <i>Ray Buffer</i> and trace rays into the scene. Lighting is calculated and generate new <i>Ray Buffer</i> elements if the rays are not terminated and continue the path tracing in the next frame. Below is a simplified flow of the rendering operation executed every frame (with render passes on the left and resources on the right):</div>
<div>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2_lCPQ8GmiGDm3TBK3DT_nDCTpI9GII95A5bg5mKLz6sDQ1chWeYL6aW1rqQrMr1VDM-EOlrvdK4D-nwsqgTlM4P8FLS6m9FawwGaEwcwUCIZuV_AGhBCHL5wM9ERITmpdVx36aFh6Skc/s1600/flowchart.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="304" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2_lCPQ8GmiGDm3TBK3DT_nDCTpI9GII95A5bg5mKLz6sDQ1chWeYL6aW1rqQrMr1VDM-EOlrvdK4D-nwsqgTlM4P8FLS6m9FawwGaEwcwUCIZuV_AGhBCHL5wM9ERITmpdVx36aFh6Skc/s640/flowchart.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A simplified path tracing flow executed every frame</td></tr>
</tbody></table>
Let's start with the usage of resources in the above flow chart first:<br />
<ul>
<li><i>Ray Buffers</i> are structured buffers storing the RayData struct. A ray will be traced for each element and if the ray is not terminated by Russian Roulette, it will be stored back to the <i>Ray Buffer</i> for the next frame.</li>
<li><i>Lighting Path Texture</i> is used for accumulating the lighting result when a ray is traversing along the path from the camera. It can be think of an intermediate result because the path is not fully traversed within a single frame, but across several frames.</li>
<li><i>Progress Buffer</i> is a 8 bytes buffer, with 4 bytes storing the current path depth and other 4 bytes storing the total accumulated sample count.</li>
<li><i>Lighting Sample Texture</i> is used for accumulating the lighting result of all the terminated rays (i.e. accumulating all the terminated ray result from <i>Lighting Path Texture</i>).</li>
</ul>
About operation done in each render pass:<br />
<ol>
<li>Ray Tracing Pass dispatch a ray generation shader, sampling the <i>Ray Buffer</i> according to the DispatchRaysIndex() and then calling TraceRay() to calculate the lighting result inside the closest hit shader by randomly choosing diffuse or specular lighting (another shadow ray is traced towards light source during lighting calculation). The lighting result is added to Lighting Path Texture and non-terminated ray will be stored back into <i>Ray Buffer</i> for next frame.</li>
<li>Checking whether all rays are terminated by using D3D12 predicate on the counter buffer of <i>Ray Buffer </i>(i.e. all rays terminated when counter == 0). Then different shader/operation will be executed depends on whether the <i>Ray Buffer</i> is empty.</li>
<li>When there are still rays not terminated, increase the path depth in <i>Progress Buffer</i>.</li>
<li>When all rays are terminated, increase the sample count and set path depth to 0 in <i>Progress Buffer</i>.</li>
<li>Accumulate the current path lighting result in <i>Lighting Path Texture</i> to <i>Lighting Sample Texture</i>. Clear the <i>Lighting Path Texture</i> to 0 (it is cleared via compute shader instead of command list clear as the predicate does not work on the clear API, <a href="https://docs.microsoft.com/en-us/windows/win32/direct3d12/predication">despite the spec say it would</a>...)</li>
<li>Regenerate the rays in <i>Ray Buffer</i> with 1 ray per pixel using the camera transform for path tracing new lighting samples in next few frames. </li>
<li>Display the current lighting result to the back buffer.</li>
</ol>
<table>
<tbody>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhr6fohvk8xaVOmv81i0ZIxRhH8NGRql4wClncwDcAKE8f8D7cGfD0DVBFO5UKKMSWCW-kZoHO2BgKPWAF0z_36nB_lH1ohxsf4NayQbCC7PU_ptxL6F6stRKUIAMRsyx07TJ_ANEkYNpb6/s1600/path_trace_result_1.png" imageanchor="1"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhr6fohvk8xaVOmv81i0ZIxRhH8NGRql4wClncwDcAKE8f8D7cGfD0DVBFO5UKKMSWCW-kZoHO2BgKPWAF0z_36nB_lH1ohxsf4NayQbCC7PU_ptxL6F6stRKUIAMRsyx07TJ_ANEkYNpb6/s320/path_trace_result_1.png" width="320" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivA9W4hOw05n0wX1POkwZF7ptMDyNHh8OQTJrWMUqZVSfqVpyJP-X_sQFJ6MBfAuP4rtxcOcr6KEvriKE2fD-2Jwuu0hNZ5ke1IGpJ628j3ViO7VNdAC3URtm5BFop8Yy3fUpkf4V2YBRs/s1600/path_trace_result_2.png" imageanchor="1"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivA9W4hOw05n0wX1POkwZF7ptMDyNHh8OQTJrWMUqZVSfqVpyJP-X_sQFJ6MBfAuP4rtxcOcr6KEvriKE2fD-2Jwuu0hNZ5ke1IGpJ628j3ViO7VNdAC3URtm5BFop8Yy3fUpkf4V2YBRs/s320/path_trace_result_2.png" width="320" /></a>
</td>
</tr>
<tr>
<th colspan="2"><span style="font-size: x-small;"><span style="font-weight: normal;">Path traced images</span></span></th>
</tr>
</tbody></table>
<br />
With the core operations described above, at most 2 rays per pixel can be launched to maintain an interactive frame rate on my GTX1060. On more powerful machines (i.e. RTX cards with hardware accelerated ray tracing), step 1 don't need to be terminated with the first closest hit, but bounce a few more times before storing back to the <i>Ray Buffer</i> (A "#Bounce/Frame" option is added to increase the number of bounce per frame for RTX cards).<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUd7Uqv6vcABRChDuwSWy3F3W42nsAjZeeTrAr9mB0jvUoLOxJ0cZqtcncU6y0SV9K4NViruTZXyuGP82TTr7jNtd2M8IrWbE9yHdrYM2nz602bdXe7rHD6d8-PCyV7YtfAPlLjClJ6f8L/s1600/numBouncePerFrame.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhUd7Uqv6vcABRChDuwSWy3F3W42nsAjZeeTrAr9mB0jvUoLOxJ0cZqtcncU6y0SV9K4NViruTZXyuGP82TTr7jNtd2M8IrWbE9yHdrYM2nz602bdXe7rHD6d8-PCyV7YtfAPlLjClJ6f8L/s1600/numBouncePerFrame.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Number of bounce per frame option to adjust performance</td></tr>
</tbody></table>
The current approach described above has 2 drawbacks: First, we don't know how many rays are still left in <i>Ray Buffer</i> on CPU, DispatchRay() is called with the maximum number of rays (i.e. viewport width * height), and terminate early within the ray generation shader. This can be fixed in <a href="https://devblogs.microsoft.com/directx/dxr-1-1/#executeindirect">DXR Tier 1.1 using ExecuteIndirect()</a> in the future. The second drawback is the performance is not constant across several frames, because the number of rays need to be traced decrease every frame and then reset back, so the frame rate fluctuate.<br />
<br />
<b><span style="font-size: large;">ACES tone mapping</span></b><br />
After calculating the HDR lighting value, we need to perform a tone map pass to map the lighting value to a displayable range. ACES tone mapping is chosen due to its popularity in recent years. ACES has a few tone mapping curves (they call it RRT + ODT) for <a href="https://github.com/ampas/aces-dev/tree/master/transforms/ctl/odt">different display with different color gamut and viewing condition</a>. Some common display types are <a href="https://github.com/ampas/aces-dev/blob/master/transforms/ctl/odt/sRGB/ODT.Academy.sRGB_100nits_dim.ctl">sRGB_100nit</a> and <a href="https://github.com/ampas/aces-dev/blob/master/transforms/ctl/odt/rec709/ODT.Academy.Rec709_100nits_dim.ctl">Rec709_100nit</a>. The input of RRT+ODT function expect RGB values in ACES2065-1 (AP0) gamut with a white point around(but not exact) D60. So we need to convert our lighting value (L_sRGB) to AP0 gamut by multiplying a few transformation matrices:</div>
<div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4Dr60y1M512Bpe0-nG3Ppk_gogN3D1PQFt1vqK4Fu8IkiFbMnxBmknx0Mmi5OWFapcRti6CGEKiSv9rGgpi5uoeHQI31Qqwqx9oZWw9I-t0SBgaUjpV32Yg80VKCkwRx-MdjzUe8ODmNS/s1600/conv_2_AP0.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="53" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4Dr60y1M512Bpe0-nG3Ppk_gogN3D1PQFt1vqK4Fu8IkiFbMnxBmknx0Mmi5OWFapcRti6CGEKiSv9rGgpi5uoeHQI31Qqwqx9oZWw9I-t0SBgaUjpV32Yg80VKCkwRx-MdjzUe8ODmNS/s320/conv_2_AP0.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">operation to convert RGB value from sRGB to AP0</td></tr>
</tbody></table>
The above steps means first transforming sRGB values to XYZ color space with D65 white point (gamut transformation matrices can be calculated using the formula from <a href="http://www.brucelindbloom.com/index.html?Eqn_RGB_XYZ_Matrix.html">here</a>), then apply a Chromatic Adaption Transform(CAT) due to different white point between sRGB and AP0 (the matrix can be calculated using the formula from <a href="http://www.brucelindbloom.com/index.html?Eqn_ChromAdapt.html">here</a>). Finally, the XYZ value can be transformed to AP0 gamut. All these matrices can be combined to perform 1 matrix-vector multiplication as an optimization. Then this value can be feed into ACES RRT+ODT to compute the back buffer value for display.</div>
<div>
<br />
So we just only need to select the appropriate ODT for the target display device. But unfortunately, not all common display gamut is provided, like my recently bought RTX laptop which comes with a 100% AdobeRGB color gamut monitor. ACES does not provide a suitable ODT to display the image in AdobeRGB color space. If using the common sRGB ODT, image will look too saturated. So I added a "Remap display color gamut" option in the demo:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8xGcIOdat-2r1zAFhX5qIbhYhGr-QnIqxpj434jkN7oKbcQLZQf8uCR9XOsQJiQHz7bIeBkgoFp8iZGJfiy49sVJLlsDyOm2L9GgPw2_yGGlqoe_XhjRhe90oUGznj1Q5FaeflKWv4KMh/s1600/remap.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="152" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8xGcIOdat-2r1zAFhX5qIbhYhGr-QnIqxpj434jkN7oKbcQLZQf8uCR9XOsQJiQHz7bIeBkgoFp8iZGJfiy49sVJLlsDyOm2L9GgPw2_yGGlqoe_XhjRhe90oUGznj1Q5FaeflKWv4KMh/s320/remap.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Remapping option to display the path traced result according to display color primaies</td></tr>
</tbody></table>
The Remap display color gamut option performs the following steps on the output of RRT+ODT:<br />
<ol>
<li>Apply EOTF function to the ODT output to get linear lighting value.</li>
<li>Transform the resulting RGB value in step 1 to the target display color gamut RGB value (e.g. AdobeRGB gamut on my laptop display), with Chromatic Adaption Transformation applied.</li>
<li>Apply OETF function to the output of step 2 for display.</li>
</ol>
By doing the above remapping, I can get similar result between my AdobeRGB laptop monitor and sRGB desktop monitor. But 1 drawback is that, although we can query the color primaries of the display, but it is not always accurate. For example, on my laptop, I can switch to regular sRGB view mode, but the <a href="https://docs.microsoft.com/en-us/windows/win32/api/dxgi1_6/nf-dxgi1_6-idxgioutput6-getdesc1">IDXGIOutput6::GetDesc1()</a> is still returning the AdobeRGB color primaries. I have also tried on some other monitors, they have color primaries greater than sRGB, but not exactly AdobeRGB or P3 primaries, and they also have different view mode such as AdobeRGB or sRGB. So I just leave the gamut remapping function optional in the demo and the user can choose their remap color primaries.<br />
<br />
Also digging deeper in the ACES ODT source code, the 3 ODT used in the demo share many common code and only have different color space transform function / OETF at the end of ODT. In the future, I may refactor the RRT+ODT code and remove the remap display gamut function and directly transforn the ACES ODT output in XYZ space to the display gamut queried by <a href="https://docs.microsoft.com/en-us/windows/win32/api/dxgi1_6/nf-dxgi1_6-idxgioutput6-getdesc1">IDXGIOutput6::GetDesc1()</a> (or user selected gamut).<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoWkEZaiw29mssi20YhrV3U6DGZd_fHkxfhn1HUTBMEW8-h50mqpe0Ae3ra1t0DP2Y0qREa9xnPD6u5EremzFeV-IrUnDGAoqoPRvicju5_jjiKSJiYzC6CKlgrDjbRv1_CCWqtAEvmbMe/s1600/ACES_ODT.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="275" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoWkEZaiw29mssi20YhrV3U6DGZd_fHkxfhn1HUTBMEW8-h50mqpe0Ae3ra1t0DP2Y0qREa9xnPD6u5EremzFeV-IrUnDGAoqoPRvicju5_jjiKSJiYzC6CKlgrDjbRv1_CCWqtAEvmbMe/s400/ACES_ODT.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">An ODT from ACES, the blue part is the same for all the 3 ODT used in the demo.<br />
The orange part is different depends on display, which can be replaced by display <br />
primaries returned from IDXGIOutput6::GetDesc1(), so the "Remap display color <br />
gamut" in the demo can be removed in the future. </td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">WCG rendering</span></b><br />
Equipped with the knowledge of transforming between color spaces, I decide to try rendering in Wide Color Gamut instead. Games like <a href="http://www.polyphony.co.jp/publications/sa2018/">GT Sport</a> rendered in wide color already(Rec2020). Performing lighting calculation in wide color gamut can result in more accurate lighting than rendering in sRGB color space (despite displaying on sRGB monitor).<br />
<br />
<table>
<tbody>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIZYE-WFFnuChf2h527lbAsBRlRd70bkTpQvk5LWKUDZlwq0H89YqA-d2h9k7HsMwJDzzdSm3IWMa7dTBXRKXmGuPHq_TrHTdmaJykFlmnA6gc0WmVPG60H41qlxafCEHBP6olXKeazr1_/s1600/render_sRGB.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIZYE-WFFnuChf2h527lbAsBRlRd70bkTpQvk5LWKUDZlwq0H89YqA-d2h9k7HsMwJDzzdSm3IWMa7dTBXRKXmGuPHq_TrHTdmaJykFlmnA6gc0WmVPG60H41qlxafCEHBP6olXKeazr1_/s200/render_sRGB.png" width="200" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSrAkCoudVzQa0Q3is3YWoIllKvvCHZadkCDWlfPQNK87f674SzDF2JX_dplqwx9f7rcNqDAQ1On18bWUZlTvGSZnYpEBHAhg4jxmqLb-XD4DJ4XzOwjori5qFg4FShXqNaltk-PYZJVlH/s1600/render_ACEScg.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSrAkCoudVzQa0Q3is3YWoIllKvvCHZadkCDWlfPQNK87f674SzDF2JX_dplqwx9f7rcNqDAQ1On18bWUZlTvGSZnYpEBHAhg4jxmqLb-XD4DJ4XzOwjori5qFg4FShXqNaltk-PYZJVlH/s200/render_ACEScg.png" width="200" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjh_y1ru9NfAVTXsOeg3CH4i1JxWujCOmPGmVINRlhEnwrfMI0Qf8qVBcVY_UZeTygPVRbXI7rpyEUsgXk1_NXCBx3JZyN-yFVhWksZ7zpmdqB5q1IybxTVdRnJn5k4X7oqcnc-37neRIdA/s1600/render_rec2020.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjh_y1ru9NfAVTXsOeg3CH4i1JxWujCOmPGmVINRlhEnwrfMI0Qf8qVBcVY_UZeTygPVRbXI7rpyEUsgXk1_NXCBx3JZyN-yFVhWksZ7zpmdqB5q1IybxTVdRnJn5k4X7oqcnc-37neRIdA/s200/render_rec2020.png" width="200" /></a>
</td>
</tr>
<tr>
</tr>
<tr>
<th colspan="3"><span style="font-size: x-small;"><span style="font-weight: normal;">Path Traced result rendered in different color space. Left:sRGB, Center:ACEScg, Right:Rec2020</span></span></th>
</tr>
</tbody></table>
<br />
In the demo, it can path trace in sRGB, ACEScg, Rec2020 color space. Inside the closest hit shader, albedo texture is read and transformed into the chosen rendering color space from sRGB. Also the light color is converted to chosen rendering color space and then multiply with the intensity. Finally inside the tone mapping pass, the result of lighting calculation is transformed to AP0 color space and feed into ACES RRT+ODT for display. You may notice some difference between rendering in sRGB and wide color gamut (e.g. ACEScg and Rec2020). If you have a wide color gamut monitor (e.g. AdobeRGB or DCI-P3), you can try to use the Rec2020 ODT with the "Remap display color gamut" option on (described in last section). This can produce fewer color clamping and display more saturated color. But under normal lighting condition, the difference is not that much, we need to set up specific lighting such as using a local sphere light with saturated color, then those wide color can be displayed. I guess this is due to both albedo texture and light color are in sRGB space, content may need to be adjusted in order to take advantage of wide color display.<br />
<span style="font-size: x-small;">
</span><span style="font-size: x-small;">
</span>
<br />
<table>
<tbody>
<tr>
<td><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/dxr_path_tracer/images/WCG_render_sRGB.png" imageanchor="1"><img border="0" height="185" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/dxr_path_tracer/images/WCG_render_sRGB.png" width="320" /></a>
</td>
<td><a href="https://raw.githubusercontent.com/simon-yeunglm/blog/master/dxr_path_tracer/images/WCG_render_adobeRGB.png" imageanchor="1"><img border="0" height="185" src="https://raw.githubusercontent.com/simon-yeunglm/blog/master/dxr_path_tracer/images/WCG_render_adobeRGB.png" width="320" /></a>
</td>
</tr>
<tr>
<th colspan="2"><span style="font-size: x-small;"><span style="font-weight: normal;">Wide Color path traced image, saved with different profile. Left:saved with sRGB profile Right: saved with AdobeRGB profile. </span></span><br />
<span style="font-size: x-small;"><span style="font-weight: normal;">The right image shows more saturated color when viewed on a color managed browser with a wide color display (e.g. iPhone monitor), </span></span><br />
<span style="font-size: x-small;"><span style="font-weight: normal;">otherwise 2 images may look the same.</span></span></th></tr>
</tbody></table>
<br />
Also, please note that this kind of wide color support is different from Windows 10 HDR/WCG settings. On my laptop, Windows report No for both HDR and WCG, but it do have an AdobeRGB monitor and capable of displaying wide color, we just need to correctly transform the images using the monitor color gamut.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjc98np1vTdaqQELfNYNPFFziriI1LTx2CDLyQNWHSRTkdQxb74jc73TLSUQDiL-FzS6U7xbd28Ia5VinaY3KLmC3XqYTLdokf26VxowToU4j5RUeNnG0CxhOC6yxJwTV6bR6CRigwgGr3p/s1600/win_settings.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="83" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjc98np1vTdaqQELfNYNPFFziriI1LTx2CDLyQNWHSRTkdQxb74jc73TLSUQDiL-FzS6U7xbd28Ia5VinaY3KLmC3XqYTLdokf26VxowToU4j5RUeNnG0CxhOC6yxJwTV6bR6CRigwgGr3p/s400/win_settings.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">My laptop has an AdobeRGB monitor, but Windows 10 Display capabilities report No for WCG.</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">ACES ODT blue light artifact</span></b><br />
So far everything looks good when rendering in wide color space. Color get desaturated when they are over exposed. But it still has some issue when using a strong blue light...<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOg8OPYfTddQyxKZSqF7IeT_oeXOQIoihiZIbeI534oCCeo5eFMWlj9Oq43Zh6zws1KySMzeNacXbKPwA5i16cvQPV_zSuLc7kx_J3SlzRSgs9gAyJaWjb7A40C2LJ1jajXcMcZdNNmM5Z/s1600/blue_nofix.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgOg8OPYfTddQyxKZSqF7IeT_oeXOQIoihiZIbeI534oCCeo5eFMWlj9Oq43Zh6zws1KySMzeNacXbKPwA5i16cvQPV_zSuLc7kx_J3SlzRSgs9gAyJaWjb7A40C2LJ1jajXcMcZdNNmM5Z/s320/blue_nofix.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Using a strong sRGB blue light will introduce hue shift...</td></tr>
</tbody></table>
<br />
It is because pure blue (0, 0, 255) in sRGB space is not saturated enough when transformed to wide color gamut (e.g. ACEScg/Rec2020). Looking inside the ACES dev repo, it has a <a href="https://github.com/ampas/aces-dev/blob/master/transforms/ctl/lmt/LMT.Academy.BlueLightArtifactFix.ctl">blue light artifact fix LMT</a> to fix this issue. It works by de-saturating the blue color a bit to lessen the hue shift. So in the demo, I provided a "Blue Correction" parameter to adjust the blue de-saturation strength (As a side note, UE4 also use ACES tone mapper and comes with a <a href="https://docs.unrealengine.com/en-US/API/Runtime/Engine/Engine/FPostProcessSettings/BlueCorrection/index.html">blue correction parameter</a> in post process setting).<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS52j4tTUxlKbM_B2EM2wUJD6eB8wxaIZ7YA-AYDMuSz-xdVnlRJFntB3dpBJSNAaBTLs1Ex6L6aaWbnPWx4oLlIhfV6SkTmSsGBESAzHoiNXKSTY53hjXFpO0hCQfgjSlVB-2fmyXCfYn/s1600/blue_fix.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgS52j4tTUxlKbM_B2EM2wUJD6eB8wxaIZ7YA-AYDMuSz-xdVnlRJFntB3dpBJSNAaBTLs1Ex6L6aaWbnPWx4oLlIhfV6SkTmSsGBESAzHoiNXKSTY53hjXFpO0hCQfgjSlVB-2fmyXCfYn/s320/blue_fix.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">desaturating blue color to fix the hue shift</td></tr>
</tbody></table>
<br />
But I do like the saturated blue color, using the blue light artifact fix LMT will de-saturate the blue color made me sad. Below is the comparison between with/without the blue light LMT:<br />
<table>
<tbody>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi53NUAnOWkLqtzhqyRk4CeNpgsyvtepTuPK0-d3Ofxkiku7m1SSMhqYONVagLRde_h2Z58DzLHon_Pkg672P9pcsBYAy6DLmrugz5ymsArL62EIKbzG-V5e54iwWccDua6LvMlR6_bmIcs/s1600/blue_normal_nofix.png" imageanchor="1"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi53NUAnOWkLqtzhqyRk4CeNpgsyvtepTuPK0-d3Ofxkiku7m1SSMhqYONVagLRde_h2Z58DzLHon_Pkg672P9pcsBYAy6DLmrugz5ymsArL62EIKbzG-V5e54iwWccDua6LvMlR6_bmIcs/s320/blue_normal_nofix.png" width="320" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZOq5zxiKC1PVR0pPVnAfUIiZttxpSa6ra_wA_ydF2MDO_JjoZN90YSdIuD_FWt531IGIRuKI_LE9Nh3WHOsvq8WenfXOxTYoMSoco5yNFU99G9qtiKSTPnGnaWKQ58URgnQGKDwlXfIOX/s1600/blue_normal_withFix.png" imageanchor="1"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZOq5zxiKC1PVR0pPVnAfUIiZttxpSa6ra_wA_ydF2MDO_JjoZN90YSdIuD_FWt531IGIRuKI_LE9Nh3WHOsvq8WenfXOxTYoMSoco5yNFU99G9qtiKSTPnGnaWKQ58URgnQGKDwlXfIOX/s320/blue_normal_withFix.png" width="320" /></a>
</td>
</tr>
<tr>
<th colspan="3"><span style="font-size: x-small;"><span style="font-weight: normal;">Left: without blue light fix LMT, Right: with blue light fix LMT
</span></span></th>
</tr>
</tbody></table>
<br />
So may be we can work around the problem in the other way. Instead of making the blue color less saturated, we can make the light color more saturated. So I added a light "Color Picker Space" combo box to specify the color space of the picked RGB light value, so that more saturated blue light color can be chosen. By choosing an extremely saturated blue color (0, 0, 255) RGB value in ACEScg color space. We can get away with the purple color:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhI_2wWQVHgWL-l_j1Wj3fmYKHfegJeeB0j3W-xMS7Oa3BhXQoTQS_lFLBwy9YwFtHHtKu-9krQ2e4LiQQ77YCDqugLA7-XM23JvJZA4_-cwgBxVD3LcsG24Z4zQn9LZD-JLRWfwu2mM98n/s1600/blue_ACEScg.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhI_2wWQVHgWL-l_j1Wj3fmYKHfegJeeB0j3W-xMS7Oa3BhXQoTQS_lFLBwy9YwFtHHtKu-9krQ2e4LiQQ77YCDqugLA7-XM23JvJZA4_-cwgBxVD3LcsG24Z4zQn9LZD-JLRWfwu2mM98n/s320/blue_ACEScg.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Using a saturated blue light in ACEScg space, without the blue light fix LMT</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Bloom</span></b><br />
Lastly, a bloom pass is added before tone mapping. Bloom pixels are extracted based on a threshold that exceeded the maximum luminance with the current exposure values. The <a href="https://seblagarde.files.wordpress.com/2015/07/course_notes_moving_frostbite_to_pbr_v32.pdf">maximum luminance</a> is calculated with:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg892JWe-Ae3RWrQ70S1aKB6GYCN0AyZNJJIar23PEokwghs91FTkDZNEI3TfcF9NV6t0BFWpa44H0AnqQi8UjYQLggdTnMVEDXO7xNymVyrwL-PDjMjIWt3BWxCNsq_up_SrRnj9MEvmBG/s1600/max_lum.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="44" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg892JWe-Ae3RWrQ70S1aKB6GYCN0AyZNJJIar23PEokwghs91FTkDZNEI3TfcF9NV6t0BFWpa44H0AnqQi8UjYQLggdTnMVEDXO7xNymVyrwL-PDjMjIWt3BWxCNsq_up_SrRnj9MEvmBG/s320/max_lum.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">max luminance calculated using EV100</td></tr>
</tbody></table>
But simply subtracting the lighting value with the threshold will introduce some hue shift to the bloom color. So the RGB lighting value is transformed to HSV space, subtract the threshold from V, and then transform back to RGB space (We keep all the RGB values in the rendering space without transforming the lighting value to sRGB from ACEScg/Rec2020 during HSV conversion, as there are not much difference between the bloom results). Given an image with HDR values:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGlwenWhvwzqyKfhfBxYduhQkyF5renAXo2QL82Nn2rvgDGA9iWlQY1d_AaMft4i8qBUjVgGmzZ5xXMbDEl6AoVSW-cpB_UPjZQbJ2oejWPSH8MIxqJ8dOmb8u9oKG0AlTCeSDWLN0Tfur/s1600/bloom.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGlwenWhvwzqyKfhfBxYduhQkyF5renAXo2QL82Nn2rvgDGA9iWlQY1d_AaMft4i8qBUjVgGmzZ5xXMbDEl6AoVSW-cpB_UPjZQbJ2oejWPSH8MIxqJ8dOmb8u9oKG0AlTCeSDWLN0Tfur/s320/bloom.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Input image for the bloom pass</td></tr>
</tbody></table>
The differences between using threshold in HSV space and RGB space:<br />
<table>
<tbody>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjG1Gw9W0ddO22sDsny6Oman1ceOMxIbfWFOJ1nxW6G1WL3-8MQyJADL7PqXKoeTBhqFTeslyLWDW1fQsjKHN535_Ph27q0oCo_2WqON3tP_F3WHEeC4gL4c6oWZ4eRRE12E1xnyAI2qUfb/s1600/bloom_HSV.png" imageanchor="1"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjG1Gw9W0ddO22sDsny6Oman1ceOMxIbfWFOJ1nxW6G1WL3-8MQyJADL7PqXKoeTBhqFTeslyLWDW1fQsjKHN535_Ph27q0oCo_2WqON3tP_F3WHEeC4gL4c6oWZ4eRRE12E1xnyAI2qUfb/s320/bloom_HSV.png" width="320" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiEIM0bCpqVAMgk0l_ksN4gWXJehVze4c41WsUqAn1foTVr9futg8OMV5ULvCSLgOC9NO2dalEl6H82wtnn6HaoVNZ88hhyphenhyphenC3IDyoanwkks9DPoJ3Q42gfhMcN_VCX9M0NEEhVIvuosPWu2/s1600/bloom_RGB.png" imageanchor="1"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiEIM0bCpqVAMgk0l_ksN4gWXJehVze4c41WsUqAn1foTVr9futg8OMV5ULvCSLgOC9NO2dalEl6H82wtnn6HaoVNZ88hhyphenhyphenC3IDyoanwkks9DPoJ3Q42gfhMcN_VCX9M0NEEhVIvuosPWu2/s320/bloom_RGB.png" width="320" /></a>
</td>
</tr>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8oLJutht0oprlCvO5_n5RIgXeLJ1CjTLXUJ1CFR0keobg0E0Ic7rnRhoKMs48FJlvulleHQtTtglxWXaQ-19a0B-ikSM8Rx0JqOUzQ7tdB2Cck8qQ0j4_9T8lQGqeXUL9jSqGyidq7vNR/s1600/bloom_HSV_debug.png" imageanchor="1"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8oLJutht0oprlCvO5_n5RIgXeLJ1CjTLXUJ1CFR0keobg0E0Ic7rnRhoKMs48FJlvulleHQtTtglxWXaQ-19a0B-ikSM8Rx0JqOUzQ7tdB2Cck8qQ0j4_9T8lQGqeXUL9jSqGyidq7vNR/s320/bloom_HSV_debug.png" width="320" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgKuyBhpdRFwJkdc8v5ohpzyK9nQ2e3-cvTTZ2ltrABVzHZStEnPHJ7_3wZ5jDIpuFvwCqBl70-pd8gpHLRuWap2Vt-l4VUOQSuQDAwnYerhi4QAF3aisbYNAqkPQdNwiF0U9PlUPzeL135/s1600/bloom_RGB_debug.png" imageanchor="1"><img border="0" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgKuyBhpdRFwJkdc8v5ohpzyK9nQ2e3-cvTTZ2ltrABVzHZStEnPHJ7_3wZ5jDIpuFvwCqBl70-pd8gpHLRuWap2Vt-l4VUOQSuQDAwnYerhi4QAF3aisbYNAqkPQdNwiF0U9PlUPzeL135/s320/bloom_RGB_debug.png" width="320" /></a>
</td>
</tr>
<tr>
<th colspan="2"><span style="font-size: x-small;"><span style="font-weight: normal;">Left column: bloom in HSV space. Right column: bloom in RGB space.</span></span><br />
<span style="font-size: x-small;"><span style="font-weight: normal;">Upper row: Lighted scene combined with bloom.</span></span><br />
<span style="font-size: x-small;"><span style="font-weight: normal;">Lower row: Debug images showing only the bloom component.</span></span>
</th>
</tr>
</tbody></table>
<br />
The bloom calculated using HSV space introduce less saturated color. The situation will be exaggerated when the image is over-exposed:<br />
<table>
<tbody>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjblPQZU8BrmFdAMGAsN1bQEIco6ngIL_V4ixDQxonOMWvxaq83DU25OupSav5x0yRMo-nlGOjfto6gXL91h6FjUUuXc0NMJyG9Fen8jyVnpLj53K6wrxSpSskGmhcM6p1OVZkgbsvZ9Fym/s1600/bloom_overExp.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjblPQZU8BrmFdAMGAsN1bQEIco6ngIL_V4ixDQxonOMWvxaq83DU25OupSav5x0yRMo-nlGOjfto6gXL91h6FjUUuXc0NMJyG9Fen8jyVnpLj53K6wrxSpSskGmhcM6p1OVZkgbsvZ9Fym/s200/bloom_overExp.png" width="200" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJ4A5zT0-PXleAWZ8-DFfFrAqTtEsm-w9sOl_HPhSOX-B90bD093G4ZQ7TXslwrH8AS_xZ82PgwZPWUQbD0ue3nw5jEKTgAD0oFagy-nJPd4P5aQxj2Hv3vA4YehvukS_53mvRPPXFJjzZ/s1600/bloom_overExp_HSV.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgJ4A5zT0-PXleAWZ8-DFfFrAqTtEsm-w9sOl_HPhSOX-B90bD093G4ZQ7TXslwrH8AS_xZ82PgwZPWUQbD0ue3nw5jEKTgAD0oFagy-nJPd4P5aQxj2Hv3vA4YehvukS_53mvRPPXFJjzZ/s200/bloom_overExp_HSV.png" width="200" /></a>
</td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiybAokfBrNROB_LGDZAqVUl1M45y0mgMHyy6u2Oir-75YMy49fwz-wsYtSAqcBH0oHHR-XGABdUOHyFEDdpto1fkVSqyxpsz_qhoITyywEDI6RWPv2T0hEAJL9Z9md-D4A2IuJ3tn7Y6x0/s1600/bloom_overExp_RGB.png" imageanchor="1"><img border="0" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiybAokfBrNROB_LGDZAqVUl1M45y0mgMHyy6u2Oir-75YMy49fwz-wsYtSAqcBH0oHHR-XGABdUOHyFEDdpto1fkVSqyxpsz_qhoITyywEDI6RWPv2T0hEAJL9Z9md-D4A2IuJ3tn7Y6x0/s200/bloom_overExp_RGB.png" width="200" /></a>
</td>
</tr>
<tr>
<th colspan="3"><span style="font-size: x-small;"><span style="font-weight: normal;">Left:Bloom input image. </span></span><br />
<span style="font-size: x-small;"><span style="font-weight: normal;">Center:Bloom in HSV space. </span></span><br />
<span style="font-size: x-small;"><span style="font-weight: normal;">Right: Bloom in RGB space.</span></span></th>
</tr>
</tbody></table>
<b><span style="font-size: large;">Conclusion</span></b><br />
In this post, the core algorithm of my DXR path tracer is described, together with some color space conversion. There are much more stuff to be done in the future like, support dynamic geometry during ray tracing, adding a denoiser for path traced output, implement hybrid rasterization/ray tracing rendering, spectral rendering to compute a ground truth reference. Also, this is my first time to write code about color space management. Currently, in the demo, the 3D lighting can be displayed correctly using the monitor gamut, but the UI is not managed properly. Also, 4K and HDR need to be supported too.<br />
<br />
<b>References</b><br />
<span style="font-size: x-small;">[1] </span><a href="https://seblagarde.files.wordpress.com/2015/07/course_notes_moving_frostbite_to_pbr_v32.pdf">https://seblagarde.files.wordpress.com/2015/07/course_notes_moving_frostbite_to_pbr_v32.pdf</a><br />
<span style="font-size: x-small;">[2] </span><a href="https://microsoft.github.io/DirectX-Specs/d3d/Raytracing.html#addressing-calculations-within-shader-tables">https://microsoft.github.io/DirectX-Specs/d3d/Raytracing.html#addressing-calculations-within-shader-tables</a><br />
<span style="font-size: x-small;">[2] </span><a href="https://github.com/ampas/aces-dev">https://github.com/ampas/aces-dev</a><br />
<span style="font-size: x-small;">[3] </span><a href="http://www.brucelindbloom.com/index.html?Eqn_RGB_XYZ_Matrix.html">http://www.brucelindbloom.com/index.html?Eqn_RGB_XYZ_Matrix.html</a><br />
<span style="font-size: x-small;">[4] </span><a href="http://www.brucelindbloom.com/index.html?Eqn_ChromAdapt.html">http://www.brucelindbloom.com/index.html?Eqn_ChromAdapt.html</a><br />
<span style="font-size: x-small;">[5] </span><a href="http://www.polyphony.co.jp/publications/sa2018/">http://www.polyphony.co.jp/publications/sa2018/</a><br />
<span style="font-size: x-small;"><br /></span>
<span style="font-size: x-small;"><br /></span>
<br />
<br />
<br />
<br />
<br />
<br />
<br /></div>
Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-17506575114406170812020-01-20T01:56:00.002+08:002022-11-28T16:04:43.553+08:00Note on sampling GGX Distribution of Visible Normals<b>Introduction</b><br />
<div>
After writing an AO demo in last post, I started to write a progressive path tracer, but my progress was very slow due to the social unrest in past few months (<a href="https://www.washingtonpost.com/graphics/2019/world/hong-kong-protests-excessive-force/">here are some related news about what has happened</a>). In the past weeks, the situation has claimed down a bit, and I continue to write my path tracer and started adding specular lighting. While implementing Eric Heitz's <a href="http://www.jcgt.org/published/0007/04/01/paper.pdf">"Sampling the GGX Distribution of Visible Normals"</a> technique, I was confused by why taking a random sample on a disk and then project it on the hemisphere equals to the GGX distribution of visible normals (VNDF). And I can't find a prove in the paper, so in this post, I will try to verify their PDF are equal. (Originally, I planned to write this post after finishing my path tracer demo. But I worry that the situation here in Hong Kong will get worse again and won't be able to write, so I decided to write it down first, hope it won't get too boring with only math equations.)<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5UwCR-kP9pb_AQUjwd6is6iGQhX19Yuiy5kL3Xn8qa3o8wphUQgCc1RVo_GHuRa7fqyUJQLM6zKTtvHp1j_dxUiZKRLnT8Cgldqw-aNbRF8fhqEvAD2HF1j7ZXI7XMhdBjFY2Stpb12Lp/s1600/spec.png" style="margin-left: auto; margin-right: auto;"><img border="0" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5UwCR-kP9pb_AQUjwd6is6iGQhX19Yuiy5kL3Xn8qa3o8wphUQgCc1RVo_GHuRa7fqyUJQLM6zKTtvHp1j_dxUiZKRLnT8Cgldqw-aNbRF8fhqEvAD2HF1j7ZXI7XMhdBjFY2Stpb12Lp/s400/spec.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">My work in progress path tracer, using GGX material only</td></tr>
</tbody></table>
<br />
<b>Quick summary of sampling the GGX VNDF technique</b><br />
For those who are not familiar with the GGX VNDF technique, I will briefly talk about it. It is an important sampling technique to sample a random normal vector from GGX distribution. That normal vector is then used for generating a reflection vector, usually for the next reflected ray during path tracing.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGBEqXjEzWHPM_h9ddoHSMuovYAio_057eoxGEs2UvkPBtZVsodU3CqoDuSLYpzXbgYvftMLMhCNYIBxcJUEPmgv_P2Zt30HMlfNch6Wy3r9qJ7hMe5lygRhURmO8MwedpC8WH0_3Q5ZiO/s1600/ndf.png" style="margin-left: auto; margin-right: auto;"><img border="0" height="90" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgGBEqXjEzWHPM_h9ddoHSMuovYAio_057eoxGEs2UvkPBtZVsodU3CqoDuSLYpzXbgYvftMLMhCNYIBxcJUEPmgv_P2Zt30HMlfNch6Wy3r9qJ7hMe5lygRhURmO8MwedpC8WH0_3Q5ZiO/s320/ndf.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Traditional importance sampling scheme use D(N) to sample a normal vector</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUiLKyP1l4XqmuNdWC3tRRjKK7m8ylj0iFtPYLvtZw9xmb2ArbnOjC6ihAaCldGeIRQUnEdvzAWRKUqbnCeUUfvgYXnBUi_Obhx5g4HdsVkvXbcM6vLqkoBlhKRj0vYH-hUasbrD6FcK8t/s1600/vndf.png" style="margin-left: auto; margin-right: auto;"><img border="0" height="110" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUiLKyP1l4XqmuNdWC3tRRjKK7m8ylj0iFtPYLvtZw9xmb2ArbnOjC6ihAaCldGeIRQUnEdvzAWRKUqbnCeUUfvgYXnBUi_Obhx5g4HdsVkvXbcM6vLqkoBlhKRj0vYH-hUasbrD6FcK8t/s320/vndf.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">VNDF technique use the visible normal to importance sample a vector, taking the view direction into account</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
Given a view vector to a GGX surface with arbitrary roughness, the steps to sample a normal vector are:<br />
<ol>
<li>Transform the view vector to GGX hemisphere configuration space (i.e. from arbitrary roughness to roughness = 1 config) using GGX stretch-invariant property.</li>
<li>Sample a random point on the projected disk along the transformed view direction.</li>
<li>Re-project the sampled point onto the hemisphere along view direction. And this will be our desired normal vector.</li>
<li>Transform the normal vector back to original GGX roughness space from the hemisphere configuration.</li>
</ol>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiuS3TSHZFnrSmJ_6K9s84WPAt6ibL_t5TnU1hWEZvBFz38YfYK6v18exu_Ap9yPW7arYg27-BL9wlZjyNuya1UJV6My1wvBYggD1qiHrxCOeR9NfmYMjbAdDq98HKCw9s81jvAw3PpsK8/s1600/eric_pic.png" style="margin-left: auto; margin-right: auto;"><img border="0" height="152" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhiuS3TSHZFnrSmJ_6K9s84WPAt6ibL_t5TnU1hWEZvBFz38YfYK6v18exu_Ap9yPW7arYg27-BL9wlZjyNuya1UJV6My1wvBYggD1qiHrxCOeR9NfmYMjbAdDq98HKCw9s81jvAw3PpsK8/s640/eric_pic.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">VNDF sampling technique illustration from Eric Heitz's paper</td></tr>
</tbody></table>
My confusion mainly comes from step 2 and 3, in the hemisphere configuration: why this method of generating normal vector equals to GGX VNDF exactly...</div>
<div>
<br /></div>
<div>
<b>GGX NDF </b><b>definition</b></div>
<div>
Before digging deep into the problem, let's start with the definition of GGX NDF. In the paper, it states that: <a href="http://www.jcgt.org/published/0007/04/01/paper.pdf">The GGX distribution uses only the upper part of the ellipsoid</a>, and when alpha/roughness equals to 1, <a href="https://hal.archives-ouvertes.fr/hal-01509746/document">the GGX distribution is a uniform hemisphere</a>. According to the definition (with alpha = 1):<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEhqXyzfJ3K0hu6RMCfmY8tdRPvLHRImJiG1mvx5vcrWs6moBogsRaMyyZSA8P57Et_Bs9e1Aq5JWzkYDDHNmKfiNmFXMPZaHTikexq0NK8ZvfWJO_7qRu0MPyGNBC53SbaZ9yREC6ysIi/s1600/ndf_sim.png"><img border="0" height="192" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEhqXyzfJ3K0hu6RMCfmY8tdRPvLHRImJiG1mvx5vcrWs6moBogsRaMyyZSA8P57Et_Bs9e1Aq5JWzkYDDHNmKfiNmFXMPZaHTikexq0NK8ZvfWJO_7qRu0MPyGNBC53SbaZ9yREC6ysIi/s400/ndf_sim.png" width="400" /></a><br />
<br />
So its PDF will be:<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTOJRlUrQHjAEMmbeMC0xQbAZkvHe5DAj9By4UzkbbSDpwpFGx-qlMD5xM-UnndeSSxFT073eiKbTWb4P4fw3pZ6JZPO-rJn45V4ZCwxHyUArmHaxDvki5gQoIxsbp2snSQBicOjBaYL_D/s1600/ndf_pdf.png"><img border="0" height="66" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTOJRlUrQHjAEMmbeMC0xQbAZkvHe5DAj9By4UzkbbSDpwpFGx-qlMD5xM-UnndeSSxFT073eiKbTWb4P4fw3pZ6JZPO-rJn45V4ZCwxHyUArmHaxDvki5gQoIxsbp2snSQBicOjBaYL_D/s200/ndf_pdf.png" width="200" /></a><br />
So, sampling a normal vector from GGX distribution (with alpha = 1) equals to sampling a vector using a cos-weighted distribution.</div>
<div>
<br />
<b>GGX VNDF definition</b><br />
The definition of VNDF depends on the shadowing function. And we are using the Smith shadowing function (with alpha =1):<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXl4O8GTNQ2PZFZ910TulomQjN0eIi9n9C2v_9x4KdpMGcXmGmA3w375cUqJ74GKlN3C80hSRE1SKCoV0WO1ybIf5E4IhK-hUg_vC0xVM1qSAa14Ga-jZul7w8QUSuUHMguy2DlasxHjLS/s1600/g1.png"><img border="0" height="528" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXl4O8GTNQ2PZFZ910TulomQjN0eIi9n9C2v_9x4KdpMGcXmGmA3w375cUqJ74GKlN3C80hSRE1SKCoV0WO1ybIf5E4IhK-hUg_vC0xVM1qSAa14Ga-jZul7w8QUSuUHMguy2DlasxHjLS/s640/g1.png" width="640" /></a><br />
<br />
Therefore the VNDF equals to:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzvN1qtTwUTC5dOhFMSg46GzVoUNpROfFSQEE8q1pFnx49Wvonh1CsiLd_G74TKnhYOeFSCU9iJKf4V3d327L-zjhajeOmmsDfFu3_ibrDr_A5aoh7BkvvTaKaFKHpc67thJPzcaYpqyJP/s1600/vndf_sim.png"><img border="0" height="174" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzvN1qtTwUTC5dOhFMSg46GzVoUNpROfFSQEE8q1pFnx49Wvonh1CsiLd_G74TKnhYOeFSCU9iJKf4V3d327L-zjhajeOmmsDfFu3_ibrDr_A5aoh7BkvvTaKaFKHpc67thJPzcaYpqyJP/s320/vndf_sim.png" width="320" /></a><br />
<br />
<b>GGX VNDF specific case</b></div>
<div>
With both GGX NDF and VNDF definition, we can start investigating the problem. I decided to start with something simple first, with a specific case: view direction equals to surface normal (i.e. V=Z).<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJcVWYkXPDAwawDhYpg9LXBDSoASI6US9OqNxOD5ZyiGUA3wy8l6glxH1oRvLD6PvZl68lDAMzzf6jjDQOqTj9daXnS5rA8o8LE4y3XwHbi6Y-0FDkJ9x4-yAZm5d5Hxs7zm2Y58tiu2Tk/s1600/vndf_z.png"><img border="0" height="96" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJcVWYkXPDAwawDhYpg9LXBDSoASI6US9OqNxOD5ZyiGUA3wy8l6glxH1oRvLD6PvZl68lDAMzzf6jjDQOqTj9daXnS5rA8o8LE4y3XwHbi6Y-0FDkJ9x4-yAZm5d5Hxs7zm2Y58tiu2Tk/s320/vndf_z.png" width="320" /></a><br />
<br />
After simplification in this V=Z case, the PDF of Dz(N) is also cos-weighted, which equals to the traditional sampling GGX NDF method.<br />
<br />
Now take a look at the sampling scheme by Eric Heitz's method. The method start with uniform sampling from a unit disc, which has a <a href="http://www.pbr-book.org/3ed-2018/Monte_Carlo_Integration/2D_Sampling_with_Multidimensional_Transformations.html">PDF = 1/π</a>, then the point is projected to the hemisphere along the view direction, which add a cos term to the PDF (i.e. Z.N/π ) according to <a href="http://www.pbr-book.org/3ed-2018/Monte_Carlo_Integration/2D_Sampling_with_Multidimensional_Transformations.html">Malley's method</a> (where the cos term comes from the Jacobian transform). Therefore, both the VNDF and Eric Heitz's method are the same at this specific case, which has a cos weighted PDF.</div>
<div>
<br /></div>
<div>
<b>GGX VNDF general case</b></div>
<div>
To verify Eric Heitz's sampling scheme equals to the PDF of GGX VNDF in all possible viewing direction, we need to calculate the PDF of his method and take care of how the PDF changes according to each transformation. From the paper we have this vertical mapping:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggjRuc4E8f6nN0L8BD0kN0UhSxMV1zBj1anoR-CiJNpn8Sm4iWtU1KE11tl9fIoB562FsFJyoUDKO3WV4vqi2fQQL07qDtOXWa8FNNKf8Oj4JP5DYrx079vB2iEgJujpdtBI_lGa0JPXB2/s1600/eric_eqt.png" style="margin-left: auto; margin-right: auto;"><img border="0" height="278" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggjRuc4E8f6nN0L8BD0kN0UhSxMV1zBj1anoR-CiJNpn8Sm4iWtU1KE11tl9fIoB562FsFJyoUDKO3WV4vqi2fQQL07qDtOXWa8FNNKf8Oj4JP5DYrx079vB2iEgJujpdtBI_lGa0JPXB2/s640/eric_eqt.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Transformation of randomly sampled point from Eric Heitz's paper</td></tr>
</tbody></table>
We know the PDF of sampling an unit disk is 1/π, (i.e. <i>P(t1, t2)</i>= 1/π), we need to calculate <i>P(t1, t2')</i>:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/a/AVvXsEjpinAhey9Tqbnl6F11mg-L-l5xNE4rDVm_oSH05tHVQibGjtiquO-N3zL5hf3JRcqb64x8oghHafzQ3B8MKN6lZbnl41S1I-cPf8j9M5K6WDGHsjlDZAyIbITYTIx7r8S14Ng_6iCfz-NaC0WGQ5hx_2wgXV123ibEyCLViPES-dWfVZiQ5cRqORuBOQ"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/a/AVvXsEjpinAhey9Tqbnl6F11mg-L-l5xNE4rDVm_oSH05tHVQibGjtiquO-N3zL5hf3JRcqb64x8oghHafzQ3B8MKN6lZbnl41S1I-cPf8j9M5K6WDGHsjlDZAyIbITYTIx7r8S14Ng_6iCfz-NaC0WGQ5hx_2wgXV123ibEyCLViPES-dWfVZiQ5cRqORuBOQ=w249-h320" width="249" /></a><br />
</div><div></div><div></div>
<div>
<div><i><span style="font-size: x-small;">(Edited on 28/11/2022: Thanks for Brian Collins pointing out, dt<span style="font-size: xx-small;">2</span>'/dt<span style="font-size: xx-small;">1</span> was calculated incorrectly before)</span></i><br /></div><div>The next step of the algorithm is to re-project the disc to the hemisphere along the view direction, which produce our target importance sampled normal, so by Malley's method again (but this time along the view direction instead of surface normal), we can add a V.N Jacobian term to the above PDF <i>P(t1,t2')</i>:</div>
<div>
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-JFP4bJM2gZ0bd2tEnvC0AJONmmJI8_bokVXQgGPN3achZywjfm_gi3i1raYhDMzFq_8AdAl0jncvcqv7Q-_fgT_HkVLY4dLEwpM3dhaFkRAAh3f1HF6MtP7Qtv02Lfazg5Ks4FsTq-13/s1600/pdf_n.png"><img border="0" height="80" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-JFP4bJM2gZ0bd2tEnvC0AJONmmJI8_bokVXQgGPN3achZywjfm_gi3i1raYhDMzFq_8AdAl0jncvcqv7Q-_fgT_HkVLY4dLEwpM3dhaFkRAAh3f1HF6MtP7Qtv02Lfazg5Ks4FsTq-13/s400/pdf_n.png" width="400" /></a><br />
<br />
The resulting PDF equals to the GGX VNDF definition exactly. So this solved my question of why Eric Heitz's sampling scheme is an exact sampling routine for the GGX VNDF.</div>
<div>
<br /></div>
<div>
<b>Conclusion</b></div>
<div>
This post describe my learning process of the paper <a href="http://www.jcgt.org/published/0007/04/01/paper.pdf">"Sampling the GGX Distribution of Visible Normals"</a> and solved my most confusing part of why "taking a random sample on a disk and then project it on the hemisphere equals to the GGX VNDF". If anybody knows a simpler proof of how these 2 equations are equal, or if you discover any mistake, please let me know in the comment. Thank you.<br />
<br />
<b><span style="font-size: x-small;">References</span></b><br />
<span style="font-size: x-small;">[1] <a href="http://www.jcgt.org/published/0007/04/01/paper.pdf">http://www.jcgt.org/published/0007/04/01/paper.pdf</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://hal.archives-ouvertes.fr/hal-01509746/document">https://hal.archives-ouvertes.fr/hal-01509746/document</a></span><br />
<span style="font-size: x-small;">[3] <a href="https://agraphicsguy.wordpress.com/2015/11/01/sampling-microfacet-brdf/">https://agraphicsguy.wordpress.com/2015/11/01/sampling-microfacet-brdf/</a></span><br />
<span style="font-size: x-small;">[4] <a href="https://schuttejoe.github.io/post/ggximportancesamplingpart1/">https://schuttejoe.github.io/post/ggximportancesamplingpart1/</a></span><br />
<span style="font-size: x-small;">[5] <a href="https://schuttejoe.github.io/post/ggximportancesamplingpart2/">https://schuttejoe.github.io/post/ggximportancesamplingpart2/</a></span><br />
<span style="font-size: x-small;">[6] <a href="http://www.pbr-book.org/3ed-2018/Monte_Carlo_Integration/2D_Sampling_with_Multidimensional_Transformations.html">http://www.pbr-book.org/3ed-2018/Monte_Carlo_Integration/2D_Sampling_with_Multidimensional_Transformations.html</a></span><br />
<br />
<br />
<br /></div>
<div>
<br /></div>
</div>
<div>
<br /></div>
Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-65178407024616576002019-09-30T23:54:00.001+08:002020-05-03T02:15:47.306+08:00DXR AO<b><span style="font-size: large;">Introduction</span></b><br />
<span style="font-size: x-small;">(Edit 3/5/2020: An updated version of the demo can be downloaded <a href="https://drive.google.com/file/d/1acRKuJyt5cxPxmx9UKfc9YH_wJubqMrt/view?usp=sharing"><b>here</b></a>, which support high DPI monitor and some bug fixes)</span><br />
It has been 2 months since my last post. For the past few months, the <a href="https://www.bbc.com/news/world-asia-china-49317695">situation here in Hong Kong</a> was very bad. Our basic human rights are deteriorating. Absurd things happens such as <a href="https://www.scmp.com/news/hong-kong/law-and-crime/article/3019524/least-10-injured-baton-wielding-mob-suspected-triad">suspected cooperation</a> <a href="https://www.youtube.com/watch?v=16CiwPChpr0&has_verified=1">between police</a> <a href="https://www.nytimes.com/2019/07/22/world/asia/hong-kong-protest-mob-attack-yuen-long.html">and triad</a>, <a href="https://www.hongkongfp.com/2019/08/12/hong-kong-police-shoot-projectiles-close-range-tai-koo-protester-suffers-ruptured-eye-tst/">as well as</a> <a href="https://www.hongkongfp.com/2019/09/20/broken-bones-internal-bleeding-hong-kong-police-used-reckless-indiscriminate-tactics-protests-says-amnesty/">the police</a> <a href="https://www.scmp.com/news/hong-kong/law-and-crime/article/3025241/chaos-hong-kongs-mtr-network-police-chase-protesters">brutality</a> (<a href="https://www.hongkongfp.com/2019/09/30/hong-kong-riot-police-target-journalists-sunday-unrest-reporter-shot-eye-projectile/">including shooting directly at the journalists</a>). I really don't know what can be done... May be, could you spare me a few minutes to sign <a href="https://petitions.whitehouse.gov/petition/please-impose-sanctions-hong-kong-police-until-all-have-committed-state-terrorism-crimes-are-brought-justice">some</a> <a href="https://petitions.whitehouse.gov/petition/global-magnitsky-act-pro-beijing-officials-hong-kong-executive-council">of</a> <a href="https://petitions.whitehouse.gov/petition/hongkongers-urge-congress-pass-protect-hong-kong-act">these</a> <a href="https://petitions.whitehouse.gov/petition/please-send-us-armed-force-hong-kong-rescue-citizens-massacre-carried-out-hong-kong-police-force">petitions</a>? Although such petitions may not be very useful, at least <a href="https://petitions.whitehouse.gov/petition/please-pass-bill-hong-kong-human-rights-and-democracy-act?fbclid=IwAR3bFTyr_xiIgHO9-9ICShIg1BQP1RgVbEEFBv7edIaNgOdNXUrCKah-dv4">after signing</a> some of them, the US Congress <a href="https://www.hongkongfp.com/2019/09/26/us-bill-punish-hong-kong-officials-strengthened-passes-congressional-committees-says-student-lobbying-leader/">is discussing</a> the Hong Kong Human Rights and Democracy Act now. I would sincerely appreciate your help. Thank you very much!<br />
<br />
Back to today's topic, after setting up my D3D12 rendering framework, I started to learn <a href="https://microsoft.github.io/DirectX-Specs/d3d/Raytracing.html">DirectX ray-tracing (DXR)</a>. So I decided to start writing an ambient occlusion demo first because it is easier than writing a full path tracer since I do not need to handle material information as well as the lighting data. The demo can be downloaded from <a href="https://drive.google.com/file/d/1Qj1OJIK397ZRNxyF4BjKa6PvVO7Zt_OV/view?usp=sharing">here</a> (required a DXR compatible graphics card and driver with Windows 10 build version 1809 or newer).<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghcH47ThtS8hkpMwIaPNG5RVpt8JjWh9pKtrBfqbbS31C9i8FU9ZUzabmZaG3ONjdRgxwpp3OtRG30pGKRfBzBx1u9PZm39xW_F0Go6h9Thg5V1n99rLC21VxqUDQ8wTn1kg8GpuQ37POp/s1600/ao.png" imageanchor="1"><img border="0" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghcH47ThtS8hkpMwIaPNG5RVpt8JjWh9pKtrBfqbbS31C9i8FU9ZUzabmZaG3ONjdRgxwpp3OtRG30pGKRfBzBx1u9PZm39xW_F0Go6h9Thg5V1n99rLC21VxqUDQ8wTn1kg8GpuQ37POp/s640/ao.png" width="640" /></a><br />
<br />
<b><span style="font-size: large;">Rendering pipeline</span></b><br />
In this demo, it renders a G-buffer with normal and depth data. Then a velocity buffer will be generated using current and previous frame camera transform, stored in RG16Snorm format. Then rays are traced from world position reconstructed from depth buffer with cosine weight distribution. To avoid ray-geometry self intersection, ray origin is shifted towards the camera a bit. After that, a temporal and spatial filter is applied to smooth out the noisy AO image and then an optional bilateral blur pass can be applied for a final clean up.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiINkHDXU1ckAYxpx37_vvQMIBtxb46Zg29qpq9j7zwXljiuU1oMFIRTmp1xXDUZwuilE7rVka1a2sOsd1XKuGnn2ciTkuaCVskXAwGMNYuu4HgTHbIequq3kUpH9Bd1Kpx9Iy5Y9Q95qmp/s1600/passes.png" imageanchor="1"><img border="0" height="96" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiINkHDXU1ckAYxpx37_vvQMIBtxb46Zg29qpq9j7zwXljiuU1oMFIRTmp1xXDUZwuilE7rVka1a2sOsd1XKuGnn2ciTkuaCVskXAwGMNYuu4HgTHbIequq3kUpH9Bd1Kpx9Iy5Y9Q95qmp/s400/passes.png" width="400" /></a><br />
<br />
<span style="font-size: large;"><b>Temporal Filter</b></span><br />
With the noisy image generated from the ray tracing pass, we can reuse previous frame ray-traced data to smooth out the image. In the demo, the velocity buffer is used to get the pixel location in previous frame (with an additional depth check between current frame depth value and the re-projected previous frame depth value). As we are calculating ambient occlusion using Monte Carlo Integration:<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjoOZPc2Hq5CuxOKhktwcOOe_bUzOwpMsZhgGu8XtYUwvIdy8odXIU-1thTatGfYGRkN07UfjcdUxITypBs6JaQcR8rtLB-pWqAgoRwxeOrFui7G19ymJJRimu3exVwjUaOaqCKnsbKOJJ8/s1600/AO_formula.png" imageanchor="1"><img border="0" height="148" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjoOZPc2Hq5CuxOKhktwcOOe_bUzOwpMsZhgGu8XtYUwvIdy8odXIU-1thTatGfYGRkN07UfjcdUxITypBs6JaQcR8rtLB-pWqAgoRwxeOrFui7G19ymJJRimu3exVwjUaOaqCKnsbKOJJ8/s320/AO_formula.png" width="320" /></a><br />
<br />
We can split the Monte Carlo integration into multiple frames and store the AO result into a RG16Unorm texture, where red channel stores the accumulated AO result, green channel stores the total sample count N (The sample count is clamped to 255 to avoid overflow). So after a new frame is rendered, we can accumulated the AO Monte Carlo Integration with the following equation:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjnVQoYKuEftiE7tE5em68J_guQEXcgOBbQpMi7xSuYhkqdZOg4PKOaVex4dB7GsY931Q_rjKMyDF0y4dvRwnqPN3_k-PEn6Ul80eKBatvI14L1bbKrHyVRbudq7YdKe9OKFTmMADkpBc-i/s1600/AO_formula_accum.png" imageanchor="1"><img border="0" height="93" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjnVQoYKuEftiE7tE5em68J_guQEXcgOBbQpMi7xSuYhkqdZOg4PKOaVex4dB7GsY931Q_rjKMyDF0y4dvRwnqPN3_k-PEn6Ul80eKBatvI14L1bbKrHyVRbudq7YdKe9OKFTmMADkpBc-i/s400/AO_formula_accum.png" width="400" /></a><br />
<br />
We also reduce the sample count by the delta depth difference between current and previous frame depth buffer value (i.e. when the camera zoom out/in) to "fade out" the accumulated history faster to reduce ghosting artifact.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixoTbgvyhYJhBzwb6x3T25iaLXP7VeNYfXQyF5-YIYUwAzzUWhoU3Lxsj1L6swfU_kA2wQmvprVlvzx12oqjV6NJClok5kNSlgAwsRvPZIEm8gBtdV1Hrd5YfolvrCMpNU_jMahIoGpnIX/s1600/ao_noise.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEixoTbgvyhYJhBzwb6x3T25iaLXP7VeNYfXQyF5-YIYUwAzzUWhoU3Lxsj1L6swfU_kA2wQmvprVlvzx12oqjV6NJClok5kNSlgAwsRvPZIEm8gBtdV1Hrd5YfolvrCMpNU_jMahIoGpnIX/s320/ao_noise.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">AO image traced at 1 ray per pixel</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdRFju5G6_qBycK0_lkq3rxiYKWoeIB6rnEG49FkpcG8M8KmhAeLE_LjqdM8jWLXSXRdZoy0KDQp_pMuKhCczovug5CqjKlZYAjZhEkl_GOonXw5PEPQ_GD2Bbf1w4VmDEvLcQeohh2J9C/s1600/ao_temporal.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhdRFju5G6_qBycK0_lkq3rxiYKWoeIB6rnEG49FkpcG8M8KmhAeLE_LjqdM8jWLXSXRdZoy0KDQp_pMuKhCczovug5CqjKlZYAjZhEkl_GOonXw5PEPQ_GD2Bbf1w4VmDEvLcQeohh2J9C/s320/ao_temporal.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">AO image with accumulated samples over multiple frames</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
But this re-projection temporal filter have a short coming that the geometry edge would failed very often (especially when done in half resolution). So in the demo, when re-projection failed, it will shift 1 pixel to perform the re-projection again to accumulate more samples.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjc2qLEOJacBCRJeq7rDqvM9U5ggzr8AzUr8MneA9M_8ThSUS_1Q8w30b54oOnFT_o-AQlGgjqFyIHDZTGB88VlM_GiMN-T7bfT8QYytZolVmHwdP3r-MjQMstv2MLZnkZBSAikbTa3OuEg/s1600/edge_before.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjc2qLEOJacBCRJeq7rDqvM9U5ggzr8AzUr8MneA9M_8ThSUS_1Q8w30b54oOnFT_o-AQlGgjqFyIHDZTGB88VlM_GiMN-T7bfT8QYytZolVmHwdP3r-MjQMstv2MLZnkZBSAikbTa3OuEg/s320/edge_before.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Many edge pixels failed the re-projection test</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKPUtlPk13GCc_brG8w9qR16iPZ79UXVLqm3Z-pYKKdqcYjA22tnTyImJeSu7YIUxtdo9_zMBZvcQq3RM9E50CEj1lggJTO69WTt0JAWV7zGBrf3QqqyMRBHCerWUHplzcO8F1f9LAPviC/s1600/edge_after.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKPUtlPk13GCc_brG8w9qR16iPZ79UXVLqm3Z-pYKKdqcYjA22tnTyImJeSu7YIUxtdo9_zMBZvcQq3RM9E50CEj1lggJTO69WTt0JAWV7zGBrf3QqqyMRBHCerWUHplzcO8F1f9LAPviC/s320/edge_after.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">With 1 pixel shifted, many edge pixels can be re-projected</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
As the result is biased, I have also reduced the sample count by a factor of 0.75 to make the correct ray-traced result "blend in" faster.<br />
<br />
<span style="font-size: large;"><b>Spatial Filter</b></span><br />
To increase the sample count for Monte Carlo Integration, we can reuse the ray-traced data in the neighbor pixels. We search in 5x5 grid and reuse the neighbor data if they are on the same surface by comparing their delta depth value (i.e. ddx and ddy generated from depth buffer). As the delta depth value is re-generated from depth buffer, some artifact may been seen on the triangle edge.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_pGze6QUFnObd0PK_a_jKLIbFpefXouEDJ-ggqAvFrYcIXoC2pqZIclf8F_Ed0JWz2UsUMrmSCfSDFWMvMUbY4KIhuCPS224Gm1xGWl7m6sRq5hCfufGlnY6dtp9xUKsDrFhfqvXSaOeY/s1600/ao_spatial.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_pGze6QUFnObd0PK_a_jKLIbFpefXouEDJ-ggqAvFrYcIXoC2pqZIclf8F_Ed0JWz2UsUMrmSCfSDFWMvMUbY4KIhuCPS224Gm1xGWl7m6sRq5hCfufGlnY6dtp9xUKsDrFhfqvXSaOeY/s320/ao_spatial.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">noisy AO image applied with a spatial filter</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7e5qcCLedElL6BxnB100Qwi-pRU0581KZy-rbUW0gAl2-19Ay1Nhk_wYoKCGevFmOqWuSbRc74NfATDgqkf3JeZI4JMsjhDXKw2LeMEEgQkbSiER3Qsq1F3JpsslNz7l0dkcbAC4Ve3dG/s1600/tri_edge_artifact.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7e5qcCLedElL6BxnB100Qwi-pRU0581KZy-rbUW0gAl2-19Ay1Nhk_wYoKCGevFmOqWuSbRc74NfATDgqkf3JeZI4JMsjhDXKw2LeMEEgQkbSiER3Qsq1F3JpsslNz7l0dkcbAC4Ve3dG/s320/tri_edge_artifact.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">artifact shown at the triangle edge by re-constructed delta depth</td></tr>
</tbody></table>
</td>
</tr>
<tr>
</tr>
</tbody></table>
To save some performance, beside using half resolution rendering, we can also choose to interleave the ray cast every 4 pixels and ray cast the remaining pixels in next few frames.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheSOphDT9Q_Mj7vmfwnpKkMktYYhOINQsvrGc8ehqcyePAypcU4C6P0jZ3ioab7AXvPafxYNNgKF-fl_zSD8-5qIfVIcwtc93fxX1WlrpDG94GfjXetS_-rEgWQ1XG5yC8uOtNhUSp8rai/s1600/interleaved.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheSOphDT9Q_Mj7vmfwnpKkMktYYhOINQsvrGc8ehqcyePAypcU4C6P0jZ3ioab7AXvPafxYNNgKF-fl_zSD8-5qIfVIcwtc93fxX1WlrpDG94GfjXetS_-rEgWQ1XG5yC8uOtNhUSp8rai/s400/interleaved.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Rays are traced only at the red pixels<br />
to save performance</td></tr>
</tbody></table>
For those pixels without any ray traced data during interleaved rendering, we use the spatial filter to fill in the missing data. The same surface depth check in spatial filter can be by-passed when the sample count(stored in green channel during temporal filter) is low, because it is better to have some "wrong" neighbor data than have no data for the pixel. This also helps to remove the edge artifact shown before.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8eap5EBT-JS0fQ6V7h8b0ggHFES9wZDdVYnCT4zoE5hNySzxMJDkX4pamxBLHxzofRD2hWnwZ85FFY2f0ysN7QJtch4HV0R4tBYhKwsBGb0sWuuoxcdCCJNdLt3Sgt5Kgx0h8AwIfJVt0/s1600/spatial_interleaved_before.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8eap5EBT-JS0fQ6V7h8b0ggHFES9wZDdVYnCT4zoE5hNySzxMJDkX4pamxBLHxzofRD2hWnwZ85FFY2f0ysN7QJtch4HV0R4tBYhKwsBGb0sWuuoxcdCCJNdLt3Sgt5Kgx0h8AwIfJVt0/s320/spatial_interleaved_before.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Rays are traced at interleaved pattern, <br />
leaving many 'holes' in the image</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj1wPu-N-B42EfbcYbliJLz9z4_wYlU4ASTwh2jYod6iRC6KSqI9TKhcyvrSrksDM-xvjIQEcqIJsFEP4O6pss22zKaB6I8Dwy1Rys3i-nBh7f7U42RUn7w8l99mk91itOnCtRcJmFPsS3p/s1600/spatial_interleaved_after.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj1wPu-N-B42EfbcYbliJLz9z4_wYlU4ASTwh2jYod6iRC6KSqI9TKhcyvrSrksDM-xvjIQEcqIJsFEP4O6pss22zKaB6I8Dwy1Rys3i-nBh7f7U42RUn7w8l99mk91itOnCtRcJmFPsS3p/s320/spatial_interleaved_after.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Spatial filter will fill in those 'holes' <br />
during interleaved rendering</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
Also, when ray casting are interleaved between pixels, we need to pay attention to the temporal filter too. We may have a chance that we re-project to previous frame pixel which have no sample data. In this case, we snap the re-projected UV to the pixel that have cast interleaved ray in previous frame.<br />
<br />
<span style="font-size: large;"><b>Bilateral Blur</b></span><br />
To clean up the remaining noise from the temporal and spatial filter. A bilateral blur is applied, we can have a wider blur by using the <a href="https://jo.dreggn.org/home/2010_atrous.pdf">edge aware A-Trous algorithm</a>. The blur radius is adjusted according to the sample count (stored in green channel in temporal filter). So when we have already cast many ray samples, we can reduce the blur radius to have a sharper image.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHqFqSdD-yFGpS0kd0SvE0AB89Kyvq4BTNoTtEs847iZW9fytw4gJ3XZTkFeqyWKDPU2YYHThRN8UsdEULEErkDhsBoes8Pb-r60fcW98yBaJ1MveT2tnnf8NhiEcQHkMPiBBTUUeXK2pQ/s1600/ao_blur.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHqFqSdD-yFGpS0kd0SvE0AB89Kyvq4BTNoTtEs847iZW9fytw4gJ3XZTkFeqyWKDPU2YYHThRN8UsdEULEErkDhsBoes8Pb-r60fcW98yBaJ1MveT2tnnf8NhiEcQHkMPiBBTUUeXK2pQ/s400/ao_blur.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Applying an additional bilateral blur to smooth out remaining noise</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Random Ray Direction</span></b><br />
When choosing the random ray cast direction, we want those chosen direction can have a more significance effect. Since we have a spatial filter to reuse neighbor pixels data, so we can try to cast rays in directions such that the angle between the ray direction in neighbor pixels should be as large as possible and also cover as much hemisphere area as possible.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5LVL5Qf5iGWbcPDNZ7QJKy8E21rNdFFCvNEbVOsuf_LUZYqag2UT9gv44r6FWhcqW9bcnxijOP-VqBXK5Fpwrl_WprmagIMU0EawMtpYjyM1UJdpq5oAnsDemGRRopgPjKQL_gDmYSUa8/s1600/ray_dir.png" imageanchor="1"><img border="0" height="142" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5LVL5Qf5iGWbcPDNZ7QJKy8E21rNdFFCvNEbVOsuf_LUZYqag2UT9gv44r6FWhcqW9bcnxijOP-VqBXK5Fpwrl_WprmagIMU0EawMtpYjyM1UJdpq5oAnsDemGRRopgPjKQL_gDmYSUa8/s320/ray_dir.png" width="320" /></a><br />
<br />
It looks like we can use some kind of <a href="http://momentsingraphics.de/BlueNoise.html">blue noise texture</a> so that the ray direction is well distributed. Let's take a look at how the cosine weighted random ray direction is generated:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfd1-qare77nEBxucXMATedb9f3_5tAdP1fZ318vQvXQxVK61qN7D2cT98v_bBGlzkOvQNmkCualRLChQoutiuNAYUPhlEzhp5WUbiTFI2jZ41ci4GVNkKrQv2TiYTWuO0_mE48bRUD6BI/s1600/ray_dir_formula.png" imageanchor="1"><img border="0" height="88" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjfd1-qare77nEBxucXMATedb9f3_5tAdP1fZ318vQvXQxVK61qN7D2cT98v_bBGlzkOvQNmkCualRLChQoutiuNAYUPhlEzhp5WUbiTFI2jZ41ci4GVNkKrQv2TiYTWuO0_mE48bRUD6BI/s400/ray_dir_formula.png" width="400" /></a><br />
<br />
From the above equation, the random variable ϕ is directly corresponding to the random ray direction on the tangent plane, which have a linear relationship between the angle ϕ and random variable ξ<span style="font-size: xx-small;">2</span>. Since we generate random numbers using <a href="http://www.reedbeta.com/blog/quick-and-easy-gpu-random-numbers-in-d3d11/">wang hash</a>, which is a white noise. May be we can stratified the random range and using the blue noise to pick a desired range to turn it like a blue noise pattern. For example, originally we have a random number between [0, 1), we may stratified it into 4 ranges: [0, 0.25), [0.25, 0.5), [0.5, 0.75), [0.75, 1). Then using the screen space pixel coordinates to sample a tileable blue noise texture. And according to the value of the blue noise, we scale the white noise random number into 1 of the 4 stratified range. Below is some sample code of how the stratification is done:<br />
<blockquote class="tr_bq">
<span style="font-family: inherit;">int BLUE_NOISE_TEX_SIZE= 64;</span><br />
<span style="font-family: inherit;">int STRATIFIED_SIZE= 16;</span><br />
<span style="font-family: inherit;">float4 noise= blueNoiseTex[pxPos % BLUE_NOISE_TEX_SIZE];</span><br />
<span style="font-family: inherit;">uint2 noise_quantized= noise.xy * (255.0 * STRATIFIED_SIZE / 256.0);</span><br />
<span style="font-family: inherit;">float2 r= wang_hash(pxPos); // random white noise in range [0, 1)</span><br />
<span style="font-family: inherit;">r = mad(r, 1.0/STRATIFIED_SIZE, noise_quantized * (1.0/STRATIFIED_SIZE));</span></blockquote>
With the blue noise adjusted ray direction, the ray traced AO image looks less noisy visually:<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0xzpG8VJjWqEZkS_8b2CNpbHoiSYijPm6juSWtebKH5H8A6uK_VOPff-DW1ArVML8y5MAu6z1bw4kNszbuUhBb0mcnHpROw4OJuCFHMOOTLQrI3Y52M9Z5VAlkaF8JkBiVoyIuxYbsSJ0/s1600/noise_white.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh0xzpG8VJjWqEZkS_8b2CNpbHoiSYijPm6juSWtebKH5H8A6uK_VOPff-DW1ArVML8y5MAu6z1bw4kNszbuUhBb0mcnHpROw4OJuCFHMOOTLQrI3Y52M9Z5VAlkaF8JkBiVoyIuxYbsSJ0/s320/noise_white.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Rays are traced using white noise</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0C6yP1EbtGh7A-9WbIDFdHy_WZpJhNJ_URX5iqBJdrmIFv9lxZPjfXS4AGUFgWzEgYorw8VTIgZR-yigrB_csox6TVF2zyJjW1MOtlfAsH54q_VRls_WBVeNl72KxbSADNj5IFjrLEhdm/s1600/noise_blue.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0C6yP1EbtGh7A-9WbIDFdHy_WZpJhNJ_URX5iqBJdrmIFv9lxZPjfXS4AGUFgWzEgYorw8VTIgZR-yigrB_csox6TVF2zyJjW1MOtlfAsH54q_VRls_WBVeNl72KxbSADNj5IFjrLEhdm/s320/noise_blue.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Rays are traced using blue noise</td></tr>
</tbody></table>
</td>
</tr>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggqWSDwSBfBm1zKJrUO3n6aqMvv9wb_hX5F5_XOV9rUzXN2skAURv862x4m0jTSZ_xALSpFLo5lLqAkcsYEmKW9sbI0KOaILhPIEYPkPS4ZVqSbPUwu7cLsx1rNL5kmjGFmzOD6ufrnXz3/s1600/noise_white_blur.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggqWSDwSBfBm1zKJrUO3n6aqMvv9wb_hX5F5_XOV9rUzXN2skAURv862x4m0jTSZ_xALSpFLo5lLqAkcsYEmKW9sbI0KOaILhPIEYPkPS4ZVqSbPUwu7cLsx1rNL5kmjGFmzOD6ufrnXz3/s320/noise_white_blur.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Blurred white noise AO image</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXIf8iYjasidUG9FSKtMi5w09BjVrRoYWXAwYVPEeFW_mV_03HS6qEYqrRiemV6rMB2Zr4a072YhSMidkIHWepimDontPIBCklRRrwKtHmIqIe1Tp9oj-kXFV15wq3RbabX5b2pUw4aQhyphenhyphen/s1600/noise_blue_blur.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="180" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiXIf8iYjasidUG9FSKtMi5w09BjVrRoYWXAwYVPEeFW_mV_03HS6qEYqrRiemV6rMB2Zr4a072YhSMidkIHWepimDontPIBCklRRrwKtHmIqIe1Tp9oj-kXFV15wq3RbabX5b2pUw4aQhyphenhyphen/s320/noise_blue_blur.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Blurred blue noise AO image</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Ray Binning</span></b><br />
In the demo, ray binning is <a href="http://advances.realtimerendering.com/s2019/Benyoub-DXR%20Ray%20tracing-%20SIGGRAPH2019-final.pdf">also</a> <a href="https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s91023-it-just-works-ray-traced-reflections-in-battlefield-v.pdf">implemented</a>, but the performance improvement is not significant. The ray binning only show a large performance gain when the ray tracing distance is large (e.g. > 10m) and turning off both half resolution and interleaved rendering. I have only ran the demo on my GTX 1060, may be the situation will be different on RTX graphcis card (so, this is something I need to investigate in the future). Also the demo may have a slight difference when toggling on/off ray binning due to the precision difference using RGBA16Float format to store ray direction (the difference will be vanished after accumulating more samples over multiple frames with temporal filter).<br />
<br />
<span style="font-size: large;"><b>Conclusion</b></span><br />
In this post, I have described how DXR is used to compute ray-traced AO in real-time using a combination of temporal and spatial filter. Those filters are important to increase the total sample count for Monte Carlo Integration and getting a noise free and stable image. The demo can be downloaded from <a href="https://drive.google.com/file/d/1Qj1OJIK397ZRNxyF4BjKa6PvVO7Zt_OV/view?usp=sharing">here</a>. There are still plenty of stuff to improve, such as having a <a href="https://cg.ivd.kit.edu/publications/2017/svgf/svgf_preprint.pdf">better filter</a>, currently, when the AO distance is large and both half resolution and interleaved rendering is turned on (i.e. 1 ray per 16 pixels), the image is too noisy and not temporally stable during camera movement. May be I need to improve those stuff when writing a path tracer in the future.<br />
<br />
<b>References</b><br />
<span style="font-size: x-small;">[1] DirectX Raytracing (DXR) Functional Spec <a href="https://microsoft.github.io/DirectX-Specs/d3d/Raytracing.html">https://microsoft.github.io/DirectX-Specs/d3d/Raytracing.html</a></span><br />
<span style="font-size: x-small;">[2]<span style="font-size: x-small;"><span style="font-family: inherit;"> <span style="left: 140.473px; top: 205.127px; transform: scaleX(0.981098);">Edge-Avoiding À-Trous Wavelet Transform for fast Global</span><span style="left: 365.062px; top: 238.337px; transform: scaleX(1.04612);">Illumination Filtering <a href="https://jo.dreggn.org/home/2010_atrous.pdf">https://jo.dreggn.org/home/2010_atrous.pdf</a></span></span></span></span><br />
<span style="font-size: x-small;"><span style="font-family: inherit;"><span style="left: 365.062px; top: 238.337px; transform: scaleX(1.04612);"><span style="font-size: x-small;"><span style="font-family: inherit;"><span style="font-size: x-small; left: 365.062px; top: 238.337px; transform: scalex(1.04612);">[3] Free blue noise textures <a href="http://momentsingraphics.de/BlueNoise.html">http://momentsingraphics.de/BlueNoise.html</a></span></span></span></span></span></span><br />
<span style="font-size: x-small;"><span style="font-family: inherit;"><span style="font-size: x-small; left: 365.062px; top: 238.337px; transform: scalex(1.04612);">[4] Quick And Easy GPU Random Numbers In D3D11 <a href="http://www.reedbeta.com/blog/quick-and-easy-gpu-random-numbers-in-d3d11/">http://www.reedbeta.com/blog/quick-and-easy-gpu-random-numbers-in-d3d11/</a></span></span></span><br />
<span style="font-size: x-small;"><span style="font-size: x-small;"><span style="font-family: inherit;"><span style="left: 365.062px; top: 238.337px; transform: scaleX(1.04612);">[5] Leveraging Real-Time Ray Tracing to build a Hybrid Game Engine <a href="http://advances.realtimerendering.com/s2019/Benyoub-DXR%20Ray%20tracing-%20SIGGRAPH2019-final.pdf">http://advances.realtimerendering.com/s2019/Benyoub-DXR%20Ray%20tracing-%20SIGGRAPH2019-final.pdf</a></span></span></span></span><br />
<span style="font-size: x-small;">[6] ”It Just Works”: Ray-Traced Reflections in 'Battlefield V' <a href="https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s91023-it-just-works-ray-traced-reflections-in-battlefield-v.pdf">https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s91023-it-just-works-ray-traced-reflections-in-battlefield-v.pdf</a></span>Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-72430125059732412982019-07-14T13:53:00.000+08:002019-07-14T13:53:02.509+08:00Reflection and Serialization<b><span style="font-size: large;">Introduction</span></b><br />
Reflection and serialization is a convenient way to save/load data. After reading <a href="https://www.gdcvault.com/play/1026345/The-Future-of-Scene-Description">"The Future of Scene Description on 'God of War'"</a>, I decided to try to write something called "Compile-time Type Information" described in the presentation (but a much more simple one with less functions). All my need is something to save/load C style struct (something like D3D DESC structure, e.g. <a href="https://docs.microsoft.com/en-us/windows/win32/api/d3d12/ns-d3d12-d3d12_shader_resource_view_desc">D3D12_SHADER_RESOURCE_VIEW_DESC</a>) in my toy engine.<br />
<br />
<b><span style="font-size: large;">Reflection</span></b><br />
A reflection system is needed to describe how struct are defined before writing a serialization system. <a href="https://preshing.com/20180116/a-primitive-reflection-system-in-cpp-part-1/">This site</a> has many information about this topic. I use a similar approach to describe the C struct data with some macro. We define the following 2 data types to describe all possible struct that need to be reflected/serialized in my toy engine (with some variables omitted for easier understanding).<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPcwmrFTwJ_SOQObAGrS10xeHwIOhAxpkmRuvDenOYrkTf75-U-qjp14GW5fbXguOPw9wBJ6qA-GAKSruNoF6Ywo91OW8f2i4XmQXG8CfEYzoc-DLznbDgyJshk7x1hEJaFgL-MyEXpbRP/s1600/TypeInfo.png" imageanchor="1"><img border="0" height="100" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjPcwmrFTwJ_SOQObAGrS10xeHwIOhAxpkmRuvDenOYrkTf75-U-qjp14GW5fbXguOPw9wBJ6qA-GAKSruNoF6Ywo91OW8f2i4XmQXG8CfEYzoc-DLznbDgyJshk7x1hEJaFgL-MyEXpbRP/s640/TypeInfo.png" width="640" /></a><br />
<br />
As you can guess from their names, TypeInfo is used to described the C struct that need to be reflected/serialized. And TypeInfoMember is responsible for describing the member variables inside the struct. We can use some macro tricks to reflect a struct(more can be found in the <a href="https://preshing.com/20180116/a-primitive-reflection-system-in-cpp-part-1/">reference</a>):<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwRYve4PKNlcEewNJNhGeBEDAJnDGyeZFFEOIlH3nFp__Hus53BIwUdORn0j-2FpJMuEfAYiCrDytTkqx35Lwor6Y1ESZvkyX3yYLlI8pjQKW37CHf_lDgKwakL9haJFqVi4Zgp6Bu0QZo/s1600/reflect_macro.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwRYve4PKNlcEewNJNhGeBEDAJnDGyeZFFEOIlH3nFp__Hus53BIwUdORn0j-2FpJMuEfAYiCrDytTkqx35Lwor6Y1ESZvkyX3yYLlI8pjQKW37CHf_lDgKwakL9haJFqVi4Zgp6Bu0QZo/s400/reflect_macro.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">struct reflection example</td></tr>
</tbody></table>
The above example reflect 3 variables inside struct vec3: <i>x</i>, <i>y</i>, <i>z</i>. The tricks of those macro is to use <a href="https://en.cppreference.com/w/cpp/language/sizeof">sizeof()</a>, <a href="https://en.cppreference.com/w/cpp/language/alignof">alignof()</a>, <a href="https://en.cppreference.com/w/cpp/types/offsetof">offsetof()</a> and <a href="https://en.cppreference.com/w/cpp/language/type_alias">using keyword</a>. The sample implementation can be found below:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNGHrQ9y5EZx4jB33N7TU_ct36S7IdHZM5bqe4hL2Fl8OLPFWKQxPFqwyp1ILIAX2SbYc0Z7z85xMNHK55IQkgxgGpb9qWB1VTRNfdn3MhAsSZ4fqGh_s1V2aj_IZePWTgD_Ct866QluLC/s1600/reflect_macro_def.png" imageanchor="1"><img border="0" height="204" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjNGHrQ9y5EZx4jB33N7TU_ct36S7IdHZM5bqe4hL2Fl8OLPFWKQxPFqwyp1ILIAX2SbYc0Z7z85xMNHK55IQkgxgGpb9qWB1VTRNfdn3MhAsSZ4fqGh_s1V2aj_IZePWTgD_Ct866QluLC/s640/reflect_macro_def.png" width="640" /></a><br />
<br />
This approach has one disadvantage that we cannot use <a href="https://en.cppreference.com/w/cpp/language/bit_field">bit field</a> to specify how many bits are used in a variable. And bit field order seems to be <a href="https://stackoverflow.com/questions/1490092/c-c-force-bit-field-order-and-alignment">compiler dependent</a>. So I just don't use it for the struct that need to be reflected.<br />
<br />
It also has another disadvantage that it is error-prone to reflect each variable manually. So I have written a C struct header parser (using Flex & Bison) to generate the reflection source code. So, for those C struct file that need to auto generate the reflection data, instead of naming the source code file with extension .h, we need to name it with another file extension (e.g. .hds) and using visual studio <a href="https://simonstechblog.blogspot.com/2019/06/msbuild-custom-build-tools-notes.html">custom MSBuild file</a> to execute my header parser. To make visual studio to get syntax high light for this custom file type, We need to associate this file extension with C/C syntax by navigate to<br />
<blockquote class="tr_bq">
"Tools" -> "Options" -> "Text Editor" -> "File Extension"</blockquote>
and add the appropriate association:<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUOQMOejgtNaGGXkDn5cxi5VhYlACFEW2l50WTA2rTvIeH1iWHFVAF8ROaSXU2OJzJ3z6n7L5gvpr_mT8zzLgpe8NKyn6Ma4vEka4VJmzHXFSzQ_s2CfoE3gwbSs1lvwT-Aj0BPi_ydrAU/s1600/file_extension.png" imageanchor="1"><img border="0" height="232" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUOQMOejgtNaGGXkDn5cxi5VhYlACFEW2l50WTA2rTvIeH1iWHFVAF8ROaSXU2OJzJ3z6n7L5gvpr_mT8zzLgpe8NKyn6Ma4vEka4VJmzHXFSzQ_s2CfoE3gwbSs1lvwT-Aj0BPi_ydrAU/s400/file_extension.png" width="400" /></a><br />
<br />
But one thing I cannot figure out is the auto-complete when typing "#include" for custom file extension, looks like visual studio only filtered for a couple of extensions (e.g. .h, .inl, ...) and cannot recognize my new file type... If someone knows how to do it, please leave a comment below. Thank you.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4IRxiIglfAGTVvxeVStxUys4TdsTa8mIKRawqzXFITa9paprQJUSYuBtDCsTj5Aqt-oDIJqNSLZcB0jpNN71EUS0uQ3BcpGXBiUCnV4IolEggGX3OOBoHbTBCL0TPDNW5BpXhZ_YMNdct/s1600/missing_auto_complete.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="128" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4IRxiIglfAGTVvxeVStxUys4TdsTa8mIKRawqzXFITa9paprQJUSYuBtDCsTj5Aqt-oDIJqNSLZcB0jpNN71EUS0uQ3BcpGXBiUCnV4IolEggGX3OOBoHbTBCL0TPDNW5BpXhZ_YMNdct/s640/missing_auto_complete.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">MSVC auto-complete filter for .h file only and cannot discover the new type .hds</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Serialization</span></b><br />
With the reflection data available, we know how large a struct is, how many variables and their byte offset from the start of the struct, so we can serialize our C struct data. We define the serialization format with data header and a number of data chunks as following figure:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7u2hNhTBv756f4TkbJoLd-Dgk_TlzgdWkUwXoMcl6cczti8Q_jvcNnyyKvozMd8pJ7lTmaPkd_lz7Lr9Hvo940zixQH9r6rY62YtbFjlOGDTdNhFQrK6c21Lxu_Zifq6qgswIIG0IECDs/s1600/format.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi7u2hNhTBv756f4TkbJoLd-Dgk_TlzgdWkUwXoMcl6cczti8Q_jvcNnyyKvozMd8pJ7lTmaPkd_lz7Lr9Hvo940zixQH9r6rY62YtbFjlOGDTdNhFQrK6c21Lxu_Zifq6qgswIIG0IECDs/s320/format.png" width="148" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Memory layout of a serialized struct</td></tr>
</tbody></table>
<br />
<b>Data Header</b><br />
The data header contains all the TypeInfo used in the struct that get serialized, as well as the architecture information(i.e. x86 or x64). During de-serialization, we can compare the runtime TypeInfo against the serialized TypeInfo to check whether the struct has any layout/type change (To speed up the comparison, we generate a hash value for every TypeInfo by using the file content that defining the struct). If layout/type change is detected, we de-serialize the struct variables one by one (and may perform the data conversion if necessary, e.g. int to float), otherwise, we de-serialize the data in chunks.<br />
<br />
<b>Data Chunk</b><br />
The value of C struct are stored in data chunks. There are 6 types of data chunks: RawBytes, size_t, String, Struct, PointerSimple, PointerComplex. There are 2 reasons to divide the chunk into different types: First, we want to support the serialized data to be used on different architecture (e.g. serialized on x86, de-serialized on x64) where some data type have different size depends on architecture(e.g. size_t, pointers). Second, we want to support serializing pointers(with some restriction). Below is a simple C struct that illustrate how the data are divided into chunks:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-uMQD_Mzbt6aqHUvOJjS7-dXPy9bCliPdJ0QNGEWDcq8XhCGdNAOTuUZ-eV_emPXeJBbT-tPMClBa-wU4KPJTXe6j8pYwlEj4ivIJjnzBiSifHQak-bQ9egOuMcRAO6brXsZSYvlT4IV7/s1600/sample.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="101" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg-uMQD_Mzbt6aqHUvOJjS7-dXPy9bCliPdJ0QNGEWDcq8XhCGdNAOTuUZ-eV_emPXeJBbT-tPMClBa-wU4KPJTXe6j8pYwlEj4ivIJjnzBiSifHQak-bQ9egOuMcRAO6brXsZSYvlT4IV7/s320/sample.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">This Sample struct get serialized into 3 data chunks</td></tr>
</tbody></table>
<br />
<b>RawBytes chunk</b><br />
RawBytes chunk is a chunk that contains a group of values where the size of those variables are architecture independent. Refer to the above Sample struct, the variables <i>val_int</i> and <i>val_float</i> are grouped into a single RawBytes chunk so that during run time, those values can be de-serialized by a single call to memcpy().<br />
<br />
<b>size_t chunk</b><br />
size_t chunk is a chunk that contains a single size_t value, which get serialized as a 64 bit integer value to avoid data loss. But loading a too large value on x86 architecture will cause a warning. Usually this type will not be used, I just add it in case I need to serialize this type for third party library.<br />
<br />
<b>String chunk</b><br />
String chunk is used for storing the string value of <i>char*</i>, the serializer can determine the length the string (looking for '\0') and serialize appropriately.<br />
<br />
<b>Struct chunk</b><br />
Struct chunk is used when we serialize a struct that contains another struct which have some architecture dependent variables. With this chunk type, we can serialize/de-serialize recursively.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2aBAdFMv1m5_OjRNZONzheF6qPFW0Qm9Yxl7Xmn7ykP457I4mLnLnVJ19R2D0amSxxTMK0KH4WPmqBYpMChUX5LWjo2MzEIA_J8zUtoH_Whzpt2mezOyqZvIKe6KIqKPYMD5xRL0yX6Bg/s1600/chunk_struct.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="178" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg2aBAdFMv1m5_OjRNZONzheF6qPFW0Qm9Yxl7Xmn7ykP457I4mLnLnVJ19R2D0amSxxTMK0KH4WPmqBYpMChUX5LWjo2MzEIA_J8zUtoH_Whzpt2mezOyqZvIKe6KIqKPYMD5xRL0yX6Bg/s320/chunk_struct.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The ComplexSample struct contains a Complex struct that has some architecture dependent values,<br />which cannot be collapsed into a RawBytes chunk, so it get serialized as a struct chunk instead.</td></tr>
</tbody></table>
<br />
<b>PointerSimple chunk</b><br />
PointerSimple chunk is storing a pointer variable. And the size of the data referenced by this pointer does not depend on architecture and can be de-serialized by a single memcpy() similar to the RawBytes chunk. To determine the length of a pointer (sometimes pointer can be used like an array), my C struct header parser recognizes some special macro which define the length of the pointer (and this macro will be expanded to nothing when parsed by normal Visual Studio C/C++ compiler). Usually during serialization, the length of the pointer depends on another variable within the same struct, so with the special macro, we can define the length of the pointer like below:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoKc2GZA4qgQyT12o50Iz1cBHqRDk5UCkBlx3boc7j3vXuOpqex2ktKfd7_o4NV6Vggv78UIvpKu4jIZqdDcb33yArHVuFBK99koWwadOzwyJDYR8r6l6KDmscv47zxg6ZcQO0Rw0IyuWq/s1600/pointer_simple.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgoKc2GZA4qgQyT12o50Iz1cBHqRDk5UCkBlx3boc7j3vXuOpqex2ktKfd7_o4NV6Vggv78UIvpKu4jIZqdDcb33yArHVuFBK99koWwadOzwyJDYR8r6l6KDmscv47zxg6ZcQO0Rw0IyuWq/s400/pointer_simple.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The DESC_ARRAY_SIZE() macro tells the serializer that <br />the size depends on the variable <i>num</i> within the same struct</td></tr>
</tbody></table>
<br />
When serializing the above struct, the serializer will look up the value of the variable <i>num </i>to determine the length of the pointer variable <i>data</i>, so that we know how bytes are needed to be serialized for <i>data</i>.<br />
<br />
But using this macro is not enough to cover all my use case, for example when serializing <a href="https://docs.microsoft.com/en-us/windows/win32/api/d3d12/ns-d3d12-d3d12_subresource_data">D3D12_SUBRESOURCE_DATA</a> for a 3D texture, the <i>pData</i> variable length cannot be simply calculated by <i>RowPitch</i> and <i>SlicePitch</i>:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXoowlPuz6PgsmMkcuIJlKF6NyzijMrFKWkOUnESk3ILWRIbEyT6zkrAUu0NU591n3OanbldRlosqMmpy4ySdvQqdyluf8JQLQOiQCT97mnoZkIp8P0NiJSfXZcT2cNUWz7v_Sq4_PvwJH/s1600/tex3D.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="210" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXoowlPuz6PgsmMkcuIJlKF6NyzijMrFKWkOUnESk3ILWRIbEyT6zkrAUu0NU591n3OanbldRlosqMmpy4ySdvQqdyluf8JQLQOiQCT97mnoZkIp8P0NiJSfXZcT2cNUWz7v_Sq4_PvwJH/s400/tex3D.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A sample struct to serialize a 3D texture, which the length of <br />D3D12_SUBRESOURCE_DATA::pData depends on the depth of the resources</td></tr>
</tbody></table>
<br />
The length can only be determined when having access to the struct Texture3DDesc, which have the depth information. To tackle this, my serializer can register custom pointer length calculation callback (e.g. register for the D3D12_SUBRESOURCE_DATA::pData variable inside Texture3DDesc struct). The serializer will keep track of a stack of struct type that is currently serializing, so that the callback can be trigger appropriately.<br />
<br />
Finally, if a pointer variable does not have any length macro nor registered length calcuation callback, we assume the pointer have a length of 1 (or 0 if nullptr).<br />
<br />
<b>PointerComplex chunk</b><br />
PointerComplex chunk is for storing pointer variable, with the data being referenced is platform dependent, similar to the struct chunk type. It has the same pointer length calculation method as PointerSimple chunk type.<br />
<br />
<b>Serialize union</b><br />
We can also serialize struct with union values that depends on another integer/enum variable, similar to <a href="https://docs.microsoft.com/en-us/windows/win32/api/d3d12/ns-d3d12-d3d12_shader_resource_view_desc">D3D12_SHADER_RESOURCE_VIEW_DESC</a>. We utilize the same macro approach used for pointer length calculation. For example:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZA3I5DTuFV6aamVEtVq-vhtHOutcYGb7f0OPa1JTMABaItH2CpZPLRZGzPhDtzzO40xPSLcUFC40HTkcdrCBSKZGx9xRFzVem2GqmKKc5XWTrU-R4ErqcBg7nJD3nzMKdm42rrlYaJdgM/s1600/union.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="230" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZA3I5DTuFV6aamVEtVq-vhtHOutcYGb7f0OPa1JTMABaItH2CpZPLRZGzPhDtzzO40xPSLcUFC40HTkcdrCBSKZGx9xRFzVem2GqmKKc5XWTrU-R4ErqcBg7nJD3nzMKdm42rrlYaJdgM/s400/union.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A sample to serialize variables inside union</td></tr>
</tbody></table>
In the above example, the DESC_UNION() macro add information about when the variable need to be serialized. During serialization, we check the value of variable <i>type</i>, if <i>type == ValType::Double</i>, we serialize <i>val_double</i>, else if <i>type == ValType::Integer</i>, we serialize <i>val_integer</i>.<br />
<br />
<b><span style="font-size: large;">Conclusion</span></b><br />
This post have described how a simple reflection system for C struct is implemented, which is a macro based approach, assisted with code generator. Based on the the reflection data, we can implement a serialization system to save/load the C struct using compile time type information. This system is simple, but it does not support complicated features like C++ class inheritance. And it is mainly for serializing C style struct, which is enough for my current need.<br />
<b><br /></b>
<b>References</b><br />
<span style="font-size: x-small;">[1] <a href="https://preshing.com/20180116/a-primitive-reflection-system-in-cpp-part-1/">https://preshing.com/20180116/a-primitive-reflection-system-in-cpp-part-1/</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://www.gdcvault.com/play/1026345/The-Future-of-Scene-Description">https://www.gdcvault.com/play/1026345/The-Future-of-Scene-Description</a></span><br />
<span style="font-size: x-small;">[3] <a href="https://blog.molecular-matters.com/2015/12/11/getting-the-type-of-a-template-argument-as-string-without-rtti/">https://blog.molecular-matters.com/2015/12/11/getting-the-type-of-a-template-argument-as-string-without-rtti/</a></span><br />
<br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-27379548335882045482019-07-07T20:53:00.001+08:002019-07-07T20:53:26.338+08:00Render Graph<b><span style="font-size: large;">Introduction</span></b><br />
Render graph is a directed acyclic graph that can be used to specify the dependency of render passes. It is a convenient way to manage the rendering especially when using low level API such as D3D12. There are many great resources talked about it such as <a href="https://www.gdcvault.com/play/1024612/FrameGraph-Extensible-Rendering-Architecture-in">this</a> and <a href="https://ourmachinery.com/post/high-level-rendering-using-render-graphs/">this</a>. In this post I will talk about how the render graph is set up, render pass reordering as well as resources barrier management.<br />
<br />
<b><span style="font-size: large;">Render Graph set up</span></b><br />
To have a simplified view of render graph, we can treat each node inside a graph as single render pass. For example we can have a graph for a simple deferred renderer like this:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGUyLG6mojDXBTqX00RyZbb6rMQ5BMOCyd45MQIymZaPqTJ6d-4fsTNfP70JKssmhSD-8dT2gcFZmWV-TnCOvsKs31U7As65Eyl3qVA1drKIdYgP4L3-9W6zYzPVTPd4yzo8of1K9nwmhQ/s1600/defer_render_pass.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="176" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiGUyLG6mojDXBTqX00RyZbb6rMQ5BMOCyd45MQIymZaPqTJ6d-4fsTNfP70JKssmhSD-8dT2gcFZmWV-TnCOvsKs31U7As65Eyl3qVA1drKIdYgP4L3-9W6zYzPVTPd4yzo8of1K9nwmhQ/s320/defer_render_pass.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Render passes dependency within a render graph</td></tr>
</tbody></table>
By having such graph, we can derive the dependency of the render passes, remove unused render pass, as well as reorder them. In my toy graphics engine, I use a simple scheme to reorder render passes. Taking the below render graph as an example, the render pass are added as following order:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBxKsZVlHwslvLLQrnU1rnZMG9R663PY_-adfNaUGk7r2ZdqjgzWADhHR5cyLbdKvpOJ3zYgN2R6772IxQQtGe1WeRvrUFVM5lMXTXC1amyCsWSjT336Kwy_xesMUZVClPmI8RMksTeh4y/s1600/graph_eg.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBxKsZVlHwslvLLQrnU1rnZMG9R663PY_-adfNaUGk7r2ZdqjgzWADhHR5cyLbdKvpOJ3zYgN2R6772IxQQtGe1WeRvrUFVM5lMXTXC1amyCsWSjT336Kwy_xesMUZVClPmI8RMksTeh4y/s320/graph_eg.png" width="205" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A render graph example</td></tr>
</tbody></table>
We can group it into several dependency levels like this:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnlrbjjaZw0tpztioI3qqR-fwqMTD_r6xBX9lFxL-HqFQYO_ow1h8ONq4ERE6E_QNTtjkEBNUc9H_PAj1fIlVrsFfANRc2JVa9KScFbznSsMKovZEvZ0zc4hIPUpVclzTHL66Bm3Yyqr8B/s1600/graph_eg_lv.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="245" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnlrbjjaZw0tpztioI3qqR-fwqMTD_r6xBX9lFxL-HqFQYO_ow1h8ONq4ERE6E_QNTtjkEBNUc9H_PAj1fIlVrsFfANRc2JVa9KScFbznSsMKovZEvZ0zc4hIPUpVclzTHL66Bm3Yyqr8B/s400/graph_eg_lv.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">split render passes into several dependency levels</td></tr>
</tbody></table>
Within each level, the passes are independent and can be reordered freely, so the render passes are enqueued into command list as the following order:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWakFsL_J1LuLFbca69lPa8nPp-1TniXGBuxrsfM7szP3ng6KPSiopo6UskYQsbi76FF85Ljrezsf2_f8LldMPbBzNMT0CtBr8RtXwObV77KoqAiPtQqLktfRvmmVpo6csbAc-mh17WTMw/s1600/graph_eg_order.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="56" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWakFsL_J1LuLFbca69lPa8nPp-1TniXGBuxrsfM7szP3ng6KPSiopo6UskYQsbi76FF85Ljrezsf2_f8LldMPbBzNMT0CtBr8RtXwObV77KoqAiPtQqLktfRvmmVpo6csbAc-mh17WTMw/s640/graph_eg_order.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Reordered render passes</td></tr>
</tbody></table>
Between each dependency level, we batch resources barrier to transit the resources to the correct state.<br />
<br />
<b><span style="font-size: large;">Transient Resources</span></b><br />
The above view is just a simplified view of the graph. In fact, each render pass consist of a number of inputs and outputs. Every input/output is a graphics resource (e.g. texture). And render passes are connected through such resources within a render graph.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHkYlA6sh-bjwvtCNNuRu9q1E_NIJInKZoUpx_MyV65WSEzPloJY8LKMIagbEsWxL6IyD9A1-ofguSR_ztLuBao3FB3d7AiLAL5pZkbfuE0iBu23iqxRT5OBA2vaKGLUHOfq0MNR9IQvAs/s1600/defer_render_resource.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="388" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHkYlA6sh-bjwvtCNNuRu9q1E_NIJInKZoUpx_MyV65WSEzPloJY8LKMIagbEsWxL6IyD9A1-ofguSR_ztLuBao3FB3d7AiLAL5pZkbfuE0iBu23iqxRT5OBA2vaKGLUHOfq0MNR9IQvAs/s640/defer_render_resource.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Render graph connecting render passes and resources</td></tr>
</tbody></table>
As you can see in the above example, there are many transient resources used (e.g. depth buffer, shadow map, etc). We handle such transient resources by using a texture pool. Texture will be reused after it is no longer need by previous pass (<a href="https://docs.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12device-createplacedresource">placed resources</a> is not used for simplicity). When building a render graph, we compute the life time of every transient resources (i.e. the dependency level that the resource start/end). So we can free the transient resources when the execution go beyond a dependency level and reuse them for later render pass. So to specify a render pass input/output in my engine, I only need to specify their size/format and don't need to worry about the resources creation and the transient resources pool will create the textures as well as the required resources view (e.g. SRV/DSV/RTV).<br />
<br />
<b><span style="font-size: large;">Conclusion</span></b><br />
In this post, I have described how render passes are reordered inside render graph, when barrier are inserted and transient resources handling. But I have not implemented parallel recording of command lists and async compute. It really takes much more effort to use D3D12 than D3D11. I think the current state of my hobby graphics engine is good enough to use. Looks like I can start learning DXR after spending lots of effort on basic D3D12 set up code. =]<br />
<div>
<br />
<b>References</b><br />
<span style="font-size: x-small;">[1] <a href="https://www.gdcvault.com/play/1024612/FrameGraph-Extensible-Rendering-Architecture-in">https://www.gdcvault.com/play/1024612/FrameGraph-Extensible-Rendering-Architecture-in</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://ourmachinery.com/post/high-level-rendering-using-render-graphs/">https://ourmachinery.com/post/high-level-rendering-using-render-graphs/</a></span><br />
<br />
<br />
<br /></div>
Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-53743086507577322942019-07-02T23:59:00.000+08:002019-07-02T23:59:01.051+08:00D3D12 Constant Buffer Management<b><span style="font-size: large;">Introduction</span></b><br />
In D3D12, it does not have an explicit constant buffer API object (unlike <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d11/overviews-direct3d-11-resources-buffers-intro#constant-buffer">D3D11</a>). All we have in D3D12 is <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/nf-d3d12-id3d12device-createconstantbufferview">ID3D12Resource</a> which need to be sub-divided into smaller region with <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/ns-d3d12-d3d12_constant_buffer_view_desc">Constant Buffer View</a>. And it is our job to handle the constant buffer life time and avoid updating constant buffer value while the GPU is still using it. This post will describe how I handle this topic.<br />
<br />
<b><span style="font-size: large;">Constant buffer pool</span></b><br />
We allocate a large ID3D12Resource and treat it as an object pool by sub-dividing it into many small constant buffers (Let's call it constant buffer pool). Since constant buffer required to be 256 bytes aligned (I can only find this requirement in <a href="https://docs.microsoft.com/en-us/previous-versions//dn899216(v=vs.85)">previous documentation</a>, while the updated document only have such requirement in the <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/upload-and-readback-of-texture-data">Uploading Texture Data Through Buffers</a>, which is under a section about texture...), so I defined 3 fixed size pools 256/512/1024 bytes pool. Only this 3 size type is enough for my need as most constant buffers are small (In <a href="https://simonstechblog.blogspot.com/2017/11/seal-guardian-announced.html">Seal Guardian</a>, the largest constant buffer size is 560 bytes, while large data like skinning matrix palette is uploaded via texture).<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPkrqA-u1pkd60zpsG7pEYVPjLg7GKi_P2ZV5pgVX7oaQy9kLdoEoUUUG9CYa6EKJPFImCszpJ-ZsB2WHVgYlAHpqMN5tVKOx564WK97TAPevRFkrwJe6OH2gDfMieD1SGAaFFs3V0o0DN/s1600/cb_pool.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="310" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPkrqA-u1pkd60zpsG7pEYVPjLg7GKi_P2ZV5pgVX7oaQy9kLdoEoUUUG9CYa6EKJPFImCszpJ-ZsB2WHVgYlAHpqMN5tVKOx564WK97TAPevRFkrwJe6OH2gDfMieD1SGAaFFs3V0o0DN/s400/cb_pool.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">3 constant buffer pools with different size</td></tr>
</tbody></table>
In <a href="https://simonstechblog.blogspot.com/2019/06/d3d12-descriptor-heap-management.html">last post</a>, a non shader visible descriptor heap manager is used to handle non shader visible descriptors. But in fact, that is only used for SRV/DSV/RTV descriptors. Constant buffer view are managed with another scheme. As described above, when we create a ID3D12Resource for constant buffer pool, we also create a non shader visible ID3D12DescriptorHeap with size large enough to have descriptors point to all the constant buffers inside the constant buffer pool.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9AdziNXpXq0v17t3JkwQvpJVxWhkWPBwYCWTWRwfemy3kWBqo-vF80j7BNAAs4nZpay9I4mTUFd9P68VQbP1mC8IfepiINbHMHOvxXZhFOYrd0iwfjeFoJ9QKkz3l0gjITy4cflApSwyZ/s1600/res_pair.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9AdziNXpXq0v17t3JkwQvpJVxWhkWPBwYCWTWRwfemy3kWBqo-vF80j7BNAAs4nZpay9I4mTUFd9P68VQbP1mC8IfepiINbHMHOvxXZhFOYrd0iwfjeFoJ9QKkz3l0gjITy4cflApSwyZ/s320/res_pair.png" width="280" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">ID3D12Resource and ID3D12DescriptorHeap are created in pair</td></tr>
</tbody></table>
We also split constant buffer pool based on their usage: static/dynamic. So there are total 6 constant buffer pools inside my toy engine (static 256/512/1024 bytes pool + dynamic 256/512/1024 bytes pool).<br />
<br />
<b><span style="font-size: large;">Dynamic constant buffer</span></b><br />
Constant buffer can be updated dynamically. Each constant buffer contains a CPU side copy of their constant values. When they are binded before a draw call, those values will be copied to the dynamic constant buffer pool (created in <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/ne-d3d12-d3d12_heap_type">upload heap</a>). A piece of memory for constant buffer values will be allocated from the constant buffer pool in a ring buffer fashion. If the pool is full (i.e. ring buffer wrap around too fast where all the constant buffers are still in use by GPU), we will create a larger pool and the existing pool will be deleted after all related GPU commands finish execution.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEij7BdRLDDXb6u_xS1WPuSqk40C2NWLVtY6AlFpY6SPIU8PyIcbYMk2LLpa3IW8_Z_6JIn40PBOzKxAxEAwfX71WcluMxPOe7Hb2p_jWRX2Vm1Hh-yPiOrg2X3Ta6KUqhn9fypxGzWH_3dx/s1600/dyna_pool_resize.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="243" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEij7BdRLDDXb6u_xS1WPuSqk40C2NWLVtY6AlFpY6SPIU8PyIcbYMk2LLpa3IW8_Z_6JIn40PBOzKxAxEAwfX71WcluMxPOe7Hb2p_jWRX2Vm1Hh-yPiOrg2X3Ta6KUqhn9fypxGzWH_3dx/s400/dyna_pool_resize.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Resizing dynamic constant buffer pool, the previous pool<br />
will be deleted after executing related GPU commands</td></tr>
</tbody></table>
To avoid copying the same constant buffer values to the constant buffer pool when having multiple binding constant buffer calls. We keep 2 integer values for every dynamic constant buffer: "last upload frame index" and "value version". The last upload frame index is the frame index that those CPU constant buffer values get copied to the dynamic pool. The value version is an integer which is monotonic increased every-time the constant buffer value get modified/updated. So by checking this 2 integers, we can avoid duplicated copies of constant buffer in dynamic pool and re-use the previous copied values.<br />
<br />
<b><span style="font-size: large;">Static constant buffer</span></b><br />
The static constant buffer will have a static descriptor handle described in <a href="https://simonstechblog.blogspot.com/2019/06/d3d12-descriptor-heap-management.html">last post</a>. The static constant buffer pool is created in the <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/ne-d3d12-d3d12_heap_type">default heap</a>. The pool is managed in a free-list fashion as oppose to ring buffer in dynamic pool. Also when the pool is full, we still create extra pool for new constant buffer allocation request. But different from dynamic pool, previous pool will not be deleted when new pool get created.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisFO8krJU0wCEvUwIOH7C8MoaK9BFCUlimG_iAGhPgAbfojJvB0GAyCgBIWjdVcp2EXCbpH1CsrYYVQLCT97JKIyyTeICwjpJz6nnn80p-_pm-priXWcMJdI3Tr5aON5uTrqlc4OqzKVnX/s1600/stat_poo_resizepng.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="177" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisFO8krJU0wCEvUwIOH7C8MoaK9BFCUlimG_iAGhPgAbfojJvB0GAyCgBIWjdVcp2EXCbpH1CsrYYVQLCT97JKIyyTeICwjpJz6nnn80p-_pm-priXWcMJdI3Tr5aON5uTrqlc4OqzKVnX/s640/stat_poo_resizepng.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Creating more static constant buffer pool if existing pools are full</td></tr>
</tbody></table>
To upload static constant buffer values to the GPU(since static pools are created in default heap), we use the dynamic constant buffer pool instead of creating another new upload heap. Every frame, we gather all newly created static constant buffers, then before we start rendering in this frame, we copy all the CPU constant buffer values to the dynamic constant buffer pool and then schedule a <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/nf-d3d12-id3d12graphicscommandlist-copybufferregion">ID3D12GraphicsCommandList::CopyBufferRegion()</a> call to copy those values from upload heap to default heap. By grouping all the static constant buffer uploads, we can reduce the number of <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/ns-d3d12-d3d12_resource_barrier">D3D12_RESOURCE_BARRIER</a> to transit between the <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/ne-d3d12-d3d12_resource_states">D3D12_RESOURCE_STATE_COPY_DEST and D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER</a> states.<br />
<br />
<b><span style="font-size: large;">Conclusion</span></b><br />
In this post, I have described how constant buffers are managed in my toy engine. It use a number of different pool size which is managed in ring buffer fashion for dynamic constant buffers and in free-list fashion for static constant buffers. Uploading of static constant buffer content are grouped together to reduce barrier usage. However, I only split the usage based simply on static/dynamic, I would like to investigate the performance in the future whether adding another usage type for some use case like constant buffer will be updated every frame, and used frequently in many draw calls (e.g. write once, read many within a frame) and would like to place those resources on the default heap instead of the current dynamic upload heap.<br />
<br />
<b>Reference</b><br />
<span style="font-size: x-small;">[1] <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/large-buffers">https://docs.microsoft.com/en-us/windows/desktop/direct3d12/large-buffers</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://www.gamedev.net/forums/topic/679285-d3d12-how-to-correctly-update-constant-buffers-in-different-scenarios/">https://www.gamedev.net/forums/topic/679285-d3d12-how-to-correctly-update-constant-buffers-in-different-scenarios/</a></span><br />
<br />
<br />
<br />
<br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-78874721328099291272019-06-29T16:03:00.001+08:002019-06-29T16:03:33.853+08:00D3D12 Descriptor Heap Management<b><span style="font-size: large;">Introduction</span></b><br />
Continue with the <a href="https://simonstechblog.blogspot.com/2019/06/d3d12-root-signature-management.html">last post</a>, we described about how root signature is managed to bind resources. But root signature is just one part of resources binding, we also need to use descriptor to <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/resource-binding">bind reousrces</a>. <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/descriptors-overview">Descriptors</a> are small block of memory describing an object (CBV/SRV/UAV/Sampler) to GPU. They are stored in <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/descriptor-heaps">descriptor heaps</a>, and they may be <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/shader-visible-descriptor-heaps">shader visible</a> or <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/non-shader-visible-descriptor-heaps">non shader visible</a>. In this post, I will talk about how descriptors are managed for resources binding in my toy graphics engine.<br />
<br />
<b><span style="font-size: large;">Non shader visible heap</span></b><br />
Let's start with the non-shader visible heap management. We can treat a descriptor as a pointer to a GPU resource (e.g. texture). Descriptor heap is a piece of memory used for storing descriptors and the size of a single descriptor can be queried by <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/nf-d3d12-id3d12device-getdescriptorhandleincrementsize">ID3D12Device::GetDescriptorHandleIncrementSize()</a>. So we treat descriptor heap as an object pool, and every descriptor within the same heap can be referenced by an index.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhO3Iy5uzY8CINXGGlNOmcnOgdjFbVVv0DSVZfZ6HFo3nKGdXzamxyv4esCkRvxiPn1DG1AYLWQIlPZXnKYbW4ZWi-PlzT75kzJQgZK_FdktvG1q6VfxjJL_IYkIe_PGyadajT1-R4lLsju/s1600/non_shader_vis_heap.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhO3Iy5uzY8CINXGGlNOmcnOgdjFbVVv0DSVZfZ6HFo3nKGdXzamxyv4esCkRvxiPn1DG1AYLWQIlPZXnKYbW4ZWi-PlzT75kzJQgZK_FdktvG1q6VfxjJL_IYkIe_PGyadajT1-R4lLsju/s320/non_shader_vis_heap.png" width="198" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Non shader visible descriptor heap containing N descriptors</td></tr>
</tbody></table>
Since we don't know how many descriptors are needed in-advance and we may have many non shader visible heaps, A non shader visible heap manager is created for allocating a descriptor from descriptor heap(s). This manager contains at least 1 descriptor heap. When a descriptor allocation request is made to the manager, it will first look for free descriptor from existing descriptor heap(s), if none is found, a new descriptor heap will be created to handle the request.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnJAUZEeZA_aNyMZR1nsbbWrFEAua43z5G0ErucrFwAAvqkkZoSZ-vvEWYIngiGzF8uf-Tzk-082D8FRoT1SWQTDT3ovJzyIbN_kSyedgF7g1f2W5xsjRmtK4jEn3imUVVI2xZh2PGS4lR/s1600/heap_mgr.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnJAUZEeZA_aNyMZR1nsbbWrFEAua43z5G0ErucrFwAAvqkkZoSZ-vvEWYIngiGzF8uf-Tzk-082D8FRoT1SWQTDT3ovJzyIbN_kSyedgF7g1f2W5xsjRmtK4jEn3imUVVI2xZh2PGS4lR/s400/heap_mgr.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Descriptor heap manager handles descriptor allocation request, create descriptor heap if necessary</td></tr>
</tbody></table>
So within the graphics engine, we use a "non shader visible descriptor handle" to reference a D3D12 descriptor which store the heap index and descriptor index with respect to a descriptor heap manager. All the textures created in the engine will have a "non shader visible descriptor handle" for resources binding (more on this later).<br />
<br />
<b><span style="font-size: large;">Shader visible heap</span></b><br />
Next, we will talk about shader visible heap management. Shader visible heap is responsible for binding resources that get used in shaders. It is recommend that just only 1 heap is used for all frames so that asynchronous compute and graphics workload can be run in parallel(<a href="https://developer.nvidia.com/dx12-dos-and-donts">on NVidia hardware</a>). So we just create 1 large shader visible heap at the start of program and don't bother to resize/allocate a lager heap when the heap is full (we just assert in this case). With a single large shader visible descriptor heap, it is divided into 2 regions: static / dynamic.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipyy9rOPzMLxs65SrWMRLFHnrVKEb_ei4m1J1LajMZRFScEmhyphenhyphenYEhnvVKA8oznk3DtZOfDpmgTDafsrS9LxVag7Mt6AhGbzV7eAa4O82_9UGFhlk4Z2xJARKscddOrngZKC3KYggBN5KbJ/s1600/shader_vsi_heap.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipyy9rOPzMLxs65SrWMRLFHnrVKEb_ei4m1J1LajMZRFScEmhyphenhyphenYEhnvVKA8oznk3DtZOfDpmgTDafsrS9LxVag7Mt6AhGbzV7eAa4O82_9UGFhlk4Z2xJARKscddOrngZKC3KYggBN5KbJ/s320/shader_vsi_heap.png" width="198" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A single large shader visible descriptor heap, divided into 2 regions</td></tr>
</tbody></table>
<br />
<b>Dynamic descriptor</b><br />
Dynamic descriptors are used for some transient resources that their descriptor table cannot be reused often. During resources binding (e.g. texture), their non-shader visible descriptors will be copied to the shader visible heap via <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/nf-d3d12-id3d12device-copydescriptors">ID3D12Device::CopyDescriptors()</a>, where the copy destination (i.e. dynamic shader visible descriptors) is allocated in a ring buffer fashion (Note the copy operation have a restriction that the <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/copying-descriptors">copy source </a>must be in non-shader visible heap, that's why we allocate a "non shader visible descriptor handle" for every texture).<br />
<br />
<b>Static descriptor</b><br />
Static descriptors are used for resources which can be grouped together into a descriptor table, so that they can be reused over multiple frames. For example, a set of textures inside a material will not be changed very often, those textures can be grouped into a descriptor table. My current approach is to use a "stack" based approach to manage the static shader visible descriptor heap. Instead of creating a stack of individual descriptor, we have a stack of groups of descriptors, often, during level load, 1 static descriptor group will be created.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhc3c4WdIO5qt2X8CtIRU4iRhQxIJ3BtHPnbC2vyZYKPGte94EJsI4p3WDP1IqX_8InhLmOaVHsjeaEiwdRk_wy8UCdobVxoGokvYEGgw62pyS514U_78I_VlDAq7-MpjYT1ttPtH6lzT37/s1600/static_heap.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhc3c4WdIO5qt2X8CtIRU4iRhQxIJ3BtHPnbC2vyZYKPGte94EJsI4p3WDP1IqX_8InhLmOaVHsjeaEiwdRk_wy8UCdobVxoGokvYEGgw62pyS514U_78I_VlDAq7-MpjYT1ttPtH6lzT37/s320/static_heap.png" width="148" /></a></td></tr>
<tr><td class="tr-caption" style="font-size: 12.8px;">static descriptors are packed into group during level load</td></tr>
</tbody></table>
Inside a group of static descriptors, the descriptors are sorted such that all constant buffer descriptors appear before texture descriptors. Also null descriptors may need to be added to respect the <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/hardware-support">Hardware Tiers restriction</a>. To identify a static descriptor in shader visible heap, we can use the stack group index together with the descriptor index within the group.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhG7Mpx4qcKUY7LzmKDS1DgmT9CsIemdVNXlL4C4k3O5aKJxSi2HQGN4OJidHyHZtXUwF90L4o3au1vWO64mJ4RpxgHhW02hFn-nmTzasq5Eyn2fDu7EVmyWI8ha5V71mpYB7ihBkxNPG7V/s1600/static_gp.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhG7Mpx4qcKUY7LzmKDS1DgmT9CsIemdVNXlL4C4k3O5aKJxSi2HQGN4OJidHyHZtXUwF90L4o3au1vWO64mJ4RpxgHhW02hFn-nmTzasq5Eyn2fDu7EVmyWI8ha5V71mpYB7ihBkxNPG7V/s320/static_gp.png" width="166" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">descriptor are ordered by type, with necessary padding</td></tr>
</tbody></table>
<div>
Each "static resource"(e.g. constant buffer/texture) will have a "static descriptor handle" beside the "non shader visible descriptor handle". We can check whether those resources are within the same descriptor table by comparing the stack group index and descriptor index to see whether they are in consecutive order. With such information, we can create a resource binding API similar to D3D11 (e.g. <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d11/nf-d3d11-id3d11devicecontext-pssetshaderresources">ID3D11DeviceContext::PSSetShaderResources()</a> ), if all the resources in the API call are in the same descriptor table, we can use the static descriptor to bind the descriptor table directly, otherwise, we switch to use the dynamic descriptor approach described in previous section to create a continuous descriptor table. (I have also think of instead of using similar binding API as D3D11, may be I can create a so call "descriptor table" object explicitly, say during material loading and grouping material textures into a descriptor table, so that resources binding can skip the consecutive descriptor index check described above. But currently I just stick with a simple solution first...)<br />
<br />
As mentioned before, the static descriptor group is allocated based on a "stack" based approach. But my current implement is not strictly "last in - first out", we can removing a group in between and make some "hole" in the static shader visible heap region, but this will result in fragmentation.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAvorK-tLzkZQ3gQRt6A-UrKCkXAsjfKnBVTvQlsNPfJN13vbuOnCg0eZ9Ivlmv3CvEZxBfrYt-nq4rrcX07ivUbaSSaM2zmn2cbLOeCoomI2aEKkKlkUDPGzBXG-ESjxiZ1c8vPDY9Q6U/s1600/fragmentation.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjAvorK-tLzkZQ3gQRt6A-UrKCkXAsjfKnBVTvQlsNPfJN13vbuOnCg0eZ9Ivlmv3CvEZxBfrYt-nq4rrcX07ivUbaSSaM2zmn2cbLOeCoomI2aEKkKlkUDPGzBXG-ESjxiZ1c8vPDY9Q6U/s400/fragmentation.png" width="148" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Fragmented static descriptor heap region</td></tr>
</tbody></table>
In theory, we can defragment this heap region by moving descriptor groups to un-used space (it works as we use index to reference descriptors inside a heap instead of <a href="http://d3d12_gpu_descriptor_handle/">address</a> directly) and during defragmentation, we may switch to use dynamic descriptors temporarily to avoid overwriting the static heap region while the GPU commands are still using it. But currently, I have not implemented the defragmentation yet because I only get one simple level (i.e. only 1 static descriptor group) now...<br />
<br />
<b><span style="font-size: large;">Conclusion</span></b><br />
In this post, I have described how the descriptor heap is managed for resources binding. To sum up, the shader visible descriptor heap is divided into 2 regions: static/dynamic. Static descriptor heap is managed in a "stack" based approach. During level loading, all the static CBV/SRV descriptors are stored within a static descriptor stack group, which is a big continuous descriptor table. This will increase the chance to reuse the descriptor table. In addition to this optional static descriptor, every resources must have a non-shader visible descriptor handle. This non-shader visible descriptor handle is used when a static descriptor table cannot be used during resource binding, and it will get copied to the shader visible heap to form a new descriptor table. With this kind of heap management, we can create a resources binding API similar to D3D11, which call the underlying D3D12 API using descriptors.<br />
<br />
<b>References</b><br />
<span style="font-size: x-small;">[1] <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/resource-binding">https://docs.microsoft.com/en-us/windows/desktop/direct3d12/resource-binding</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://www.gamedev.net/forums/topic/686440-d3d12-descriptor-heap-strategies/">https://www.gamedev.net/forums/topic/686440-d3d12-descriptor-heap-strategies/</a></span><br />
<br />
<br /></div>
Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-90302739463313536512019-06-22T17:38:00.001+08:002019-06-22T17:38:22.465+08:00D3D12 Root Signature Management<b> <span style="font-size: large;">Introduction</span></b><br />
Continue with the <a href="https://simonstechblog.blogspot.com/2019/06/msbuild-custom-build-tools-notes.html">last post</a> about writing my new toy D3D12 graphics engine, we have compiled some shaders and extracted some reflection data from shader source. The next problem is to bind resources(e.g. constant buffer / textures) to the shaders. D3D12 use <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/root-signatures">root signatures</a> together with root parameters to achieve this task. In this post, I will describe how my toy engine create root signatures automatically based on shader resources usage.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQkSrLifMzETziYHnqPu6ERbQ_zCtAhJCuUHzvQrP-bDu4jydefdihJKXqxmH2nBkls-C3ZX51xabEXvyl8YakhK_-XZANdTS1K_AkIqfw896PR5BJVRp1B8QhvZ4aFcxye8GB3kabS11m/s1600/d3d12_demo_shadowmap.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="243" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQkSrLifMzETziYHnqPu6ERbQ_zCtAhJCuUHzvQrP-bDu4jydefdihJKXqxmH2nBkls-C3ZX51xabEXvyl8YakhK_-XZANdTS1K_AkIqfw896PR5BJVRp1B8QhvZ4aFcxye8GB3kabS11m/s640/d3d12_demo_shadowmap.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><div style="text-align: left;">
Left: new D3D12 graphics engine (with only basic diffuse material)</div>
<div style="text-align: left;">
Right: previous D3D11 rendering (with PBR material, GI...)</div>
<div style="text-align: left;">
Still a long way to go to catch up with the previous renderer... </div>
</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Resource binding model</span></b><br />
In D3D12, shader resource binding relies on the <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/creating-a-root-signature">root parameter index</a>. But when iterating on shader code, we may modify some resources binding(e.g. add a texture variable / remove a constant buffer), the root signature may be changed, which cause the change of root parameter index. This will need to update all function call like <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/nf-d3d12-id3d12graphicscommandlist-setgraphicsrootdescriptortable">SetGraphicsRootDescriptorTable()</a> with new root parameter index, which is tedious and error-prone... Compare to the resource binding model in D3D11 (e.g. <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d11/nf-d3d11-id3d11devicecontext-pssetshaderresources">PSSetShaderResources()</a>, <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d11/nf-d3d11-id3d11devicecontext-pssetconstantbuffers">PSSetConstantBuffers()</a>), it doesn't have such problem as the API defined a set of fixed slots to bind with. So I would prefer to work with a similar binding model in my toy engine.<br />
<br />
So, I defined a couple of slots for resource binding as follow (which is a bit different than D3D11):<br />
<blockquote class="tr_bq">
Engine_PerDraw_CBV<br />
Engine_PerView_CBV<br />
Engine_PerFrame_CBV<br />
Engine_PerDraw_SRV_VS_ONLY<br />
Engine_PerDraw_SRV_PS_ONLY<br />
Engine_PerDraw_SRV_ALL<br />
Engine_PerView_SRV_VS_ONLY<br />
Engine_PerView_SRV_PS_ONLY<br />
Engine_PerView_SRV_ALL<br />
Engine_PerFrame_SRV_VS_ONLY<br />
Engine_PerFrame_SRV_PS_ONLY<br />
Engine_PerFrame_SRV_ALL<br />
Shader_PerDraw_CBV<br />
Shader_PerDraw_SRV_VS_ONLY<br />
Shader_PerDraw_SRV_PS_ONLY<br />
Shader_PerDraw_SRV_ALL<br />
Shader_PerDraw_UAV</blockquote>
<div>
Instead of having a fixed slot per shader stage in D3D11, my toy engine fixed slots can be summarized into 3 categories as:</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiPJMzlLOVRUAhOe2APv0tFdaph41UQ6y3Sphv7U6srrQK10v-bhevHLz_7AdH-wg3CRUS7TlQljnad9f82F_WeCN2Vppa4u1GB8BZk0FNxUsFU5CK7Og2HOdUgHR-jLw26nL8F6AJdD9H/s1600/root_slot.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="37" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiPJMzlLOVRUAhOe2APv0tFdaph41UQ6y3Sphv7U6srrQK10v-bhevHLz_7AdH-wg3CRUS7TlQljnad9f82F_WeCN2Vppa4u1GB8BZk0FNxUsFU5CK7Og2HOdUgHR-jLw26nL8F6AJdD9H/s320/root_slot.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Resource binding slot categories</td></tr>
</tbody></table>
<b><br /></b>
<b>Slot category "Resource Type"</b><br />
As described by its name (CBV/SRV/UAV), this slot is used to bind the corresponding resource type like constant buffer / shader resource view / unordered access view.<br />
For SRV type, it further sub-divide into VS_ONLY / PS_ONLY / ALL sub-categories which refer to the <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/ne-d3d12-d3d12_shader_visibility">shader visibility</a>. According to <a href="https://developer.nvidia.com/dx12-dos-and-donts">Nvidia Do's An Don'ts</a>, limiting the shader visibility will improve the performance.<br />
For CBV type, the shader visibility will be deduced from shader reflection data during root signature and PSO creation.<br />
<br />
<b>Slot category "Change frequency"</b><br />
Resources are encouraged to be bound based on their update frequency. So this slot category are divided into 3 types: Per Frame/ Per View / Per Draw.<br />
For the Per Frame/View types, they will have a root parameter type as <a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/ne-d3d12-d3d12_root_parameter_type">descriptor table</a>.<br />
While the Per Draw CBV type will have root parameter type as<a href="https://docs.microsoft.com/en-us/windows/desktop/api/d3d12/ns-d3d12-d3d12_root_descriptor"> root descriptor</a>.<br />
For Per Draw SRV type, it still uses descriptor table instead of root descriptor, because for example, it is common to have only 1 constant buffer for material of a mesh while binding multiple textures for the same material. So using descriptor table instead will help to keep the size of root signature small.<br />
<br />
<b>Slot category "Usage"</b><br />
This category is used for sub-dividing into different usage patterns: Engine/Shader.<br />
For Engine usage, it will typically be binding stuff like mesh transform constant, camera transform, etc.<br />
For Shader usage, it is used for something like shader specific stuff, e.g. material constant.<br />
I just can't find the appropriate name for this category, and simply use the name as Engine/Shader. May be it is better to call them Group 0/1/2/3... in case I may have different usage patterns in the future. But currently I just don't bother with it now...<br />
<br />
<div>
<div>
<b><span style="font-size: large;">Shader Reflection</span></b><br />
In <a href="https://simonstechblog.blogspot.com/2019/06/msbuild-custom-build-tools-notes.html">last post</a>, I have mentioned that during shader compilation, shader reflection data is exported. This is important for the root signature creation. From these reflection data, we can know which constant buffer/texture slots get used. When creating a <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/managing-graphics-pipeline-state-in-direct3d-12">pipeline state object(PSO)</a> from shaders, we can deduce all the resources slots get used in PSO (as well as the shader visibility for constant buffer) and then create an appropriate root signature with each resource slot mapped to the corresponding root parameter index (let's call this mapping data as "root signature info").<br />
<br />
To specify the resource slot in shader code, we make use of the <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/resource-binding-in-hlsl">register space</a> introduced in Shader Model 5.1. We can define which slot is used for constant buffer/texture. For example:<br />
<blockquote class="tr_bq">
#define ENGINE_PER_DRAW_SRV_ALL space5 // all shaders must have the same slot-space definition<br />
Texture2D shadowMap : register(t0, ENGINE_PER_DRAW_SRV_ALL);</blockquote>
With the above information, on the CPU side code, we can bind a resource to a specific slot using the root parameter index stored inside the "root signature info" similar to D3D11 API.<br />
<br />
<b><span style="font-size: large;">Conclusion</span></b><br />
<div>
In this post, we have described how root signature can be automatically created and used for a slot based API. First root signature are created(or re-used/shared) during the creation of pipeline state object(PSO) based on its shader reflection data. We also create a "root signature info" to store the mapping between resource slots and root parameter index together with the root signature and PSO. Then we can use this "root signature info" to bind the resources to the shader.<br />
<br />
As this is my first time to write a graphics engine with D3D12. I am not sure whether this resource binding model is the best. I have also think of other naming scheme for the resource slots: instead of naming with PerDraw / PerView type, is it better to name it explicitly with RootDescriptor / DescriptorTable instead? May be I will change my mind after I gained more experience in the future...</div>
</div>
<div>
<br />
<b>Reference</b><br />
[1] <a href="https://docs.microsoft.com/en-us/windows/desktop/direct3d12/root-signatures">https://docs.microsoft.com/en-us/windows/desktop/direct3d12/root-signatures</a><br />
[2] <a href="https://developer.nvidia.com/dx12-dos-and-donts">https://developer.nvidia.com/dx12-dos-and-donts</a><br />
<br />
<br />
<br />
<br /></div>
</div>
Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-72579384101748316902019-06-14T17:11:00.003+08:002019-06-22T12:45:52.372+08:00MSBuild custom build tools notes<b><span style="font-size: large;">Introduction</span></b><br />
Recently, I am trying to re-write my graphics code to use D3D12 (instead of D3D11 in Seal Guardian). I need to have a convenient way to compile shader code. While tidying up the <a href="https://docs.microsoft.com/en-us/cpp/build/understanding-custom-build-steps-and-build-events?view=vs-2019">MSBuild custom build steps</a> files used in Seal Guardian for my new toy graphics engine, I regret that I did not write a blog post about custom MSBuild at that time, as I remembered, finding such information was hard at that time and I need to look at some of the CUDA custom build files to guess how it works. So this post will just be my personal notes about custom MSBuild and I don't guarantee all information about MSBuild are 100% correct. I have uploaded an example project to compile shader files <a href="https://github.com/simon-yeunglm/MSBuild">here</a>. Interested readers may also check out this <a href="http://www.reedbeta.com/blog/custom-toolchain-with-msbuild/">excellent post</a> about MSBuild written by Nathan Reed.<br />
<b><span style="font-size: large;"><br /></span></b>
<b><span style="font-size: large;">Custom build steps set up</span></b><br />
MSBuild need to have .targets file to describe how the compiler (e.g. dxc/fxc used for shader compilation) are invoked. In the uploaded <a href="https://github.com/simon-yeunglm/MSBuild">example project</a>, we have 3 main targets: DXC, JSON, BIN.<br />
<br />
- DXC target: described by its name, invoking the dxc.exe to compile HLSL file.<br />
- JSON target: used to invoke the shaderCompiler.exe, which is our internal tool written using Flex & Bison to parse the shader source code to output some meta data, like texture/constant buffer usage for root signature management.<br />
- BIN target: a task that depends on DXC task and JSON task, invoke the dataBuilder.exe, our internal tool for data serialization/deserialization into our binary format, combining the output from DXC and JSON task.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiy-4YwSqr5PiPhgq87GshQkzw_RoUf-KjJkOGbwG5bk14Z_BQKKJzn2zbkNT4vP0xoEF6R7MAUK3418RJfhjFzz_rQeI80M4PmVVd4grRgIO7Pn-rtLH9T3nmdaehicQh_-oxTnpMbhvDT/s1600/targets.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="328" data-original-width="577" height="181" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiy-4YwSqr5PiPhgq87GshQkzw_RoUf-KjJkOGbwG5bk14Z_BQKKJzn2zbkNT4vP0xoEF6R7MAUK3418RJfhjFzz_rQeI80M4PmVVd4grRgIO7Pn-rtLH9T3nmdaehicQh_-oxTnpMbhvDT/s320/targets.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">target dependency</td></tr>
</tbody></table>
<br />
Although MSBuild can set up the target dependency, but it looks like those independent targets are not executed in parallel. In Seal Guardian, when compiling the surface shaders which generate the shader permutation for lighting, this result in a long compilation time. At the end, I need to create another exe to launch multiple threads to speed up the shader compilation. May be I was setting up MSBuild incorrectly, if anyone knows how to parallelize it, please let me know in the comment below. Thank you!<br />
<br />
<b><span style="font-size: large;">Incremental Builds</span></b><br />
MSBuild use .tlog file to track files modification to avoid unnecessary compilation (also affect which files got deleted when cleaning the project). There are 2 tlog files (read.1.tlog and write.1.tlog), one is for tracking whether the source files are modified, and the other is tracking whether the output file is up to date. We can simply use the <a href="https://docs.microsoft.com/en-us/visualstudio/msbuild/writelinestofile-task?view=vs-2019">WriteLinesToFile</a> task to mark such dependency, e.g.<br />
<br />
<div style="text-align: center;">
<WriteLinesToFile<span style="white-space: pre;"> </span>File="$(TLogLocation)$(ProjectName).write.1.tlog" Lines="$(TLog_writelines)" /></div>
<br />
But doing this only will make the tlog file larger and larger after every compilation. So it is better to read the tlog file content into a <a href="https://docs.microsoft.com/en-us/visualstudio/msbuild/propertygroup-element-msbuild?view=vs-2019">PropertyGroup</a> and check whether the file already contains the text we would like to write using a <a href="https://docs.microsoft.com/en-us/visualstudio/msbuild/target-element-msbuild?view=vs-2019">"Conidition"</a> inside the <a href="https://docs.microsoft.com/en-us/visualstudio/msbuild/writelinestofile-task?view=vs-2019">WriteLinesToFile</a> task. For details, please take a look at the <a href="https://github.com/simon-yeunglm/MSBuild">example project</a>.<br />
<br />
Also, as a side note, do not include <a href="https://docs.microsoft.com/en-us/visualstudio/msbuild/msbuild-reserved-and-well-known-properties?view=vs-2019">$(MSBuildProjectFile)</a> property in the "Inputs" element inside "Target" task. I did it accidentally and it cause the whole project to recompile all the shaders every time a new shader file is added to / removed from the project. This is not necessary as most of the shader files are independent.<br />
<br />
<b><span style="font-size: large;">Output files</span></b><br />
Like every visual studio project, our example project have a Debug and Release configuration. After executing the BIN task described above, we also use a <a href="https://docs.microsoft.com/en-us/visualstudio/msbuild/copy-task?view=vs-2019">Copy task</a> to copy our compiled shader from Debug/Release config <a href="https://docs.microsoft.com/en-us/cpp/build/reference/common-macros-for-build-commands-and-properties?view=vs-2019">$(OutDir)</a> directory to our content directory. We can also use the <a href="https://docs.microsoft.com/en-us/visualstudio/msbuild/property-functions?view=vs-2019#msbuild-makerelative">Property Function MakeRelative()</a> to maintain the directory hierarchy in the output directory. This is another reason why I use Copy task instead of specifying the $(OutDir) to the content directory, as I cannot get a nested Property Function working inside the .props file (or may be I did something wrong? I don't know...)...<br />
<br />
Also, beside output files, another note is about the output log. If we want to write something to the output console of visual studio from your custom exe (e.g. dataBuilder.exe/dataBuilder.exe in the example project), the text must be in a specific format like (but I cannot find the documentation of the exact format, just guess from similar message emitted from visual studio...):<br />
<br />
<div style="text-align: center;">
1>C:/shaderFileName.hlsl(123): error : error message</div>
<br />
otherwise, those message will not get displayed in output window.<br />
<br />
<br />
<b><span style="font-size: large;">Example project</span></b><br />
An example project is uploaded <a href="https://github.com/simon-yeunglm/MSBuild">here</a>. It will compile vertex/pixel shaders with dxc.exe and output JSON meta data to the $(IntDir), then combine those data and write to the $(OutDir). Finally those files will be copied to the asset directory with the corresponding relative path to the source directory. Please note that the shaderCompiler.exe used for outputting meta-data is for internal tools, which have some custom restriction on the HLSL grammar for my convenience to create root signature. It is used just as an example to illustrate how to set up a custom MSBuild tool, feel free to replace/modify those files to suit your own need. Thank you.<br />
<br />
<span style="font-size: xx-small;"><b>Reference</b><br /><span style="font-size: x-small;">[1] </span></span><a href="https://docs.microsoft.com/en-us/cpp/build/understanding-custom-build-steps-and-build-events?view=vs-2019">https://docs.microsoft.com/en-us/cpp/build/understanding-custom-build-steps-and-build-events?view=vs-2019</a><br />
<span style="font-size: x-small;">[2] </span><a href="http://www.reedbeta.com/blog/custom-toolchain-with-msbuild/">http://www.reedbeta.com/blog/custom-toolchain-with-msbuild/</a><br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-77209557183729231052018-10-07T18:04:00.002+08:002018-10-07T18:04:29.188+08:00Testing Hot-Reload DLL on Windows<b><span style="font-size: large;">Introduction</span></b><br />
After finishing the game <a href="https://store.steampowered.com/app/741620/Seal_Guardian/">Seal Guardian</a> and taking some rest, I was recently back to refactoring the engine code of the game Seal Guardian. In this game, the engine has the ability to hot-reload all asset type from texture, shader, game level to Lua script. But it lacks the ability to hot-reload C/C++ files. So I decided to spend some time on finding <a href="https://github.com/RuntimeCompiledCPlusPlus/RuntimeCompiledCPlusPlus/wiki/Alternatives">resources</a> about hot reload C/C++. It turns out hot-reload C/C++ is not that trivial on Windows as the PDB <a href="https://ourmachinery.com/post/dll-hot-reloading-in-theory-and-practice/">file</a> <a href="https://ourmachinery.com/post/little-machines-working-together-part-2/">is</a> <a href="https://blog.molecular-matters.com/2017/05/09/deleting-pdb-files-locked-by-visual-studio/">locked</a>. And I found <a href="https://github.com/fungos/cr">this approach</a> of patching the PDB path inside DLL looks interesting. So I gave it a try and the sample program is uploaded to <a href="https://github.com/simon-yeunglm/HotReload">here</a> (only tested with Visual Studio Community 2017).<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdqt_jmAgRvLSOsflonEo8kOyedFqjsMAlhLQWCS1Zs5x5etMBzqNsslp1PQc_UdiYweswtV0jBjM5o6NsfTMYmE97_GYSvhj0KP-FLfQBeHMDLFYObjBboKT3ekBMcktubsZzr-5CGV4w/s1600/hot_reload.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="450" data-original-width="800" height="360" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgdqt_jmAgRvLSOsflonEo8kOyedFqjsMAlhLQWCS1Zs5x5etMBzqNsslp1PQc_UdiYweswtV0jBjM5o6NsfTMYmE97_GYSvhj0KP-FLfQBeHMDLFYObjBboKT3ekBMcktubsZzr-5CGV4w/s640/hot_reload.gif" width="640" /></a></div>
<br />
<span id="goog_433972302"></span><br />
<b><span style="font-size: large;">First try</span></b><br />
Because the PDB file path is hard coded inside the DLL file, the approach used by <a href="https://github.com/fungos/cr">cr.h</a> is to correctly parse the DLL file to find the <a href="http://www.debuginfo.com/articles/debuginfomatch.html">PDB file path</a> and replace it with another new file path according to the <a href="https://msdn.microsoft.com/en-us/library/ms809762.aspx">Portable Executable format</a>.<br />
<br />
So I tried something similar, but different from cr.h, instead of generate a new DLL/PDB file name every time the DLL get re-compiled, I use a fixed temporary name (I don't want to have many random files inside the binary directory after several hot reload...) For example, when Visual Studio generate files:<br />
<ul>
<li>abc.dll</li>
<li>abc.pdb</li>
</ul>
The sample program will detect abc.dll is updated, it will generate 2 new files:<br />
<ul>
<li>ab_.dll</li>
<li>ab_.pdb</li>
</ul>
Where ab_.dll will have a patched PDB path pointing to the newly copied ab_.pdb. And the program will load the ab_.dll instead.<br />
<br />
The reason I don't choose a more meaningful name like abc_tmp.dll is because I worry that having a longer file name length than the original name may mess up the offset values stored inside the DLL. So I just replace the last character with an underscore character.<br />
<br />
This approach works and every time I start debug without debugger by pressing Ctrl+F5 in Visual Studio, and then edit some code and re-build solution by pressing F7, the DLL get hot-reloaded. When the sample program exit, the ab_.dll and ab_.pdb files get deleted.<br />
<br />
However, when the program quit with a debugger attached, the program can't delete the ab_.pdb file...<br />
<br />
<b><span style="font-size: large;">Second try</span></b><br />
We know that the Visual Studio debugger is locking the PDB file, what if when we detect a <a href="https://msdn.microsoft.com/en-us/library/windows/desktop/ms680345(v=vs.85).aspx">debugger is attached</a>, can we detach the debugger programmatically before the program exit? Luckily the <a href="https://docs.microsoft.com/en-us/dotnet/api/envdte?view=visualstudiosdk-2017">EnvDTE COM library</a> can help with this task and someone has written sample code to do <a href="https://handmade.network/forums/wip/t/1479-sample_code_to_programmatically_attach_visual_studio_to_a_process">this</a> (Although that sample code said we need to modify the "VisualStudio.DTE" string to your installed version like "VisualStudio.DTE.14.0", but I have tested with Visual Studio Community 2017 and it works without modification). So, by detaching the debugger programmatically, we can delete the temporary PDB file when program exit.<br />
<br />
<b><span style="font-size: large;">Third try</span></b><br />
Now we can detach debugger programmatically, Why not try re-attach the debugger after every hot re-load? With the re-attach debugger code written, I tried running the program by pressing F5(Start Debugging) and then pressing F7 to re-compile the solution. A dialog pop up:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgrtP1beoGcSp7wewauUjryHOYnI2gH7m-dyW6QOwPqlYo9nDbjGMDVCGFJem2_tLRg8xFkEXfHQ8wPvLFM1_AbHvLYEziCAXS2uqpAWP_DZ414Rn1HEUZuJMIRONdcjxQvPO06nLVhwfg/s1600/stop_debugging.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="152" data-original-width="266" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgrtP1beoGcSp7wewauUjryHOYnI2gH7m-dyW6QOwPqlYo9nDbjGMDVCGFJem2_tLRg8xFkEXfHQ8wPvLFM1_AbHvLYEziCAXS2uqpAWP_DZ414Rn1HEUZuJMIRONdcjxQvPO06nLVhwfg/s1600/stop_debugging.png" /></a></div>
<br />
And I happily press 'Yes' and hope the hot-reload works, do you know what happened? The debugger stopped, but the application also quit... Looks like this approach can only work when using Ctrl-F5(Start without debugger)... I searched for the web for how to disable killing the app when debugger stop, but I can only find people suggest to detach the debugger instead. So I work around this problem by detach the debugger and re-attach it during the program start to avoid the debugger to kill the app when it stop.<br />
<br />
So, the hot-reload function is almost working now, just press F5 to start and F7+Enter to re-compile. But sometimes the debugger fail to re-attach to the reloaded app. After spending sometime to investigate the issue, it is due to EnvDTE::Process::Item() function may fail to find the reloaded app process, returning error code RPC_E_CALL_REJECTED. I don't know why this happens, may be the process is busy at reloading the new DLL, so the final work around is to wait a bit and let the process finish their work and re-try it several times.<br />
<br />
<b><span style="font-size: large;">Fourth try</span></b><br />
We know that detaching the debugger will unlock the PDB, what if we just detach the debugger to unlock the PDB, and only copy the newly complied DLL without patching a new PDB path? Unfortunately, it fails and saying that .vcxproj file is locked...<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg502VRDq3FuI-Brs37m_tLgxoukRU9XbLlvwkb4wf8SD8nTZMHQ-K8sCi_butgqcT6YEYSQXpRsic83MNOKuB5KoNbUvItvSL4DfiWfdZPuPMrsendUcC4zb67JVsTZFaqT8ns1ltCnxfT/s1600/lock_vcxproj.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="117" data-original-width="821" height="89" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg502VRDq3FuI-Brs37m_tLgxoukRU9XbLlvwkb4wf8SD8nTZMHQ-K8sCi_butgqcT6YEYSQXpRsic83MNOKuB5KoNbUvItvSL4DfiWfdZPuPMrsendUcC4zb67JVsTZFaqT8ns1ltCnxfT/s640/lock_vcxproj.png" width="640" /></a></div>
<br />
So I can only revert back to use the "Third try" approach...<br />
<br />
<b><span style="font-size: large;">Last try</span></b><br />
We finally have a workable approach to reload the DLL, how about the executable itself? So I tried the <a href="https://docs.microsoft.com/en-us/visualstudio/debugger/edit-and-continue-visual-cpp?view=vs-2017">"edit and continue"</a> function in Visual Studio. And it works! But only for once... It is because after edit and continue, stopping the debugger will make Visual Studio kill the app... When manually detach the debugger from Visual Studio, it fails with:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3X8hTbwR50g51lE_uGMGzCe8UnesNZwWWW7X8ZY-wWaok7vauRP9q_R7-yagmv190Qpsg8uVtRaLBDlQ6xucZPXWffswXRHjLRWGAtExHQKykI5FPkzJyKE1C5_kYFKCfF3fzWtuGkC71/s1600/detach_fail.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="159" data-original-width="406" height="124" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh3X8hTbwR50g51lE_uGMGzCe8UnesNZwWWW7X8ZY-wWaok7vauRP9q_R7-yagmv190Qpsg8uVtRaLBDlQ6xucZPXWffswXRHjLRWGAtExHQKykI5FPkzJyKE1C5_kYFKCfF3fzWtuGkC71/s320/detach_fail.png" width="320" /></a></div>
<br />
So, "edit and continue" function does not compatible with my hot-reload method which relies on detaching the debugger...<br />
<br />
<b><span style="font-size: large;">Conclusion</span></b><br />
In this post, I have described the methods I tried when writing hot-reloadable DLL code on windows. The steps are as follow:<br />
<br />
When the program loads a DLL:<br />
<blockquote class="tr_bq">
1. Copy its associated PDB file.<br />
2. Copy the target DLL file and modify the hard coded PDB path to newly copied PDB path done in step 1.<br />
3. Load the copied DLL in step 2 instead.</blockquote>
After editing some code:<br />
<blockquote class="tr_bq">
4. Detach the debugger to compile the DLL from Visual Studio.<br />
5. Unload the copied DLL.<br />
6. Repeat the above step 1 to 3.<br />
7. Re-attach the debugger.</blockquote>
<ul>
</ul>
From a programmer perspective, steps are:<br />
<blockquote class="tr_bq">
1. In Visual Studio, press F5 to compile and run the program with debugger.<br />
2. Edit some code, then press F7 to re-build the solution.<br />
3. Press enter to confirm the "Do you want to stop debugging?" dialog.<br />
4. The program will reload the new DLL and re-attach the debugger automatically after compilation.</blockquote>
You can try the above work flow by downloading the <a href="https://github.com/simon-yeunglm/HotReload">sample code</a>. I have only tested it with Visual Studio Community 2017 and may not work with other version of Visual Studio. This method is far from perfect, and if anyone knows a better method and don't require work around, please let me know. Thank you very much!<br />
<br />
<b>Reference</b><br />
<span style="font-size: x-small;">[1] <a href="https://github.com/RuntimeCompiledCPlusPlus/RuntimeCompiledCPlusPlus/wiki/Alternatives">https://github.com/RuntimeCompiledCPlusPlus/RuntimeCompiledCPlusPlus/wiki/Alternatives</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://ourmachinery.com/post/dll-hot-reloading-in-theory-and-practice/">https://ourmachinery.com/post/dll-hot-reloading-in-theory-and-practice/</a></span><br />
<span style="font-size: x-small;">[3] <a href="https://ourmachinery.com/post/little-machines-working-together-part-2/">https://ourmachinery.com/post/little-machines-working-together-part-2/</a></span><br />
<span style="font-size: x-small;">[4] <a href="https://blog.molecular-matters.com/2017/05/09/deleting-pdb-files-locked-by-visual-studio/">https://blog.molecular-matters.com/2017/05/09/deleting-pdb-files-locked-by-visual-studio/</a></span><br />
<span style="font-size: x-small;">[5] <a href="https://github.com/fungos/cr">https://github.com/fungos/cr</a></span><br />
<span style="font-size: x-small;">[6] <a href="http://www.debuginfo.com/articles/debuginfomatch.html">http://www.debuginfo.com/articles/debuginfomatch.html</a></span><br />
<span style="font-size: x-small;">[7] <a href="https://msdn.microsoft.com/en-us/library/ms809762.aspx">https://msdn.microsoft.com/en-us/library/ms809762.aspx</a></span><br />
<span style="font-size: x-small;">[8] <a href="https://handmade.network/forums/wip/t/1479-sample_code_to_programmatically_attach_visual_studio_to_a_process">https://handmade.network/forums/wip/t/1479-sample_code_to_programmatically_attach_visual_studio_to_a_process</a></span><br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-17540395233239195822018-06-11T02:12:00.000+08:002018-06-11T02:12:33.349+08:00Simple GPU Path Tracer<span style="font-size: large;"><b>Introduction</b></span><br />
Path tracing is getting more popular in recent years. And because it is easy to get the code run in parallel, so making the path tracer to run on GPU can greatly reduce the rendering time. This post is just my personal notes about learning the basic of Path Tracing and to make me familiar with the D3D12 API. The source code can be downloaded <a href="https://github.com/simon-yeunglm/PathTracer">here</a>. And for those who don't want to compile from the source, the executable can be downloaded <a href="https://drive.google.com/file/d/1b3QiAn6mtunOHfad8NhGBcdfzH_VQxMv/view?usp=sharing">here</a>.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBeft3XKyfmaH80f6dXhlp3yI8j9Nkj6Lxk4202heQtwpKleMCQOf1qYyiYpcqbDB7x03SojNovJPs5jWKjOlPrFH_oUc9W5ZS4M8PZUMmyAVws7sI_tLSQnAjwDuNn3fxcxMWzjQRPleD/s1600/traced_result.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBeft3XKyfmaH80f6dXhlp3yI8j9Nkj6Lxk4202heQtwpKleMCQOf1qYyiYpcqbDB7x03SojNovJPs5jWKjOlPrFH_oUc9W5ZS4M8PZUMmyAVws7sI_tLSQnAjwDuNn3fxcxMWzjQRPleD/s320/traced_result.png" width="320" /></a></div>
<br />
<b><span style="font-size: large;">Rendering Equation</span></b><br />
Like other rendering algorithm, path tracing is solving the rendering equation:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjS1wBDV8krXLkY_K7jCPKEWA1wKqzMUVX4Sch1EAX1x2jg3hDKiLnkxR29UTM14MKnD5WTRnjq66wPMWp0hkd5oNAo1N3_HT1ZaHm2mMkeu84f524P8qGZyLnidcj7WCSTY3a-mBkq9e5w/s1600/render_eqt.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="84" data-original-width="1186" height="43" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjS1wBDV8krXLkY_K7jCPKEWA1wKqzMUVX4Sch1EAX1x2jg3hDKiLnkxR29UTM14MKnD5WTRnjq66wPMWp0hkd5oNAo1N3_HT1ZaHm2mMkeu84f524P8qGZyLnidcj7WCSTY3a-mBkq9e5w/s640/render_eqt.png" width="640" /></a></div>
<br />
To solve this integral, Monte Carlo Integration can be used, so we will shoot many rays within a single pixel from the camera position.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfsaQYBEsyss-LGHrfEUj3pGAjldxmsKYWQolQfeZV-2FNLavuiMvcT5ygkPBcftiNXSeHAvDJ1uvW1bFyE6c6bMjntXZWR8Q-Na7SUoAv0J2om6xAj_nYNgm2FkVqtZvAa8vE9KxLBLZI/s1600/render_eqt_monte_carlo.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="140" data-original-width="1402" height="60" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfsaQYBEsyss-LGHrfEUj3pGAjldxmsKYWQolQfeZV-2FNLavuiMvcT5ygkPBcftiNXSeHAvDJ1uvW1bFyE6c6bMjntXZWR8Q-Na7SUoAv0J2om6xAj_nYNgm2FkVqtZvAa8vE9KxLBLZI/s640/render_eqt_monte_carlo.png" width="640" /></a></div>
<br />
During path tracing, when a ray hits a surface, we can accumulate its light emission as well as the reflected light of that surface, i.e. computing the rendering equation. But we only take one sample in the Monte Carlo Integration so that only 1 random ray is generated according to the surface normal, which simplify the equation to:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi04WIPcwM4LsUjKcoGkD6H6xwMabBf0zis-2Tz1gjjjwvDZN3Mfqh_D_kwMqMW9wyjYtqvUdJE9xPydgoOWkgLTanTdxBMVEc22AoGGkP6jStByF9vrso7prBzUQfVIISHcZlBscgaXSun/s1600/render_eqt_1_sample.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="146" data-original-width="533" height="87" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi04WIPcwM4LsUjKcoGkD6H6xwMabBf0zis-2Tz1gjjjwvDZN3Mfqh_D_kwMqMW9wyjYtqvUdJE9xPydgoOWkgLTanTdxBMVEc22AoGGkP6jStByF9vrso7prBzUQfVIISHcZlBscgaXSun/s320/render_eqt_1_sample.png" width="320" /></a></div>
<br />
Since we shoot many rays within a single pixel, we can still get an un-biased result. To expand the recursive path tracing rendering equation, we can derive the following equation:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimE9AwO_MW_GnIba53RVFvruduX4jDMGnBoEazNzSKV_YaBaNRKC0Z9dqGOoL_z_oFgyznSiOF2efT7h7naq8tEVN9NYxsC4JOb6XmKdSycilYZB8eVI7GMXsepAG6IuBk2sVbBCRNqg-L/s1600/render_eqt_expand.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="782" data-original-width="1102" height="283" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEimE9AwO_MW_GnIba53RVFvruduX4jDMGnBoEazNzSKV_YaBaNRKC0Z9dqGOoL_z_oFgyznSiOF2efT7h7naq8tEVN9NYxsC4JOb6XmKdSycilYZB8eVI7GMXsepAG6IuBk2sVbBCRNqg-L/s400/render_eqt_expand.png" width="400" /></a></div>
<br />
<b style="font-size: x-large;">GPU random number</b><br />
To compute the Monte Carlo Integration, we need to generate random number on the GPU. The <a href="http://reedbeta.com/blog/quick-and-easy-gpu-random-numbers-in-d3d11/">wang_hash</a> is used due to its simple implementation.<br />
<ol>
<li>uint wang_hash(uint seed)</li>
<li>{</li>
<li> seed = (seed ^ 61) ^ (seed >> 16);</li>
<li> seed *= 9;</li>
<li> seed = seed ^ (seed >> 4);</li>
<li> seed *= 0x27d4eb2d;</li>
<li> seed = seed ^ (seed >> 15);</li>
<li> return seed;</li>
<li>}</li>
</ol>
We use the pixel index as the input for the wang_hash function.<br />
<blockquote class="tr_bq">
seed = px_pos.y * viewportSize.x + px_pos.x</blockquote>
However, there are some visible pattern for the random noise texture using this method (although not affecting the final render result much...):<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbrW-b5G0LDVDa2e99D-_1_15aGSmheRhY_QFVACmt4aq0MNkXh3SJqUhDi6ruTw4oUQC13VH8ULAPkCne9mrNWv-FTt20SU37JCymNqxThzpTbaz1x6jFxjHZu_DbaQ417iEYK-qnd5aC/s1600/noise_no_fix.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbrW-b5G0LDVDa2e99D-_1_15aGSmheRhY_QFVACmt4aq0MNkXh3SJqUhDi6ruTw4oUQC13VH8ULAPkCne9mrNWv-FTt20SU37JCymNqxThzpTbaz1x6jFxjHZu_DbaQ417iEYK-qnd5aC/s320/noise_no_fix.png" width="320" /></a></div>
<br />
<br />
Luckily, to fix this, we can simply multiple a random number for the pixel index which eliminate the visible pattern in the random texture.<br />
<blockquote class="tr_bq">
seed = (px_pos.y * viewportSize.x + px_pos.x) * 100 </blockquote>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmEPYx0BnyH_rpFqapvnM-UHZNgXcfvig5Cyqx3UvctW-OxJn7jc8DCE6joXRuyyR3eZC9QNV4OIs5DAYiDcavNzVd-E48QrYAefvLxtDYwK0Ck-P18CUpcItnjmm4UZFov4S2uXhJ498E/s1600/noise_with_fix.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmEPYx0BnyH_rpFqapvnM-UHZNgXcfvig5Cyqx3UvctW-OxJn7jc8DCE6joXRuyyR3eZC9QNV4OIs5DAYiDcavNzVd-E48QrYAefvLxtDYwK0Ck-P18CUpcItnjmm4UZFov4S2uXhJ498E/s320/noise_with_fix.png" width="320" /></a></div>
<br />
To generate multiple random numbers within the same pixel, we can add the random seed by a constant number after each call to the wang_hash function. Any constant larger than 0, (e.g. 10) will be good enough for this simple path tracer.<br />
<ol>
<li>float rand(inout uint seed)</li>
<li>{</li>
<li> float r= wang_hash(seed) * (1.0 / 4294967296.0);</li>
<li> seed+= 10;</li>
<li> return r;</li>
<li>}</li>
</ol>
<b><span style="font-size: large;">Scene Storage</span></b><br />
To trace ray on the GPU, I upload all the scene data(e.g. triangles, material, light...) into several structure buffers and constant buffer. Due to my laziness and the announcement of <a href="https://blogs.msdn.microsoft.com/directx/2018/03/19/announcing-microsoft-directx-raytracing/">DirectX Raytracing</a>, I did not implement any ray tracing acceleration structure like BVH. I just store the triangles in a big buffer.<br />
<br />
<b><span style="font-size: large;">Tracing Rays</span></b><br />
By using the rendering equation derived above, we can start writing code to shoot rays from the camera. During each frame, for each pixel, we trace one ray and reflect it multiple times to compute the rendering equation. And then we can additive blend the path traced result over multiple frames to get a progressive path tracer using the following blend factor:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEijNk87blMx4S0TKk0v2GRneX8TUL1bZVT3IraYsNYUaAoQRBg1r583JxeliKEUHjoppPVOECLcQILQzWjfYVyFh0Z66xt4vgVKxROJ3W234c5OfnqexyivIrd7pZRILiTRrvMWp3YDdRIY/s1600/blend_factor.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="140" data-original-width="1403" height="60" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEijNk87blMx4S0TKk0v2GRneX8TUL1bZVT3IraYsNYUaAoQRBg1r583JxeliKEUHjoppPVOECLcQILQzWjfYVyFh0Z66xt4vgVKxROJ3W234c5OfnqexyivIrd7pZRILiTRrvMWp3YDdRIY/s640/blend_factor.png" width="640" /></a></div>
<br />
To generate the random reflected direction of any ray hit surface, we simply uniformly sample a direction on the hemi-sphere around surface normal:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgezk3Bpd1AYWuVDYlbQXMoPKD63TVlPGePQrK-tkztb77iQb-JUNZbHwMsolxK7WArVmd5w75KXoyfruwtRMqybeL-GgxhcztSFdKPqwzEUUZU99lFsEy8O4r5NcRfsg0hxiUZhuKMAkJa/s1600/sample_uniform.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="393" data-original-width="1038" height="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgezk3Bpd1AYWuVDYlbQXMoPKD63TVlPGePQrK-tkztb77iQb-JUNZbHwMsolxK7WArVmd5w75KXoyfruwtRMqybeL-GgxhcztSFdKPqwzEUUZU99lFsEy8O4r5NcRfsg0hxiUZhuKMAkJa/s400/sample_uniform.png" width="400" /></a></div>
<br />
Here is the result of the path tracer when using the uniform random direction and using an emissive light material. The result is quite noisy:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9JWF1lzEGuCesOfXm66P2-zZC4CmaY85TUTi76tB9aIMDHJINK4-9DunXMRgOYdGs64zlS_XeSHS2Y93QBZ8t-bDw9hrKF_tInQiWuybawwCXGUss17otsjN_m8leM7ulrR4GLBg0317n/s1600/implicit_uniform_64.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9JWF1lzEGuCesOfXm66P2-zZC4CmaY85TUTi76tB9aIMDHJINK4-9DunXMRgOYdGs64zlS_XeSHS2Y93QBZ8t-bDw9hrKF_tInQiWuybawwCXGUss17otsjN_m8leM7ulrR4GLBg0317n/s320/implicit_uniform_64.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Uniform implicit light sampling, 64 sample per pixel</td></tr>
</tbody></table>
<br />
To reduce noise, we can weight the randomly reflected ray with a cosine factor similar to the Lambert diffuse surface:<br />
<table>
<tbody>
<tr>
<td><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkNyXpjrn2RMLgkSfCxkX002KUdioBYuwdnIp1ADamhbZ_DLsZCBUTakjraQISs8ObIFBibnpd2Vc3qPv2pQvt2O3IKPtPLcg_2BF22jfHloUB5kkayHgNJU7jG7lag9UAkrVkvAMQ08aK/s1600/sample_cos.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="398" data-original-width="1052" height="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkNyXpjrn2RMLgkSfCxkX002KUdioBYuwdnIp1ADamhbZ_DLsZCBUTakjraQISs8ObIFBibnpd2Vc3qPv2pQvt2O3IKPtPLcg_2BF22jfHloUB5kkayHgNJU7jG7lag9UAkrVkvAMQ08aK/s400/sample_cos.png" width="400" /></a></div>
<br /></td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVYV8Pls2dEcYsLESZRxndF4g7kwH85x-mvwuqmklqHWMSCMar-9VjsyKKahyphenhyphena-_v31yaIaJWvCSc-OEMMUfZdNUdAQ-hUGxaNtJrtrnFiOrY7uZuU1wo3_7AQRgcg_BXETO0KjWk8PK1L/s1600/implicit_cos_64.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVYV8Pls2dEcYsLESZRxndF4g7kwH85x-mvwuqmklqHWMSCMar-9VjsyKKahyphenhyphena-_v31yaIaJWvCSc-OEMMUfZdNUdAQ-hUGxaNtJrtrnFiOrY7uZuU1wo3_7AQRgcg_BXETO0KjWk8PK1L/s320/implicit_cos_64.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Cos weighted implicit light sampling, 64 sample per pixel</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
The result is still a bit noisy. Because in our scene, the light source is not very large, the probability of a randomly reflected ray to hit the light source is quite low. So to improve this, we can explicit sample the light source for every ray that hit a surface.<br />
<br />
To sample a rectangular light source, we can randomly choose a point over its surface area, and the corresponding probability will be:<br />
<blockquote class="tr_bq">
1/area of light</blockquote>
Since our light sampling is over the area domain instead of the direction domain as state in the above equation. The rendering equation need to multiply by the Jacobian that relates solid angle to area. i.e.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghDX4ndW7T21UnrZNAZZ0ItbBw73TGWpIsNlapHishSW6xTLrhVvf5Pxs5hyZ53ju4DiUAT-xTwGtwhyXEZKybNkSG1_HmjNIxLCDRiyNH8QIt7s0mwXGQ8Ce0m7DXs5qR0h_tc6im1Emt/s1600/jacobian.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="242" data-original-width="1116" height="137" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghDX4ndW7T21UnrZNAZZ0ItbBw73TGWpIsNlapHishSW6xTLrhVvf5Pxs5hyZ53ju4DiUAT-xTwGtwhyXEZKybNkSG1_HmjNIxLCDRiyNH8QIt7s0mwXGQ8Ce0m7DXs5qR0h_tc6im1Emt/s640/jacobian.png" width="640" /></a></div>
<br />
With the same number of sample per pixel, the result is much less noisy:<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTK8MPk550xNSBMkZTHOi5ZFf87r99GTtKxqDLqaa2RNJfLEgH2cp5RdgXD22-0SWDW4NJsj2vpbcdAHb9dioSLnvXtAz7ZzWMclHYSeE8zJmioAvcO2IxvOTd4wI05ZxikZki2lJ92xkm/s1600/explicit_uniform_64.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTK8MPk550xNSBMkZTHOi5ZFf87r99GTtKxqDLqaa2RNJfLEgH2cp5RdgXD22-0SWDW4NJsj2vpbcdAHb9dioSLnvXtAz7ZzWMclHYSeE8zJmioAvcO2IxvOTd4wI05ZxikZki2lJ92xkm/s320/explicit_uniform_64.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-size: 12.8px;">Uniform explicit light sampling, 64 sample per pixel</span></td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDltvhMRd2bIVePO4KUZcXTS6OoU4o6Pf2Z0zT0j5t6Gullgh_4N8u0dn6Krj8Tzbi1vTufBqAywVvVfNkONRT1DDTQxt1SAFFAWkaZnjTk8uU-84vPI4DjECLWZBqx_luMG1_k6-V8agM/s1600/explicit_cos_64.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgDltvhMRd2bIVePO4KUZcXTS6OoU4o6Pf2Z0zT0j5t6Gullgh_4N8u0dn6Krj8Tzbi1vTufBqAywVvVfNkONRT1DDTQxt1SAFFAWkaZnjTk8uU-84vPI4DjECLWZBqx_luMG1_k6-V8agM/s320/explicit_cos_64.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-size: 12.8px;">Cos weighted explicit light sampling, 64 sample per pixel</span></td></tr>
</tbody></table>
</td></tr>
</tbody></table>
<div>
<span style="font-size: large;"><b><br />Simple de-noise</b></span><br />
As we have seen above, the result of path tracing is a bit noise even with 64 samples per pixel. The result will be even worse for the first frame:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiFIYmvjRMoACawn8INwDc8POpsVP2Y8jd4SwxZwSMrZKJz8vPlEZJ_U9iZ8EQFRulybOhwKdAFP89dfIt0lF6JldkP_2x2Igj0vYuVSU0JbZRWw-ixoLMjyHyKFcbfp0nbjKjQgTHTYcSL/s1600/traced_result_1_frame.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiFIYmvjRMoACawn8INwDc8POpsVP2Y8jd4SwxZwSMrZKJz8vPlEZJ_U9iZ8EQFRulybOhwKdAFP89dfIt0lF6JldkP_2x2Igj0vYuVSU0JbZRWw-ixoLMjyHyKFcbfp0nbjKjQgTHTYcSL/s320/traced_result_1_frame.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">first frame path traced result</td></tr>
</tbody></table>
There are some very bright dots and looks not good during camera motion. So I added a simple de-noise pass, which is just blurring lots of pixels where they are located on the same surface (which really need a lot of pixel to make the result looks good, which cost some performance...).<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW5tfdkoLXNNYlU5PukSid7SQU3lQGvWQ2_Z89joEqg6P8rg8pM3JAjQ17MOsscTfqPW5DoTQEP6UpfEjXvLtzYBzv86hsOdCnudDVduuB3UM0D6iqfvTap8OUmwG9iDY9Bx2b9N90kLkO/s1600/traced_result_1_frame_blur.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW5tfdkoLXNNYlU5PukSid7SQU3lQGvWQ2_Z89joEqg6P8rg8pM3JAjQ17MOsscTfqPW5DoTQEP6UpfEjXvLtzYBzv86hsOdCnudDVduuB3UM0D6iqfvTap8OUmwG9iDY9Bx2b9N90kLkO/s320/traced_result_1_frame_blur.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Blurred first frame path traced result</td></tr>
</tbody></table>
To identify the pixel correspond to which surface, we store this data in the alpha channel of the path tracing texture with the following formula:<br />
<blockquote class="tr_bq">
dot(surface_normal, float3(1, 10, 100)) + (mesh_idx + 1) * 1000</blockquote>
This works because we only contains small number of mesh and the mesh normal are the same for each surface in this simple scene.<br />
<br />
<b><span style="font-size: large;">Random Notes...</span></b><br />
During the implementation, I encounter various bugs/artifacts which I think is interesting.<br />
<br />
First, is about the simple de-noise pass. It may bleed the light source color to neighbor pixel far away even we have per pixel mesh index data.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibsJN4VQtVX584XAEVYp-f-ARPGvPxJJ6TgjK6I-JiC5gdDhdfhFZN4VeKwG5_RDLvotoJf7rLg9QkzYEVwt1IgApKgAlMDT-QIcz-fG2ino9WH7XFbAoJmpewJJtzs61V43ilgWO1JZQE/s1600/blur_bug.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEibsJN4VQtVX584XAEVYp-f-ARPGvPxJJ6TgjK6I-JiC5gdDhdfhFZN4VeKwG5_RDLvotoJf7rLg9QkzYEVwt1IgApKgAlMDT-QIcz-fG2ino9WH7XFbAoJmpewJJtzs61V43ilgWO1JZQE/s320/blur_bug.png" width="320" /></a></div>
<br />
This is because we only store a single mesh index per pixel, but we jitter the ray shot from camera within a single pixel per frame, some of the light color will be blend to the light geometry edge. It get very noticeable because the light source have a very high radiance compared to the reflect light of ceiling geometry.<br />
<br />
To fix this, I just simply do not jitter the ray for tracing a direct hit of light geometry from camera, so this fix can only apply to explicit light sampling.<br />
<br />
<br />
<br />
The second one is about quantization when using 16bit floating point texture. The path tracing texture sometimes may get quantized result after several hundred frames of additive blend when the single sample per pixel path trace result is very noise.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTCNt8G7kQjOCMPfIOI3HQTGroZ9DTHZKOHZQ_ocTYoM1uLeEnHl-QnDK6ePApOT6-rXiBaC8S6LkTe2WtgdZ-oC1tyCaPYpPL0gq9M71zamfPQvsexiPDK1im4EGN9L7H4oHnx42tE8Zo/s1600/quantized.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTCNt8G7kQjOCMPfIOI3HQTGroZ9DTHZKOHZQ_ocTYoM1uLeEnHl-QnDK6ePApOT6-rXiBaC8S6LkTe2WtgdZ-oC1tyCaPYpPL0gq9M71zamfPQvsexiPDK1im4EGN9L7H4oHnx42tE8Zo/s200/quantized.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Quantized implicit light sampling</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUG-ZNjAh7NJoWrh9OXZBF5J3BUJK_QpDLFgzrXZB9WxFBtCl8FSDRRDqO46lZYiPYWxIMkaEV-UBbIXYzf0A9luvhR90Z5AkiKErLDVtuSZ1_oRWjmfiy697fR327QXkvvox-N5f5Qp5V/s1600/quantized_1_frame.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjUG-ZNjAh7NJoWrh9OXZBF5J3BUJK_QpDLFgzrXZB9WxFBtCl8FSDRRDqO46lZYiPYWxIMkaEV-UBbIXYzf0A9luvhR90Z5AkiKErLDVtuSZ1_oRWjmfiy697fR327QXkvvox-N5f5Qp5V/s200/quantized_1_frame.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Path traced result in first frame</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgrRX-a0cKiEjA-xc_zPp3eh3S9wKOcx2-PKuzjcFp-pZrYv02ts1PCGwun_as1tYpZCtJyDIp-0j8j0p2d_5-V8KcTRdrmnMIljkTfr87fHIphuLAd-IktnACt9BUhiAHsM4OIkTFqVS1_/s1600/quantized_1_frame_blur.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgrRX-a0cKiEjA-xc_zPp3eh3S9wKOcx2-PKuzjcFp-pZrYv02ts1PCGwun_as1tYpZCtJyDIp-0j8j0p2d_5-V8KcTRdrmnMIljkTfr87fHIphuLAd-IktnACt9BUhiAHsM4OIkTFqVS1_/s200/quantized_1_frame_blur.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">simple de-noised first frame result</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
To work around this, 32bit floating point texture need to be used, but this may have a performance impact (explicitly for my simple de-noise pass...).<br />
<br />
<br />
<br />
The last one is the bright flyflies artifact when using a very large light source (as big as ceiling). This may sound counter intuitive. And the implicit light path traced result(i.e. not sampling the light source directly) does not have those flyflies...</div>
<div>
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmyDmM5oxz-944waoK0BN5pwytDOq8miemft9-unILpmkjxCxiZ24Tn5nfUYNFZ4RBzliAWSQ1GJcuHONti5p0DnOnHCXjyPYhn07tX_c6QhnrtCyfGArHraE9F8-lYkDfCDmX7uaiS5te/s1600/large_light_explicit.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmyDmM5oxz-944waoK0BN5pwytDOq8miemft9-unILpmkjxCxiZ24Tn5nfUYNFZ4RBzliAWSQ1GJcuHONti5p0DnOnHCXjyPYhn07tX_c6QhnrtCyfGArHraE9F8-lYkDfCDmX7uaiS5te/s320/large_light_explicit.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Explicit light sample result</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbLhXUW2SJN1expKF4taftIWeEGl9tW29UR-y1sNAM2l79jwedMji7BKLKVWFRrV6semmMBReCSDa0ThrlY4WxtK1ieKSTfyDquxRLG9k2Gv2amjx5rLFg_r28nrwQv92F_2rB1dUhg3d1/s1600/large_light_implicit.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbLhXUW2SJN1expKF4taftIWeEGl9tW29UR-y1sNAM2l79jwedMji7BKLKVWFRrV6semmMBReCSDa0ThrlY4WxtK1ieKSTfyDquxRLG9k2Gv2amjx5rLFg_r28nrwQv92F_2rB1dUhg3d1/s320/large_light_implicit.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Implicit light sample result</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
But it turns out this artifact is not related to the size of the light source, but is related to the light too close to the reflected geometry. To visualize it, we may look at how the light get bounced:<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgw_MLBGeU0oDGWrsBvdbSSK0cQRjc4EXMzdc3MDKWjCIxRe4KxB0HvXWE87S30UVlx3cClH2LBYc5HqZw0TbetCps0KyBLVMumlONDl39uOhhjMKbMhOPgXlUbhqlA8KIX6PQD_CC29Jz2/s1600/depth_1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgw_MLBGeU0oDGWrsBvdbSSK0cQRjc4EXMzdc3MDKWjCIxRe4KxB0HvXWE87S30UVlx3cClH2LBYc5HqZw0TbetCps0KyBLVMumlONDl39uOhhjMKbMhOPgXlUbhqlA8KIX6PQD_CC29Jz2/s320/depth_1.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">path trace depth = 1</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiC8h4zj1bzTn5bqNfeCn0r4LzVp5JFeLtm2bTw-pCZtnSZ8PDSOafgJ6pgohxyZ9FuTaV5pm0y_jRjdDlq-J-kaTqOYzSWLuerMLzRNZpWxE-DjIys40f45uHwH2GIeCGGhFK245gnjIoU/s1600/depth_2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiC8h4zj1bzTn5bqNfeCn0r4LzVp5JFeLtm2bTw-pCZtnSZ8PDSOafgJ6pgohxyZ9FuTaV5pm0y_jRjdDlq-J-kaTqOYzSWLuerMLzRNZpWxE-DjIys40f45uHwH2GIeCGGhFK245gnjIoU/s320/depth_2.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">path trace depth = 2</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
The flyflies start to appear in first bound, located at the position near the light source. And then those flyflies get propagated with the reflected light rays. Those large values are generated by explicit light sampling Jacobian transform, the denominator part, which is the distance square between the light and surface.<br />
<br />
After a brief search on the internet, to fix this, either need to implement radiance clamping or bi-directional path tracing, or greatly increase the sampling number. Here is the result with over 75000 number of samples per pixel, but it still contains some flyflies...<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbQuxPfJOcasZgN21irDfPubd07zqtjlLUT5R_xLwzObBgD-BMtZoL6RZ-59fcDUB1Kmm1BXLAJOS4jtKqRYuUs9-eReIokNhfjFsdGylMO3rkPOorJ93JVoXXfFdWnu9JYgyw6z0xQoFj/s1600/large_light_explicit_75000.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="512" data-original-width="512" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbQuxPfJOcasZgN21irDfPubd07zqtjlLUT5R_xLwzObBgD-BMtZoL6RZ-59fcDUB1Kmm1BXLAJOS4jtKqRYuUs9-eReIokNhfjFsdGylMO3rkPOorJ93JVoXXfFdWnu9JYgyw6z0xQoFj/s320/large_light_explicit_75000.png" width="320" /></a></div>
<br />
<span style="font-size: large;"><b>Conclusion</b></span><br />
In this post, we discuss the steps to implement a simple GPU path tracer. The most basic path tracer is simply shooting large number of rays per pixel, and reflect the ray multiple times until it hits a light source. With explicit light sampling, we can greatly reduce noise.<br />
<br />
This path tracer is just my personal toy project, which only have Lambert diffuse reflection with a single light. It is my first time to use the D3D12 API, the code is not well optimized, so the source code are for reference only and if you find any bugs, please let me know. Thank you.<br />
<br />
<b>Reference</b><br />
<span style="font-size: x-small;">[1] Physically Based Rendering <a href="http://www.pbrt.org/">http://www.pbrt.org/</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://www.slideshare.net/jeannekamikaze/introduction-to-path-tracing">https://www.slideshare.net/jeannekamikaze/introduction-to-path-tracing</a></span><br />
<span style="font-size: x-small;">[3] <a href="https://www.slideshare.net/takahiroharada/introduction-to-bidirectional-path-tracing-bdpt-implementation-using-opencl-cedec-2015">https://www.slideshare.net/takahiroharada/introduction-to-bidirectional-path-tracing-bdpt-implementation-using-opencl-cedec-2015</a></span><br />
<span style="font-size: x-small;">[4] <a href="http://reedbeta.com/blog/quick-and-easy-gpu-random-numbers-in-d3d11/">http://reedbeta.com/blog/quick-and-easy-gpu-random-numbers-in-d3d11/</a></span><br />
<br />
<br />
<br />
<br />
<br />
<br />
<table>
</table>
</div>
Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com1tag:blogger.com,1999:blog-7659461179709896430.post-55194243588078340742017-12-14T01:14:00.000+08:002018-06-11T08:56:26.466+08:00Render Passes in "Seal Guardian"<span style="font-size: large;"><b>Introduction</b></span><br />
<a href="http://www.whitebudgie.games/seal_guardian.html">"Seal Guardian"</a> uses a forward renderer to render the scene. Because we need to support mobile platform, we don't have too many effect in it. But still it consists of a few render passes to compose an image.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhb0XQ6v68gZKnabI2DYOX5EOZjowkeMeuwLjd6FKB5wbpJ9X2AkMElEHdoUWptVC_QOOI_VEcKvzQiRtj8ciXKH8qrBJBTupAJj-T2-gSvS0RP6KXRdU6XvKIPjUQFPH9RnBCP20jU-gZX/s1600/main_game.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="782" data-original-width="1336" height="233" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhb0XQ6v68gZKnabI2DYOX5EOZjowkeMeuwLjd6FKB5wbpJ9X2AkMElEHdoUWptVC_QOOI_VEcKvzQiRtj8ciXKH8qrBJBTupAJj-T2-gSvS0RP6KXRdU6XvKIPjUQFPH9RnBCP20jU-gZX/s400/main_game.png" width="400" /></a></div>
<br />
<span style="font-size: large;"><b>Shadow Map Pass</b></span><br />
To calculate dynamic shadow of the scene, we need to render the depth of the meshes from the light point of view. We render them into a 1024x1024 shadow map.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpXW2c_Rw4OeSGpzj8oUHDeAF-B2lYoA-fWU84Rb-20tHe4CM8SN8bTcqo2NdqnZGKR0RRUHGb3WTbtOwBNJQksWSxq8I8g9ONogfhQVBq4yRstr9LV2H7-O6QP6EUqxuNpWkuDOU9V9f_/s1600/pass_0_shadow_map.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="233" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpXW2c_Rw4OeSGpzj8oUHDeAF-B2lYoA-fWU84Rb-20tHe4CM8SN8bTcqo2NdqnZGKR0RRUHGb3WTbtOwBNJQksWSxq8I8g9ONogfhQVBq4yRstr9LV2H7-O6QP6EUqxuNpWkuDOU9V9f_/s400/pass_0_shadow_map.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Standard shadow map</td></tr>
</tbody></table>
<br />
Then we use the Exponential Shadow Map method to blur the shadow map into a 512x512 shadow map.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMi6G1k4xesp-k8gWO0h8GFuecBmJhIhoXYQIP318Ke0QVH3GPGZmXrA3wTfSY4C0yh9hda-s_UMdfB_gQMDptISe9mVITTWueZ89Fh34nGQSxXTf4AEJh1ak1V2KGAeu9aEI7ve-1dxQ1/s1600/pass_0_ESM.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="233" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMi6G1k4xesp-k8gWO0h8GFuecBmJhIhoXYQIP318Ke0QVH3GPGZmXrA3wTfSY4C0yh9hda-s_UMdfB_gQMDptISe9mVITTWueZ89Fh34nGQSxXTf4AEJh1ak1V2KGAeu9aEI7ve-1dxQ1/s400/pass_0_ESM.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">ESM blurred shadow map</td></tr>
</tbody></table>
<br />
(Note that this pass may be skipped according to current performance setting.)<br />
<br />
<span style="font-size: large;"><b>Opaque Geometry Pass</b></span><br />
In this pass, we render the scene meshes into a RGBA8 render target. We compute all the lighting including direct lighting, indirect lighting(lightmap or SH probe), tone mapping in this single pass. This is because on iOS, reducing render pass may have a better performance, so we choose to combine all the calculation into a single pass.<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3EmcTtIVIlN7QhRaFTBBG5XJpk8WO4giwcqR3L6GPlRYJG0NxHovfO6QgVXtcygN2xEJJ45pLCzk3TRcDJqKBlJOw1EFuUO4UgCOgopBVn56MkCYYE9dV0CSM0LZh2E2HVnjimPIh_O_e/s1600/pass_1_opaque_1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3EmcTtIVIlN7QhRaFTBBG5XJpk8WO4giwcqR3L6GPlRYJG0NxHovfO6QgVXtcygN2xEJJ45pLCzk3TRcDJqKBlJOw1EFuUO4UgCOgopBVn56MkCYYE9dV0CSM0LZh2E2HVnjimPIh_O_e/s320/pass_1_opaque_1.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Tonemapped opaque scene color</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgj6oaht1xVp-6NqXFSXgCO4IQ4fdM-bmZSrK_od6_v8vfQJID1er3-iFoYR-d58Kxem9tn4uoUH9Ei8OGTCghEkF2Gvm1LaDkGG8Favco5QeG_2kqQ49DadvRkIKcoz8ypsZ2__BCjZ4aV/s1600/pass_1_opaque_1_depth.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgj6oaht1xVp-6NqXFSXgCO4IQ4fdM-bmZSrK_od6_v8vfQJID1er3-iFoYR-d58Kxem9tn4uoUH9Ei8OGTCghEkF2Gvm1LaDkGG8Favco5QeG_2kqQ49DadvRkIKcoz8ypsZ2__BCjZ4aV/s320/pass_1_opaque_1_depth.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Opaque geometry depth bufer</td></tr>
</tbody></table>
</td></tr>
</tbody></table>
<br />
To reduce the impact of overdraw, we pre-compute a visibility set to avoid drawing occluded mesh (may talk about it in future post). Also we want to add a bloom pass to enhance the effect of bright pixels, we compute a bloom value in this pass according to the pre-tone mapped value and store it in the alpha channel of this pass.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggg5le8UPS0e3McrreCF8t967SMGkvv8lb2u5ml5oOYFm1YUW78BCW5fInEeXqppGmUaRQNkSSmHpZ8ejJJ52RZXTMU85ZOqTOuV7BEsSJWhk0cVVmhh9Yc3tYiIifC3KxKBaue3NMtXc7/s1600/gbuffer.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="126" data-original-width="576" height="69" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggg5le8UPS0e3McrreCF8t967SMGkvv8lb2u5ml5oOYFm1YUW78BCW5fInEeXqppGmUaRQNkSSmHpZ8ejJJ52RZXTMU85ZOqTOuV7BEsSJWhk0cVVmhh9Yc3tYiIifC3KxKBaue3NMtXc7/s320/gbuffer.png" width="320" /></a></div>
<br />
<span style="font-size: large;"><b>Transparent Geometry Pass</b></span><br />
In this pass, we render transparent mesh and particle. We blend the post-tonemapped color with the opaque geometry due to performance reason. Also, because we store the bloom intensity in the alpha channel and we want the alpha geometry to affect the bloom result. We solve this by 2 different methods depending on the game runs on which platform.<br />
<br />
On iOS, we render the mesh directly to the render target of the opaque geometry pass with a shader similar to the opaque pass by outputting tonemapped scene color in RGB and bloom intensity in A. To blend those 4 values over the opaque value, we use the <span style="white-space: pre-wrap;"><a href="https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_shader_framebuffer_fetch.txt">EXT_shader_framebuffer_fetch</a> OpenGL extension. So the blending happens at the end of the transparent geometry shader and we choose the simple blending formula below by using the opacity of the mesh(because we want to make it consistent with other platform):</span><br />
<blockquote class="tr_bq">
<span style="white-space: pre-wrap;">RGB= mesh color * mesh alpha </span><span style="white-space: pre-wrap;">+ dest color * (1 - mesh alpha)</span><span style="white-space: pre-wrap;"><br />A = mesh bloom intensity </span><span style="white-space: pre-wrap;">* mesh alpha </span><span style="white-space: pre-wrap;">+ dest </span><span style="white-space: pre-wrap;">bloom intensity </span><span style="white-space: pre-wrap;"> * (1 - mesh alpha)</span></blockquote>
<span style="white-space: pre-wrap;">On Windows and Mac, the </span><span style="white-space: pre-wrap;">EXT_shader_framebuffer_fetch does not exist. We render all the transparent meshes into a separate RGBA8 render target. We compute the scene color and bloom intensity similar to opaque pass, but before writing to the render target, we decompose the RGB scene color into luma and chroma and store the chroma value in checkerboard pattern similar to <a href="http://www.crytek.com/download/fmx2013_c3_art_tech_donzallaz_sousa.pdf">this paper(slide 104)</a>. So we can store luma+chroma in RG channel, bloom intensity in B channel and opacity of mesh in the A channel of the render target. </span><br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicsy1EnG5kC1wChv5ZBtX3zCaA_E5VoV50fPtPH4yzNaxKxj1jI3U_xx-BQCASCTFYnFWogGR55zuNvAYj3UbEilOdhYPk5gI7VEWlDTiEl4tA7yUBiNJyOYoGJLNPEJv0LD6cGCdQG-4x/s1600/pass_2_transparent.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="233" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicsy1EnG5kC1wChv5ZBtX3zCaA_E5VoV50fPtPH4yzNaxKxj1jI3U_xx-BQCASCTFYnFWogGR55zuNvAYj3UbEilOdhYPk5gI7VEWlDTiEl4tA7yUBiNJyOYoGJLNPEJv0LD6cGCdQG-4x/s400/pass_2_transparent.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Transparent render target on Windows platform</td></tr>
</tbody></table>
<span style="white-space: pre-wrap;"><br /></span>
<span style="white-space: pre-wrap;">Finally, we can blend this transparent texture over the opaque geometry pass render target.</span><br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXgnflHMEFSr5AM6W3Qji_c2brUKvXxzTIhOW_f3RhNKOi2YbDUgn7H1pAxckhneGXolqYXK9jQwOk40mGq6mTKhyngc-OF5t6RnhDIB3uS2ajE1lwuGD1ZEqEUPpv7LtSiEA-ZBZG89cj/s1600/pass_2_transparent_compose.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="233" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXgnflHMEFSr5AM6W3Qji_c2brUKvXxzTIhOW_f3RhNKOi2YbDUgn7H1pAxckhneGXolqYXK9jQwOk40mGq6mTKhyngc-OF5t6RnhDIB3uS2ajE1lwuGD1ZEqEUPpv7LtSiEA-ZBZG89cj/s400/pass_2_transparent_compose.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Composed opaque and transform geometry</td></tr>
</tbody></table>
<br />
<span style="font-size: large;"><b>Post Process Pass</b></span><br />
After those geometry passes, we can blend in the bloom filter. We make several blur passes for those bright pixels and additive blend over the previous render pass output to enhance the bright effect.<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUW_j_AGZOKHL1ZOfitOgC4HkM-Cio6zgt-MIE4RcXIaTH5bYKhqhHrQDqiOF205nuwHclNY2U7tNvTkowUmzaQwW6nU52IFhRBm06cCV-P0aP1PE3WiM5Kb2hDDDE8HiCMiA5eFLsmFDd/s1600/pass_3_bloom.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUW_j_AGZOKHL1ZOfitOgC4HkM-Cio6zgt-MIE4RcXIaTH5bYKhqhHrQDqiOF205nuwHclNY2U7tNvTkowUmzaQwW6nU52IFhRBm06cCV-P0aP1PE3WiM5Kb2hDDDE8HiCMiA5eFLsmFDd/s320/pass_3_bloom.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Blurred bright pixels</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBXnGm4OOzWOCFbo5-ZoeHuRZOM8CW37W16rFxphhzwY1thft7Q8BwfMZ1-uQV223kGO6ud1rjTcoECuFD3s50sZ7BZO4ce85XwPm48E_Xmnoim2iPbKJ9osrkAw8Xx3fn8syJkm9-SwNR/s1600/pass_3_bloom_compose.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="187" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgBXnGm4OOzWOCFbo5-ZoeHuRZOM8CW37W16rFxphhzwY1thft7Q8BwfMZ1-uQV223kGO6ud1rjTcoECuFD3s50sZ7BZO4ce85XwPm48E_Xmnoim2iPbKJ9osrkAw8Xx3fn8syJkm9-SwNR/s320/pass_3_bloom_compose.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Additive blended bloom texture with scene color</td></tr>
</tbody></table>
</td></tr>
</tbody></table>
<br />
Then we compute a simplified(but not very accurate, due to the lack of a velocity buffer) temporal anti-aliasing using the color and depth buffer of current frame and previous 2 frames. One thing we didn't mention is that, during rendering the opaque and transparent meshes, we jitter the camera projection by half a pixel, alternating between odd and even frame, similar to the figure below, so that we can have sub-pixel information for anti-aliasing.<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjm6KanAVxdPCv1HySaBnv0YxEEAboK7Kuu1OZxOcqfyuk7JFQ6RCS7uin2LXl21B398ZFDnNMhQE56g2o_QyjOznjCm18i-SLbqVQG_myIhCYXceWrAMByXls-5KyH_TFwqwbOdYgesm66/s1600/AA_grid.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="408" data-original-width="569" height="228" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjm6KanAVxdPCv1HySaBnv0YxEEAboK7Kuu1OZxOcqfyuk7JFQ6RCS7uin2LXl21B398ZFDnNMhQE56g2o_QyjOznjCm18i-SLbqVQG_myIhCYXceWrAMByXls-5KyH_TFwqwbOdYgesm66/s320/AA_grid.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Temporal AA jitter pattern</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigL4hfMIFi4UWJ7oWryMf0A2yUseWdN-3eIwjVcHvD3UXK1b2trTdQqJwv5yLaIkc7azzL8HlDNBhQdAYGo2kkfnCnWI2ILY_K6QK5HXTkUOT_IblPnW78puoAaM_2Zy9GY-Ke3znCScto/s1600/pass_3_AA.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="231" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigL4hfMIFi4UWJ7oWryMf0A2yUseWdN-3eIwjVcHvD3UXK1b2trTdQqJwv5yLaIkc7azzL8HlDNBhQdAYGo2kkfnCnWI2ILY_K6QK5HXTkUOT_IblPnW78puoAaM_2Zy9GY-Ke3znCScto/s400/pass_3_AA.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Temporal anti-aliased image</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<span style="font-size: large;"><b>Conclusion</b></span><br />
In this post, we break down the render passes in "Seal Guardian", which compose of mainly 4 parts: shadow map, opaque geometry, transparent geometry and post process passes. By making less render pass, we can achieve a constant 60FPS in most cases (if target framerate is not met, we may skip some render pass such as temporal AA and shadow).<br />
<br />
Lastly, <a href="http://www.whitebudgie.games/seal_guardian.html">"Seal Guardian"</a> has already been released on <a href="http://store.steampowered.com/app/741620/Seal_Guardian/">Steam</a> / <a href="https://itunes.apple.com/us/app/seal-guardian/id1307643357?ls=1&mt=12">Mac App Store</a> / <a href="https://itunes.apple.com/us/app/seal-guardian/id1307585597?ls=1&mt=8">iOS App Store</a>. If you want to support us to develop games with custom tech, then buying a copy of the game on any platform will help. Thank you very much.<br />
<br />
<b>References</b><br />
<span style="font-size: x-small;">[1] The Art and Technology behind Crysis 3 <a href="http://www.crytek.com/download/fmx2013_c3_art_tech_donzallaz_sousa.pdf">http://www.crytek.com/download/fmx2013_c3_art_tech_donzallaz_sousa.pdf</a></span><br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-91257166389617096402017-12-06T00:18:00.000+08:002017-12-06T23:20:38.249+08:00Shadow in "Seal Guardian"<span style="font-size: large;"><b>Introduction</b></span><br />
<a href="http://www.whitebudgie.games/seal_guardian.html">"Seal Guardian"</a> uses a mix of static and dynamic shadow systems to support long range shadow to cover the whole level. "Seal Guardian" only use a single directional for the whole level, so part of the shadow information can be pre-computed. It mainly consists of 3 parts: baked static shadow on static meshes stored along with the <a href="http://simonstechblog.blogspot.hk/2017/11/light-map-in-seal-guardian.html">light map</a>, baked static shadow for dynamic objects stored along with the irradiance volume and dynamic shadow with optional ESM soft shadow.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAWsV8pm_atapWf4Gx-RX1uh72as0W6VAlKFPjk-O99d_XVnb-rLVvA7t7KzGFUwYjOQSFbAY8fWEFLvj3gNOTweYg5ii8XXY6TSdFicTdcWrk3ycsE4HvcudLZI-VFP-jFz-fcas6-vtq/s1600/main.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="956" data-original-width="1427" height="266" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAWsV8pm_atapWf4Gx-RX1uh72as0W6VAlKFPjk-O99d_XVnb-rLVvA7t7KzGFUwYjOQSFbAY8fWEFLvj3gNOTweYg5ii8XXY6TSdFicTdcWrk3ycsE4HvcudLZI-VFP-jFz-fcas6-vtq/s400/main.png" width="400" /></a></div>
<span style="font-size: large;"><b><br /></b></span>
<span style="font-size: large;"><b>Static shadow for static objects</b></span><br />
During the baking process of the light map, we also compute static shadow information. We first render a shadow map for the whole level in a big render target (e.g. 8192x8192), then for each texel of light map, we can compare against its world position to the shadow map to check whether that texel is in shadow. But we are using a 1024x1024 light map for the whole scene, storing the shadow term directly will not have enough resolution. So we use <a href="http://www.valvesoftware.com/publications/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf">distance field representation</a>[1] to reduce storage size similar to the <a href="https://docs.unrealengine.com/udk/Three/DistanceFieldShadows.html">UDK</a>[2]. To bake the distance field representing of the shadow term, instead of comparing a single depth value at texel world position as before, we compare several values within a 0.5m x 0.5m grid, oriented along the normal at position similar to the figure below:<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkYoA4qTuaVHhsleWL_BS0izn_Nr8KhuRh1UaNrB7oH5BayycV1ww8nC_ijy0wPJKEAvDM6foCdYiv73q463pVeUqqjLqJ8XBg8Pu54xd_BgU6nKc5JgJKfxBLRvKXZXEnGj2Zdpt7rJwx/s1600/grid.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="895" data-original-width="1338" height="213" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjkYoA4qTuaVHhsleWL_BS0izn_Nr8KhuRh1UaNrB7oH5BayycV1ww8nC_ijy0wPJKEAvDM6foCdYiv73q463pVeUqqjLqJ8XBg8Pu54xd_BgU6nKc5JgJKfxBLRvKXZXEnGj2Zdpt7rJwx/s320/grid.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Blue dots indicate the positions for sampling shadow map <br />
to compute distance field value for the texel at red dot position.<br />
(The gird is perpendicular to the red vertex normal of the texel.)</td></tr>
</tbody></table>
<br />
By doing this, we can get the shadow information around the baking texel to compute the distance field. We choose this method instead of computing the distance field from a large baked shadow texture because we want to have the shadow distance filed consistently computed in world space no matter how the mesh UV is and this can also avoid UV seam too. But this method may cause potential problem for concave mesh, but so far, for all levels in "Seal Guardian", it is not a big problem.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggVmLYA1fwaEB-kZB4hv4U_MppM6m6tiLxtjD0A5RB6MSy0WMfl9ShmgknRjYApxgW5IJA8ms7E2ae63LCpsoUpD-Jvk8Q1xnT3FQuX7L4rXtWM563MYmSfmkjZqi6NlTxlLc03yCnOaet/s1600/static_shadow_on_static.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="956" data-original-width="1427" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggVmLYA1fwaEB-kZB4hv4U_MppM6m6tiLxtjD0A5RB6MSy0WMfl9ShmgknRjYApxgW5IJA8ms7E2ae63LCpsoUpD-Jvk8Q1xnT3FQuX7L4rXtWM563MYmSfmkjZqi6NlTxlLc03yCnOaet/s320/static_shadow_on_static.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Static shadow only</td></tr>
</tbody></table>
<br />
<span style="font-size: large;"><b>Static shadow for dynamic objects</b></span><br />
For dynamic objects to receive baked shadow, we baked shadow information and store it along with the irradiance volume. For each irradiance probe location, we compare it to the whole scene shadow map and get a binary shadow value. During runtime, we interpolate this binary shadow value by using the position of dynamic object and the probe location to get a smooth transition of shadow value, just like interpolating the SH coefficients of irradiance volume.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwPP9eyMQZpiRIxyo8G23eqBdzlMWvRii9M9BQA93XRwjNOzh2FijR8MqbXEcaOXjagkutorqRRauyLVrRlmQWB5OTsmaQKaVXe9Z6Hsqvq5R4SWzlATKkdIFRSFNkJWUFrqHyt3gWPcTA/s1600/static_shadow_on_dyna.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="956" data-original-width="1427" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwPP9eyMQZpiRIxyo8G23eqBdzlMWvRii9M9BQA93XRwjNOzh2FijR8MqbXEcaOXjagkutorqRRauyLVrRlmQWB5OTsmaQKaVXe9Z6Hsqvq5R4SWzlATKkdIFRSFNkJWUFrqHyt3gWPcTA/s320/static_shadow_on_dyna.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Circled objects does not have light map UV, so they are treated the same as dynamic objects and shadowed with the shadow value stored along with irradiance volume</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCIjQYdFX9cqSZiM-mIG7WK7lyJt1c9tiCkHEgme96an9FQr9oFm4cIYykutmqM4odzJqkXAgCjW7jdJzkm3LBFxBYEEt7Y1gLptiqBBFffgFNPKTiup3cX1XmVi3kZdEDl-dS7zDOQs3J/s1600/volume_shadow_term.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="956" data-original-width="1427" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgCIjQYdFX9cqSZiM-mIG7WK7lyJt1c9tiCkHEgme96an9FQr9oFm4cIYykutmqM4odzJqkXAgCjW7jdJzkm3LBFxBYEEt7Y1gLptiqBBFffgFNPKTiup3cX1XmVi3kZdEDl-dS7zDOQs3J/s320/volume_shadow_term.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Each small sphere is a sampling location for storing the SH coefficients and shadow value of the irradiance for dynamic objects.</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<span style="font-size: large;"><b>Dynamic Shadow</b></span><br />
We use standard shadow mapping algorithm with <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.146.177&rep=rep1&type=pdf">exponential shadow map(ESM)</a>[3] to support dynamic shadow in <a href="http://www.whitebudgie.games/seal_guardian.html">"Seal Guardian"</a>. However due to we need to support a variety of hardware(from iOS, Mac to PC) and minimise code complexity, we choose not to use any cascade shadow map. Instead we use a single shadow map to support dynamic shadow for a very short distance (e.g. 30m-60m) and rely on baked shadow to cover the remaining part of the scene.
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhZPe-451uaV34KpqkC7vQ77GqeR-BDXKz-gSuVRD4F0JCODfgu3tLKoLyhptHZSq4GGa5gbmdKcOHjXoSi7nn5JicDe00KNNX2u5e83EKbAoSbwKpjd9IFWJ9vb5cdlUaN6PK51Bn8eyH/s1600/mixed_shadow.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="956" data-original-width="1427" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjhZPe-451uaV34KpqkC7vQ77GqeR-BDXKz-gSuVRD4F0JCODfgu3tLKoLyhptHZSq4GGa5gbmdKcOHjXoSi7nn5JicDe00KNNX2u5e83EKbAoSbwKpjd9IFWJ9vb5cdlUaN6PK51Bn8eyH/s320/mixed_shadow.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Dynamic shadow mixed with static shadow</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEib6l-clqD3ZTEpxN_BM_o61ThywCtzhYGVaSJ9UH5fYIPcayw6sp6vZt3NocgYWVP3P_-_WHIlJ9ak1rWMXJmovebAIGtSghbmL0GJVbQDaba7EmiBnycLV1ZNyuW2Ck5R0YYVwx2qxS4Q/s1600/dyna_shadow.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="956" data-original-width="1427" height="214" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEib6l-clqD3ZTEpxN_BM_o61ThywCtzhYGVaSJ9UH5fYIPcayw6sp6vZt3NocgYWVP3P_-_WHIlJ9ak1rWMXJmovebAIGtSghbmL0GJVbQDaba7EmiBnycLV1ZNyuW2Ck5R0YYVwx2qxS4Q/s320/dyna_shadow.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Dynamic shadow only</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<b style="font-size: x-large;">Shadow Quality Settings</b><br />
With the above systems, we can make a few shadow quality settings:<br />
<ol>
<li>mix of static shadow with dynamic ESM shadow</li>
<li>mix of static shadow with dynamic hard shadow</li>
<li>static shadow only</li>
</ol>
On iOS platform, we choose the shadow quality depends on the device capability.
Besides, as we are using a forward renderer, when we are drawing objects that outside the dynamic shadow distance, those objects can use the static shadow only shader to save a bit of performance.<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5fFMvgkTbFX8YzKTjnWaYN-fMS4AHiRW4_tn6gkvYQ5batZ6RMC8auSLaXi7UGJeheM_7_E4QGRcrUOVsvzqcsM2yelQ_MgduwHkd1uDqX7cqOATZkM6uP_pRP9sVhGjorgXOMFUS4umW/s1600/quality_soft.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg5fFMvgkTbFX8YzKTjnWaYN-fMS4AHiRW4_tn6gkvYQ5batZ6RMC8auSLaXi7UGJeheM_7_E4QGRcrUOVsvzqcsM2yelQ_MgduwHkd1uDqX7cqOATZkM6uP_pRP9sVhGjorgXOMFUS4umW/s200/quality_soft.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Soft Shadow</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZlznJuXj7tMVwZsfzVaTinK6JjtFfXt6aHdAdYVSYivXh7lij7sFXFB79tAyEjX_I4BaZrvYo69dNenyUc8uMk7KzswTmHmmuPB49PXhxo5TssnzGUICgm2drKOvatHKjfQ83SGYFcjAa/s1600/quality_hard.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZlznJuXj7tMVwZsfzVaTinK6JjtFfXt6aHdAdYVSYivXh7lij7sFXFB79tAyEjX_I4BaZrvYo69dNenyUc8uMk7KzswTmHmmuPB49PXhxo5TssnzGUICgm2drKOvatHKjfQ83SGYFcjAa/s200/quality_hard.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Hard Shadow</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiFOCWPm6vXX0IspETTM8XzijWeLrjn1VGv2BtQ5pGCGExzsOLb32PXiLNuweSbdby78ZDr4F2Ig3xwMLyke4XsGPfGqxvVRYqYmCgKmugFPvuX8rjQnQuYQ_Q0iRP8WvMmcnMVKhBt7K_/s1600/quality_static.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="782" data-original-width="1336" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgiFOCWPm6vXX0IspETTM8XzijWeLrjn1VGv2BtQ5pGCGExzsOLb32PXiLNuweSbdby78ZDr4F2Ig3xwMLyke4XsGPfGqxvVRYqYmCgKmugFPvuX8rjQnQuYQ_Q0iRP8WvMmcnMVKhBt7K_/s200/quality_static.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">No Shadow</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<b style="font-size: x-large;">Conclusion</b><br />
We have briefly describe the shadow system in <a href="http://www.whitebudgie.games/seal_guardian.html">"Seal Guardian"</a>, which uses distance field shadow map for static mesh shadow, interpolated static shadow value for dynamic objects and ESM dynamic shadow for a short distance. Also a few shadow quality settings can be generated with very few coding effort.<br />
<br />
Lastly, if you are interested in "Seal Guardian", feel free to <a href="http://www.whitebudgie.games/seal_guardian.html">check it out</a> and its <a href="http://store.steampowered.com/app/741620/Seal_Guardian/">Steam store page</a> is live now. It will be released on 8<sup>th </sup>Dec, 2017 on iOS/Mac/PC. Thank you.<br />
<br />
<b>References</b><br />
<span style="font-size: x-small;">[1] <a href="http://www.valvesoftware.com/publications/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf">http://www.valvesoftware.com/publications/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf</a></span><br />
<span style="font-size: x-small;">[2] <a href="https://docs.unrealengine.com/udk/Three/DistanceFieldShadows.html">https://docs.unrealengine.com/udk/Three/DistanceFieldShadows.html</a></span><br />
<span style="font-size: x-small;">[3] <a href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.146.177&rep=rep1&type=pdf">http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.146.177&rep=rep1&type=pdf</a></span><br />
<br />
<br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com2tag:blogger.com,1999:blog-7659461179709896430.post-31133125254657000102017-11-29T23:41:00.001+08:002017-12-03T04:51:39.761+08:00Light Map in "Seal Guardian"<b><span style="font-size: large;">Introduction</span></b><br />
Light mapping is a common technique used in games for storing lighting data. <a href="http://www.whitebudgie.games/seal_guardian.html">"Seal Guardian"</a> used light map in order to support large variety of hardware from iOS, Mac to PC because of its low run time cost. There are many methods to bake the light map such as photon mapping and radiosity. Our baking method is similar to <a href="https://www.siggraph.org/education/materials/HyperGraph/radiosity/overview_2.htm">radiosity hemicube</a>[1], but we render a full cube map for each light map texel to store incoming lighting data instead.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOW_AnRguD7H2gCMTzgXDya9RBWq6ZpV-fFSVpsyAXYk4Vl7091ZLPvueNWZQUefqVLQsv-cYaO2BR3_Zf43MdUp1MlhZQJOafMzvAvi6vdDPcHMwLcoA4UM_jeLpKXpr92U3mzJofT11G/s1600/scene_with_lightmap.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="868" data-original-width="1600" height="172" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjOW_AnRguD7H2gCMTzgXDya9RBWq6ZpV-fFSVpsyAXYk4Vl7091ZLPvueNWZQUefqVLQsv-cYaO2BR3_Zf43MdUp1MlhZQJOafMzvAvi6vdDPcHMwLcoA4UM_jeLpKXpr92U3mzJofT11G/s320/scene_with_lightmap.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Scene with light map</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEha9ZN3ic46myPGIfo2SnSXyGAW0uL_9SlFwc2WWYPnqoIbKqglkwzE4D2c-aQOIIGsG4jqQjD3Zvw6KrE4sQ1rnTBObtjJnbLWQ9DDvMtvOX6c9oynvmR1ZgqJah_FEAqiwi1ttihl65xL/s1600/scene_no_lightmap.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="868" data-original-width="1600" height="172" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEha9ZN3ic46myPGIfo2SnSXyGAW0uL_9SlFwc2WWYPnqoIbKqglkwzE4D2c-aQOIIGsG4jqQjD3Zvw6KrE4sQ1rnTBObtjJnbLWQ9DDvMtvOX6c9oynvmR1ZgqJah_FEAqiwi1ttihl65xL/s320/scene_no_lightmap.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Scene without light map</td></tr>
</tbody></table>
</td><td><br /></td></tr>
</tbody></table>
<div>
<br /></div>
<b><span style="font-size: large;">Light Map Atlas</span></b><br />
In each level, light map is built for all static meshes with a second unique UV set. We gather all those static meshes and pack them into a large light map atlas by using this <a href="http://blackpawn.com/texts/lightmaps/">method</a>[2], others method can be chosen, we just pick a simple one.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0OCTCF3FMONMG9CVoH6vQ04z0fvMQCY2dSrbaAK8LKpkHrzy7hEyWJax98-hP4qWqtfQhJ-YuMXVfXSE4w3zGdYd-hsciQV08qpHtC_G8yJMzlPJG0KqdJDqX9SzJgZFuwz2VCTzTu6iu/s1600/atlas.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="514" data-original-width="514" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0OCTCF3FMONMG9CVoH6vQ04z0fvMQCY2dSrbaAK8LKpkHrzy7hEyWJax98-hP4qWqtfQhJ-YuMXVfXSE4w3zGdYd-hsciQV08qpHtC_G8yJMzlPJG0KqdJDqX9SzJgZFuwz2VCTzTu6iu/s320/atlas.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Packing a single large light map atlas for all static mesh in the scene</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Compute Light Map Texel Position</span></b><br />
Then we render all the meshes into a RGBA32Float world position render target using the light map atlas layout created before (by a vertex shader which transform the mesh 3D world position vertex to its unique 2D light map UV). Then we query back the render target to store all the written texels which correspond to the world position of each light map texel. Those position will be used for rendering cube maps for radiosity.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheYtL3v_4ZrYE2tfOzZ7JlpyquGQfO95Pqxo7t1RfH2koV9gKVrDTXC-p1zTOi3dKugYowcthmY5XQQu4SUnmdXQ3uZ8r9zShVLbj_evgigsFeFYxB5OVNLsaE74XGDTmPwzCzBU5rWaDJ/s1600/lightmap_density.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="868" data-original-width="1600" height="216" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheYtL3v_4ZrYE2tfOzZ7JlpyquGQfO95Pqxo7t1RfH2koV9gKVrDTXC-p1zTOi3dKugYowcthmY5XQQu4SUnmdXQ3uZ8r9zShVLbj_evgigsFeFYxB5OVNLsaE74XGDTmPwzCzBU5rWaDJ/s400/lightmap_density.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Each square represent a single light map texel,<br />
we query back those texel world space position to render cube map for radiosity</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Radiosity Baking</span></b><br />
As talked before, we use a method similar to hemicube, but rendering a full cube map instead, so we will render a cube map at each light map texel with all the post processing effect/tone mapping off and just storing the lighting data. Because our light map is intended to store the incoming indirect static lighting for each texel, we convert the incoming lighting data cube map rendered at each texel to 2nd order spherical harmonics coefficients(i.e. 4 coefficients for each RGB channels), the conversion method can be found in <a href="http://www.ppsloan.org/publications/StupidSH36.pdf">"Stupid Spherical Harmonics (SH) Tricks"</a>[3]. So we will need 1 RGBA32Float(or RGBA16Float) cube map and 3 temporal RGBA32Float(or RGBA16Float) textures for each radiosity iteration.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhB65Png3ZN5VME4XLEbM0e0MzaxoqG7_dDlnVQOb11Qg7V8P82jEEkAQwP9j1q8B7m7Gwu9GXpDjTcawgIfvHAuGxdR_bwA9zmkS64YEnxSJ_1KbTBQy9l2oPOzfps4yecrJ1TiL1CwIDx/s1600/bake_pass_0.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="671" data-original-width="1024" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhB65Png3ZN5VME4XLEbM0e0MzaxoqG7_dDlnVQOb11Qg7V8P82jEEkAQwP9j1q8B7m7Gwu9GXpDjTcawgIfvHAuGxdR_bwA9zmkS64YEnxSJ_1KbTBQy9l2oPOzfps4yecrJ1TiL1CwIDx/s320/bake_pass_0.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">No light map, direct lighting and emissive materials only </td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEir4BJRmFmC5qEPs_yOK0Kok2MUgKCv-iNYuTEk4RVsXc2e9h0DbhdC1NgnU9k2h72xtehX8j_W37kOIXX0ApD3u7QNcylO8LjVfiM7FV0-rYKeh0rldlXrCMbZuZIfLD6rU9OL_yJelWu0/s1600/bake_pass_0_lighting.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="671" data-original-width="1024" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEir4BJRmFmC5qEPs_yOK0Kok2MUgKCv-iNYuTEk4RVsXc2e9h0DbhdC1NgnU9k2h72xtehX8j_W37kOIXX0ApD3u7QNcylO8LjVfiM7FV0-rYKeh0rldlXrCMbZuZIfLD6rU9OL_yJelWu0/s320/bake_pass_0_lighting.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Lighting without albedo texture</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<div>
<b><span style="font-size: large;">Radiosity pass 1</span></b><br />
In the first pass, we render all the meshes without analytical light source (e.g. directional light) into the cube map. Only the emissive material such as sky and static light placed in the scene get rendered to inject the initial lighting into the radiosity iterations. We support sphere and box shape static light which get rendered into the cube just like an emissive mesh during rendering the cube map. Once the cube map render is completed, we convert the cube map to SH coefficients and store the values. After all the texels are rendered, we will have an incoming lighting light map from emissive mesh and static light source ready for the next pass.<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzK5_5GDoRzlFHSA1872vy5JY0oBso3JRzVkltryeRxqLMENkmsSIu6FZf_BUlBtVhRYed59lcZaCTdU-f9EmglzgiW91UDvkV4YXT3eV8wChiKTdaM8Aq57kvh9uJIkiwlf4rnCvMRpxo/s1600/bake_pass_1.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="671" data-original-width="1024" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzK5_5GDoRzlFHSA1872vy5JY0oBso3JRzVkltryeRxqLMENkmsSIu6FZf_BUlBtVhRYed59lcZaCTdU-f9EmglzgiW91UDvkV4YXT3eV8wChiKTdaM8Aq57kvh9uJIkiwlf4rnCvMRpxo/s320/bake_pass_1.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Light map baked with the emissive sky material</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMfauMAkHYrulpTI7xh5uPKa-XFeFVKNloi7T2QiYDiP7Y5Jtjiia8utrZVVevHD1OoQF8bRnC5PjpdjmGDM1tzgZ7OdagplyVDoxWJvNRfLONOvj7ai_9p2JB7rTGQYdr6h5K6NkG-00b/s1600/bake_pass_1_lighting.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="671" data-original-width="1024" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiMfauMAkHYrulpTI7xh5uPKa-XFeFVKNloi7T2QiYDiP7Y5Jtjiia8utrZVVevHD1OoQF8bRnC5PjpdjmGDM1tzgZ7OdagplyVDoxWJvNRfLONOvj7ai_9p2JB7rTGQYdr6h5K6NkG-00b/s320/bake_pass_1_lighting.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Lighting with light map using the emissive sky</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Radiosity pass 2</span></b><br />
In the second pass, we render all the meshes only with analytical light source and the SH light map from previous pass into a new cube map to calculate the first bound incoming lighting. Then convert the cube map to SH coefficients. After all the texels are rendered and converted to an SH light map, we need to sum this SH light map to the previous pass SH light map to get the accumulated lighting for pass 1 and 2 into another 3 accumulated SH light maps for our final storage (this accumulated light map is not used in the radiosity iteration, just for final radiosity output.).<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjnlItwOGz7KBkU2miR1CWedUaLaoX2jbAlwYgqwP5WR-RzhQYBRyDizi_jm-woETk9QvNP1MUXtynOjcKsXtvhURHIO40u9hBjGN-YxcmdoyHEIrhkVXjRQ8BsQGxEsBVAZTu0VaC4djyN/s1600/bake_pass_2_indirect.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="671" data-original-width="1024" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjnlItwOGz7KBkU2miR1CWedUaLaoX2jbAlwYgqwP5WR-RzhQYBRyDizi_jm-woETk9QvNP1MUXtynOjcKsXtvhURHIO40u9hBjGN-YxcmdoyHEIrhkVXjRQ8BsQGxEsBVAZTu0VaC4djyN/s320/bake_pass_2_indirect.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Light map baked with direct lighting and emissive material</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmzQvUoSItgQuOCq64ax6WQe5HV0RWCbNuH_ImnEUiBb_bzT6mJybHvIBsp0i3mZTP2ZVvjEdzsmiROUvgkJiaLedocbjocldguIV5juAnNe4LYtqG5rE-Tp2XkZOKzhHwNf9hgzY2Tuwj/s1600/bake_pass_2_indirect_lighting.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="671" data-original-width="1024" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmzQvUoSItgQuOCq64ax6WQe5HV0RWCbNuH_ImnEUiBb_bzT6mJybHvIBsp0i3mZTP2ZVvjEdzsmiROUvgkJiaLedocbjocldguIV5juAnNe4LYtqG5rE-Tp2XkZOKzhHwNf9hgzY2Tuwj/s320/bake_pass_2_indirect_lighting.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Lighting with light map using direct lighting and emissive sky</td></tr>
</tbody></table>
</td></tr>
</tbody></table>
</div>
<br />
<b><span style="font-size: large;">Radiosity pass >= 3</span></b><br />
For the sub-sequence passes, we can use the SH light map from previous iteration to render the cube map and repeat the conversion to SH and accumulate SH lighting for all passes steps to get the incoming indirect lighting for each light map texels.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_5eYNzqxtIbAHFRdkyLk2IUJvvbnTFPXFFDFBhIwC4DAuGcTModn_D9cIiTR7tC46g_QEKrC7tyYUSxgSWEJ5TIBK2uWKaPzp4RBR9AXDBvO9D6_uRSOfRE47DV5fak_ZtBSYq-iB3DnC/s1600/bake_pass_3.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="671" data-original-width="1024" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh_5eYNzqxtIbAHFRdkyLk2IUJvvbnTFPXFFDFBhIwC4DAuGcTModn_D9cIiTR7tC46g_QEKrC7tyYUSxgSWEJ5TIBK2uWKaPzp4RBR9AXDBvO9D6_uRSOfRE47DV5fak_ZtBSYq-iB3DnC/s320/bake_pass_3.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Final baked result, showing both direct and indrect lighting</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9hJIDaNtg6zRQ37LEPeNbxj4tvDtt8MEQ1BrRZG3H1G0p5kouYaoOY43k9AblM6QkLPNEaqM0phGePu2xFtmi5f3NwR8CIq59vpwi4QdL9seunoFtZSnn1BcSdv8oR-Vg07uLiwKv0TrK/s1600/bake_pass_3_lighting.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="671" data-original-width="1024" height="209" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9hJIDaNtg6zRQ37LEPeNbxj4tvDtt8MEQ1BrRZG3H1G0p5kouYaoOY43k9AblM6QkLPNEaqM0phGePu2xFtmi5f3NwR8CIq59vpwi4QdL9seunoFtZSnn1BcSdv8oR-Vg07uLiwKv0TrK/s320/bake_pass_3_lighting.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Lighting using light map, without albedo texture </td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Storage Format</span></b><br />
To store the light map data for runtime and reduce memory usage (3 SH light maps in float format, i.e. 12 values for each texels, is too much data to store...), we decompose the the incoming lighting color data to luma and chroma. We only store the luma data in SH format and compute an average chroma value by integrating the SH RGB incoming lighting data with a SH cosine transfer function along the static mesh normal direction, this will get a reflected Lambertian lighting and we use this value to compute the chroma value. By doing this, we can preserve the directional variation of indirect lighting, keeping an average color of incoming lighting and reduce the light map storage to 6 values per texels. To further reduce the memory usage, we clamp the incoming SH luma value to a predefined range so that we can store it in 8-bit texture. However, using compression like DXT will result in artifacts, so we just store the light map data in 2 RGBA8 textures. <br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgctE0C6k79ZTQEKeiZB7GX1eEC8aUy-KVKAoBHLbqY9iGXc1CMrImPuzkTJZbY8a4LtAWzxfhms4PA09ChxJKsY3hgwqnfAwqMDBcV9F8QPRZFncfK4wE-TTPQKk9eCCR1rkidUyL-rRXO/s1600/final_lightmap.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="514" data-original-width="1027" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgctE0C6k79ZTQEKeiZB7GX1eEC8aUy-KVKAoBHLbqY9iGXc1CMrImPuzkTJZbY8a4LtAWzxfhms4PA09ChxJKsY3hgwqnfAwqMDBcV9F8QPRZFncfK4wE-TTPQKk9eCCR1rkidUyL-rRXO/s640/final_lightmap.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Final light map used in run-time, storing SH luma and average chroma</td></tr>
</tbody></table>
<br />
<b><span style="font-size: large;">Conclusion</span></b><br />
In this post, we have briefly outlined how the light maps are created in "Seal Guardian". It is based on a modified version of radiosity hemicube and using SH as an intermediate representation for baking and reduce the storage size (by splitting the lighting data to luma and chroma.). We skipped some of the baking details like padding the lighting data for each UV shell in each radiosity iteration to avoid light leaking from empty light map texels. Also "Seal Guardian" is rendered using PBR, that means we have metallic material which doesn't work well with radiosity. Instead of converting the metallic to a diffuse material, we pre-filter all the environment probe in each radiosity pass to get the lighting for metallic material. Also, we would like to improve the light map baking in the future, like improving the baking time, fixing the compression problem, we may try BC6H (but need to find another method for iOS compression...), or using a smaller texture size for chroma light map than the luma SH light map texture...<br />
<br />
Lastly, if you are interested in "Seal Guardian", feel free to <a href="http://www.whitebudgie.games/seal_guardian.html">check it out</a> and its <a href="http://store.steampowered.com/app/741620/Seal_Guardian/">Steam store page</a> is live now. It will be released on 8<sup>th </sup>Dec, 2017 on iOS/Mac/PC. Thank you.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFtlx-0fLAOT5q0-zOOie2mDCXynVLK61FeYPOfkzMKlNfcpn5781sMyEW5Hk3t6eJifHnefOFeoieN7uiyJeO_wg87AlAyX4sVVPt_1laqgkaDNYIXCnwuluznMhxIZhZ6XS-o64d_Oj6/s1600/metallic_wall_direct.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="476" data-original-width="1024" height="148" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFtlx-0fLAOT5q0-zOOie2mDCXynVLK61FeYPOfkzMKlNfcpn5781sMyEW5Hk3t6eJifHnefOFeoieN7uiyJeO_wg87AlAyX4sVVPt_1laqgkaDNYIXCnwuluznMhxIZhZ6XS-o64d_Oj6/s320/metallic_wall_direct.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The yellow light bounce on the floor is done by the yellow metallic wall with pre-filtering the environment map in each radiosity pass</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPe_bMvHiaCKj7qg84jjy2LeiVFvee2sxAl5KSJKQf9MGASUgjpNgoqpHQ0gmhLJcbpycZqzOVHcllbtWNg-fLoDAX7_LQpl-c_2_mMW1cTsYrA3p0VRgi2Kvzg2vhZX4kjy_-DrnQISt6/s1600/metallic_wall_indirect.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="476" data-original-width="1024" height="148" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPe_bMvHiaCKj7qg84jjy2LeiVFvee2sxAl5KSJKQf9MGASUgjpNgoqpHQ0gmhLJcbpycZqzOVHcllbtWNg-fLoDAX7_LQpl-c_2_mMW1cTsYrA3p0VRgi2Kvzg2vhZX4kjy_-DrnQISt6/s320/metallic_wall_indirect.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Showing only the indirect lighting<br />
<br /></td></tr>
</tbody></table>
</td><td><br /></td></tr>
</tbody></table>
<b><br />References</b><br />
<span style="font-size: x-small;">[1] <a href="https://www.siggraph.org/education/materials/HyperGraph/radiosity/overview_2.htm">https://www.siggraph.org/education/materials/HyperGraph/radiosity/overview_2.htm</a></span><br />
<span style="font-size: x-small;">[2] <a href="http://blackpawn.com/texts/lightmaps/">http://blackpawn.com/texts/lightmaps/</a></span><br />
<span style="font-size: x-small;">[3] <a href="http://www.ppsloan.org/publications/StupidSH36.pdf">http://www.ppsloan.org/publications/StupidSH36.pdf</a></span><br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com1tag:blogger.com,1999:blog-7659461179709896430.post-30857311232070502102017-11-23T08:18:00.000+08:002017-11-23T08:18:42.499+08:00"Seal Guardian" announced!!!<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcgmCmEA4Aad7ctamwjFI6p13kEbcAK_Rp44aOtkybXmkCjub-MtnTniq5kA977T1xPUUJxCTNl5aqI8wP0XqvNyUxY6XVZcQP9tb2UgDuOp25D0LdZK668Ol3yeR3KHSK-FkAhdT5_ORh/s1600/main_capsule_616_353.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="353" data-original-width="616" height="366" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcgmCmEA4Aad7ctamwjFI6p13kEbcAK_Rp44aOtkybXmkCjub-MtnTniq5kA977T1xPUUJxCTNl5aqI8wP0XqvNyUxY6XVZcQP9tb2UgDuOp25D0LdZK668Ol3yeR3KHSK-FkAhdT5_ORh/s640/main_capsule_616_353.png" width="640" /></a></div>
<br />
Finally, <a href="http://www.whitebudgie.games/seal_guardian.html">"Seal Guardian"</a> is announced!!!<br />
<br />
It has been a very long time since my last post, I was busy with making the game "Seal Guardian".<br />
<br />
"Seal Guardian" is a hard core hack and slash action game, powered by the engine described in this blog. It took me more than 5 years to code the engine(with some help of open source libraries like Bullet physics, Lua and DLMalloc.)/gameplay and creating all the visual artwork from modelling, texturing, skinning, rigging and animation... The game will be available on 8<sup>th</sup> December, 2017 on iOS/Mac/PC via iOS App Store/Mac App Store/<a href="http://store.steampowered.com/app/741620/Seal_Guardian/">Steam</a>.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<iframe allowfullscreen="" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/Ek2DMv7V2yM/0.jpg" frameborder="0" height="270" src="https://www.youtube.com/embed/Ek2DMv7V2yM?feature=player_embedded" width="480"></iframe></div>
<br />
Finishing a game takes lots of effort, patient and time, especially making the whole game on your own. It contains lots of fun tasks like rendering and gameplay, but it contains even more boring tasks like game menu, localisation, UI(e.g. handling all the input UI for mouse/keyboard, touch screen, different gamepad type like PS4/XBox/MFi in different languages for different resolution), making website, prepare online store artwork like app icon, trailer video and screen shots(where different stores have different resolution requirements... :S), opening bank account(which may take more than a month for a small indie game company.). Hope I can share these in future blog posts.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvYetX1Qwz4FSC3uSbeTVwe17tJ4FJf9Cx79mDOKaqwMHeEx9rQscu2onkihyW7xnnowHKC_xLrzDqWjqMXMHOB973LUHA6cbRv0UgKQ_LLPS8C-XecdOyzw6Xj0LsXqQc309YRj7wQtuY/s1600/shot_2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="900" data-original-width="1600" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvYetX1Qwz4FSC3uSbeTVwe17tJ4FJf9Cx79mDOKaqwMHeEx9rQscu2onkihyW7xnnowHKC_xLrzDqWjqMXMHOB973LUHA6cbRv0UgKQ_LLPS8C-XecdOyzw6Xj0LsXqQc309YRj7wQtuY/s400/shot_2.png" width="400" /></a></div>
<br />
Before sharing the game's postmortem and waiting for its release on 8<sup>th</sup> December, 2017. I will write some more blog posts about the engine tech used in "Seal Guardian", e.g. light map baking processing and its storage format, how the static shadow is baked, visibility system, the cross platform rendering pipeline... So, stay tuned!<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgilCsBMFHrPd2icJ9cNQfgB2njvUz-dulFx4caBNm9etyLrQHjDtquQ8gHOW_1hXkiKvKijMefKRzrnR7szJ-9vZ2jUQSVh31_wgmlK6x34ZsorTyP-urpLjBFPmKGZzGykpue1V0OTEd-/s1600/shot_3.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="900" data-original-width="1600" height="225" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgilCsBMFHrPd2icJ9cNQfgB2njvUz-dulFx4caBNm9etyLrQHjDtquQ8gHOW_1hXkiKvKijMefKRzrnR7szJ-9vZ2jUQSVh31_wgmlK6x34ZsorTyP-urpLjBFPmKGZzGykpue1V0OTEd-/s400/shot_3.png" width="400" /></a></div>
<br />
In the mean time, feel free to visit the "Seal Guardian" <a href="http://store.steampowered.com/app/741620/Seal_Guardian/">Steam store page</a> and share it if you like.<br />
Thank you very much!!! =]<br />
<br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com0tag:blogger.com,1999:blog-7659461179709896430.post-68441989085193248792015-02-07T08:50:00.000+08:002015-02-08T03:20:02.003+08:00Pre-Integrated Skin Shading<b><span class="Apple-style-span" style="font-size: large;">Introduction</span></b><br />
Recently, I was implementing skin shading in my engine. The pre-integrated skin shading technique is chosen because it has a low performance cost and does not require another extra pass. The idea is to pre-bake the scattering effect over a ring into a texture for different curvature to look up during run-time. More information can be found in the <a href="http://www.amazon.com/GPU-Pro-2-Wolfgang-Engel/dp/1568817185">GPU Pro 2</a>, <a href="http://advances.realtimerendering.com/s2011/Penner%20-%20Pre-Integrated%20Skin%20Rendering%20(Siggraph%202011%20Advances%20in%20Real-Time%20Rendering%20Course).pptx">the SIGGRAPH slides</a>, and also in <a href="http://blog.selfshadow.com/publications/s2013-shading-course/rad/s2013_pbs_rad_notes.pdf">the presentation of the game "The Oder:1886"</a>. Here is the result implemented in my engine (all screen shots are rendered with filmic tone mapping):<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-kpxF6FACUp4Iw0UspXBmPWht0iNthcUQG7Zzzxt8lpEryrcWljc8-3unrzN1-Du8eQKtlE6B3P0BmQaNhmhKfjG5WgThyJbjjRDzRKVzgAbvmn9KOQLxbBhPE4SyBz_7MAbhYzXZnAs6/s1600/skin_result.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi-kpxF6FACUp4Iw0UspXBmPWht0iNthcUQG7Zzzxt8lpEryrcWljc8-3unrzN1-Du8eQKtlE6B3P0BmQaNhmhKfjG5WgThyJbjjRDzRKVzgAbvmn9KOQLxbBhPE4SyBz_7MAbhYzXZnAs6/s1600/skin_result.png" height="243" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The head on the left is lit with Oren-Nayar shading<br />
The head on the right is lit with pre-integrated skin</td></tr>
</tbody></table>
<br />
<b><span class="Apple-style-span" style="font-size: large;">Curve Approximation for Direct Lighting</span></b><br />
In my engine, iOS is one of my target platform, which only have <a href="https://developer.apple.com/library/ios/documentation/DeviceInformation/Reference/iOSDeviceCompatibility/OpenGLESPlatforms/OpenGLESPlatforms.html#//apple_ref/doc/uid/TP40013599-CH106-SW1">8 texture unit</a> to use for the OpenGL ES 2.0 API. This is not enough for the pre-integrated skin look up texture because my engine have already used some slots for light map, shadow map, IBL... So I need to find an approximation to the look up texture.<br />
<br />
Unfortunately I don't have a Mathematica license at home. So I think may be I can fit the curve manually by inspecting the shape of the curve. So, I started by plotting the graph of the equation:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEizQ0P2P6yPsUJAAKB5VWOCJwBzSoS0a6ekoRGPuWR8CABM3dzyjV-ma4jAwEXWDbv_9xt-nPmwqDCZ3a_hOMspiH8mAKU8GuUoXrTmUgK2tgoDMmOggzlkjE4p413HngOb9mYhKsxuct0e/s1600/forumla_direct.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEizQ0P2P6yPsUJAAKB5VWOCJwBzSoS0a6ekoRGPuWR8CABM3dzyjV-ma4jAwEXWDbv_9xt-nPmwqDCZ3a_hOMspiH8mAKU8GuUoXrTmUgK2tgoDMmOggzlkjE4p413HngOb9mYhKsxuct0e/s1600/forumla_direct.png" height="73" width="320" /></a></div>
Here is the shape of the red channel diffusion curve, plotting with <i>N.L</i>(normalized to [0, 1])and <i>r</i> (from 2 to 16):<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNK0EAQgprTv9MZ7xSSOzIczkC7_HT4PMY4UwVcZ5dBBQuYsIRnWAUBNKYx96AHMagol9SGGyXPwbh1e7UwGkOY5Rs-Deh9HowjDhGDsRC37bmFrRf0M8LCgDbQgtYxJB2HWEwi9VLHNw8/s1600/direct_R.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgNK0EAQgprTv9MZ7xSSOzIczkC7_HT4PMY4UwVcZ5dBBQuYsIRnWAUBNKYx96AHMagol9SGGyXPwbh1e7UwGkOY5Rs-Deh9HowjDhGDsRC37bmFrRf0M8LCgDbQgtYxJB2HWEwi9VLHNw8/s1600/direct_R.png" height="189" width="200" /></a></div>
My idea for approximating the curve is by finding some simple curves first and then interpolate between them like this:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhT-y-fw7zyoI3kcos76mjo4ZZXiYeWoTt2Sj5TmHHiK7VjFLrK20ZOBnmgr4G_lm_v3-GErbznfM2CkwBoxFVM7XLe8C-DttMUv4RvQ3b2o7N_B1ZeyjU0YmVxTmuHgVzvDUIKjld3TBYO/s1600/approx_idea.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhT-y-fw7zyoI3kcos76mjo4ZZXiYeWoTt2Sj5TmHHiK7VjFLrK20ZOBnmgr4G_lm_v3-GErbznfM2CkwBoxFVM7XLe8C-DttMUv4RvQ3b2o7N_B1ZeyjU0YmVxTmuHgVzvDUIKjld3TBYO/s1600/approx_idea.png" height="145" width="200" /></a></div>
For the light blue line in the above figure, a single straight line can get a close enough approximation:<br />
<div style="text-align: center;">
<i>curve1= saturate(1.95 * NdotL -0.96)</i></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_YSu0l5sC6UmPIezDC7O4sh7Y9ybJBqfHbt7o2ufbrWsJkDW8hAWWcjX5iVykncadKnj6SUwhtndLMJRzG9_SntNRzvUkxuI4oWSfSFfG3w2qaYKwAei62vrnwfHwc0_6Sx800SMUMlzK/s1600/direct_curve1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_YSu0l5sC6UmPIezDC7O4sh7Y9ybJBqfHbt7o2ufbrWsJkDW8hAWWcjX5iVykncadKnj6SUwhtndLMJRzG9_SntNRzvUkxuI4oWSfSFfG3w2qaYKwAei62vrnwfHwc0_6Sx800SMUMlzK/s1600/direct_curve1.png" height="189" width="200" /></a></div>
To approximate the dark blue line, I divide it into 2 parts: linear part and quadratic part:<br />
<div style="text-align: center;">
<i>curve0_linear= saturate(1.75* NdotL -0.76)</i></div>
<div style="text-align: center;">
<i>curve0_quadratic= 0.65*(NdotL^ 2) + 0.045</i></div>
<table align="center">
<tbody>
<tr>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWpN5L1CiZyERtmOyCUAzqZSEQWu5Vqa0E0HpBc_po5BSlLfxrcWAV3umK1h_E8EdQ3AINNyqMu0mlR4iVzFkkt21ZN15lfQ9COdQWTyZLVgWWUjOngWvU53kz52n2x0hYjLwPRrGUwD6V/s1600/curve0_upper.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWpN5L1CiZyERtmOyCUAzqZSEQWu5Vqa0E0HpBc_po5BSlLfxrcWAV3umK1h_E8EdQ3AINNyqMu0mlR4iVzFkkt21ZN15lfQ9COdQWTyZLVgWWUjOngWvU53kz52n2x0hYjLwPRrGUwD6V/s1600/curve0_upper.png" height="200" width="217" /></a></td>
<td><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVia3VMtY8w0FfstGEXbWXHTLhqCOARwmgSCkEw03PCtWAZrhYsnE_PsVz52bLQbZmTJeFjKf8FbtBTpgGfSTvt0QI6a8RSIo8ErjuRnMBOsqppQDQlxLyqKosmZ_jgq-HTSGiQZMbLm0s/s1600/curve0_lower.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVia3VMtY8w0FfstGEXbWXHTLhqCOARwmgSCkEw03PCtWAZrhYsnE_PsVz52bLQbZmTJeFjKf8FbtBTpgGfSTvt0QI6a8RSIo8ErjuRnMBOsqppQDQlxLyqKosmZ_jgq-HTSGiQZMbLm0s/s1600/curve0_lower.png" height="200" width="196" /></a> </td></tr>
</tbody></table>
blending both linear and quadratic curve will get a curve that is similar to the original function:<br />
<div style="text-align: center;">
<i>curve0= lerp(curve0_quadratic, curve0_linear, NdotL^2)</i></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3BxeA3JnoPojPHlowrLt2IH6BnewPgB4EkVarJPGtlrovMjfukHWo5AgKg42bx8K0ycczJ3t3YbY80XKTULfVu3I7DC5Si3iDQjWTq2Vead0owhJWu_qeYP6qTepxoLLMWE5KKLPdz6fF/s1600/direct_curve0.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3BxeA3JnoPojPHlowrLt2IH6BnewPgB4EkVarJPGtlrovMjfukHWo5AgKg42bx8K0ycczJ3t3YbY80XKTULfVu3I7DC5Si3iDQjWTq2Vead0owhJWu_qeYP6qTepxoLLMWE5KKLPdz6fF/s1600/direct_curve0.png" height="176" width="200" /></a></div>
Now we have 2 curves that is similar to our original function at both end. By mixing them together, we can get something similar to the original function like this:<br />
<div style="text-align: center;">
<i>curve= lerp(curve0, curve1, 1 - (1 - curvature)^4)</i></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLs8fyOHCb_1hmBWLk2X6EUmMPb9oOBVTcd4Zlb1NEYk9ytzLLK9x1KNH0wXsGb5qQL-_A8jTh3mFRpjGjygpWsztjdh49C6hD1_gxWXN7ivOUyfgOVxsTb9PaDnzvHejRzLghxxdcEnat/s1600/func_compare.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLs8fyOHCb_1hmBWLk2X6EUmMPb9oOBVTcd4Zlb1NEYk9ytzLLK9x1KNH0wXsGb5qQL-_A8jTh3mFRpjGjygpWsztjdh49C6hD1_gxWXN7ivOUyfgOVxsTb9PaDnzvHejRzLghxxdcEnat/s1600/func_compare.png" height="195" width="200" /></a></div>
By repeating the above steps for the blue and green channels, we can shade the pre-integrated skin without the look up texture. Here is the result:<br />
<table align="center" border="0" cellspacing="0">
<tbody>
<tr>
<td height="106" width="103"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWDt5JJbUW8AkpI9MtWIwJIRkZZv1US5sRM2xVSlLLdfyGKJ7jAgR4N6vTDpPtXX7d9F-Lcr2e4LEOslSYvYydXfyaNiiFoa0xSE8mptr7iMyNu93dneA46ChUEPDbdIsH2r3TKdmPwrrp/s1600/direct_exact_r2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiWDt5JJbUW8AkpI9MtWIwJIRkZZv1US5sRM2xVSlLLdfyGKJ7jAgR4N6vTDpPtXX7d9F-Lcr2e4LEOslSYvYydXfyaNiiFoa0xSE8mptr7iMyNu93dneA46ChUEPDbdIsH2r3TKdmPwrrp/s1600/direct_exact_r2.png" height="106" width="103" /></a></div>
</td>
<td height="106" width="103"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXQdQkkmDFQsGX3iKnxM8hKggZR6NDtJz-xiqpGaQK44fNOP80UdtFW_bqIrfvql1z-wZPVVhnFsjKhSmonIlo9O0cd-G0G3s4lw59Khdz626I8zn793YlIthV9tW1vbviZ4hPciqvE_E6/s1600/direct_exact_r4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXQdQkkmDFQsGX3iKnxM8hKggZR6NDtJz-xiqpGaQK44fNOP80UdtFW_bqIrfvql1z-wZPVVhnFsjKhSmonIlo9O0cd-G0G3s4lw59Khdz626I8zn793YlIthV9tW1vbviZ4hPciqvE_E6/s1600/direct_exact_r4.png" height="106" width="103" /></a></div>
</td>
<td height="106" width="103"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjva5O0OkQyUn6_3PAaKuVelEWM5o3SwyVOm3NbqP9BpNRDLAVuKk_xH0OoyE20sTbG7t9NukF3qXr40y1W_Ni2NQp0kmgwIMavXLi6XGnkLQPZEWFEDZ0Lu_3019Sl0h0PHVfepGsV-yaL/s1600/direct_exact_r8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjva5O0OkQyUn6_3PAaKuVelEWM5o3SwyVOm3NbqP9BpNRDLAVuKk_xH0OoyE20sTbG7t9NukF3qXr40y1W_Ni2NQp0kmgwIMavXLi6XGnkLQPZEWFEDZ0Lu_3019Sl0h0PHVfepGsV-yaL/s1600/direct_exact_r8.png" height="106" width="103" /></a></div>
</td>
<td height="106" width="103"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHkVPnUImbO6Q-FjwY5wUVcSt-nMNDA9elWrI1z5ylKyt-re2fNE1tbXMRZn4z_lQ-JTjcgVZLLfGU79nIATQEL4m67cCy5jYt0aijKyWKNtT9ypnNNlx7Z8nXXuaBJ3_7IwZoSVYx4N9n/s1600/direct_exact_r16.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHkVPnUImbO6Q-FjwY5wUVcSt-nMNDA9elWrI1z5ylKyt-re2fNE1tbXMRZn4z_lQ-JTjcgVZLLfGU79nIATQEL4m67cCy5jYt0aijKyWKNtT9ypnNNlx7Z8nXXuaBJ3_7IwZoSVYx4N9n/s1600/direct_exact_r16.png" height="106" width="103" /></a></div>
</td>
</tr>
<tr>
<td height="106" width="103"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgx5ra4R4UMTYN9GzxqNneQgQwxAbA-ExoQr45CKvP5OEO471qjSpI0eHLIqI69ML97MjTPbjGcZBA5U2kMoFzw0HTUKwE94LyWXK3fQmaZ6Mum6OC9UHPmElllxcAZZgZWDaQGDpR1svwM/s1600/direct_approx_r2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgx5ra4R4UMTYN9GzxqNneQgQwxAbA-ExoQr45CKvP5OEO471qjSpI0eHLIqI69ML97MjTPbjGcZBA5U2kMoFzw0HTUKwE94LyWXK3fQmaZ6Mum6OC9UHPmElllxcAZZgZWDaQGDpR1svwM/s1600/direct_approx_r2.png" height="106" width="103" /></a></div>
</td>
<td height="106" width="103"><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiq7_LE1IZ9mgpRZSGMfAjW05rxCDMonqMIF6aUDlxfpYAJSTBgqckXZ5GTmvfcSuEeShNTO8tewaNfvzWD6knfKyv8SDEYzsZuWF61_BpdwL45jQB-f3sCOTfasP1N_1t0p71h4eAnEKQf/s1600/direct_approx_r4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiq7_LE1IZ9mgpRZSGMfAjW05rxCDMonqMIF6aUDlxfpYAJSTBgqckXZ5GTmvfcSuEeShNTO8tewaNfvzWD6knfKyv8SDEYzsZuWF61_BpdwL45jQB-f3sCOTfasP1N_1t0p71h4eAnEKQf/s1600/direct_approx_r4.png" height="106" width="103" /></a></div>
</td>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJZWOb5Ouexc3ZVzR2WQoU3Z54pwyvYLRexcK1Uq_EML23VHAMJ_i2EEr_wi8fyR1tPODf7kc8RYDRbrycjPnRFjvhOg-XhhLJnSu7GjdGlBvjWsOC1kJGfJmRlOZzBD6kIlJaMIjkJdyj/s1600/direct_approx_r8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiJZWOb5Ouexc3ZVzR2WQoU3Z54pwyvYLRexcK1Uq_EML23VHAMJ_i2EEr_wi8fyR1tPODf7kc8RYDRbrycjPnRFjvhOg-XhhLJnSu7GjdGlBvjWsOC1kJGfJmRlOZzBD6kIlJaMIjkJdyj/s1600/direct_approx_r8.png" height="106" width="103" /></a>
</td>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5quZydBFVOMjfuJlp1cnc5-HpLeQCN-Wmn_QVNy3HKGeOxJmfLh4dZ_v0JKTelGzNisy8SU5Wyp8tjylaP_dlWNTTtuBqABWCXPsQ_hOdWfL-XZhEKRgs9BZ5hTQrztfPUYK7XG48R4-F/s1600/direct_approx_r16.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5quZydBFVOMjfuJlp1cnc5-HpLeQCN-Wmn_QVNy3HKGeOxJmfLh4dZ_v0JKTelGzNisy8SU5Wyp8tjylaP_dlWNTTtuBqABWCXPsQ_hOdWfL-XZhEKRgs9BZ5hTQrztfPUYK7XG48R4-F/s1600/direct_approx_r16.png" height="106" width="103" /></a>
</td>
</tr>
<tr>
<td align="center" colspan="4"><div style="text-align: left;">
<div style="text-align: left;">
Lit with a single directional light<br />
From left to right: <i>r </i>= 2, 4, 8, 16</div>
</div>
<div style="text-align: left;">
<div style="text-align: left;">
Upper row: shaded with look up texture</div>
</div>
<div style="text-align: left;">
<div style="text-align: left;">
Lower row: shaded with approximated function</div>
</div>
</td></tr>
</tbody></table>
<br />
This is how it looks like when applying to a human head model:<br />
<table align="center" border="0" cellspacing="0">
<tbody>
<tr>
<td height="167" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnRavH8j8rxFOKFU2xnhY5hTM2szT_pfSJNSBgN3vuxLBdVu_t1Tu8XaiPayH_fRFBr0IDWYPEZoqVsgf2TsIuCtuq-P3x0lENDa48AH8_akF53EUk7JDAR7Ks0-I0gjEa7pmbQ3p7xSGY/s1600/direct_shading_exact.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnRavH8j8rxFOKFU2xnhY5hTM2szT_pfSJNSBgN3vuxLBdVu_t1Tu8XaiPayH_fRFBr0IDWYPEZoqVsgf2TsIuCtuq-P3x0lENDa48AH8_akF53EUk7JDAR7Ks0-I0gjEa7pmbQ3p7xSGY/s1600/direct_shading_exact.png" height="167" width="200" /></a>
</td>
<td height="167" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlIsRO8RnHq6Mq87OYRRss0Eezz4jl9lYDIXfPu24gP2D37SCOO2R5dHtDphl-24oPFc6MXc5f8GInCfydgMhkU7dKBdbUkXughgM6i4MTEMjp7Y2RMWyaQkXoQ0njkD3KbqVgxhemdH2K/s1600/direct_shading_approx.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjlIsRO8RnHq6Mq87OYRRss0Eezz4jl9lYDIXfPu24gP2D37SCOO2R5dHtDphl-24oPFc6MXc5f8GInCfydgMhkU7dKBdbUkXughgM6i4MTEMjp7Y2RMWyaQkXoQ0njkD3KbqVgxhemdH2K/s1600/direct_shading_approx.png" height="167" width="200" /></a>
</td>
<td height="167" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcSHb8WO-HjTSKhyP_CMTWldHaQci4q6hZaMr9KYumSRtL0j1r5_x7Hlc8j_IfZ3id-0o1u0kFsDCm8nXzMoPNxJJDN5RWylYFGk5jwYmOeUVqmYZ6GJ-o29EnohdV-_M4DTxZxbtceqlX/s1600/direct_shading_lambert.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcSHb8WO-HjTSKhyP_CMTWldHaQci4q6hZaMr9KYumSRtL0j1r5_x7Hlc8j_IfZ3id-0o1u0kFsDCm8nXzMoPNxJJDN5RWylYFGk5jwYmOeUVqmYZ6GJ-o29EnohdV-_M4DTxZxbtceqlX/s1600/direct_shading_lambert.png" height="167" width="200" /></a>
</td>
</tr>
<tr>
<td height="167" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiF6iF4YI8h7TYd3y2eblINrJV9g99e_xblb33kvs6YwHNhDUXsjT0yR_Kpo0ZQ0LbWLw2EM6jIZgvymzzIyYWS_NHmur8slifxlWx6-sQNz4DweeYb0ug78OOB_ujYthQfpxyCjtJDA3dw/s1600/direct_lighting_exact.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiF6iF4YI8h7TYd3y2eblINrJV9g99e_xblb33kvs6YwHNhDUXsjT0yR_Kpo0ZQ0LbWLw2EM6jIZgvymzzIyYWS_NHmur8slifxlWx6-sQNz4DweeYb0ug78OOB_ujYthQfpxyCjtJDA3dw/s1600/direct_lighting_exact.png" height="167" width="200" /></a>
</td>
<td height="167" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQEUyUQJkU9qICkF6e_xJIA1yEGSzyrgADwpnsc7adTpdHJ-KC7_7eB6oet0P4-ASXROp82_MEV_Z8W-7uE5ITDQ6aMS4ijKHeRcOoZDjZpsfiCuuPasjaubIYELik-njrviHsLRdosUjp/s1600/direct_lighting_approx.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQEUyUQJkU9qICkF6e_xJIA1yEGSzyrgADwpnsc7adTpdHJ-KC7_7eB6oet0P4-ASXROp82_MEV_Z8W-7uE5ITDQ6aMS4ijKHeRcOoZDjZpsfiCuuPasjaubIYELik-njrviHsLRdosUjp/s1600/direct_lighting_approx.png" height="167" width="200" /></a>
</td>
<td height="167" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIdu4gQSKQ_1-clLF3K_mEcgDUrHKH72W4OGBxuQJA5i1x46PfZgPpLnoneVSKyiVS4RoWDAxJUi6NUR2DfmYeBT_-3wAZ5eL76op0a3X8cBqCkiG92vRs8ZxhNIoRiEBb5Jq6MCsfirD6/s1600/direct_lighting_lambert.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhIdu4gQSKQ_1-clLF3K_mEcgDUrHKH72W4OGBxuQJA5i1x46PfZgPpLnoneVSKyiVS4RoWDAxJUi6NUR2DfmYeBT_-3wAZ5eL76op0a3X8cBqCkiG92vRs8ZxhNIoRiEBb5Jq6MCsfirD6/s1600/direct_lighting_lambert.png" height="167" width="200" /></a>
</td>
</tr>
<tr>
<td align="center" colspan="3"><div style="text-align: left;">
<div style="text-align: left;">
<span class="Apple-style-span">From left to right: shaded with look up texture, approximated </span>function<span class="Apple-style-span">, lambert shader</span></div>
</div>
<div style="text-align: left;">
<div style="text-align: left;">
Upper row: shaded with albedo texture applied</div>
</div>
<div style="text-align: left;">
<div style="text-align: left;">
Lower row: showing only lighting result</div>
</div>
</td></tr>
</tbody></table>
<br />
For your reference, here is the approximated function I used for the RGB channels:<br />
<blockquote class="tr_bq">
<span class="Apple-style-span" style="font-size: xx-small;">NdotL = mad(NdotL, 0.5, 0.5); // map to 0 to 1 range<br />float curva = (1.0/mad(curvature, 0.5 - 0.0625, 0.0625) - 2.0) / (16.0 - 2.0); // curvature is within [0, 1] remap to normalized <i>r</i> from 2 to 16<br />float oneMinusCurva = 1.0 - curva;<br />float3 curve0;<br />{<br /> float3 rangeMin = float3(0.0, 0.3, 0.3);<br /> float3 rangeMax = float3(1.0, 0.7, 0.7);<br /> float3 offset = float3(0.0, 0.06, 0.06);<br /> float3 t = saturate( mad(NdotL, 1.0 / (rangeMax - rangeMin), (offset + rangeMin) / (rangeMin - rangeMax) ) );<br /> float3 lowerLine = (t * t) * float3(0.65, 0.5, 0.9);<br /> lowerLine.r += 0.045;<br /> lowerLine.b *= t.b;<br /> float3 m = float3(1.75, 2.0, 1.97);<br /> float3 upperLine = mad(NdotL, m, float3(0.99, 0.99, 0.99) -m );<br /> upperLine = saturate(upperLine);<br /> float3 lerpMin = float3(0.0, 0.35, 0.35);<br /> float3 lerpMax = float3(1.0, 0.7 , 0.6 );<br /> float3 lerpT = saturate( mad(NdotL, 1.0/(lerpMax-lerpMin), lerpMin/ (lerpMin - lerpMax) ));<br /> curve0 = lerp(lowerLine, upperLine, lerpT * lerpT);<br />}<br />float3 curve1;<br />{<br /> float3 m = float3(1.95, 2.0, 2.0);<br /> float3 upperLine = mad( NdotL, m, float3(0.99, 0.99, 1.0) - m);<br /> curve1 = saturate(upperLine);<br />}<br />float oneMinusCurva2 = oneMinusCurva * oneMinusCurva;<br />float3 brdf = lerp(curve0, curve1, mad(oneMinusCurva2, -1.0 * oneMinusCurva2, 1.0) );</span></blockquote>
<span class="Apple-style-span" style="font-size: large; font-weight: bold;"><br /></span>
<span class="Apple-style-span" style="font-size: large; font-weight: bold;">Curve Approximation for Indirect Lighting</span><br />
In my engine, the indirect lighting is stored in spherical harmonics up to order 2. So the pre-integrated skin BRDF need to be projected into spherical harmonics coefficients and can be store in look up texture just like <a href="http://blog.selfshadow.com/publications/s2013-shading-course/rad/s2013_pbs_rad_notes.pdf">the presentation of "The Oder:1886"</a> described. One thing to note about the integral range in equation (19) from the paper should be up to π instead of π/2 because to project the coefficients, we need to integrate over the whole sphere domain and the value of D(<i>θ</i>, <i>r</i>) in the lower hemi-sphere may be non zero for small <i>r</i> due to sub-surface scattering, which does not like the clamped cos(θ). So I compute the spherical harmonics coefficient using this:<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-X-FbqRivLaQqD8SBTkfBypil6H_ngGe0mr8N_iYLYuJi72m2uiBb__XIgpx3dZh6x1gfsamNezNg1Zda_c-ntMBD0A5SG-YO30X6pPal9H5eCFZUmGpf00EC_TI07wVdEmGFJPRli0dq/s1600/forumla_SH.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-X-FbqRivLaQqD8SBTkfBypil6H_ngGe0mr8N_iYLYuJi72m2uiBb__XIgpx3dZh6x1gfsamNezNg1Zda_c-ntMBD0A5SG-YO30X6pPal9H5eCFZUmGpf00EC_TI07wVdEmGFJPRli0dq/s1600/forumla_SH.png" height="40" width="320" /></a></div>
To make the indirect lighting work on my target platform, an approximate function to this indirect lighting look up texture also need to be found. By using similar methods described above and with some trail and error. Here is my result:<br />
<table align="center" border="0" cellspacing="0">
<tbody>
<tr>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhW9s7kJYSwmxfrb1yGHyjvDiEc33j5Y-svab_troQQ21oXSic388NPXh8i2F-Qo5APXoPJRzKBzQw4kFYu6l5GTeXBUYuW9yaOkYkXYLk15oOqRhUHwNrf4Y-3Mgxps1mNmqj9Vr9Sggy4/s1600/sh_exact_r2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhW9s7kJYSwmxfrb1yGHyjvDiEc33j5Y-svab_troQQ21oXSic388NPXh8i2F-Qo5APXoPJRzKBzQw4kFYu6l5GTeXBUYuW9yaOkYkXYLk15oOqRhUHwNrf4Y-3Mgxps1mNmqj9Vr9Sggy4/s1600/sh_exact_r2.png" height="106" width="103" /></a>
</td>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKR94gMsiNjwpMCauk_S0FepW7rrEAwDdDnZutnRB6mI-nDFO9GE_0p7ZPaKexpBjjGj7imssRWGfhWwp1MfQ_O1xS8TaDGdE7frjXz2rgu3i9uxVjiKvRW_NHBPAx0w5BNIK0RjU_kwtI/s1600/sh_exact_r4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKR94gMsiNjwpMCauk_S0FepW7rrEAwDdDnZutnRB6mI-nDFO9GE_0p7ZPaKexpBjjGj7imssRWGfhWwp1MfQ_O1xS8TaDGdE7frjXz2rgu3i9uxVjiKvRW_NHBPAx0w5BNIK0RjU_kwtI/s1600/sh_exact_r4.png" height="106" width="103" /></a>
</td>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipo8TzEmPG6bNjhsfHY0GQLqXeLqNmmVfzlB0m_KFbJ6YfC-fvH2KqJXhGfJ1wgP6iCZLNsPRDKE71fU-U0X2hxsyROSmfd2iGnxZrs-hOzDU_vfvE4jbH4doUmHAcCFfZbIe_mlV2IZiu/s1600/sh_exact_r8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEipo8TzEmPG6bNjhsfHY0GQLqXeLqNmmVfzlB0m_KFbJ6YfC-fvH2KqJXhGfJ1wgP6iCZLNsPRDKE71fU-U0X2hxsyROSmfd2iGnxZrs-hOzDU_vfvE4jbH4doUmHAcCFfZbIe_mlV2IZiu/s1600/sh_exact_r8.png" height="106" width="103" /></a>
</td>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVfiQuzaEY_uBhDj8XkciA0RefEM95S9QmlGRPePB531zPHO3COkVR_lEAMF5TyzWRKQeTX_7J7ip6yvw7YT93hlbtH9ZlGVwVmLudZGh1vzqX5swYzmqgVf68eEkmqhb0CGMyCROK-AzZ/s1600/sh_exact_r16.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVfiQuzaEY_uBhDj8XkciA0RefEM95S9QmlGRPePB531zPHO3COkVR_lEAMF5TyzWRKQeTX_7J7ip6yvw7YT93hlbtH9ZlGVwVmLudZGh1vzqX5swYzmqgVf68eEkmqhb0CGMyCROK-AzZ/s1600/sh_exact_r16.png" height="106" width="103" /></a>
</td>
</tr>
<tr>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCWAdTLaY2U2RYzyW7sVssDWg3FuGFD2RmOE9BLzqSOtm550YsBLlCmmvv9BrIwZtZ-2Y44K0cQckTDZ2mvNp73wmUmbX_lyKxECEwaB6XOs_EU5V576Hwbe8TL5lvuN-LxEoWNeDbyhFa/s1600/sh_approx_r2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCWAdTLaY2U2RYzyW7sVssDWg3FuGFD2RmOE9BLzqSOtm550YsBLlCmmvv9BrIwZtZ-2Y44K0cQckTDZ2mvNp73wmUmbX_lyKxECEwaB6XOs_EU5V576Hwbe8TL5lvuN-LxEoWNeDbyhFa/s1600/sh_approx_r2.png" height="106" width="103" /></a>
</td>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHrlZPTqgrnksb8SWN5Rl2JizVJpRwaqzWx1_NNXITeJhzv3quCraGy1Z2vCCASvnehvvmtrrFb2g7Bp_5XMjdmv7H3W1I1LmUnLAGguMYUDzs2oTeHQujRYzozXMZ1KRPO-3jo0unooNQ/s1600/sh_approx_r4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHrlZPTqgrnksb8SWN5Rl2JizVJpRwaqzWx1_NNXITeJhzv3quCraGy1Z2vCCASvnehvvmtrrFb2g7Bp_5XMjdmv7H3W1I1LmUnLAGguMYUDzs2oTeHQujRYzozXMZ1KRPO-3jo0unooNQ/s1600/sh_approx_r4.png" height="106" width="103" /></a>
</td>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSprl7gOElkXvaoeAZgqiX1aRZz6ubZApgXfuEWvcTGev0nKsITYq4yNyL8hHZwbRc7WNX_IhIiCrrd8gmZWk9ChZRd3F0UjBG0_juH2rxlsDAXZxhuUyiHGd4uOaIBC37OXgVRKOGdAo-/s1600/sh_approx_r8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSprl7gOElkXvaoeAZgqiX1aRZz6ubZApgXfuEWvcTGev0nKsITYq4yNyL8hHZwbRc7WNX_IhIiCrrd8gmZWk9ChZRd3F0UjBG0_juH2rxlsDAXZxhuUyiHGd4uOaIBC37OXgVRKOGdAo-/s1600/sh_approx_r8.png" height="106" width="103" /></a>
</td>
<td height="106" width="103"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhffNbN7RWbqm8_KN1JhibkcmI-76aLV4E_NQgShdc74P4jpNZWwqt5viqkPETKaXR1wqrCGSsnhR5dqD22ckdVzOp4yFTHwLY5NrCP09U__Ss1fEnh1_YMasmhyb7GSvIA_r8z4vh3Q4Km/s1600/sh_approx_r16.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhffNbN7RWbqm8_KN1JhibkcmI-76aLV4E_NQgShdc74P4jpNZWwqt5viqkPETKaXR1wqrCGSsnhR5dqD22ckdVzOp4yFTHwLY5NrCP09U__Ss1fEnh1_YMasmhyb7GSvIA_r8z4vh3Q4Km/s1600/sh_approx_r16.png" height="106" width="103" /></a></td></tr>
<tr><td align="center" colspan="4"><div style="text-align: left;">
Lit with both directional light and BRDF projected into SH<br />
From left to right: <i>r</i> = 2, 4, 8, 16</div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
Upper row: shaded with look up texture</div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
Lower row: shaded with approximated function</div>
</td>
</tr>
</tbody></table>
<br />
And applying it to the human head model, but this time the approximation is not close enough and lose a bit of red color:<br />
<table align="center" border="0" cellspacing="0">
<tbody>
<tr>
<td height="177" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4Etqh5zNEUvRBb2G4Ga2h9GeQK-NIrz06Bc8afqvuLnjZ0YIfXdTUhFfSgF4BBLoDfZIsMZ1K2mu6PCT2ekYQgOSlBE-GAr-kKQMyo_PVgkDPJrcGOpE_3VsRwRqqTPwQoCYWWiq8U3sg/s1600/SH_shading_exact.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4Etqh5zNEUvRBb2G4Ga2h9GeQK-NIrz06Bc8afqvuLnjZ0YIfXdTUhFfSgF4BBLoDfZIsMZ1K2mu6PCT2ekYQgOSlBE-GAr-kKQMyo_PVgkDPJrcGOpE_3VsRwRqqTPwQoCYWWiq8U3sg/s1600/SH_shading_exact.png" height="177" width="200" /></a>
</td>
<td height="177" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmAd_Tnu3fHCSsoehQY5t3HWTWV3Hh1N_dKCBeyGCQ-9oHaCTvOk8FyhfnlQY5WmAAlkMzW_tnEjVh8o_LMnLEsqb0Q1Y2DcyOJY8DQspjimswQ-lAiIcaD6BOqbZ5eBJ4vc_Bs72Aimg_/s1600/SH_shading_approx.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmAd_Tnu3fHCSsoehQY5t3HWTWV3Hh1N_dKCBeyGCQ-9oHaCTvOk8FyhfnlQY5WmAAlkMzW_tnEjVh8o_LMnLEsqb0Q1Y2DcyOJY8DQspjimswQ-lAiIcaD6BOqbZ5eBJ4vc_Bs72Aimg_/s1600/SH_shading_approx.png" height="177" width="200" /></a>
</td>
<td height="177" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgC0t9UGpc_OZBpQL9ejV-xJBfEWXsgXLB8iLEaZFAMAAEF_nnpF-Kg-bwrJ-YHQeflEemgi21H7P583Cri2gSgC2UBdrWMhxVJnm2vcI0rJHCthHkv3jTY-KbPVJJJeJ5sRf-Zvqly8I5-/s1600/SH_shading_lambert.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgC0t9UGpc_OZBpQL9ejV-xJBfEWXsgXLB8iLEaZFAMAAEF_nnpF-Kg-bwrJ-YHQeflEemgi21H7P583Cri2gSgC2UBdrWMhxVJnm2vcI0rJHCthHkv3jTY-KbPVJJJeJ5sRf-Zvqly8I5-/s1600/SH_shading_lambert.png" height="177" width="200" /></a>
</td>
</tr>
<tr>
<td height="177" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJ-ImZ4LPSF-JAe4IChzCP6ovgTiBLJJKfV7OgyVSb5rMtvpQ86gvYM5osAu74rsAVGWSaB4BDbGXbZHfCVNPBVw57Faa5-8i9gTCDBxUEXIBGcc-tmctgLrC5VxWzAVd-Lss9pMnDNzDl/s1600/lighting_exact.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJ-ImZ4LPSF-JAe4IChzCP6ovgTiBLJJKfV7OgyVSb5rMtvpQ86gvYM5osAu74rsAVGWSaB4BDbGXbZHfCVNPBVw57Faa5-8i9gTCDBxUEXIBGcc-tmctgLrC5VxWzAVd-Lss9pMnDNzDl/s1600/lighting_exact.png" height="177" width="200" /></a>
</td>
<td height="177" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhaig9BFbIOJ6z1tWJjvbtJfq3rT4UZXRKdgxGBKvtYsOU5c2-YjnoJVeDomBok3Nim-5nXCceeTMQc4r7rTOpPMYjfIyorFQUd5r17qh3s62j0FyLWB9ZiQiHex32AKiUFdi9JpLD8T_VB/s1600/SH_lighting_approx.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhaig9BFbIOJ6z1tWJjvbtJfq3rT4UZXRKdgxGBKvtYsOU5c2-YjnoJVeDomBok3Nim-5nXCceeTMQc4r7rTOpPMYjfIyorFQUd5r17qh3s62j0FyLWB9ZiQiHex32AKiUFdi9JpLD8T_VB/s1600/SH_lighting_approx.png" height="177" width="200" /></a>
</td>
<td height="177" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW7z9o57yGHmUfW1-2txz1Habkl22EwDlOG7fmHbobqEF2jFDccB9mNo17mA0FxVf0Sm5AqIlHPdzsUEVBvxQ_dO3gLEpOL7AUNsa_2cKtPqnl1KsPiBRuLTObLpIZvHUNm8aMThXbMVT7/s1600/SH_lighting_lambert.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW7z9o57yGHmUfW1-2txz1Habkl22EwDlOG7fmHbobqEF2jFDccB9mNo17mA0FxVf0Sm5AqIlHPdzsUEVBvxQ_dO3gLEpOL7AUNsa_2cKtPqnl1KsPiBRuLTObLpIZvHUNm8aMThXbMVT7/s1600/SH_lighting_lambert.png" height="177" width="200" /></a>
</td>
</tr>
<tr>
<td align="center" colspan="3"><div style="text-align: left;">
<span class="Apple-style-span">From left to right: shaded with look up texture, approximated </span>function<span class="Apple-style-span">, lambert shader</span></div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
Upper row: shaded with albedo texture applied</div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
Lower row: showing only lighting result</div>
</td></tr>
</tbody></table>
<br />
And some code for your reference, where the ZH is the zonal harmonics coefficient:<br />
<blockquote class="tr_bq">
<span class="Apple-style-span" style="font-size: xx-small;">float curva = (1.0/mad(curvature, 0.5 - 0.0625, 0.0625) - 2.0) / (16.0 - 2.0); // curvature is within [0, 1] remap to <i>r</i> distance 2 to 16<br />float oneMinusCurva = 1.0 - curva;<br />// ZH0<br />{<br /> float2 remappedCurva = 1.0 - saturate(curva * float2(3.0, 2.7) );<br /> remappedCurva *= remappedCurva;<br /> remappedCurva *= remappedCurva;<br /> float3 multiplier = float3(1.0/mad(curva, 3.2, 0.4), remappedCurva.x, remappedCurva.y);<br /> zh0 = mad(multiplier, float3( 0.061659, 0.00991683, 0.003783), float3(0.868938, 0.885506, 0.885400));<br />}<br />// ZH1<br />{<br /> float remappedCurva = 1.0 - saturate(curva * 2.7);</span><span class="Apple-style-span" style="font-size: xx-small;"> </span><span class="Apple-style-span" style="font-size: xx-small;"><br /> float3 lowerLine = mad(float3(0.197573092, 0.0117447875, 0.0040980375), (1.0f - remappedCurva * remappedCurva * remappedCurva), float3(0.7672169, 1.009236, 1.017741));<br /> float3 upperLine = float3(1.018366, 1.022107, 1.022232);<br /> zh1 = lerp(upperLine, lowerLine, oneMinusCurva * oneMinusCurva);<br />}</span></blockquote>
<b><span class="Apple-style-span" style="font-size: large;">Result</span></b><br />
Putting both the direct and indirect lighting calculation together, with a simple GGX specular, lit by 1 directional light, SH projected indirect light and pre-filtered IBL.<br />
<table align="center" border="0" cellspacing="0">
<tbody>
<tr>
<td height="113" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzGII6bjBXGVTGA0qbTYcYRocWpHVmcBLlgFe3re6Lu_1EArOzvw3Eh9cqcmXtlKhYXXqrWEOwP4u1lP4-KU_9xDTFKszE2A_sG5jdJjzYkQfv5lJzey1mkh2j3UgVrUXIUE-XozwF9-1m/s1600/scene0_shading.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzGII6bjBXGVTGA0qbTYcYRocWpHVmcBLlgFe3re6Lu_1EArOzvw3Eh9cqcmXtlKhYXXqrWEOwP4u1lP4-KU_9xDTFKszE2A_sG5jdJjzYkQfv5lJzey1mkh2j3UgVrUXIUE-XozwF9-1m/s1600/scene0_shading.png" height="112" width="200" /></a>
</td>
<td height="113" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicBqGJN6gZmtkflZfBKEx4qgEM7HsXEe4bmnFpBbuqt5gLgDFstHCDkW4Hp4P-87oMGkWre_aZLGN-2ApluTy9MnOcP5QI9LZcRWWuKR3NAyk4EGbSDFEMr-xThxI7rYJNEiP5N8i6-O2c/s1600/scene0_shading_direct.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicBqGJN6gZmtkflZfBKEx4qgEM7HsXEe4bmnFpBbuqt5gLgDFstHCDkW4Hp4P-87oMGkWre_aZLGN-2ApluTy9MnOcP5QI9LZcRWWuKR3NAyk4EGbSDFEMr-xThxI7rYJNEiP5N8i6-O2c/s1600/scene0_shading_direct.png" height="113" width="200" /></a>
</td>
<td height="113" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwgODdAiSG5dkayQl2gEeGgfsmsvtKUmjBWcTfiSaTlWe-Qq7O4valddD8lTpX725Xdj1MgMbDX07r3sh193112MoVd6Qq3PRHckCw8ZvCEcYJT8NUOSofQsR8i9tD2ZWp21MlNAjdODGe/s1600/scene0_shading_SH.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwgODdAiSG5dkayQl2gEeGgfsmsvtKUmjBWcTfiSaTlWe-Qq7O4valddD8lTpX725Xdj1MgMbDX07r3sh193112MoVd6Qq3PRHckCw8ZvCEcYJT8NUOSofQsR8i9tD2ZWp21MlNAjdODGe/s1600/scene0_shading_SH.png" height="113" width="200" /></a>
</td>
</tr>
<tr>
<td height="113" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg84pOOXnepUvjlvUU5VLtkVZSFRxy8AOhTWxhkjsPbXANDLPaKJsnFp5_VlEauQ9wTiFZ7bIVOoBy0u9ZRrB_0ILIYeky9JOD7NyL8fh5zqYwmNHr_LFwtXekHmj5Kt6SMNC4NcR7Yi14L/s1600/scene0_lighting.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg84pOOXnepUvjlvUU5VLtkVZSFRxy8AOhTWxhkjsPbXANDLPaKJsnFp5_VlEauQ9wTiFZ7bIVOoBy0u9ZRrB_0ILIYeky9JOD7NyL8fh5zqYwmNHr_LFwtXekHmj5Kt6SMNC4NcR7Yi14L/s1600/scene0_lighting.png" height="113" width="200" /></a>
</td>
<td height="113" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggh278JKYynmGI2YSBTeR_YqD9EsyWy7wFYU-nZV-ihDkNLHKoSf9XRufs1TuAZCTCroTyqviX1rfGmxsYqT6qv5eV8luWuKBanBI6arY-3DIVB6LjEeLOrZBr0dqUXRwxdAnH-RrNKOzs/s1600/scene0_lighting_direct.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEggh278JKYynmGI2YSBTeR_YqD9EsyWy7wFYU-nZV-ihDkNLHKoSf9XRufs1TuAZCTCroTyqviX1rfGmxsYqT6qv5eV8luWuKBanBI6arY-3DIVB6LjEeLOrZBr0dqUXRwxdAnH-RrNKOzs/s1600/scene0_lighting_direct.png" height="113" width="200" /></a>
</td>
<td height="113" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhsbTzgUZHt-DYD-4ReTDnVO9R8fTDZpGqg-pLK_UA8DniA-YjsRMK6ooWPMu9xNl3Ysa8O-yd7lQ8yMeDKMaU58NLhnCgQVK5epWzHRnN3Woys80jO3HQkN-QDfBEAkELlCKPU4tk7Y0P5/s1600/scene0_lighting_SH.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhsbTzgUZHt-DYD-4ReTDnVO9R8fTDZpGqg-pLK_UA8DniA-YjsRMK6ooWPMu9xNl3Ysa8O-yd7lQ8yMeDKMaU58NLhnCgQVK5epWzHRnN3Woys80jO3HQkN-QDfBEAkELlCKPU4tk7Y0P5/s1600/scene0_lighting_SH.png" height="113" width="200" /></a>
</td>
</tr>
<tr>
<td align="center" colspan="3"><div style="text-align: left;">
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">
<span class="Apple-style-span">Heads from left to right: shaded with lambert shader, approximated </span>function<span class="Apple-style-span">, look up texture</span></div>
</div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
</div>
Images from left to right: full shading, direct lighting only, indirect lighting only</div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
Upper row: shaded with albedo texture applied</div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
Lower row: showing only lighting result</div>
</td></tr>
</tbody></table>
<br />
With another lighting environment:<br />
<table align="center" border="0" cellspacing="0">
<tbody>
<tr>
<td height="85" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEir17PH5cySwmrr8LpiAs-eRZl1VeYfwzFYmOIQLbIo7WmkCr17rH1QpK9dG4PZri6NlKMqiXIcTeL2RFIoIuB1DoCJwARLOWabrYNYQ3k4FrqWa3wRy7a0dyWLk5M8bCaC7161SLo1g4yE/s1600/scene1_shading.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEir17PH5cySwmrr8LpiAs-eRZl1VeYfwzFYmOIQLbIo7WmkCr17rH1QpK9dG4PZri6NlKMqiXIcTeL2RFIoIuB1DoCJwARLOWabrYNYQ3k4FrqWa3wRy7a0dyWLk5M8bCaC7161SLo1g4yE/s1600/scene1_shading.png" height="85" width="200" /></a>
</td>
<td height="85" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMODyLDWLc1UxJZJF_EzBDQmXxE7mym2oYLhDBAI8DgmfqrBdo0WodFMbTY48LGlDdQVSwAm6eDwSZf2LO_fwnjZ6jWr1T3TOyXMGLgHfqWOsPz3GN9sM8PAIJQF4iAUJNT1CAnxV4P2Ui/s1600/scene1_shading_direct.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMODyLDWLc1UxJZJF_EzBDQmXxE7mym2oYLhDBAI8DgmfqrBdo0WodFMbTY48LGlDdQVSwAm6eDwSZf2LO_fwnjZ6jWr1T3TOyXMGLgHfqWOsPz3GN9sM8PAIJQF4iAUJNT1CAnxV4P2Ui/s1600/scene1_shading_direct.png" height="85" width="200" /></a>
</td>
<td height="85" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLvJbww7GOF8eZVhZygqQA-d9O8LhvH0NgwCnDZxJlgEqjFhjI_V1rOkXhSdFqk1h2GOkF_MSOrpQc9GZVvLu1GZU12ypt1gZdJnKD5V0zeET6z6o4TUUHDdDIhTAw_TkKyoOpFtHJUYCA/s1600/scene1_shading_SH.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhLvJbww7GOF8eZVhZygqQA-d9O8LhvH0NgwCnDZxJlgEqjFhjI_V1rOkXhSdFqk1h2GOkF_MSOrpQc9GZVvLu1GZU12ypt1gZdJnKD5V0zeET6z6o4TUUHDdDIhTAw_TkKyoOpFtHJUYCA/s1600/scene1_shading_SH.png" height="85" width="200" /></a>
</td>
</tr>
<tr>
<td height="85" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEcBTP1ezdrmDc02mNqIudC6vXQQJ2dcy-UQHPWXedxSsjK_cVZv7JCrtWJURwsh8ND1UosotepALH33gtQWSFdLM7-mNGlnOBEDC7PsygEn65fD6hyhB9z4GBYYdgvWfe65PCZL5Voglb/s1600/scene1_lighting.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEcBTP1ezdrmDc02mNqIudC6vXQQJ2dcy-UQHPWXedxSsjK_cVZv7JCrtWJURwsh8ND1UosotepALH33gtQWSFdLM7-mNGlnOBEDC7PsygEn65fD6hyhB9z4GBYYdgvWfe65PCZL5Voglb/s1600/scene1_lighting.png" height="85" width="200" /></a>
</td>
<td height="85" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjKiXe5Xt0NgyZJD_Vps-nX5TEAK1qb21v1TUwV9rvLTrDK1Yl1xAC6tlhVBpNPn8m7HcMk7_caIjhodvT8vQ576TOof4P64kiTMyUgQKryFKfssUA0ogbZ1y6_mSCdqWsQUZfOVMJDG_VP/s1600/scene1_lighting_direct.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjKiXe5Xt0NgyZJD_Vps-nX5TEAK1qb21v1TUwV9rvLTrDK1Yl1xAC6tlhVBpNPn8m7HcMk7_caIjhodvT8vQ576TOof4P64kiTMyUgQKryFKfssUA0ogbZ1y6_mSCdqWsQUZfOVMJDG_VP/s1600/scene1_lighting_direct.png" height="85" width="200" /></a>
</td>
<td height="85" width="200"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEioU4sQkGFcnTo-5UKwSo9-J5a9Tb0naiNZRCtywi3BbvXjZsEURheRdP_oYDVMaQ0OTnQflBFHdrin5DwD0gTJRFQqkXv39hlyyAb06BoC9ydrfukkid0dS4yEMJ6xouxJ0UyMaTVLRMKw/s1600/scene1_lighting_SH.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEioU4sQkGFcnTo-5UKwSo9-J5a9Tb0naiNZRCtywi3BbvXjZsEURheRdP_oYDVMaQ0OTnQflBFHdrin5DwD0gTJRFQqkXv39hlyyAb06BoC9ydrfukkid0dS4yEMJ6xouxJ0UyMaTVLRMKw/s1600/scene1_lighting_SH.png" height="85" width="200" /></a>
</td>
</tr>
<tr>
<td align="center" colspan="3"><div style="text-align: left;">
Heads from left to right: shaded with look up texture, approximated function<span class="Apple-style-span">, lambert shader</span></div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
<div style="text-align: left;">
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;">
Images from left to right: full shading, direct lighting only, indirect lighting only</div>
</div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
</div>
Upper row: shaded with albedo texture applied</div>
<div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-align: left;">
Lower row: showing only lighting result</div>
</td></tr>
</tbody></table>
<br />
I have uploaded the <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEio3VS8w6hPQ-qsr2Uupzgs3yybesq9TcN-zML33tjCmj-QWkn82KoazJ8KGthwJwS4UiwOSZ-35QNOGn7d5TEeGDm3uo3C4qD-W0GCkP9XY8K_V-YqI22_-c4De292EtFhaQdxUjhelLYF/s1600/head_curvature.png">curvature map for the human head here</a>, <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjpipX9-frF9HZfjg9ZT7O6Ax3NEJItAoygkjCGKuQd_81Tba9QumVf7X8iX2x0PWqmLi8inWnLr6GAHqxm9M8fC1CBfOhXLiZMnN6BZKQ8rdIH3TT_YPELlYdVZt5xrnFSvhgwsV9coSe/s1600/lut_direct.png">look up texture for the direct lighting here</a> and <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjLKQ8o4QJTN6W2AxjWZgHLz9xoU0bXlzJcwrUQTuVJmu811PdNCuBs3a-ACcBVPhWQkRZrFLu7Cq6x36xLW_kp93tNq4xw1-5jhzOnz4bh19c__q24L8f_MEZy9KxKXZNqm3iAetLfFyxU/s1600/lut_SH.png">indirect lighting look up texture here</a>. The textures need to be loaded without sRGB conversion. For the indirect lighting texture, I have scale the value so that it fits into an 8-bit texture within 0 to 1 range. So a sample use of the look up textures looks like:<br />
<blockquote class="tr_bq">
<span class="Apple-style-span" style="font-size: xx-small;">float3 brdf= directBRDF.Sample(samplerLinearClamp, float2(mad(NdotL, 0.5, 0.5), oneOverR)).rgb;<br />float3 zh0= indirectBRDF_ZH.Sample(samplerLinearClamp, float2(oneOverR, 0.25)).rgb;<br />float3 zh1= indirectBRDF_ZH.Sample(samplerLinearClamp, float2(oneOverR, 0.75)).rgb;<br />float remapMin= 0.75;<br />float remapMax= 1.05;<br />zh0= zh0 * (remapMax - remapMin) + remapMin;<br />zh1= zh1 * (remapMax - remapMin) + remapMin;</span></blockquote>
<b><span class="Apple-style-span" style="font-size: large;">Conclusion</span></b><br />
In this post, I describe a way to find an approximated function for the pre-integrated skin diffusion profile, which gives a similar result for the direct lighting function while losing a bit of red color for the indirect lighting. The down side of fitting the curve manually is when the function is changed a bit, say changing the function input from radial distance to curvature(i.e. from <i>r</i> to 1/<i>r</i>), all the approximate functions need be re-do again (or the conversion need to be done during run-time just like my code snippet above...). Also the shadow scattering described in the original paper is not implemented, so some artifact may be seen at the direct shadow boundary. Overall, the skin shading result is improved compare to shade with Lambert or Oren-Nayar under environment with a strong directional light source.<br />
<br />
<b>Reference</b><br />
<span class="Apple-style-span" style="font-size: x-small;">[1] SIGGRAPH 2011- Pre-Integrated Skin Shading</span><br />
<span class="Apple-style-span" style="font-size: x-small;"><a href="http://advances.realtimerendering.com/s2011/Penner%20-%20Pre-Integrated%20Skin%20Rendering%20(Siggraph%202011%20Advances%20in%20Real-Time%20Rendering%20Course).pptx">http://advances.realtimerendering.com/s2011/Penner%20-%20Pre-Integrated%20Skin%20Rendering%20(Siggraph%202011%20Advances%20in%20Real-Time%20Rendering%20Course).pptx</a></span><br />
<span class="Apple-style-span" style="font-size: x-small;">[2] GPU Pro 2- </span><span class="Apple-style-span" style="font-size: x-small;">Pre-Integrated Skin Shading</span><span class="Apple-style-span" style="font-size: x-small;"> <a href="http://www.amazon.com/GPU-Pro-2-Wolfgang-Engel/dp/1568817185">http://www.amazon.com/GPU-Pro-2-Wolfgang-Engel/dp/1568817185</a></span><br />
<span class="Apple-style-span" style="font-size: x-small;">[3] Crafting a Next-Gen Material Pipeline for The Order: 1886 <a href="http://blog.selfshadow.com/publications/s2013-shading-course/rad/s2013_pbs_rad_notes.pdf">http://blog.selfshadow.com/publications/s2013-shading-course/rad/s2013_pbs_rad_notes.pdf</a></span><br />
<span class="Apple-style-span" style="font-size: x-small;">[4] GPU Gem 3- Advanced Techniques for Realistic Real-Time Skin Rendering </span><span class="Apple-style-span" style="font-size: x-small;"><a href="http://http.developer.nvidia.com/GPUGems3/gpugems3_ch14.html">http://http.developer.nvidia.com/GPUGems3/gpugems3_ch14.html</a></span><br />
<span class="Apple-style-span" style="font-size: x-small;">[5] Mathematica and Skin Rendering <a href="http://c0de517e.blogspot.jp/2011/09/mathematica-and-skin-rendering.html">http://c0de517e.blogspot.jp/2011/09/mathematica-and-skin-rendering.html</a></span><br />
<span class="Apple-style-span" style="font-size: x-small;">[6] Addendum to Mathematica and Skin Rendering <a href="http://c0de517e.blogspot.jp/2011/09/mathematica-and-skin-rendering.html">http://c0de517e.blogspot.jp/2011/09/mathematica-and-skin-rendering.html</a></span><br />
<span class="Apple-style-span" style="font-size: x-small;">[7] 3D head model by Infinite-Realities <a href="http://graphics.cs.williams.edu/data/meshes/head.zip">http://graphics.cs.williams.edu/data/meshes/head.zip</a></span><br />
<br />Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com10tag:blogger.com,1999:blog-7659461179709896430.post-57542720822775778792014-10-01T17:28:00.001+08:002015-02-08T03:19:40.359+08:00Recent Update 2014<b><span class="Apple-style-span" style="font-size: large;">Overview</span></b><br />
It has been a long time since my last post. Life was not that good here in <a href="http://www.scmp.com/news/hong-kong/article/1603350/police-fire-tear-gas-and-baton-charge-thousands-occupy-central">Hong Kong</a>, but I was still working on my engine in spare time. This post will show some of the stuff I have been working on in the past few months.<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMbkf4nlqvJ225nhP8O4qhyphenhyphen4AMJhBtgrCeFBuRncP4U5ux1F-gSlobfkdaYacqqcMihWV1UGq9U8EB5Jekfm78Vr7HOE4FCCunqgNT6RJofuXBek_cWBBad5aecfQAGBPXDYlRq0dfOIng/s1600/overview.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhMbkf4nlqvJ225nhP8O4qhyphenhyphen4AMJhBtgrCeFBuRncP4U5ux1F-gSlobfkdaYacqqcMihWV1UGq9U8EB5Jekfm78Vr7HOE4FCCunqgNT6RJofuXBek_cWBBad5aecfQAGBPXDYlRq0dfOIng/s1600/overview.png" height="328" width="640" /></a></div>
<br />
<b><span class="Apple-style-span" style="font-size: large;">Shading</span></b><br />
On graphics side, the engine is now switched to use physically based shaders with GGX model for specular and Oren-Nayar model for diffuse shading. The GGX model is implemented according to <a href="http://blog.selfshadow.com/publications/s2013-shading-course/karis/s2013_pbs_epic_notes_v2.pdf">this paper from Unreal Engine 4[1]</a>. While the Oren-Nayar model is implemented according to <a href="http://research.tri-ace.com/Data/s2012_beyond_CourseNotes.pdf">this paper from tri-Ace[2]</a>.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZj83p7JUXm3evjCWlnTuQVoyC_Q5o5jivNz1zW8247JGbT1uPjjJOtERIX9W3J2SbY76v-pLi7L-VRaUzDkxKTROA_ChAloZUkIiQcl-T84uC4MqWDiwyAgtw4x7vsrxEn7ZTjEtdevsp/s1600/pbr.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgZj83p7JUXm3evjCWlnTuQVoyC_Q5o5jivNz1zW8247JGbT1uPjjJOtERIX9W3J2SbY76v-pLi7L-VRaUzDkxKTROA_ChAloZUkIiQcl-T84uC4MqWDiwyAgtw4x7vsrxEn7ZTjEtdevsp/s1600/pbr.png" height="205" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Meshes shaded with physically based shaders with different roughness</td></tr>
</tbody></table>
Oren-Nayar model is chosen over Lambert model because it takes roughness into account. During shading, for those mesh with roughness map but without normal map, using Oren-Nayar model will show a bit more details.<br />
<br />
<table>
<tbody>
<tr>
<td><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcFhyvIvKz3KvP6VEkyVcCJO34ls8jfBHj0ldz8gb_0ajIHHhhs3FSQDGYCXE14UCnXqWOEDPwX2CgBX4EemAarBcSkbUHW-IpG1D5z69XDZJWvzf0YLI5CuLTk1oFXKMH2jmqX10TKLAN/s1600/rougness_001.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjcFhyvIvKz3KvP6VEkyVcCJO34ls8jfBHj0ldz8gb_0ajIHHhhs3FSQDGYCXE14UCnXqWOEDPwX2CgBX4EemAarBcSkbUHW-IpG1D5z69XDZJWvzf0YLI5CuLTk1oFXKMH2jmqX10TKLAN/s1600/rougness_001.png" height="64" width="124" /></a></div>
</td>
<td><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUdcU1doG7hQU31yoIpr3zhS2yuKpUzkju483iVNWlyNly2xsmx3my0jjnE3riYJHvHTMzrFUTwnCgBL1M-Q-DKoVcfJnHhNkSp-tiSiiQn4zD3YMQv2UP5iqnN-bt5LsyTw_cCY8oygEW/s1600/rougness_002.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUdcU1doG7hQU31yoIpr3zhS2yuKpUzkju483iVNWlyNly2xsmx3my0jjnE3riYJHvHTMzrFUTwnCgBL1M-Q-DKoVcfJnHhNkSp-tiSiiQn4zD3YMQv2UP5iqnN-bt5LsyTw_cCY8oygEW/s1600/rougness_002.png" height="64" width="124" /></a></div>
</td>
<td><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRfAamByiMJyyQFK6h4Dv-xtZwrMPuoBs0b6Jfhu8muHc87Tv9Aoer67wOad_qxGr2FUsac60m31EQv0HR6GNLIYC6AvWCSNZyrjimf-HyRMUbOI6jSaxnwHTzRKGqzL4UCjI8Gr_OxbG1/s1600/rougness_003.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRfAamByiMJyyQFK6h4Dv-xtZwrMPuoBs0b6Jfhu8muHc87Tv9Aoer67wOad_qxGr2FUsac60m31EQv0HR6GNLIYC6AvWCSNZyrjimf-HyRMUbOI6jSaxnwHTzRKGqzL4UCjI8Gr_OxbG1/s1600/rougness_003.png" height="64" width="124" /></a></div>
</td>
<td><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2gpR-36Un-XcB1d53Czt-g6h2s673BDl3VGZU1SeYO0Vhcmj_PweBVwzJLODFqU6nl9btyvvjQZIgqZ0_HCAdQT6DsHGSo4b-D8qwXPE1vClMAr9p7MKVx9piVDzIWMjccwba-ygXDDqr/s1600/rougness_004.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2gpR-36Un-XcB1d53Czt-g6h2s673BDl3VGZU1SeYO0Vhcmj_PweBVwzJLODFqU6nl9btyvvjQZIgqZ0_HCAdQT6DsHGSo4b-D8qwXPE1vClMAr9p7MKVx9piVDzIWMjccwba-ygXDDqr/s1600/rougness_004.png" height="64" width="124" /></a></div>
</td>
<td><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTN_itXd0PSu1J-MlvZhZOE44odki9Qvl1lO3F4mtLuKpfjaJjuZ7TcMmHlo5gYa_IaqCmYE-0bjbgYnRTOcEld7yIyCBNpS3wr4J9H0WvRY35dCsWg8dGqmfPT7MplqSEYpxobz1qQo_A/s1600/rougness_005.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgTN_itXd0PSu1J-MlvZhZOE44odki9Qvl1lO3F4mtLuKpfjaJjuZ7TcMmHlo5gYa_IaqCmYE-0bjbgYnRTOcEld7yIyCBNpS3wr4J9H0WvRY35dCsWg8dGqmfPT7MplqSEYpxobz1qQo_A/s1600/rougness_005.png" height="64" width="124" /></a></div>
</td>
</tr>
<tr>
<td colspan="5"><div style="text-align: center;">
A mesh shaded without normal map under indirect lighting:</div>
<div style="text-align: center;">
From left to right: final result, lighting only, normal, roughness, albedo </div>
</td>
</tr>
</tbody></table>
<br />
Also, a <a href="http://en.wikipedia.org/wiki/Radiosity_(computer_graphics)">radiosity[3]</a> baking tool was written to calculate the indirect diffuse lighting for static mesh by rendering cube-maps at every light map texel position. The cube maps are then projected into Spherical Harmonics(SH) during the radiosity iteration. And the results are stored in SH luma + average chroma along the vertex normal hemi-sphere (but the chroma format doesn't play well with the texture compression, so may need to find another storage representation in the future...).<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIgcZDpZhxhyphenhyphenC95hn91mArETjN3TAn-5Vloo649LHYBo_vxyzTneb9elLdWNgcddOFNXJggcCR3s05ZPjpqHQ8xg8lOfU3bvvrjNLmvekwj4-wfg5N9TGPXFc7wungedWSpfgdStzoS8_c/s1600/GI_001.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIgcZDpZhxhyphenhyphenC95hn91mArETjN3TAn-5Vloo649LHYBo_vxyzTneb9elLdWNgcddOFNXJggcCR3s05ZPjpqHQ8xg8lOfU3bvvrjNLmvekwj4-wfg5N9TGPXFc7wungedWSpfgdStzoS8_c/s1600/GI_001.png" height="102" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Direct + Indirect lighting</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghpN5VR8fc6PALjjdynq_ZjMdNRgld8ieogziAEtk64h0Jp01f7NV1Br0f_JmPXiKAilkAZnN08heq6-U3l89d2lYJLZI2d-X8pqgwGSbK6Ejmg07TvwgMqZD6Ci6LVPQ0RcBp8wvk4SMq/s1600/GI_002.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghpN5VR8fc6PALjjdynq_ZjMdNRgld8ieogziAEtk64h0Jp01f7NV1Br0f_JmPXiKAilkAZnN08heq6-U3l89d2lYJLZI2d-X8pqgwGSbK6Ejmg07TvwgMqZD6Ci6LVPQ0RcBp8wvk4SMq/s1600/GI_002.png" height="102" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Direct lighting only</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjof7XabV321bqB3HeTvAklL_2VwbeLVtIjgJn_x3Oi2s0XRj3j0jqzpm3jH6Ha2_ahQMO5nwdEOWFr6g4ORFlz0pwEMcYBp6oRk2jg2_1Bu7PdhQ4dSroNEuweJel1RAT-HN8FLirsWMZ4/s1600/GI_003.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjof7XabV321bqB3HeTvAklL_2VwbeLVtIjgJn_x3Oi2s0XRj3j0jqzpm3jH6Ha2_ahQMO5nwdEOWFr6g4ORFlz0pwEMcYBp6oRk2jg2_1Bu7PdhQ4dSroNEuweJel1RAT-HN8FLirsWMZ4/s1600/GI_003.png" height="102" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Lighting only</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
The indirect specular lighting is baked by capturing reflection probe and pre-filtered with GGX model according to the <a href="http://blog.selfshadow.com/publications/s2013-shading-course/karis/s2013_pbs_epic_notes_v2.pdf">Unreal Engine 4 paper[1]</a>. Currently only the closest reflection probe is used without parallax correction.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigyZi-2FQy1agzQp0wnh58R3jVu0NEdcwT5EvDOgZR-B014xKBl_BT8qCCGPJtNwJnLbDQ1FR06mmvllj-EwQ2TRw5ywlJ3rEhToUR1NZUFgnPkL4Z4dpdo_2frbUPru_VtoPfaH41owDf/s1600/reflectionProbe_001.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigyZi-2FQy1agzQp0wnh58R3jVu0NEdcwT5EvDOgZR-B014xKBl_BT8qCCGPJtNwJnLbDQ1FR06mmvllj-EwQ2TRw5ywlJ3rEhToUR1NZUFgnPkL4Z4dpdo_2frbUPru_VtoPfaH41owDf/s1600/reflectionProbe_001.png" height="164" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Reflection probes placed within the scene</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEij4B8SWQF9R-VlbMADC2ykM4gDsZVl0F5wE930E3Gt7ii6Dj_dTHxlmbTZ_6GD6sovi20vpISTLYZQQX8zGCEvg2BwJF0yPFx3n63Ikx18RTnGQc4TXhGeSYMjEpoUm8ykVkkxiUlmJKXu/s1600/reflectionProbe_002.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEij4B8SWQF9R-VlbMADC2ykM4gDsZVl0F5wE930E3Gt7ii6Dj_dTHxlmbTZ_6GD6sovi20vpISTLYZQQX8zGCEvg2BwJF0yPFx3n63Ikx18RTnGQc4TXhGeSYMjEpoUm8ykVkkxiUlmJKXu/s1600/reflectionProbe_002.png" height="166" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Shading with pre-filtered cube-map</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
The indirect diffuse lighting for dynamic mesh is baked into the <a href="http://developer.amd.com/wordpress/media/2012/10/Tatarchuk_Irradiance_Volumes.pdf">Irradiance Volumes[4]</a>, storing in SH coefficients. The SH coefficients are modified according to roughness described in the <a href="http://research.tri-ace.com/Data/s2012_beyond_CourseNotes.pdf">tri-Ace paper[2]</a> (This is also done for the SH luma store in the light map for static mesh).<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHFIuuHKYy_PtQbB5_j-OPKz4_VpWe84oV5QJW2vyz2W8epQn8can2Be4_U3L07sX479zAlOCsX19AOEPh_iSvJ-8Pk765gSdSv6NR2r8d5WJplwXsqWsTQSnxbBbH5RR8fIlcZ22gSnz2/s1600/lightProbe_002.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHFIuuHKYy_PtQbB5_j-OPKz4_VpWe84oV5QJW2vyz2W8epQn8can2Be4_U3L07sX479zAlOCsX19AOEPh_iSvJ-8Pk765gSdSv6NR2r8d5WJplwXsqWsTQSnxbBbH5RR8fIlcZ22gSnz2/s1600/lightProbe_002.png" height="172" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Irradiance Volume samples</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2tXXeWDsZO9SwFXeM8_0kHmtxyEolBybuA-0cH6kv4CAsoZIGV2q6k8UDO3WGxpgdt3VgVycJM8b_GyP-0u3WHgADComhEtSulkADz-BBn7M8ODMdYgXJWojNoNhLF74NlSSSQlqB00fg/s1600/lightProbe_001.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi2tXXeWDsZO9SwFXeM8_0kHmtxyEolBybuA-0cH6kv4CAsoZIGV2q6k8UDO3WGxpgdt3VgVycJM8b_GyP-0u3WHgADComhEtSulkADz-BBn7M8ODMdYgXJWojNoNhLF74NlSSSQlqB00fg/s1600/lightProbe_001.png" height="164" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Dragon shaded with Irradiance Volume</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<b><span class="Apple-style-span" style="font-size: large;">Shadow</span></b><br />
The shadow system is updated to use with a mix of baked and real-time shadow. The light map baker described above calculate a shadow term for the main directional light at each light map texel and stored using signed distance field representation<a href="http://www.valvesoftware.com/publications/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf">[5]</a><a href="http://udn.epicgames.com/Three/DistanceFieldShadows.html">[6]</a> (Also, an additional binary shadow term is calculated for each Irradiance Volume sample position for dynamic objects).<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbMesdRLUGJg_3feYDjbIWWX1KfeX3AyoNV__K3RXAJ2ECx8FAgJJgAF2mxNJyo-Yy6tcaPhHwSJ9D8dW0oZQRQOVTKNyWOwOXhFULB5X8addE9s9hjdf6B9PXCRy94TPBLK3H7erbxVT1/s1600/shadow_001.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbMesdRLUGJg_3feYDjbIWWX1KfeX3AyoNV__K3RXAJ2ECx8FAgJJgAF2mxNJyo-Yy6tcaPhHwSJ9D8dW0oZQRQOVTKNyWOwOXhFULB5X8addE9s9hjdf6B9PXCRy94TPBLK3H7erbxVT1/s1600/shadow_001.png" height="127" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Real-time shadow only</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_K3r0aTXbakjaHKWN9M3JA_9vv1sco6hkK65LBDm08xUBQQLWV3HXaG0iUueLfYGfyuixhsTtNakmhC0zbZY3gSska04xkEqWKgkrPlQudcV_Udm2Jj1y9eHSLFOwMUw14W9tcxOn7OcJ/s1600/shadow_002.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_K3r0aTXbakjaHKWN9M3JA_9vv1sco6hkK65LBDm08xUBQQLWV3HXaG0iUueLfYGfyuixhsTtNakmhC0zbZY3gSska04xkEqWKgkrPlQudcV_Udm2Jj1y9eHSLFOwMUw14W9tcxOn7OcJ/s1600/shadow_002.png" height="127" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">baked shadow only</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBNeXmy5LDLe3clj7uqQYto4PnlpZdOEOVMKrLYyNcF72iSitt2Opbjq9NkOfXzKP98dRP1AAkzvmnFG2zr-r_mkCkhf7Y0SpCcbIh5yAzjYiCd9IQqT303BWFiDVd2MrzO9hDqM_HFLwQ/s1600/shadow_003.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiBNeXmy5LDLe3clj7uqQYto4PnlpZdOEOVMKrLYyNcF72iSitt2Opbjq9NkOfXzKP98dRP1AAkzvmnFG2zr-r_mkCkhf7Y0SpCcbIh5yAzjYiCd9IQqT303BWFiDVd2MrzO9hDqM_HFLwQ/s1600/shadow_003.png" height="127" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Baked + real-time shadow</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
For the dynamic soft shadow, <a href="http://research.edm.uhasselt.be/tmertens/papers/gi_08_esm.pdf">Exponential Shadow Map(ESM)[7]</a> is used. But the contact shadow looks too soft, so the shadow term calculated by ESM is raised to the power 4 (i.e. <i><b>s'</b></i>= <i><b>s</b></i>^4, where s is the shadow attenuation result from ESM) to make it darker.<br />
<br />
<b><span class="Apple-style-span" style="font-size: large;">Potential Visibility Set</span></b><br />
A potential visibility set (PVS) baker is written to calculate which meshes are visible to the visibility cells for culling the scene during runtime. A brute force approach is used for baking by taking some sample positions on a given mesh (e.g. vertex position + light map texel position) and then rendering the scene to check whether the visibility cells are visible.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoqGXLxiQpeEts5fVtTKPzwzFp3GAoYD_mpEGyotld-9s4HMasxkeWtGi7eIsqIBmimmGnytnkPYTxnDI9LylGNVV4GbHfvgLO069LaHSrS8H14B01hTs7_2zUupbtmXLW-Kzptt_6sAvG/s1600/pvs_002.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhoqGXLxiQpeEts5fVtTKPzwzFp3GAoYD_mpEGyotld-9s4HMasxkeWtGi7eIsqIBmimmGnytnkPYTxnDI9LylGNVV4GbHfvgLO069LaHSrS8H14B01hTs7_2zUupbtmXLW-Kzptt_6sAvG/s1600/pvs_002.png" height="112" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Render without PVS culling</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbzm8p0ORy-U3ypZESqJ5aBZInB23-ZGyPpAtek8jQsdPTMcMhfSCJ0tE-1aR1_uA9YcM-U4bb3TchUGZt-wV4iHI95C14omOUaB1Knnffz15LxGnMjWcUyFh2qUKGMQ2DHpf5gtYXDtuc/s1600/pvs_003.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbzm8p0ORy-U3ypZESqJ5aBZInB23-ZGyPpAtek8jQsdPTMcMhfSCJ0tE-1aR1_uA9YcM-U4bb3TchUGZt-wV4iHI95C14omOUaB1Knnffz15LxGnMjWcUyFh2qUKGMQ2DHpf5gtYXDtuc/s1600/pvs_003.png" height="112" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Render with PVS culling</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMk8ouoMSeKVklce5DdsTj0BOyzq4DzcxYsOWsenY_HracS7JXWKmcQVjwJYmMKSoJesor9GLEzq0s1dMH0usTxB1-XPVU8Bb7CbZbfNVQMA1g16JMBrOKUdHdjPNrJUKioHhVmXv1odB5/s1600/pvs_001.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMk8ouoMSeKVklce5DdsTj0BOyzq4DzcxYsOWsenY_HracS7JXWKmcQVjwJYmMKSoJesor9GLEzq0s1dMH0usTxB1-XPVU8Bb7CbZbfNVQMA1g16JMBrOKUdHdjPNrJUKioHhVmXv1odB5/s1600/pvs_001.png" height="112" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Final rendered result</td></tr>
</tbody></table>
</td>
</tr>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjh-8nk0kIcNcNVMu0UvDG9kSVxw7ikXao8pTvGxnt0zjCNzu3FEeD-SE49Tm5wor6gm2nRo8e6HTStJTlKZFVffD9w5LXMi9GSYkfXC4F3f__LTMekT0ryb3tAZdsthq7Ao5YrCTg-UkOj/s1600/pvs_004.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjh-8nk0kIcNcNVMu0UvDG9kSVxw7ikXao8pTvGxnt0zjCNzu3FEeD-SE49Tm5wor6gm2nRo8e6HTStJTlKZFVffD9w5LXMi9GSYkfXC4F3f__LTMekT0ryb3tAZdsthq7Ao5YrCTg-UkOj/s1600/pvs_004.png" height="112" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Another view for the same camera render without PVS culling</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfg23yogpKJNjD7abYVcPiOErBo1qgDY5a_UAL0pm4b3AdarTwOvvuaIyMA-XjT4N2Z-AGr8S3EJSk-Hz5pi2v_F00GRJOKBluFLwv6CfhHCd2Sb15aVBnPrda550H65lHOvrFDE_3Ji23/s1600/pvs_005.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfg23yogpKJNjD7abYVcPiOErBo1qgDY5a_UAL0pm4b3AdarTwOvvuaIyMA-XjT4N2Z-AGr8S3EJSk-Hz5pi2v_F00GRJOKBluFLwv6CfhHCd2Sb15aVBnPrda550H65lHOvrFDE_3Ji23/s1600/pvs_005.png" height="112" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Another view for the same camera render with PVS culling</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjqbmrbSHu-tooz0h1AS4Itk0klcjTcb7Lt3rccq4GwnmswK4hPKLwCmCK235c-aaoY4WfNBBo1TM2J36BGeLMcKRqSeXMJn7xlCpR41DK9VpDxSqEPwgJgO4MtD7Vx-Bva5EGOJ2w7nQRT/s1600/pvs_006.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjqbmrbSHu-tooz0h1AS4Itk0klcjTcb7Lt3rccq4GwnmswK4hPKLwCmCK235c-aaoY4WfNBBo1TM2J36BGeLMcKRqSeXMJn7xlCpR41DK9VpDxSqEPwgJgO4MtD7Vx-Bva5EGOJ2w7nQRT/s1600/pvs_006.png" height="112" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Visibility Cells are placed within the scene for possible camera location</td></tr>
</tbody></table>
</td>
</tr>
</tbody></table>
<br />
<b><span class="Apple-style-span" style="font-size: large;">Particles</span></b><br />
A basic particle system is implemented which can receive static lighting and receive static shadow. The particle are shaded on CPU using the Irradiance Volumes described above and can be calculated either per vertex, per particle or per emitter.<br />
<br />
<table>
<tbody>
<tr>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzhuQLNXscKTCzVhzdNO11iSNOjkvYQs2Y3dYz7uiI3rJbNCEKxgUxD_qpFrY9RuFDmi-UY5wYYBGxQ2uHxEFMquc2JxMVuHwuaeP-JFd_mPvZIFkJECvwaWARh8bHeHYxhNHdBmQ0CbtB/s1600/particle_001.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgzhuQLNXscKTCzVhzdNO11iSNOjkvYQs2Y3dYz7uiI3rJbNCEKxgUxD_qpFrY9RuFDmi-UY5wYYBGxQ2uHxEFMquc2JxMVuHwuaeP-JFd_mPvZIFkJECvwaWARh8bHeHYxhNHdBmQ0CbtB/s1600/particle_001.png" height="164" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Particles with lighting and shadow enabled</td></tr>
</tbody></table>
</td>
<td><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNZQKGEjCmiBDfH1mcq3eXiMHgjWMRVJJUMCqUcqqLACiE2vW_Z5a4670QKGJw2-m1K-XPr89yW4NHGFKmvJQidUseNtyQ8HSmO3o3T0z_0nchwRmdhARSbeKXCtFiaP6pX9LmanDEHaNk/s1600/particle_002.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNZQKGEjCmiBDfH1mcq3eXiMHgjWMRVJJUMCqUcqqLACiE2vW_Z5a4670QKGJw2-m1K-XPr89yW4NHGFKmvJQidUseNtyQ8HSmO3o3T0z_0nchwRmdhARSbeKXCtFiaP6pX9LmanDEHaNk/s1600/particle_002.png" height="164" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Self-shadow disabled</td></tr>
</tbody></table>
</td></tr>
</tbody></table>
The particles can also receive self shadow using the <a href="http://sebastien.hillaire.free.fr/index.php?option=com_content&view=article&id=61&Itemid=72">Fourier Opacity Mapping[8]</a>. An opacity map is computed on CPU side for the main directional light which assume each particle is of sphere shape. Then a shadow term can be computed for shading.<br />
<br />
<b><span class="Apple-style-span" style="font-size: large;">Cross platform support</span></b><br />
The engine runtime supports 3 different platforms: Windows, Mac, iOS(Editors are Windows only). On Windows, the engine mainly runs with D3D11. An extra OpenGL wrapper was written for Windows to ease the porting to iOS and Mac because it is easier to debug OpenGL with tools like Nsight.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsQFjk-gLfnfnXW5MVYbd4ERtIDAWf0gPKM3YtvzELpKfW9RfdTZ_WSbNrvGSlOpzElJhTozBw_v6lkYgFGw8-LsN4cnCAZdvGjIgoUStdsbeSeYmV1x1I6eCyi5cqnycQn6sKp2Gofhxc/s1600/cross_platform.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjsQFjk-gLfnfnXW5MVYbd4ERtIDAWf0gPKM3YtvzELpKfW9RfdTZ_WSbNrvGSlOpzElJhTozBw_v6lkYgFGw8-LsN4cnCAZdvGjIgoUStdsbeSeYmV1x1I6eCyi5cqnycQn6sKp2Gofhxc/s1600/cross_platform.jpg" height="251" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The engine runs on Windows, Mac, iOS platforms</td></tr>
</tbody></table>
<br />
To write cross platform shaders, shaders are written in my own shading language which is similar to HLSL. Those shaders are then parsed by a shader parser generated by <a href="http://aquamentus.com/flex_bison.html">Flex and Bison[9]</a> to obtain a syntax tree to output the actual HLSL and GLSL source code.<br />
<br />
<b><span class="Apple-style-span" style="font-size: large;">Final words</span></b><br />
Hope you enjoy the above screen shots and wish I can find some time to describe them in details in future posts.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvW955EJYVP99W9ul771DXNlt9G3x3eM5fT9D3CK9AqgSM_5cG842KowAMGjZaTBICwbD8kFK27KivalgCA6FE0MyxALqiqavBPn9-Hj4xZh4oJEaicIrUHtArMhN30C-dw9T11rCygtmf/s1600/editor.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvW955EJYVP99W9ul771DXNlt9G3x3eM5fT9D3CK9AqgSM_5cG842KowAMGjZaTBICwbD8kFK27KivalgCA6FE0MyxALqiqavBPn9-Hj4xZh4oJEaicIrUHtArMhN30C-dw9T11rCygtmf/s1600/editor.png" height="186" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">There are some other tasks implemented in the past few months,<br />
such as various editors, audio coding as well as asset hot-reload</td></tr>
</tbody></table>
<b>Reference</b><br />
<span style="font-size: x-small;">[1] Real Shading in Unreal Engine 4 <a href="http://blog.selfshadow.com/publications/s2013-shading-course/karis/s2013_pbs_epic_notes_v2.pdf">http://blog.selfshadow.com/publications/s2013-shading-course/karis/s2013_pbs_epic_notes_v2.pdf</a></span><br />
<span style="font-size: x-small;">[2] Beyond a Simple Physically Based Blinn-Phong Model in Real-time <a href="http://research.tri-ace.com/Data/s2012_beyond_CourseNotes.pdf">http://research.tri-ace.com/Data/s2012_beyond_CourseNotes.pdf</a></span><br />
<span style="font-size: x-small;">[3] Radiosity <a href="http://en.wikipedia.org/wiki/Radiosity_(computer_graphics)">http://en.wikipedia.org/wiki/Radiosity_(computer_graphics)</a></span><br />
<span style="font-size: x-small;">[4] Irradiance Volumes for Games <a href="http://developer.amd.com/wordpress/media/2012/10/Tatarchuk_Irradiance_Volumes.pdf">http://developer.amd.com/wordpress/media/2012/10/Tatarchuk_Irradiance_Volumes.pdf</a></span><br />
<span style="font-size: x-small;">[5] Improved Alpha-Tested Magnification for Vector Textures and Special Effects <a href="http://www.valvesoftware.com/publications/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf">http://www.valvesoftware.com/publications/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf</a></span><br />
<span style="font-size: x-small;">[6] Distance Field Shadows <a href="http://udn.epicgames.com/Three/DistanceFieldShadows.html">http://udn.epicgames.com/Three/DistanceFieldShadows.html</a></span><br />
<span style="font-size: x-small;">[7] Exponential Shadow Maps <a href="http://research.edm.uhasselt.be/tmertens/papers/gi_08_esm.pdf">http://research.edm.uhasselt.be/tmertens/papers/gi_08_esm.pdf</a></span><br />
<span style="font-size: x-small;">[8] Fourier Opacity Mapping <a href="http://sebastien.hillaire.free.fr/index.php?option=com_content&view=article&id=61&Itemid=72">http://sebastien.hillaire.free.fr/index.php?option=com_content&view=article&id=61&Itemid=72</a></span><br />
<span style="font-size: x-small;">[9] Flex and Bison <a href="http://aquamentus.com/flex_bison.html">http://aquamentus.com/flex_bison.html</a></span><br />
<br />
<div>
<div>
<br />
<br /></div>
</div>
Simonhttp://www.blogger.com/profile/16505698282735255970noreply@blogger.com5