Spectral Path Tracer

Introduction
Before starting this post, I would like to talk a bit about my homeland, Hong Kong. The Chinese government enacted a new National Security Law, bypassing our local legislative council. We can only read the full text of this law after it is enacted (with official English version published 3 days after that). This law destroys our legal system completely, the government can appoint judges they like (Article 44), jury can be removed from trial (Article 46), and without media and public presence (Article 41). This law is so vague that the government can prosecute anyone they don't like. People were arrested due to processing anti-government stickers. We don't even have the right to hate the government (Article 29.5). If I promote "Boycott Chinese Products", I may broke this law already... Also the personnel of the security office do not need to obey HK law (Article 60). This law even applies to foreigners outside HK (Article 38). Our voting right is also deteriorating, more pro-democracy candidates can be disqualified by this law in the upcoming election (Article 35)... So, if you are living in a democratic country, please cast a vote if you can.

Back to the topic of spectral path tracer. Path tracing in the spectral domain has be added in my toy renderer (along side with tracing in sRGB / ACEScg space). Spectral path tracer trace rays with actual wavelength of light instead of RGB channels. The result is physically correct, and some of the effect can only be calculated by spectral rendering (e.g. dispersion, iridescence). Although my toy spectral path tracer does not support such material, I would like to investigate how spectral rendering affects the bounced light color compared to image rendered with RGB color spaces. The demo can be downloaded here.
Spectral rendered image

Render Loop Modification
Referencing my previous DXR Path Tracer post, only a few modification need to be done to support spectral path tracing:

RGB path tracer render loop in previous post
When start tracing new rays, a wavelength is randomly picked. My first implementation uses hero wavelength with 3 wavelength samples per ray. The number 3 is chosen because it is convenient to replace existing code where rays are traced with RGB channels. So the "Light Path Texture" in previous post is modified to accumulate the energy at that 3 wavelengths during ray traversal. Finally, when the ray is terminated, the resulting energy in the "Light Path Texture" will be integrated with the CIE XYZ color matching function and stored in the "Lighting Sample Texture" in XYZ space, which later will be converted to the display device space (e.g. sRGB/AdobeRGB/Rec2020) as described in previous post.

Spectral Up Sampling Texture
One of the problem in spectral rendering is to convert texture from color to spectral power distribution(SPD), this process is called spectral up sampling. Luckily, there are many paper talks about it. The technique called "Spectral Primary Decomposition for Rendering with sRGB Reflectance" is used in the demo to up sample texture. I choose this method because of its simplicity. This method can reconstruct the spectrum by using linear combination of the texture color with 3 pre-computed spectral basis function:
But one thing bothered me is that, the meaning of texture color is a bit different than the spectral up sampling method. In PBR rendering, texture color is referring to albedo (i.e. ratio of radiosity to the irradiance received by a surface.), which is independent of CIE XYZ observer. While the up-sampling method is trying to minimize the least squares problem of the texture color viewed under illuminant D65 with CIE standard observer. May be the RGB albedo values are computed with SPD and XYZ observer function? I have no idea about it and may investigate about this in the future.

Spectral Up Sampling Light Color and Intensity
Beside spectral up sampling the texture, light also need to be up sampled. Because the light color can be specified in wide color in the demo, the up sampling method used in above section is not enough. The method "A Low-Dimensional Function Space for Efficient Spectral Upsampling" is used to up sample the light color. This method compute 3 coefficients using light color (i.e. from RGB to c0, c1, c2), and then the spectral power distribution, (λ), can be computed as below:
Since, light is specified by color and intensity, after calculating the SPD coefficients, we need to scale the SPD curve, so that when integrating the scaled SPD with the CIE standard observer ȳ(λ) curve, the result should equals to the specified luminance intensity :
The scaling factor K is calculated numerically using Trapezoidal rule with 1nm wavelength interval and the ȳ(λ) curve is approximated with "multi-lobe approximation in "Simple Analytic Approximations to the CIE XYZColor Matching Functions". So, the light spectral power distribution is specified by 4 floating point numbers: 3 coefficients + 1 intensity scale.

In the demo, the original light intensity of RGB path tracer is modified so that it better matches the intensity of the spectral rendered image. Before the modification, the RGB lighting is done by simply multiplying the light color with light intensity. But now, this value is also divided by the luminance of the color (but this lose some control in the color picker UI...).
RGB light color multiply with
intensity only
RGB light color multiply with intensity
divided by color luminance
Spectral light with scaled SPD curve


In addition to the luminance scale, we also need to chromatic adapt the light color from illuminant E to illuminant D65/D60 before computing the 3 SPD coefficients, because the coefficients are fitted using illuminant E. If not doing this, the image will have a reddish appearance.
Computing light SPD coefficients without chromatic adaption
Computing light SPD coefficients with chromatic adaption

Importance Sampling Wavelength
As mention at the start of the post, the wavelengths are sampled using hero wavelength, which randomly pick 1 wavelength within the visible spectrum (i.e. 380-780nm in the demo), and then 2 additional samples are picked, which evenly separated within visible wavelength range. With this approach, there is a high variance in color. Sometimes with 100 samples per pixel, the color converge to the final color, but more often, it requires > 1000 samples per pixel to converge. It really depends on luck...
3 different spectral rendered images with hero wavelength using 100 samples per pixel,
The color looks a bit different between all 3 images with a noticeable red tint in the middle image.

To make the render converge faster, let's consider the CIE XYZ standard observer curves below, those samples with wavelength >650nm and <420nm will only have a few influence on the output image. So I tried to place more samples around the center of visible wavelength range.
CIE 1931 Color Matching Function from Wikipedia
My first failed attempt is to use a cos weighted PDF curve like this to randomly pick 3 wavelengths for each ray:

A normalization constant is computed so that the PDF is integrated to one, and then CDF can be computed. To pick a random sample from this PDF, inverse method can be used. To simplify the calculation, the PDF is centered at 0 with width 200 instead of [380, 780] range. After sampling λ from the inverse of CDF, the λ is shifted by 580 to make it lies in [380, 780] range. To find the inverse of CDF:
Compute inverse CDF of the cos weighted PDF (with w=200)

Unfortunately, this cannot be inverted analytically as mentioned here. So Newton's method(with 15 iterations,) is used as suggested from this post, which have the follow result:

3 different spectral rendered images with cos-weighted PDF using 100 samples per pixel,
The color still looks a bit different between all 3 images...

Sadly, the result is not improved, which gives more color variance than the hero wavelength method...

So I google for a while and found another paper: "An Improved Technique for Full Spectral Rendering". It suggests to use the cosh function for PDF, which its CDF can be inverted analytically:
The paper only suggest to use that PDF curve with center B= 538nm and A= 0.0072. Since this shape is similar to my cos weighted PDF, the color converge rate is similar (so I just skip capturing screen shots for this case)... But what if we use this curve with their center lying around the peak of the XYZ standard observer curve? To find this out, I tried to find the normalization constant within range [380, 780]nm, and then compute the CDF and inverse CDF:



By using 3 different PDF to sample the wavelengths (A0=0.0078, B0= 600, A1= 0.0072, B1= 535, A2= 0.0062, B2= 445 are used in the demo), the image converge much faster than using hero wavelength. Using about 100 SPP will often enough to get a similar color to the converged image.
Rendered with 3 different cosh-curves PDF using 100SPP
Converged spectral rendered image.

Another problem with the color variance in hero wavelength is the camera movement. Since my demo is an interactive path tracer, when the camera moves, the path tracer re-generate a wavelength sample which change the color greatly every frame:
Camera movement with hero wavelength sampling

To give a better preview of color during first few samples of the path tracing. The random numbers are stratified into 9 regions so that the first ray will pick 3 random wavelengths lying around 600nm, 535nm and 445nm when substituted into the inverse CDF of cosh weighted curves, which will give some Red, Green, Blue color.

Code to generate stratified random numbers P0, P1, P2 within [0, 1] range.

With this stratified random number, color variation is reduced during camera movement:

Camera movement with stratified random numbers.

Conclusion
In this post, I have described how a basic spectral path tracer can be implemented. The spectral rendered image is a bit different from the RGB rendered image (The RGB rendered image is a bit more reddish compared to the spectral traced one.). This may be due to the spectral up sampling method used, or not using a D65 light source. However, the bounced light intensity is not much different between tracing in Spectral and ACEScg space. In the future I would like to try using different light source such as illuminant E/D/F to see how it affects the color. I would also like to have a technique to spectral up sampling albedo with wide color gamut instead of sRGB color only.

References
[1] https://cgg.mff.cuni.cz/~wilkie/Website/EGSR_14_files/WNDWH14HWSS.pdf
[2] https://en.wikipedia.org/wiki/CIE_1931_color_space
[3] https://graphics.geometrian.com/research/spectral-primaries.html
[4] https://rgl.s3.eu-central-1.amazonaws.com/media/papers/Jakob2019Spectral_3.pdf
[5] http://jcgt.org/published/0002/02/01/paper.pdf
[6] https://www.researchgate.net/publication/228938842_An_Improved_Technique_for_Full_Spectral_Rendering




HDR Display

Introduction
Continue with the DXR Path Tracer in the last post, I updated the demo to support HDR display. It also has various small features updated such as adding per monitor high DPI support, path trace resolution scale (e.g. path trace at 1080, and bilinear upscale to 4K.) and added dithering to tone mapped output to reduce color banding (Integer back buffer format only). The updated demo can be downloaded here.
A photo of my HDR TV to show the difference between SDR and HDR

HDR Color Spaces
There are 2 swapchain formats/color spaces can be chosen to output HDR images on HDR capable monitor/TV:
1. Rec 2020 color space (DXGI_FORMAT_R10G10B10A2_UNORM + DXGI_COLOR_SPACE_RGB_FULL_G2084_NONE_P2020)
2. scRGB color space (DXGI_FORMAT_R16G16B16A16_FLOAT + DXGI_COLOR_SPACE_RGB_FULL_G10_NONE_P709)
Rec2020 color space is the common HDR10 format with PQ EOTF. But it is recommended to use scRGB color space on Windows. scRGB use the same color primaries at Rec709, which support wide color using negative values. I was confused about using negative values to represent a color (as well as intensity) at first. Because color gamut is usually displayed as CIE xy chromaticity diagram, I used to think of a given RGB value as a color interpolated inside the Red/Green/Blue gamut triangle using the barycentric coordinates
A debug chromaticity diagram showing Rec2020 color gamut.
Using a HDR backbuffer will show less clipped color
Although, it makes sense to represent a wide color using negative numbers using barycentric interpolation, I was confused how it to represent the intensity at the same time. It is because the chromaticity diagram skipped the luminance information. So, instead of thinking color inside the horseshoe-shaped diagram, it is easier for me to think about the color in 3D XYZ color space. The Red/Green/Blue color primaries of a gamut is 3 basis vectors in the XYZ color space. A linear combination of RGB values with the color primaries basis vectors can represent a color and intensity. Thinking in this way make me feel more comfortable. So the path traced lighting values are fed into ACES HDR tone mapper, transformed into Rec709 color space, and then divided by 80 when using scRGB color space(scRGB requires value of 1 to represent 80 nit).

HDR10 metadata
I have also played around with HDR10 metadata to see how it affects the image. But most of the data does not affect the image on my Samsung UA49MU7300JXKZ TV. The only data have effects on the image is "Max Mastering Luminance" which can affect the image brightness. Setting it to a small value will make image darker despite outputting a very bright 1000 nit image. Also, the HDR10 metadata only works in Full Screen Exclusive mode, or borderless windowed mode with HWND_TOPMOST Z order (I guess is the full screen optimization get enabled), using borderless window with HWND_TOP Z order won't work (but this mode is easier to alt-tab on single monitor set up...). Besides, entering Full Screen Exclusive mode may fail when calling SetFullscreenState() if the display is not connected to the adapter that used for rendering. I didn't notice this until I started to work on laptop which use the RTX graphics card for ray tracing and the laptop monitor is connected to the Intel graphics card. Looks like some hard work need be done in order to support Full Screen Exclusive mode properly (e.g. create a D3D Device/Command Queue/Swapchain and for the Intel graphics card and copy the ray traced image from the RTX graphics to Intel graphics card for full screen swapchain). But unfortunately, my demo does not support multi-adapter, so the HDR10 metadata may not work on such setup (I am outputting to the external HDR TV using the RTX graphics card, so that doesn't create a much problem for me...)...

Capability of my HDR TV
UI showing all the adjustable HDR10 meta in the demo

UI
Blending SDR UI with HDR image is handled in 2 steps: blend color and then brightness. All the UI is rendered into an off screen buffer (in 8 bit Rec709 color space). And then later blended with the ACES tone mapped image. Take a look at the ACES tone mapping function snippet below, the lighting value will be mapped to a normalized range in AP1 color space as an intermediate step (red part in the below code snippet). So in the demo, the UI in off screen buffer will be converted to AP1 color space and then blend with the tone mapped image at this step. (I have also tried blending in XYZ space and the result is similar in the demo.)

ACES tone map function snippet

Then the UI blended image can be transformed into target color space (e.g. scRGB or Rec2020, purple part in the above code snippet). When converting the normalized color data to HDR data, the ACES tone mapper interpolates the RGB values between Y_MIN and Y_MAX (i.e. blue part in the above code snippet). During this brightness interpolation, the demo adjust Y_MAX (e.g. 1000 to 4000 nits) to user defined UI brightness (e.g. 80 to 300 nits) depending on the UI alpha, using the following formula:



In the demo, BlendPow is set to 5 as default value. Although the result is not perfect (e.g. the UI alpha value may not looks blending linearly, depending on background luminance), it works well enough to avoid bleeding though from a bright background with UI alpha > 0.5:

Background Luminance: 0-10 nit
Background Luminance: 2 - 250 nit
Background Luminance: 1000 nit
Photos showing UI blending with HDR background, from dark to bright.

However, the above blending formula have artifact when Y_MAX is smaller than the UI brightness (But this may not happens and only happens in some debug view mode). In this case, the background may looks too bright after blending with the UI. To fix this, inverting the BlendPow may helps to minimize the artifact:



The below is an extreme example showing the artifact with Y_MAX set to 1 nit and UI set to 80 nit:
Without the fix, the artifact is very noticeable at the upper right corner
With the fix, the background is darkened properly
Looking back to the blue part of the ACES tone mapping function in the above code snippet, I was wondering will the image looks different as the Y_MIN and Y_MAX values are interpolated in the display color space (e.g. will the image looks different when interpolating in scRGB / Rec2020). To compare whether these 2 color are the same, we need to have both color in the same space. So let's convert the interpolated RGB value back to XYZ space:



So, interpolating in different display color space do have some differences as long as Y_MIN != 0. But The ACES tone mapper which defaults to use the STRECH_BLACK option which effectively setting Y_MIN= 0, so there should be no difference when interpolating values in difference color space. Out of curiosity, I have tried to disable the STRECH_BLACK option to see whether the image will look different when switching between the scRGB and Rec2020 back buffer with Y_MIN > 0, but the images still looks the same with a large Y_MIN... I am not sure why this happens, may be the difference is too small to be noticeable... In the demo, I take this into account and treat the Y_MIN as a value in the XYZ color space and interpolate the value like this:



Debug View Mode
To help debugging, several debug view modes are added. A "Luminance Range" view mode is used to display the luminance value of a pixel before tone mapping (ie. the physical light luminance entering the virtual camera) and after tone mapping (i.e. the output light luminance the monitor should be displayed):
Showing pixel luminance before tone mapping.
Showing pixel luminance after tone mapping.

A "Gamut Clip Test" mode to high light pixel that fall outside Rec709 gamut, i.e. those color can only be viewed with a HDR display or wide color monitor (e.g. AdobeRGB / P3 monitor).
Highlight clipped pixel with cyan color
A "SDR HDR Split" mode to compare the SDR/HDR images. But this mode can only be a rough preview of how SDR will look like on HDR back buffer. It is because SDR images are usually displayed brighter than 80 nit, the SDR part need to be brighten up (say to 100-200 nit) to make it look similar to using a real SDR back buffer. Also, in this view mode, because the HDR back buffer expect a pixel value in nit (either linear or PQ encoded), I don't apply any gamma curve to the SDR portion of the image, which may also result in differences to a real SDR version too.
A photo showing both SDR and HDR at the same time.
Bloom is exaggerated to show the clipped color in SDR
Conclusion
In this post, I have talked about the color spaces used for outputting to HDR display. No matter which color space format is used (e.g. scRGB/ Rec2020), the displayed image should be identical if transformed correctly (except some precision difference). Also, I have tried to play around with the HDR10 metadata, but most of the metadata does not change the image on my TV... I guess how the metadata is interpreted is device dependent. Lastly, SDR UI is composited with the HDR image by first blending with the color and then brightness. A simple blending formula is enough for the demo. A complicated algorithm can be explored in the future, say brighten up the UI depending on the brightness of a blurred HDR background (e.g. may be storing the background luminance in the alpha channel and blur it together with bloom pass?). A demo can be downloaded here to test on your HDR display (Some of the options are hid if not connecting to HDR display).

References
[1] https://channel9.msdn.com/Events/Build/2017/P4061
[2]  https://www.pyromuffin.com/2018/07/how-to-render-to-hdr-displays-on.html
[3] https://www.gdcvault.com/play/1024803/Advances-in-the-HDR-Ecosystem
[4] https://www.gdcvault.com/play/1026443/Not-So-Little-Light-Bringing
[5] https://developer.nvidia.com/implementing-hdr-rise-tomb-raider
[6] https://www.asawicki.info/news_1703_programming_hdr_monitor_support_in_direct3d
[7] https://onedrive.live.com/?authkey=%21AFU3moSbUzyUgaE&cid=A4B88088C01D9E9A&id=A4B88088C01D9E9A%21170&parId=A4B88088C01D9E9A%21106&o=OneUp
[8] https://www.shadertoy.com/view/4tV3WW

DXR Path Tracer

Introduction
Can't believe it has been half a year since my last DXR AO post. It was a hard time in Hong Kong last year, but thanks to the social unrest and medical worker's strike, the Wuhan Coronavirus does not spread widely in the local community (but still have new cases everyday...). Due to the virus, it is better to stay at home, so I continue to code my path tracer. This new path tracer is unbiased by terminating rays with Russian Roulette. During path tracing, physical light unit are used. Also, rendering can be done in wide color space. Finally, the path traced result is tone mapped and output to sRGB/wide color gamut depends on the display device. A demo can be downloaded here (Please use latest graphics driver to run as I have encountered device remove hang on my laptop RTX2060 with old driver, but not on my desktop GTX1060... If the crash/hang still happens, please let me know. Thank you.).

Path Traced Sponza scene.


Render Loop
At the start of the demo (or after any camera movement/lighting changes), a structured buffer, Ray Buffer, is initialized with 1 ray per pixel using the camera transform.
The struct stored in Ray Buffer, not tightly packed for easier understanding.
Then a ray generation shader is dispatched to read the Ray Buffer and trace rays into the scene. Lighting is calculated and generate new Ray Buffer elements if the rays are not terminated and continue the path tracing in the next frame. Below is a simplified flow of the rendering operation executed every frame (with render passes on the left and resources on the right):

A simplified path tracing flow executed every frame
Let's start with the usage of resources in the above flow chart first:
  • Ray Buffers are structured buffers storing the RayData struct. A ray will be traced for each element and if the ray is not terminated by Russian Roulette, it will be stored back to the Ray Buffer for the next frame.
  • Lighting Path Texture is used for accumulating the lighting result when a ray is traversing along the path from the camera. It can be think of an intermediate result because the path is not fully traversed within a single frame, but across several frames.
  • Progress Buffer is a 8 bytes buffer, with 4 bytes storing the current path depth and other 4 bytes storing the total accumulated sample count.
  • Lighting Sample Texture is used for accumulating the lighting result of all the terminated rays (i.e. accumulating all the terminated ray result from Lighting Path Texture).
About operation done in each render pass:
  1. Ray Tracing Pass dispatch a ray generation shader, sampling the Ray Buffer according to the DispatchRaysIndex() and then calling TraceRay() to calculate the lighting result inside the closest hit shader by randomly choosing diffuse or specular lighting (another shadow ray is traced towards light source during lighting calculation). The lighting result is added to Lighting Path Texture and non-terminated ray will be stored back into Ray Buffer for next frame.
  2. Checking whether all rays are terminated by using D3D12 predicate on the counter buffer of Ray Buffer (i.e. all rays terminated when counter == 0). Then different shader/operation will be executed depends on whether the Ray Buffer is empty.
  3. When there are still rays not terminated, increase the path depth in Progress Buffer.
  4. When all rays are terminated, increase the sample count and set path depth to 0 in Progress Buffer.
  5. Accumulate the current path lighting result in Lighting Path Texture to Lighting Sample Texture. Clear the Lighting Path Texture to 0 (it is cleared via compute shader instead of command list clear as the predicate does not work on the clear API, despite the spec say it would...)
  6. Regenerate the rays in Ray Buffer with 1 ray per pixel using the camera transform for path tracing new lighting samples in next few frames. 
  7. Display the current lighting result to the back buffer.
Path traced images

With the core operations described above, at most 2 rays per pixel can be launched to maintain an interactive frame rate on my GTX1060. On more powerful machines (i.e. RTX cards with hardware accelerated ray tracing), step 1 don't need to be terminated with the first closest hit, but bounce a few more times before storing back to the Ray Buffer (A "#Bounce/Frame" option is added to increase the number of bounce per frame for RTX cards).
Number of bounce per frame option to adjust performance
The current approach described above has 2 drawbacks: First, we don't know how many rays are still left in Ray Buffer on CPU, DispatchRay() is called with the maximum number of rays (i.e. viewport width * height), and terminate early within the ray generation shader. This can be fixed in DXR Tier 1.1 using ExecuteIndirect() in the future. The second drawback is the performance is not constant across several frames, because the number of rays need to be traced decrease every frame and then reset back, so the frame rate fluctuate.

ACES tone mapping
After calculating the HDR lighting value, we need to perform a tone map pass to map the lighting value to a displayable range. ACES tone mapping is chosen due to its popularity in recent years. ACES has a few tone mapping curves (they call it RRT + ODT) for different display with different color gamut and viewing condition. Some common display types are sRGB_100nit and Rec709_100nit. The input of RRT+ODT function expect RGB values in ACES2065-1 (AP0) gamut with a white point around(but not exact) D60. So we need to convert our lighting value (L_sRGB) to AP0 gamut by multiplying a few transformation matrices:
operation to convert RGB value from sRGB to AP0
The above steps means first transforming sRGB values to XYZ color space with D65 white point (gamut transformation matrices can be calculated using the formula from here), then apply a Chromatic Adaption Transform(CAT) due to different white point between sRGB and AP0 (the matrix can be calculated using the formula from here). Finally, the XYZ value can be transformed to AP0 gamut. All these matrices can be combined to perform 1 matrix-vector multiplication as an optimization. Then this value can be feed into ACES RRT+ODT to compute the back buffer value for display.

So we just only need to select the appropriate ODT for the target display device. But unfortunately, not all common display gamut is provided, like my recently bought RTX laptop which comes with a 100% AdobeRGB color gamut monitor. ACES does not provide a suitable ODT to display the image in AdobeRGB color space. If using the common sRGB ODT, image will look too saturated. So I added a "Remap display color gamut" option in the demo:

Remapping option to display the path traced result according to display color primaies
The Remap display color gamut option performs the following steps on the output of RRT+ODT:
  1. Apply EOTF function to the ODT output to get linear lighting value.
  2. Transform the resulting RGB value in step 1 to the target display color gamut RGB value (e.g. AdobeRGB gamut on my laptop display), with Chromatic Adaption Transformation applied.
  3. Apply OETF function to the output of step 2 for display.
By doing the above remapping, I can get similar result between my AdobeRGB laptop monitor and sRGB desktop monitor. But 1 drawback is that, although we can query the color primaries of the display, but it is not always accurate. For example, on my laptop, I can switch to regular sRGB view mode, but the IDXGIOutput6::GetDesc1() is still returning the AdobeRGB color primaries. I have also tried on some other monitors, they have color primaries greater than sRGB, but not exactly AdobeRGB or P3 primaries, and they also have different view mode such as AdobeRGB or sRGB. So I just leave the gamut remapping function optional in the demo and the user can choose their remap color primaries.

Also digging deeper in the ACES ODT source code, the 3 ODT used in the demo share many common code and only have different color space transform function / OETF at the end of ODT. In the future, I may refactor the RRT+ODT code and remove the remap display gamut function and directly transforn the ACES ODT output in XYZ space to the display gamut queried by IDXGIOutput6::GetDesc1() (or user selected gamut).
An ODT from ACES, the blue part is the same for all the 3 ODT used in the demo.
The orange part is different depends on display, which can be replaced by display
primaries returned from IDXGIOutput6::GetDesc1(), so the "Remap display color
gamut" in the demo can be removed in the future. 

WCG rendering
Equipped with the knowledge of transforming between color spaces, I decide to try rendering in Wide Color Gamut instead. Games like GT Sport rendered in wide color already(Rec2020). Performing lighting calculation in wide color gamut can result in more accurate lighting than rendering in sRGB color space (despite displaying on sRGB monitor).

Path Traced result rendered in different color space. Left:sRGB, Center:ACEScg, Right:Rec2020

In the demo, it can path trace in sRGB, ACEScg, Rec2020 color space. Inside the closest hit shader, albedo texture is read and transformed into the chosen rendering color space from sRGB. Also the light color is converted to chosen rendering color space and then multiply with the intensity. Finally inside the tone mapping pass, the result of lighting calculation is transformed to AP0 color space and feed into ACES RRT+ODT for display. You may notice some difference between rendering in sRGB and wide color gamut (e.g. ACEScg and Rec2020). If you have a wide color gamut monitor (e.g. AdobeRGB or DCI-P3), you can try to use the Rec2020 ODT with the "Remap display color gamut" option on (described in last section). This can produce fewer color clamping and display more saturated color. But under normal lighting condition, the difference is not that much, we need to set up specific lighting such as using a local sphere light with saturated color, then those wide color can be displayed. I guess this is due to both albedo texture and light color are in sRGB space, content may need to be adjusted in order to take advantage of wide color display.

Wide Color path traced image, saved with different profile. Left:saved with sRGB profile Right: saved with AdobeRGB profile. 
The right image shows more saturated color when viewed on a color managed browser with a wide color display (e.g. iPhone monitor), 
otherwise 2 images may look the same.

Also, please note that this kind of wide color support is different from Windows 10 HDR/WCG settings. On my laptop, Windows report No for both HDR and WCG, but it do have an AdobeRGB monitor and capable of displaying wide color, we just need to correctly transform the images using the monitor color gamut.
My laptop has an AdobeRGB monitor, but Windows 10 Display capabilities report No for WCG.

ACES ODT blue light artifact
So far everything looks good when rendering in wide color space. Color get desaturated when they are over exposed. But it still has some issue when using a strong blue light...

Using a strong sRGB blue light will introduce hue shift...

It is because pure blue (0, 0, 255) in sRGB space is not saturated enough when transformed to wide color gamut (e.g. ACEScg/Rec2020). Looking inside the ACES dev repo, it has a blue light artifact fix LMT to fix this issue. It works by de-saturating the blue color a bit to lessen the hue shift. So in the demo, I provided a "Blue Correction" parameter to adjust the blue de-saturation strength (As a side note, UE4 also use ACES tone mapper and comes with a blue correction parameter in post process setting).
desaturating blue color to fix the hue shift

But I do like the saturated blue color, using the blue light artifact fix LMT will de-saturate the blue color made me sad. Below is the comparison between with/without the blue light LMT:
Left: without blue light fix LMT, Right: with blue light fix LMT

So may be we can work around the problem in the other way. Instead of making the blue color less saturated, we can make the light color more saturated. So I added a light "Color Picker Space" combo box to specify the color space of the picked RGB light value, so that more saturated blue light color can be chosen. By choosing an extremely saturated blue color (0, 0, 255) RGB value in ACEScg color space. We can get away with the purple color:
Using a saturated blue light in ACEScg space, without the blue light fix LMT

Bloom
Lastly, a bloom pass is added before tone mapping. Bloom pixels are extracted based on a threshold that exceeded the maximum luminance with the current exposure values. The maximum luminance is calculated with:
max luminance calculated using EV100
But simply subtracting the lighting value with the threshold will introduce some hue shift to the bloom color. So the RGB lighting value is transformed to HSV space, subtract the threshold from V, and then transform back to RGB space (We keep all the RGB values in the rendering space without transforming the lighting value to sRGB from ACEScg/Rec2020 during HSV conversion, as there are not much difference between the bloom results). Given an image with HDR values:
Input image for the bloom pass
The differences between using threshold in HSV space and RGB space:
Left column: bloom in HSV space. Right column: bloom in RGB space.
Upper row: Lighted scene combined with bloom.
Lower row: Debug images showing only the bloom component.

The bloom calculated using HSV space introduce less saturated color. The situation will be exaggerated when the image is over-exposed:
Left:Bloom input image.
Center:Bloom in HSV space.
Right: Bloom in RGB space.
Conclusion
In this post, the core algorithm of my DXR path tracer is described, together with some color space conversion. There are much more stuff to be done in the future like, support dynamic geometry during ray tracing, adding a denoiser for path traced output, implement hybrid rasterization/ray tracing rendering, spectral rendering to compute a ground truth reference. Also, this is my first time to write code about color space management. Currently, in the demo, the 3D lighting can be displayed correctly using the monitor gamut, but the UI is not managed properly. Also, 4K and HDR need to be supported too.

References
[1] https://seblagarde.files.wordpress.com/2015/07/course_notes_moving_frostbite_to_pbr_v32.pdf
[2] https://microsoft.github.io/DirectX-Specs/d3d/Raytracing.html#addressing-calculations-within-shader-tables
[2] https://github.com/ampas/aces-dev
[3] http://www.brucelindbloom.com/index.html?Eqn_RGB_XYZ_Matrix.html
[4] http://www.brucelindbloom.com/index.html?Eqn_ChromAdapt.html
[5] http://www.polyphony.co.jp/publications/sa2018/