Screen-Space Reflections Explained

Introduction

Some of the most difficult algorithms to properly implement for graphics developers are the family of algorithms used for screen-space reflections (SSR). These are difficult to implement because they require programmers to navigate peculiarities of transformations and coordinate systems inherent to their rendering API, as well as requiring a solid understanding of Linear Algebra. This is further complicated by the need for rendering multiple data buffers for use in the algorithm -- the algorithm I will describe requires buffers for depth, normal vectors, and an output color buffer from the lighting pass. These are buffers commonly generated and used in a deferred renderer though this algorithm could also be used in a forward renderer, as well.

Methodology

The simplest SSR algorithm starts with the view space calculation of the reflection vector (R) from the view (V) and normal (N) vectors using the formula V - 2.0 * dot(N,V) * N, where V = pixelViewSpacePosition - cameraPosition (note that the camera position in view space is at the origin, so this simplifies to V = pixelViewSpacePosition). Recall the semantic for creating a vector from point A to point B in a vector space is vec3(B - A). The math for the aforementioned reflection calculation is handled by the built-in GLSL function 'reflect().' This reflection vector is then used to march the screen-space ray from each pixel's origin until the ray's current screen-space depth value is close to the depth value from the depth buffer. It's easier to do ray marching for reflections in view space rather than world space because, in view space, the camera is at the origin and all world positions are transformed to positions relative to it. Consider that if you were to calculate reflections in world space, you might run into serious issues due to ray origins far from the world space origin due to float imprecision.

The process of transforming normals to view space can be separated into two transformations: traditionally in a deferred renderer, a model's normals are transformed to tangent space by applying the inverse transpose of the model matrix. Then, these tangent space normals can be transformed to view space in the SSR shader by applying the inverse transpose of the view matrix. Typically, it's not possible to access the model matrix for every pixel of the normal buffer after the geometry buffer ("g-buffer") pass so, for a deferred renderer, view space normals must be calculated this way. It is possible to store values for transformed view space normals in a g-buffer texture and access them later, but I encountered problems with this method.

The transformation for a given normal N from object space to view space in OpenGL is 'inverse(transpose(viewMatrix * modelMatrix)) * N.' For those uncertain of the reasoning for using the inverse transpose of these matrices, it is worth looking at the proof. Essentially, by using the inverse transpose of the transform, the property of the normal-tangent relationship where dot(N,T) = 0 will be preserved. Tangent space normals cannot be converted to view space simply by transforming them with the view matrix, the normal and tangent vector relationship will not be preserved and the basis will no longer be orthonormal.

Vanilla ray tracing code for SSR can be relatively computationally expensive compared to the quality of the output. However, there are a few essential optimizations that can improve results in terms of quality or performance. For example, performing a search between depth values for the previous and current ray step if the current ray depth exceeds the pixel depth buffer value. Use your favorite search algorithm, etc. Also, calculation time can be reduced by stopping the algorithm whenever the normal at a given hit point along the ray is orientated in the same direction as the direction of the camera to pixel V, ie where dot(N,V) > 0. See the "Optimizations" section for more info.

Due to the limited scope of SSR's use, it may be tempting to composite SSR reflections with other calculated reflections, though in my experience this can be an arduous process that does not produce viable results. Mismatches in the look and accuracy of various methods may require tweaking values to fix, and methods for doing this are beyond my scope of knowledge.

Unblurred SSR output

Code Sample

Ray ray;

    // Get view space params.
   ray.o = ViewPositionFromDepth(uv).xyz;
   vec3 V = normalize(ray.o);
   ray.viewNormal = getViewNormal(uv).xyz;
   ray.d = normalize(reflect(V, ray.viewNormal));

   // Screen space params. The screen direction vector is calculated from projected view space values.
   vec3 view_first_step = ray.o + ray.d;
   ray.o_screen = vec3(uv, getDepth(gDepth, uv));
   vec3 screen_first_step = ViewToScreen(view_first_step);
   vec3 stepDist = screen_first_step - ray.o_screen;
   ray.d_screen = SSRinitialStepAmount * normalize(stepDist);

ray.prevPos = ray.o_screen;
ray.currPos = ray.prevPos + ray.d_screen;

vec4 out_col = vec4(0.0);

// Ray march in screen space.

while(ray.steps < maxSteps)
{
        // End early if offscreen.
       if(minVec3(ray.currPos) < 0 || maxVec3(ray.currPos) > 1)
           return vec4(0.0);

//Check ray hit:

    float diff = ray.currPos.z - getDepth(gDepth, ray.currPos.xy);
   if(diff >= 0.0 && diff < length(ray.d_screen))
   {

// Do refinement here as necessary.

out_col = texture(sourceTex, ray.currPos.xy);

break;

}

        // Iterate ray forward.
        ray.prevPos = ray.currPos;
        ray.currPos = ray.prevPos + ray.d_screen;
        ray.steps++;

}

gl_FragColor = out_col;

Issues

- Results are noisy since missed rays are interspersed among ray hits. SSR is only an approximation of reflection. Acceptable, not accurate, results are the goal here. Blurring the output frame or using another method for in-fill is necessary.

- In one early attempted implementation, depending upon the pitch angle between the camera and hit pixel reflections were warped. This was due to attempting to use a per-pixel calculation of the normal world-to-view transformation, which was stored as 2 x/y values in channels of another output buffer with z normal values reconstructed in situ during the SSR calculation. However, this view normal reconstruction compounded error and led to the apparent reflection warping. Eventually, I fixed this by separately applying object-to-tangent transforms (inverse transpose of the model matrix) in the gBuffer pass, then transforming g-buffer normals to view space using the inverse transpose of the view matrix when accessing them in the SSR shader. Overall, this is a better method since it reuses the tangent space g-buffer normal output that is already used by the lighting pass of the deferred renderer and any time saved by pre-calculating view space normals is not significant. Another upside of this implementation is that the matrix multiplication required for normal transformation can be skipped for pixels which do not take part in the SSR calculation.

Steep view angle reflection warping from inaccurate view space normals

Optimizations

- In order to composite the SSR results with other reflection methods, I only applied SSR to pixels which register ray hits. To fix what is inherently noisy output, I generated a blurred version of the initial output, blurred with a large kernel Gaussian blur filter, and interpolated between the raw and blurred output linearly with the GLSL built-in 'mix()' function using per-pixel material glossiness as the blend factor. After tweaking the results until they were aesthetically pleasing, the reflections seemed to have a sufficiently natural look.

- I also downsampled the SSR buffer by 2x to cut the number of pixels to be traced in half and blur the output image. When rendering to an output frame with high enough resolution, there is almost no noticeable difference between SSR at full frame resolution and SSR at a 2x downsampled resolution.

- Due to the nature of the screen-space reflection algorithm, there are edge cases where it does not work: it cannot provide reflections for objects with rays directed off-screen (frequently these appear at the edges of the frame), nor can it accurately register hits for pixels with large disparities in depth value between the initial ray and the hit ray without using a large number of marching steps, nor can it provide reflections for reflected rays with such a steep angle with respect to the incident ray that raymarching isn't precise enough to register ray hits. Because of this, its use is limited to only very favorable conditions: rays with incident angles which are fairly close in orientation to the camera/view direction, which are close in depth to the starting depth value, and which are close to the center of the screen. Fortunately, this is enough in most cases to create somewhat realistic-looking reflections. Three kinds of attenuation will help the results look more natural: attenuation by ray travelled distance, attenuation by hit ray x/y values' distance to screen edge, and attenuation by angle between incident angle and reflection normal. With some adjustment, these values will taper off reflections at the edge of the screen and help hide unnatural looking borders in the reflection output. For my own implementation, simple linear attenuation was good enough.

- Early exit checks can be implemented for various conditions. It is possible to check if the ray has left valid NDC coordinates (for OpenGL any x, y, or z values less than 0 can be seen as off-screen) during ray marching, which can greatly speed up rendering. Also, some of the previously mentioned attenuation factors for reflections can be pre-calculated before ray marching, and when when their values would attenuate the output enough, ray marching can be abandoned altogether. Some amount of tweaking is required to optimize performance gain vs output appearance here.

Things I Tried That Didn't Help

- Multi-sampling rays per-pixel and averaging results. The computational cost isn't worth it for secondary rays.

- Applying a multi-pass biltateral filter weighted according to depth buffer value. This tended to make the results look too sharp at the edges in a surreal way, as if the reflections were outlined.

- Applying morphological dilation and other in-fill methods to the results to fill missed rays. Anything more computationally expensive than blurring isn't worth the performance loss.

References

McGuire, Mara, College, et al. "Efficient GPU Screen-Space Ray Tracing". Journal of Computer Graphics Techniques. Vol 3, No 4, 2014. http://jcgt.org/published/0003/04/04/paper.pdf.

Wronski, Bart. "GCD follow-up: Screenspace reflections filtering and upsampling". March 23, 2014. https://bartwronski.com/2014/03/23/gdc-follow-up-screenspace-reflections-filtering-and-up-sampling/

Programming Notes