A Simple and Practical Approach to SSAO

프로그래밍/번역 2012. 7. 5. 17:38

원문 : http://www.gamedev.net/page/resources/_/technical/graphics-programming-and-theory/a-simple-and-practical-approach-to-ssao-r2753

Introduction

Global illumination (GI) is a term used in computer graphics to refer to all lighting phenomena caused by interaction between surfaces (light rebounding off them, refracting, or getting blocked), for example: color bleeding, caustics, and shadows. Many times the term GI is used to refer only to color bleeding and realistic ambient lighting. Direct illumination – light that comes directly from a light source – is easily computed in real-time with today´s hardware, but we can´t say the same about GI because we need to gather information about nearby surfaces for every surface in the scene and the complexity of this quickly gets out of control. However, there are some approximations to GI that are easier to manage. When light travels through a scene, rebounding off surfaces, there are some places that have a smaller chance of getting hit with light: corners, tight gaps between objects, creases, etc. This results in those areas being darker than their surroundings.

전역 조명(GI)은 표면들사이에서 일어나는 상호작용(빛이 표면들로부터 튕겨져 나오고 굴절되고 막히는 것)에 의해서 발생되는 모든 현상을 나타내기 위해 컴퓨터 그래픽스에서 사용되는 개념이다. 예를 들면 color bleeding(색이 있는 재질을 가진 물체의 주변에 재질의 색이 번지는 현상), caustics(초선, 표면에 굴절, 반사되어 빛이 모여서 일렁거리는 모양을 만드는 현상), shadows(그림자)이다. 전역 조명 용어는 대부분 오직 color bleeding과 현실적인 ambient lighting(환경광)을 나타내기 위해 사용된다. 직접 조명(DI) - 빛의 근원으로 부터 바로 오는 빛 - 은 오늘날의 하드웨어에서 실시간을 쉽게 계산될 수 있다. 그러나 우리는 전역 조명에 대해서는 같은 말을 할 수 없다. 씬 안에서 모든 표면에 인근에 있는 표면들에 대한 정보를 모을 필요가 있고 이 복잡한 정보는 빨리 처리될 수 없기 때문이다. 그러나 더 쉽게 관리할 수 있는 전역 조명의 근사 방법들이 몇가지 있다. 씬을 통해서 빛이 진행할 때, 표면들에 튕겨져 나오는데, 빛이 닿는 확률이 더 적은 어떤 장소들이 있을 것이다. : 구석, 오브젝트 사이의 타이트한 사이공간, 주름진 곳 등등. 이것은 그 장소들을 주변보다 더 어둡게 만든다.

This effect is called ambient occlusion (AO), and the usual method to simulate this darkening of certain areas of the scene involves testing, for each surface, how much it is “occluded” or “blocked from light” by other surfaces. Calculating this is faster than trying to account for all global lighting effects, but most existing AO algorithms still can’t run in real-time.

이 효과를 ambient occlusion(AO)라 부르고, 씬의 어떤 지역의 어두움을 시뮬레이션하기 위한 방법은 각 표면에 대해서, 다른 표면에 의해서 얼마나 많이 "차폐되었는가"나 "빛으로 부터 가려졌는가"를 테스트하는 것을 포함한다. 이것을 계산하는 것은 모든 전역 조명 효과를 처리하려는 것 보다 더 빠른지만 대부분의 존재하는 AO 알고리즘들은 여전히 실시간에 실행될 수 없다.

Real-time AO was out of the reach until Screen Space Ambient Occlusion (SSAO) appeared. SSAO is a method to approximate ambient occlusion in screen space. It was first used in games by Crytek, in their “Crysis” franchise and has been used in many other games since. In this article I will explain a simple and concise SSAO method that achieves better quality than the traditional implementation.

실시간 AO는 Screen Space Ambient Occlusion(SSAO)가 나타날때까지 논외였다. SSAO는 스크린 공간에서 AO를 근사치로 계산하는 하나의 방법이다. 이 방법은 Crytek에 의해 게임에서 먼저 사용되었다. Crytek의 "Crysis"에서 사용되었고 그 이후로 많은 다른 게임들에서 사용되고 있다. 이 아티클을 통해서 기존의 구현보다 더 좋은 품질을 얻는 간단하면서도 간결한 SSAO 기법을 설명할 것이다.

The SSAO in Crysis

Prerequisites

The original implementation by Crytek had a depth buffer as input and worked roughly like this: for each pixel in the depth buffer, sample a few points in 3D around it, project them back to screen space and compare the depth of the sample and the depth at that position in the depth buffer to determine if the sample is in front (no occlusion) or behind a surface (it hits an occluding object). An occlusion buffer is generated by averaging the distances of occluded samples to the depth buffer. However this approach has some problems (such as self occlusion, haloing) that I will illustrate later.

Crytek의 원래 구현은 입력값으로 깊이 버퍼를 가졌고 대략 다음과 같이 작용되었다. : 깊이 버퍼의 각 픽셀에 대해서 3D 주변에 있는 몇개의 점들을 샘플링하고, 화면 공간으로 투영하고, 샘플링된 값이 앞에 있는지(차폐되지 않음) 표면의 뒤에 있는지(차폐 오브젝트에 닿았음)를 결정하기 위해 샘플링된 깊이와 깊이버퍼의 위치에 있는 깊이를 비교한다. 그러나 이 방식은 뒤에 그림으로 보여줄 몇가지 문제들을(self occlusion과 haloing) 가지고 있다.

The algorithm I describe here does all calculations in 2D, no projection is needed. It uses per-pixel position and normal buffers, so if you´re using a deferred renderer you have half of the work done already. If you´re not, you can try to reconstruct position from depth or you can store per-pixel position directly in a floating point buffer. I recommend the later if this is your first time implementing SSAO as I will not discuss position reconstruction from depth here. Either way, for the rest of the article I´ll assume you have both buffers available. Positions and normals need to be in view space.

내가 여기서 설명할 알고리즘은 2D에서 모두 계산되고 투영이 필요없다. 이것은 픽셀당 위치와 노말 버퍼를 사용해서 만약에 디퍼드 렌더러를 사용한다면 이미 작업의 반은 해놓은 것이다. 안 그렇다고 하더라도 깊이로부터 위치를 재구축해보거나 픽셀당 위치를 부동소수점버퍼에 직접 저장할 수도 있다. 나는 여기서 깊이로 부터 위치 재구축에 대해서 설명하지 않을 것이기 때문에 당신이 SSAO를 처음 구현해 본다면 나중의 방법을 추천한다. 어느쪽이든, 아티클의 남은 부분을 위해서 나는 당신이 두 버퍼를 둘 다 사용할 수 있다고 가정할 것이다. 위치와 노말은 뷰 공간의 것이 필요하다.

What we are going to do in this article is exactly this: take the position and normal buffer, and generate a one-component-per-pixel occlusion buffer. How to use this occlusion information is up to you; the usual way is to subtract it from the ambient lighting in your scene, but you can also use it in more convoluted or strange ways for NPR (non-photorealistic) rendering if you wish.

이 아티클에서 우리가 할 것은 정확히 이것이다. : 위치와 노말 버퍼를 가져와, 픽셀당 하나의 요소를 가지는 차폐 버퍼를 생성하는 것이다. 이 차폐 정보를 사용하는 방법은 당신에게 달려있다.; 보통의 경우에는 당신의 씬에 있는 환경광으로부터 빼준다. 그러나 만약 당신이 NPR(비 실사) 렌더링을 원한다면 더 복잡하거나 이상한 방법으로 사용할 수도 있다.

Algorithm

Given any pixel in the scene, it is possible to calculate its ambient occlusion by treating all neighboring pixels as small spheres, and adding together their contributions. To simplify things, we will work with points instead of spheres: occluders will be just points with no orientation and the occludee (the pixel which receives occlusion) will be a pair. Then, the occlusion contribution of each occluder depends on two factors:

씬에서 어떤 픽셀이 주어졌을 때, 작은 구들로써 모든 이웃된 픽셀들을 다루고 픽셀들의 기여도를 더함으로써 AO를 계산할 수 있다. 단순화 하기 위해서, 우리는 구 대신에 점으로 작업할 것이다. : 차폐물은 단지 방향없는 점들로 되어 있고 occludee(차폐를 당하는 픽셀)은 한쌍이 될 것이다. 각각의 차폐물의 차폐 기여정도는 두가지 요소에 의존된다.

Distance “d” to the occludee.
Angle between the occludee´s normal “N” and the vector between occluder and occludee “V”.

With these two factors in mind, a simple formula to calculate occlusion is: Occlusion = max( 0.0, dot( N, V) ) * ( 1.0 / ( 1.0 + d ) )

occludee까지의 거리 "d"
occludee의 노말 "N"과 occluder와 occludee 사이의 벡터 "V"의 각도.

이 두 요소를 가지고, 차폐를 계산하는 간단한 공식은 : Occlusion = max( 0.0, dot( N, V) ) * ( 1.0 / ( 1.0 + d ) ) 과 같다.

The first term, max( 0.0, dot( N,V ) ), works based on the intuitive idea that points directly above the occludee contribute more than points near it but not quite right on top. The purpose of the second term ( 1.0 / ( 1.0 + d ) ) is to attenuate the effect linearly with distance. You could choose to use quadratic attenuation or any other function, it´s just a matter of taste.

첫번째 부분인 max( 0.0, dot( N,V ) )은 occludee 바로 위의 점들은 근처의 점들보다 더 많은 영향력을 준다는 직관적인 아이디어를 기반으로한다. 두번째 부분인 ( 1.0 / ( 1.0 + d ) ) 목적은 거리에 선형적으로 효과를 감소시킨다. 당신은 이차 감소나 어떤 다른 함수든 입맛대로 사용할 수 있다.

The algorithm is very easy: sample a few neighbors around the current pixel and accumulate their occlusion contribution using the formula above. To gather occlusion, I use 4 samples (<1,0>,<-1,0>,<0,1>,<0,-1>) rotated at 45º and 90º, and reflected using a random normal texture.

알고리즘은 매우 쉽다: 현재 픽셀 주변에 몇개의 이웃들을 샘플링하고 위의 공식을 사용해서 차폐 기여값을 누적시킨다. 차폐값을 모으기 위해 나는 45도, 90도 회전되고 랜덤 노말 텍스쳐를 사용해 반사된 부분에 대해 (<1,0>,<-1,0>,<0,1>,<0,-1>) 4 샘플링을 사용한다.

Some tricks can be applied to accelerate the calculations: you can use half-sized position and normal buffers, or you can also apply a bilateral blur to the resulting SSAO buffer to hide sampling artifacts if you wish. Note that these two techniques can be applied to any SSAO algorithm.

계산을 가속화하기 위해서 몇가지 트릭을 적용할 수 있다. : 절반의 사이즈의 위치, 노말 버퍼를 사용할 수 있거나, 원한다면 샘플링된 것의 결함을 숨기기 위해 SSAO 버퍼의 결과에 블러(양쪽 방향에 대한 블러인듯)를 적용할 수 있다. 이 두가지 테크닉은 어떠한 SSAO 알고리즘에도 적용할 수 있다.

This is the HLSL pixel shader code for the effect that has to be applied to a full screen quad:
이것은 풀 스크린 사각형에 적용될 효과에 대한 HLSL 픽셀 쉐이더 코드이다.

sampler g_buffer_norm;
sampler g_buffer_pos;
sampler g_random;
float random_size;
float g_sample_rad;
float g_intensity;
float g_scale;
float g_bias;

struct PS_INPUT
{
float2 uv : TEXCOORD0;
};

struct PS_OUTPUT
{
float4 color : COLOR0;
};

float3 getPosition(in float2 uv)
{
return tex2D(g_buffer_pos,uv).xyz;
}

float3 getNormal(in float2 uv)
{
return normalize(tex2D(g_buffer_norm, uv).xyz * 2.0f - 1.0f);
}

float2 getRandom(in float2 uv)
{
return normalize(tex2D(g_random, g_screen_size * uv / random_size).xy * 2.0f - 1.0f);
}

float doAmbientOcclusion(in float2 tcoord,in float2 uv, in float3 p, in float3 cnorm)
{
float3 diff = getPosition(tcoord + uv) - p;
const float3 v = normalize(diff);
const float d = length(diff)*g_scale;
return max(0.0,dot(cnorm,v)-g_bias)*(1.0/(1.0+d))*g_intensity;
}

PS_OUTPUT main(PS_INPUT i)
{
PS_OUTPUT o = (PS_OUTPUT)0;

o.color.rgb = 1.0f;
const float2 vec[4] = {float2(1,0),float2(-1,0),
                        float2(0,1),float2(0,-1)};

float3 p = getPosition(i.uv);
float3 n = getNormal(i.uv);
float2 rand = getRandom(i.uv);

float ao = 0.0f;
float rad = g_sample_rad/p.z;

//**SSAO Calculation**//
int iterations = 4;
for (int j = 0; j < iterations; ++j)
{
  float2 coord1 = reflect(vec[j],rand)*rad;
  float2 coord2 = float2(coord1.x*0.707 - coord1.y*0.707,
                          coord1.x*0.707 + coord1.y*0.707);
  
  ao += doAmbientOcclusion(i.uv,coord1*0.25, p, n);
  ao += doAmbientOcclusion(i.uv,coord2*0.5, p, n);
  ao += doAmbientOcclusion(i.uv,coord1*0.75, p, n);
  ao += doAmbientOcclusion(i.uv,coord2, p, n);
}
ao/=(float)iterations*4.0;
//**END**//

//Do stuff here with your occlusion value “ao”: modulate ambient lighting, write it to a buffer for later //use, etc.
return o;
}

The concept is very similar to the image space approach presented in “Hardware Accelerated Ambient Occlusion Techniques on GPUs” [1] the main differences being the sampling pattern and the AO function. It can also be understood as an image-space version of “Dynamic Ambient Occlusion and Indirect Lighting” [2] Some details worth mentioning about the code:

이 개념은 “Hardware Accelerated Ambient Occlusion Techniques on GPUs” [1]에 설명된 이미지 공간에 접근하는 것과 매우 유사하고 샘플링되는 패턴과 AO 함수가 다르다. “Dynamic Ambient Occlusion and Indirect Lighting” [2] 의 Screen Space 버전으로써 이해할 수도 있다. 코드에 대해서 몇가지 디테일한 중요한 언급은 :

The radius is divided by p.z, to scale it depending on the distance to the camera. If you bypass this division, all pixels on screen will use the same sampling radius, and the output will lose the perspective illusion.
During the for loop, coord1 are the original sampling coordinates, at 90º. coord2 are the same coordinates, rotated 45º.
The random texture contains randomized normal vectors, so it is your average normal map. This is the random normal texture I use:

It is tiled across the screen and then sampled for each pixel, using these texture coordinates:

g_screen_size * uv / random_size

Where “g_screen_size” contains the width and height of the screen in pixels and “random_size” is the size of the random texture (the one I use is 64x64). The normal you obtain by sampling the texture is then used to reflect the sampling vector inside the for loop, thus getting a different sampling pattern for each pixel on the screen. (check out “interleaved sampling” in the references section)
반지름은 카메라와의 거리에 의존하여 크기를 조절하기 위해 p.z로 나눈다. 만약 이 나누기 계산을 무시한다면 화면에 있는 모든 픽셀들은 같은 샘플링 반지름을 가질것이고, 결과물은 원근법을 잃게 될 것이다.
루프를 도는 동안, coord1은 90도에 있는 원본 샘플링 좌표고 coord2는 45도 회전된 같은 좌표이다.
랜덤 텍스쳐는 임의의 노말 벡터들을 포함하고 있어 평균 노말 맵이라 할 수 있다. 이것은 내가 사용하는 랜덤 노말 텍스쳐이다.:

스크린에 걸쳐 타일링이 되고 이것들은 텍스쳐 좌표를 사용하여 각 픽셀로 샘플링된다.

g_screen_size * uv / random_size

g_screen_size는 화면의 너비와 높이를 포함하고 random_size는 랜덤 텍스쳐의 크기이다.(나는 64x64짜리를 사용한다). 텍스쳐가 샘플링 되는것에 의해 얻어지는 노말은 루프안에서 샘플링된 벡터를 반사하는데 사용되고, 따라서 화면에서 각 픽셀에 다른 샘플링된 패턴을 얻는다. (레퍼런스 색션에서 "interleaved sampling"을 확인해봐라.)

At the end, the shader reduces to iterating trough some occluders, invoking our AO function for each of them and accumulating the results. There are four artist variables in it:

결국에는, 쉐이더는 일부 occluder를 반복하는 것과 각각에 우리의 AO 함수를 적용하고, 결과를 누적하는 것을 줄여준다. 쉐이더에는 아티스트가 잡아줄 4개의 변수가 있다.

g_scale: scales distance between occluders and occludee.
g_bias: controls the width of the occlusion cone considered by the occludee.
g_sample_rad: the sampling radius.
g_intensity: the ao intensity.
g_scale: occluder와 occludee 사이에 거리의 크기를 정한다.
g_bias: occludee에 의해 고려되는 콘모양 차폐물의 너비를 조절한다.
g_sample_rad: 샘플링 반지름 범위
g_intensity: AO 정도(강함 정도)

Once you tweak the values a bit and see how the AO reacts to them, it becomes very intuitive to achieve the effect you want.

값들을 조금씩 바꿔보고 그것들이 AO에 어떻게 영향을 끼치는지 보라, 원하는 효과를 얻기 위해서는 매우 직관적으로 되야한다.

Results

a) raw output, 1 pass 16 samples b] raw output, 1 pass 8 samples c) directional light only d) directional light – ao, 2 passes 16 samples each.

As you can see, the code is short and simple, and the results show no self occlusion and very little to no haloing. These are the two main problems of all the SSAO algorithms that use only the depth buffer as input, you can see them in these images:

보다시피 코드는 짧고 간단하고, 결과물은 self occlusion을 보여주지 않고 haloing을 거의 보여주지 않는다. 입력으로 오직 깊이 버퍼만 사용하는 모든 SSAO 알고리즘의 두가지 주요한 문제가 있다. 이 이미지에서 그것들을 볼 수 있다. :

♦ Posted Image

The self-occlusion appears because the traditional algorithm samples inside a sphere around each pixel, so in non-occluded planar surfaces at least half of the samples are marked as ‘occluded’. This yields a grayish color to the overall occlusion. Haloing causes soft white edges around objects, because in these areas self-occlusion does not take place. So getting rid of self-occlusion actually helps a lot hiding the halos.

The resulting occlusion from this method is also surprisingly consistent when moving the camera around. If you go for quality instead of speed, it is possible to use two or more passes of the algorithm (duplicate the for loop in the code) with different radiuses, one for capturing more global AO and other to bring out small crevices. With lighting and/or textures applied, the sampling artifacts are less apparent and because of this, usually you should not need an extra blurring pass.

전통적인 알고리즘은 각 픽셀 주변의 구안에서 샘플링이되서 샘플의 최소한 절반정도 차폐되지 않은 평면 표면도 차폐되었다고 표시되기 때문에 self occlusion이 나타난다. 이것은 차폐 전반에 회색조의 색을 나타내게한다. haloing은 오브젝트 주변에 부드러운 흰색 모서리로 나타나는데 이 영역안에서는 self occlusion이 일어나지 않기 때문이다. self occlusion를 제거하는 것은 사실상 많은 haloing을 제거하는것에 도움을 준다.

이 방법으로부터의 차폐 결과는 또한 놀랍게도 카메라가 주변으로 이동할때도 일정하다. 만약 속도 대신에 퀄리티를 원한다면, 한번은 전역 AO를 위해서 나머지는 작은 틈을 가져오기 위해서 다른 반지름으로 두번 혹은 더 많은 패스(코드상에서 루프를 위해 중복되는)로 알고리즘을 사용할 수 있다. 조명 and/or 텍스쳐를 적용하면, 샘플링된 흠집들은 이 방법 때문에 적게 보여지고 보통 추가적인 블러 패스가 필요없다.

Taking it further

I have described a down-to-earth, simple SSAO implementation that suits games very well. However, it is easy to extend it to take into account hidden surfaces that face away from the camera, obtaining better quality. Usually this would require three buffers: two position/depth buffers (front/back faces) and one normal buffer. But you can do it with only two buffers: store depth of front faces and back faces in red and green channels of a buffer respectively, then reconstruct position from each one. This way you have one buffer for positions and a second buffer for normal.

나는 게임들에 매우 적합한 간단한 SSAO 구현을 실제적으로 설명하고 있다. 그러나, 더 나은 퀄리티를 얻기 위해서 카메라로 부터 면이 떨어진 은면들을 고려하기 위해서 구현을 확장하는 것은 쉽다. 보통 이것은 세개의 버퍼를 필요로한다. : 두개의 위치/깊이 버퍼(앞면/뒷면) 그리고 하나의 노말 버퍼이다. 그러나 오직 두개의 버퍼를 가지고 할 수 있다. : 앞면과 뒷면의 깊이를 빨간색, 초록색 채널에 각각 저장하고 각각으로 부터 위치를 재구축한다. 이 방법으로 당신은 위치를 위한 하나의 버퍼와 노말을 위한 버퍼를 가진다.

These are the results when taking 16 samples for each position buffer:

각 위치 버퍼에 대해서 16 샘플링을 한 결과가 있다.

left: front faces occlusion, right: back faces occlusion

To implement it just and extra calls to “doAmbientOcclusion()” inside the sampling loop that sample the back faces position buffer when searching for occluders. As you can see, the back faces contribute very little and they require doubling the number of samples, almost doubling the render time. You could of course take fewer samples for back faces, but it is still not very practical.

이것을 구현하기 위해서 occluder를 찾을 때 뒷면 위치 버퍼를 샘플링하는 샘플링 루프 안에서 여분의 "doAmbientOcclusion()"함수를 호출해준다. 보시다시피, 뒷면은 매우 약간의 영향을 끼쳐서 샘플링 숫자를 두배로 올릴 필요가 있고 렌더링때 거의 두배로 올린다. 물론 뒷면에 더 적은 샘플링을 취할수는 있지만 실용적이지는 않다.

This is the extra code that needs to be added:

이것은 추가될 필요가 있는 여분의 코드이다:

inside the for loop, add these calls:

루프안에 이것들의 호출을 추가하라.

ao += doAmbientOcclusionBack(i.uv,coord1*(0.25+0.125), p, n);
ao += doAmbientOcclusionBack(i.uv,coord2*(0.5+0.125), p, n);
ao += doAmbientOcclusionBack(i.uv,coord1*(0.75+0.125), p, n);
ao += doAmbientOcclusionBack(i.uv,coord2*1.125, p, n);

Add these two functions to the shader:

이 두 함수를 쉐이더에 추가하라.

float3 getPositionBack(in float2 uv)
{
return tex2D(g_buffer_posb,uv).xyz;
}
float doAmbientOcclusionBack(in float2 tcoord,in float2 uv, in float3 p, in float3 cnorm)
{
float3 diff = getPositionBack(tcoord + uv) - p;
const float3 v = normalize(diff);
const float d = length(diff)*g_scale;
return max(0.0,dot(cnorm,v)-g_bias)*(1.0/(1.0+d));
}

Add a sampler named “g_buffer_posb” containing the position of back faces. (draw the scene with front face culling enabled to generate it) Another small change that can be made, this time to improve speed instead of quality, is adding a simple LOD (level of detail) system to our shader. Change the fixed amount of iterations with this:

뒷면의 위치를 포함할 "g_buffer_posb"라는 이름의 샘플러를 추가하라. (앞면 컬링을 하고 씬을 그리면 생성할 수 있다.) 퀄리티보다 속도를 향상시키기 위해서 이번에 해볼수 있는 또 하나의 작은 변화는 우리의 쉐이더에 간단한 LOD 시스템을 추가하는 것이다. 이것으로 반복의 고정된 횟수를 변경하라.

int iterations = lerp(6.0,2.0,p.z/g_far_clip);

The variable “g_far_clip” is the distance of the far clipping plane, which must be passed to the shader. Now the amount of iterations applied to each pixel depends on distance to the camera. Thus, distant pixels perform a coarser sampling, improving performance with no noticeable quality loss. I´ve not used this in the performance measurements (below), however.

변수 "g_far_clip"은 원단면 클리핑 평면의 거리이고, 쉐이더에 전달해야만 한다. 이제는 각각의 픽셀에 적용되는 반복의 횟수는 카메라의 거리에 의존한다. 따라서 멀리 떨어져 있는 픽셀은 더 거칠게 샘플링 되고 알아볼정도의 퀄리티 손실 없이 퍼포먼스를 향상된다. 그러나 나는 퍼포먼스 측정때 이것을 사용하지는 않았다.

Conclusion and Performance Measurements

As I said at the beginning of the article, this method is very well suited for games using deferred lighting pipelines because it requires two buffers that are usually already available. It is straightforward to implement, and the quality is very good. It solves the self-occlusion issue and reduces haloing, but apart from that it has the same limitations as other screen-space ambient occlusion techniques: Disadvantages:

이 아티클의 시작때 내가 말했듯이, 이 방법은 보통 이미 이용할 수 있는 두 버퍼를 필요로 하기 때문에 디퍼드 라이팅 파이프라인을 사용하는 게임에 매우 적합하다. 이 방법은 구현하기 쉽고 퀄리티도 매우 좋다. 이 방법은 self occlusion 이슈를 해결하고 haloing을 줄여준다. 하지만 다른 SSAO 테크닉들과 같은 제한은 있다:

단점 :

Does not take into account hidden geometry (especially geometry outside the frustum).
The performance is very dependent on sampling radius and distance to the camera, since objects near the front plane of the frustum will use bigger radiuses than those far away.
The output is noisy.
숨겨진 지오메트리를 고려하지 않는다.(특히 절두체 밖의 지오메트리)
퍼포먼스는 샘플링 반지름과 카메라의 거리에 매우 의존적이고 절두체의 앞 평면 근처의 오브젝트들은 멀리 있는 것들 보다 더 큰 반지름을 사용할 것이다.
결과물은 노이즈가 있다.

Speed wise, it is roughly equal to a 4x4 Gaussian blur for a 16 sample implementation, since it samples only 1 texture per sample and the AO function is really simple, but in practice it is a bit slower. Here´s a table showing the measured speed in a scene with the Hebe model at 900x650 with no blur applied on a Nvidia 8800GT:

속도면에서 16 샘플링 구현을 위한 4x4 가우시안 블러와 거의 동일하고 샘플링당 텍스쳐 1장씩만 들고 AO 함수도 정말 간단하지만 실용적인 면에서는 조금 느리다. 여기 Nvidia 8800GT에서 블러 없이 900x650 해상도에서 Hebe 모델을 띄운 씬의 속도를 측정한 표를 보자.

In these last screenshots you can see how this algorithm looks when applied to different models. At highest quality (32 samples front and back faces, very big radius, 3x3 bilateral blur):

이 마지막 스크린샷들에서 다른 모델에 적용되었을 때 이 알고리즘이 어떻게 보이는지 알 수 있을 것이다. 가장 높은 퀄리티(앞면, 뒷면 32 샘플링, 매우 큰 반지름, 3x3 블러 적용)

At lowest quality (8 samples front faces only, no blur, small radius):
가장 낮은 퀄리티(앞면 8 샘플링, 블러 없음, 작은 반지름)

It is also useful to consider how this technique compares to ray-traced AO. The purpose of this comparison is to see if the method would converge to real AO when using enough samples.

레이 트레이싱 기반의 AO와 이 기술을 비교하면 어떤지 고려하는데도 유용하다. 이 비교의 목적은 충분한 샘플링을 사용했을 때 이 방법이 실제 AO와 근접하게 모이는지를 보는 것이다.

Left: the SSAO presented here, 48 samples per pixel (32 for front faces and 16 for back faces), no blur. Right: Ray traced AO in Mental Ray. 32 samples, spread = 2.0, maxdistance = 1.0; falloff = 1.0.

One last word of advice: don´t expect to plug the shader into your pipeline and get a realistic look automatically. Despite this implementation having a good performance/quality ratio, SSAO is a time consuming effect and you should tweak it carefully to suit your needs and obtain the best performance possible. Add or remove samples, add a bilateral blur on top, change intensity, etc. You should also consider if SSAO is the way to go for you. Unless you have lots of dynamic objects in your scene, you should not need SSAO at all; maybe light maps are enough for your purpose as they can provide better quality for static scenes.

마지막 충고 : 당신의 파이프라인에 이 쉐이더를 붙인다고해서 자동적으로 실제와 같이 보이는 퀄리티를 기대하지 말아라. 이 구현은 좋은 퍼포먼스와 퀄리티를 가짐에도 불구하고 SSAO는 시간을 소비하는 효과라서 당신의 요구에 맞추면서 가능한 가장 좋은 퍼포먼스를 얻기 위해서 주의를 기울여야 한다. 샘플링을 추가하거나 제거하고 위에 블러링을 추가하고 밀도 정도치를 변경하는 등등. 당신은 SSAO가 당신을 위한 길인지에 대해서도 고려해봐야 한다. 당신이 씬에 동적인 오브젝트들을 많이 가지고 있지 않다면 SSAO가 전혀 필요하지 않다; 아마도 정적인 씬을 위해 더 좋은 퀄리티를 제공하기 위해서 라이트 맵은 당신의 목적에 충분할 것이다.

I hope you will benefit in some way from this method. All code included in this article is made available under the MIT license

References

[1] Hardware Accelerated Ambient Occlusion Techniques on GPUs
(Perumaal Shanmugam) [2] Dynamic Ambient Occlusion and Indirect Lighting
(Michael Bunnell)

[3] Image-Based Proxy Accumulation for Real-Time Soft Global Illumination
(Peter-Pike Sloan, Naga K. Govindaraju, Derek Nowrouzezahrai, John Snyder)

[4] Interleaved Sampling
(Alexander Keller, Wolfgang Heidrich)

Crytek´s Sponza rendered at 1024x768, 175 fps with a directional light.

The same scene rendered at 1024x768, 110 fps using SSAO medium settings: 16 samples, front faces, no blur. Ambient lighting has been multiplied by (1.0-AO).

The Sponza model was downloaded from Crytek's website.

About the Author(s)

José María Méndez is a 23 year old computer engineering student. He has been writing amateur games for 6 years and is currently working as lead programmer at a startup company called Minimal Drama Game Studio.

'프로그래밍 > 번역' 카테고리의 다른 글

So You Want To Be A Graphics Programmer (0)	2018.02.02
How to become a Graphics Programmer in the games industry (0)	2016.03.12
D3DBook::Screen Space Ambient Occlusion (0)	2012.07.10
Deferred shading from Wikipedia (0)	2012.04.30
Raytracer in C++ - Introduction - What is ray tracing - 번역 (0)	2012.02.11

Posted by msparkms

MSPark's Blog