Texture AntiAliasing

Introduction Hardware AA FSAA Texture AA

Concepts

Using texture mapping for geometry antialiasing is not a new idea. I found descriptions and presentations about it, but no actual code. The basic idea is to replace the original primitive with texture mapped triangles, as illustrated:

The texture applied to the triangles is simply an antialiased circle. Slices of the circle are stretched out for the line and triangle cases. To accommodate fractionally variable primitive sizes, we can generate the circle texture at an arbitrary maximum size and generate mipmaps for smaller sizes. Trilinear mipmapping will then approximate a texture for any desired size.


Implementation

My implementation, dubbed "glAArg" (OpenGL AntiAliased rendering glue), has two basic parts:

First, the texture. The circle needs to retain a one pixel empty border around it at every mip level, for blending. Ideally GL_ARB_texture_border_clamp could be used for this, but that extension is not available on all GPUs, so I fell back on a less efficient but universally supported approach. As shown, a 2D texture four times larger than the desired circle size is generated; texture REPEAT mode wraps the empty space around all sides of the circle. The circle itself is drawn into the texture with 8-way symmetry and variable exponential falloff from the center to the edge (sharp falloff pictured.)

Second, I made a set of wrapper APIs which intercept normal immediate mode GL calls and queue their data to do the right thing when enough vertices have been submitted for the current primitive type. For example, glAAVertex2f(x, y) will buffer the first vertex of a line segment. When the second vertex is submitted, it will then calculate eight vertices to project a rectangle and two endcaps around the line, and perform a real submit to the GL. If the hardware supports it, the calculated vertices / colors / texture coordinates are stored in a large array and submitted in chunks using VAR. The APIs are written in plain C and have been compiled into a GLUT demo app under Mac OS X, Linux, and Windows.


Results

Here is the test pattern rendered on four different GPUs with texture antialiasing:

The results are not quite bit-for-bit identical, due to tiny differences in the texture filtering between GPUs, but they are nearly identical, despite the very different results the same GPUs had with regular antialiasing and FSAA. In addition, the quality is better than any of the previous approaches, and in some cases arguably better than Quartz:

And some things are worse:

There are also some extra things that can be done in my textured implementation:


Performance

So now you are thinking, surely this quality comes at a huge speed hit? Well, yes and no. On the plus side, some operations that would require a state change in OpenGL (and thereby prevent using VAR for acceleration) such as setting the point size or line width can now be avoided, since these properties are calculated manually anyway. This means that for cases where the size of every primitive changes, the texture approach ends up being faster than regular antialiasing, because it can be accelerated with VAR. And in any case, it is much faster than Quartz.

On the minus side, of course there is a lot of extra geometry calculation to do, and three or four times as many vertices to transform and render. Triangles in particular are a lot slower. Naturally, my implementation could be further optimized. For example several areas are ripe for Altivec acceleration. And on modern GPUs, ultimately it should be possible to implement all of the geometry projection as vertex shader programs.

The exact performance of course depends on both the speed of the CPU and the GPU's vertex and fillrate limitations. Here is a chart benchmarking various machines in all of the rendering modes, counting thousands of points/lines/triangles drawn per second:

SystemAliasedHardware AAFSAA 2xFSAA 4xFSAA 6xTexture AASoftware AAQuartz
Rage 128 (iMac DVse 400 MHz G3) 159/13/27 159/13/27 99/31/12 10/8/0 4/0/0
Rage 128 Pro (Sawtooth 500 MHz G4) 256/35/92 256/34/91 293/92/41 14/10/0 5/0/0
Radeon 7000 (iBook 700 MHz G3) 201/239/101 192/228/88 195/176/51 156/136/27 191/75/50 15/12/0 7/0/0
Radeon 7500 (iBook 800 MHz G3) 200/287/150 200/288/121 200/203/76 200/175/44 200/78/76 17/14/0 8/0/0
Radeon 8500 (QS '02 Dual 1 GHz G4) 283/465/285 279/448/285 281/429/212 280/428/122 700/341/141 20/19/0 11/0/0
Radeon 9000 (QS '02 Dual 1 GHz G4) 285/458/301 285/466/301 277/403/150 279/402/101 793/326/148 20/19/0 11/0/0
Radeon 9200 (iBook 800 MHz G4) 190/325/151 190/296/151 188/240/87 180/200/53 365/162/81 14/13/0 8/0/0
Radeon 9600 (Dual 2.0 GHz G5) 475/576/298 456/559/298 453/558/199 469/298/120 460/200/85 1181/475/198 57/32/0 22/0/0
Radeon 9700 (FW800 Dual 1.42 GHz G4) 295/373/600 298/381/600 299/382/301 296/374/201 298/301/150 1198/510/194 28/27/0 15/0/0
Radeon 9800 (1.6 GHz G5) 395/512/600 385/485/600 386/487/300 389/397/201 378/302/150 1196/599/288 51/28/0 20/0/0
GeForce2 MX (QS Dual 800 MHz G4) 397/242/120 183/155/121 397/174/67 382/136/36 379/134/56 16/15/0 9/0/0
GeForce4 MX (FW800 Dual 1.42 GHz G4) 598/302/200 290/237/200 397/242/120 588/202/75 595/200/86 28/27/0 15/0/0
GeForce4 Ti (FW800 Dual 1.42 GHz G4) 847/731/600 195/405/600 855/602/300 871/403/150 1198/530/188 28/27/0 15/0/0
GeForceFX 5200 (Dual 1.8 GHz G5) 1152/602/596 247/243/536 1152/602/300 596/401/151 1199/600/285 52/28/0 20/0/0
Testing Notes:

Summary

The screenshots should convince you that the texture antialiasing quality is roughly equal to Quartz, and much better than any other OpenGL approach.

From the benchmark chart, we can see:

So there is a definite speed hit compared to regular OpenGL rendering overall, but in certain situations (randomly sized points) it can actually be faster. It is much faster than Quartz at nearly the same quality, and for my applications, Quartz is the only viable alternative given all the inconsistencies and problems with other OpenGL antialiasing methods.

Of course Texture AA is not a replacement for Quartz. It only draws a few types of antialiased shapes, nothing else. It is also subject to other limitations of OpenGL, such as maximum window size, which Quartz does not have. But for my needs it is a clear win.


Source Code

The current version of glAArg is 0.2.2:

Download
(Note: this software, written in 2004, has some assumptions based on the PPC Mac achitecture of the day. The basic ideas can still be applied to current OpenGL or OpenGLES.)