t3ssel8r

Pixel Art AA Video Notes

Added 2023-05-20 12:50:16 +0000 UTC

Edit: The final video, and therefore, this post, is now publicly available :) Thanks for the early feedback.

Below are some supplementary notes:

Bilinear Sampling

In the video, I describe the bilinear sampler as a box filter, but it's important to note here that I'm referring to a theoretical continuous-domain box filter, rather than a discrete one that you can compute when image processing. Just as discrete filters can be represented as convolution with a kernel matrix, so too can the continuous-domain box filter, with a continuous-domain kernel function:

k(x, y) = rect(x/w) * rect(y/h) / (w*h)

where rect is the rectangular function and w and h are the dimensions of the box. The nuance here is that, while this filter is not computable in the general case, it does have a closed-form solution when applied to a texture with discrete colors in a texel grid, because solving the integral for continuous-domain convolution results in taking a weighted average of the texel colors, with each texel weighted by how much of their area falls within the box. When the box size is exactly equal to the texel size, the equation for weighted average exactly match the equation for bilinear interpolation.

The key insight is that when we're computing these weighted averages, so long as we keep the relative areas in proportion, we can change the size of the box while maintaining the same average. On the flip-side, we can convert any box (that contains at most 4 texels) into a texel-sized box, and this is the trick used in the video to compute pixel-sized box filters in texel-space using the bilinear sampler.

While this understanding is not necessary for implementing the original filter from Cole's blog, it is necessary (or at least, it provides a good intuitive framework) when we generalize the problem to solve for rotated and distorted textures.

Deformation Gradient

An assumption of modeling the deformation using its gradient that I didn't mention in the video is that we are assuming the deformation between pixel and texture space is linear. This is generally a reasonable assumption even though the perspective transform is nonlinear, because pixels are small relative to the nonlinearity in the perspective transform.

The reason for using this linear model is that the shader language provides us with functions for calculating derivatives (ddx and ddy, or in glsl, dFdx and dFdy) which the GPU performs efficiently by comparing values between neighboring pixels in a 2x2 quad.

Mipmapping/Anisotropic Filtering

When the texture is small enough on-screen that the pixels are actually bigger than the texels, the box size computed becomes larger than 1. In the video, I clamp the box size to 1, and let the GPU's texture filtering take over. However, enabling these filters when the box size is smaller than 1 can lead to sampling from the wrong mip level due to the highly nonlinear transformation we applied to the UVs, so to keep things stable, we use tex2Dgrad (or in glsl, textureGrad) to pass in the original UV's gradient.

This is typically a bit slower than using tex2D because some extra data needs to be passed from the shader to the texture sampler, so for applications where small texels are rare, it may be advisable to not turn these filters on, but I haven't personally benchmarked these variations because they don't currently pose a performance bottleneck for my applications.

Smoothstep

Toward the end of the video, I rush through an explanation that we can compute a more complex weighted average where we weigh the center of the box higher than the edges using a quadratic kernel:

k(x, y) = ⁹⁄₄ max((1-4x²)(1-4y²), 0)

This kernel turns out to be separable:

k(x, y) = ³⁄₂ δ(y)max((1-4x²), 0) ∗ ³⁄₂ δ(x)max((1-4y²), 0)

And if we analyze it in either dimension, it becomes the integral

g(t) = ∫ ³⁄₂ max((1-4𝜏²), 0) f(t-𝜏) d𝜏

where f is our input and g is the filtered output. Again, for a texture with a discrete grid, this integral has a closed-form solution, which is computable using the smoothstep function present in the video.

This derivation turns out to be a bit of a narrative trick: in reality, I started with the smoothstep function, which is a common trick for tightening up anti-aliased edges, and derived a physical interpretation that fell in line with the context provided by the video up until that point.

Straight/Premultiplied Alpha

This is kind of a pain point in the Unity engine, which insists on using straight alpha when importing assets, and in its standard shaders. Transparent shaders in Unity blend with

Blend SrcAlpha OneMinusSrcAlpha

Instead, we can process our textures to pre-multiply the alpha using a custom texture importer, and use the blend mode

Blend One OneMinusSrcAlpha

Note that switching to premultiplied alpha blending means that all of our color sources need to be premultiplied, so in the vertex shader, if we sample vertex color or a passed-in color uniform, we need to either ensure the input is premultiplied, or multiply it ourselves.

if (_Premultiply == 1) {
data.color.rgb *= data.color.a;
}

In the video, I make the statement:

The correct solution is to use pre-multiplied alpha blending, which is an alternative way of encoding transparency that bypasses this issue entirely by having well-defined color for transparent texels, while otherwise producing exactly equivalent results.

This was pointed out to me by a viewer to be slightly misleading wording: in general, straight and premultiplied alpha encoded images do not behave the same when blending between different semitransparent colors. I highlighted one particularly problematic difference (the case with fully transparent texels bordering multiple different colored opaque texels) but there are other differences involving semi-transparent texels as well, and in all cases, premultiplied alpha produces a better outcome.

Sprites with normal maps

Sometimes, pixel art assets are authored with normal maps for use in games that have dynamic per-pixel lighting. It's of course entirely possible to apply this same technique to anti-alias the normal map, so long as it has the same resolution as the color texture, although for textures with transparency, care might need to be taken to provide a 1-texel margin in the normal map so the bilinear sampler returns good results at the transparent edge.