Lay of the Land - Text Rendering

(Kae) #1

I had a discussion on Discord with a couple of people from the UI team recently about rendering text and thought it appropriate to post here to share it better.

What I want to cover is essentially the problem of rendering arbitrary text in a real-time context. There are a number of requirements for this that yield less than ideal results with the most naive solutions.

  • Support arbitrary text resizing. For example, placing text in a 3D world and walking very close to text, you may have a single glyph covering the player’s entire screen. Other cases may include animating UI text sizes.
  • Very efficient rendering both in terms of GPU and CPU resources. Especially large spikes are generally undesirable even as text changes in many ways.
  • Dynamic visual effects such as shadows, outlines, gradients or glow.
  • Support for arbitrary glyphs as defined by font files, even emoji and others

Background
Before I go into the industry solutions I know of, let’s cover the source data we are actually trying to render. Mike Acton would be proud.

  • Text is usually defined as a series of Unicode “Code Points”.
  • A Font is a collection of Glyphs where each glyph maps to a Code Point.
  • A Glyph is a representation of the actual visual artifact that is put on the screen.

The most popular glyph representation in a font is “vectors” (TrueType, OpenType for example) which might be familiar if you’ve done graphic design. Illustrator, SVG etc. A vector font uses lines and curves to represent a glyph.

So to summarize, the job of a text rendering system is then to take a position on the screen, an input string of Unicode Code Points and output something in an image buffer.

Seems pretty straight-forward so far, right?
There are essentially two parts:

  • Find out which glyphs go where based on the string input. This stage outputs vector primitives and positional information.
  • Render the vector primitives.

Should be easy & fast, right? Well, it turns out languages are hard, and there are things like right-to-left languages, composite characters, control codes and other fun stuff that makes the first part a bit more complicated. Laying out glyphs, doing kerning and handling all the unicode code point stuff is a lot of work but usually the OS or a third-party library can handle this for us. Unfortunately because of the (necessary) complexity of Unicode, it is hard to offload this part to the GPU.

Rendering vector primitives on the GPU is kind of solved now? I mean, people have done it, but I don’t think it’s actually used in any of the big game engines yet.

Popular solutions
CPU Texture Rasterization
While Languages Are Hard, your OS or libraries like FreeType have good, simple-to-use support for taking Unicode text and rasterizing onto a bitmap texture at a specific font size. The texture can then be uploaded to the GPU and rendered in-engine like any other texture.

The good part about this technique is that it’s really simple to implement, really easy to render and the CPU rasterization is generally quite optimized. The benefits really end there though, unfortunately.

  • While CPU rasterization is generally fast, it’s not fast enough to run every frame on every text string on the screen. To implement this properly, you’ll need to cache both the layout and rasterization step. The most popular approach is to maintain cache textures with glyphs on the GPU and render them just like you’d render a spritesheet. Sprite = Glyph, essentially. It’s possible to upload only the glyphs that changed when new glyphs need to be added to the GPU cache so it doesn’t need to break your GPU bandwidth budget either.

  • Since glyphs are stored in a texture, you can’t render at a different pixel screen size without artifacts. This usually means you need to keep text in your game at a few set sizes and maintain separate caches for each size.

  • Dynamic visual effects such as shadows, outlines or glows are a bit hacky to implement. Usually you’d draw each glyph N extra times in each direction with a different color to create outlines, shadows or glows.

On the plus side, implementing this solution can unify the sprite and text rendering passes. Rendering a text string will be a matter of rendering a number of sprites at very specific positions, so a UiText component may produce a number of sprite draw commands to the sprite rendering pass for example.

Unity’s Text implementation uses this solution.

Signed Distance Fields encoded in Texture
Chris Green’s beautiful paper has inspired a lot of implementations for good-looking text rendering in games. Valve’s games use it and for Unity there’s TextMeshPro.

Basically: each pixel contains distance to the closest curve in the image instead of the alpha value of the rasterized glyph. But you should read the paper instead.

Positives

  • Rendering the text requires a relatively simple custom shader so it’s quite fast.
  • Since the texture encodes distance instead of alpha, bilinear interpolation works well. The text scales from very small to very large without many visible artifacts.
  • Outlines, glows and shadows are easy to implement thanks to the distance field.

Negatives

  • Generating the distance field takes a really long time, especially with the method detailed in the paper. This generally means that it’s not viable to generate new glyphs for the cache at runtime - all glyphs that you want to use in your game needs to be pre-generated. This makes it a non-starter for player-generated text, but is fine for all static text.

The implementation for this solution looks quite similar to the previous one with the exception that generating the texture cache and the rendering pass differs, and you generally need fewer sizes. Maintaining texture cache, looking up glyphs etc will be identical. But it does require a custom shader and pass.

Rendering Vectors on the GPU
Aras’ Font Rendering is Getting Interesting is a good take on the solutions in this space about a year ago and you might prefer his article over my post. Since he posted it, it looks like there’s a commercial product called Slug that claims to render quadratic Bézier curves on the GPU. Quadratic Bézier curves can be created from vector primitives output by font libraries, so this solves the second step of rendering vector primitives entirely on the GPU. Since it rasterizes on the GPU, scaling artifacts are entirely eliminated but this method may not work on older hardware due to the compute requirements.

Where do we go?
RustType and gfx-glyph are already used in Amethyst to implement solution #1. I think a unification of the sprite and text rendering passes are a good next step.

After that, maybe rendering vectors on the GPU is a more forward-thinking step than implementing the Distance Field in Texture thing?

6 Likes

(Hilmar Wiegand) #2

First of all, thanks for the super detailed writeup! I was in the discussion before, so I’ll just drop my immediate thoughts in here:

I think solidifying the first solution we have right now and making it usable as easily as possible should be the priority right now. I don’t think it’s in the scope of the UI refactor to completely revamp the text rendering, and it would block us from getting the rest of the benefits of the refactor.

However, starting some experimenting with the newer tech (glyphy and pathfinder were mentioned in the discussion) and eventually maybe even coming up with our own solution based on similar techniques may have huge benefits for us - and might even be a major selling point for the engine if they do mature.

In the end though, I believe every way to do it is gonna have tradeoffs, so in a perfect world I believe we should offer multiple ways of rendering text (CPU, SDF and future stuff) and make the user aware of the pros and cons for each.

1 Like

(Kae) #3

I agree that we should polish and optimize the current solution. As I mentioned in the text, I think we can unify the sprite/image rendering with the text rendering to eliminate a specialized text pass.

As for future rendering, yeah, I think it’d be good to offer multiple ways to render text. SDF is a good solution to a difficult problem, but we can clearly see from Slug that if someone was to implement quadratic Bézier curve rendering on the GPU using our rendering API, that could unlock bleeding edge text rendering. Might even eliminate the need for the current solution - time will tell I suppose.

0 Likes

(Nicolas Silva) #4

Glyphy is similar to slug in principle except that all curves are approximated with elliptic arc segments. It’s a bit of an odd choice compared to quadratic béziers and the author himself admitted going for the latter would have been a better choice. The implementation suffers from float precision bugs and the arc approximation is more expensive to compute than quadratic béziers. If you are looking into this category of rendering techniques I would suggest trying to follow the steps of slug rather than Glyphy.
I think that pathfinder is simpler to implement and reason about though.

1 Like

(Hilmar Wiegand) #5

The problem with pathfinder is that iirc it uses highly specialized OpenGL shaders that do the rendering, and since we’re using the gfx-rs backend and are looking to switch to gfx-hal in the future, it’s not feasible to use those shaders as they are. It would be interesting to look into whether or not those could be ported to gfx-hal though, as that would enable them to be used in a whole bunch of ways.

0 Likes

(Nicolas Silva) #6

Pathfinder’s shaders are not too complicated (I suspect slug is much more complicated for example). You’ll have to implement some amethyst specicific things whichever path you choose and I think that porting pathfinder is reasonably easy compared to other GPU glyph rasterization techniques. Pathfinder will eventually be ported to gfx-rs since WebRender plans to use both (but that’ll be in a while)

0 Likes

(Hilmar Wiegand) #7

I see, that’s good to know. Honestly I haven’t looked into pathfinder too closely, it’s just that they advertise their super-optimized shaders on the github readme :yum:

0 Likes