Foliage Optimization in Unity

In my last blog post Art Tips for Building Forests, I outlined some things I keep in mind when building the look of 3d forests. I stuck only to art tips, that is, tips for what assets you should have and things to think about when placing them. One of the most important things about a lush forest is density, and something that comes part and parcel with density is performance optimization.

The Problem

Since there are many ways to tackle forest optimization, there aren’t many universal guidelines for how to author your assets. However, one thing that is universal (almost anyway) with regards to forest optimization is that your number one enemy will be draw calls (or in Unity known as batches).

While poly count is also important, the problem is not as complex. One just needs to know the reasonable targets. Here are some polycounts for a few of my assets for reference on what might be reasonable for a given type of vegetation. If you find yourself needing too many alpha cards to reach your desired canopy fullness, you may look into increasing leaf coverage in the texture.

Finally, there is overdraw. This is when you have lots of overlap between alpha cards and objects. I don’t even think about this or look at this, because frankly I don’t know what I can do about it. I’m not going to modify the silhouette of my trees to reduce overlapping. In fact, I’m not going to modify the silhouette of my trees for any reason other than to make them look as good as possible artistically. I’m also not going to modify the layout of my forest to prevent overlapping vegetation.  It’s hard enough to make a forests believable without worrying about overdraw. I just concentrate on keeping down poly counts and draw calls.

The Plan

If you came to this article hoping to find hidden Unity settings you can tune to make things more “optimized” I am sorry but I don’t have any of those. To tackle this problem you do need a plan. Most plans will require your forest to be built from the ground up with your optimization strategy in mind. If your forest is a smorgasbord of assets from disparate sources, you may have a hard time with this. As I said, there are many different strategies to reduce draw calls. In this article, I will attempt to show you mine. My goal, in a sentence, was to combine the LOD1 meshes (second LOD stage) into mega meshes so draw calls consolidate as the player moves further.

Atlasing

Step one in my plan was getting every gosh darn thing in the forest (or as close as possible) to use a single material. That’s right. All my forest assets share a single material. To that end I needed all my veggies to sample a single texture.

The reason I needed to do this was so I could safely assume any cluster of foliage assets could consolidate to a single draw call when combined. Had each tree or grass clump a unique material, the resulting combined mesh might have many submeshes, and therefore many draw calls. Authoring all assets to a single texture wasn’t difficult since I planned it from the start. I designated a 2048×2048 and allotted areas on the sheet for assets I knew I’d need, and then when I authored them I simply kept adding to the texture. It’s possible to automate this process. It would require modifying the UVs of all your foliage assets through script though, and sometimes you can organize things more efficiently if you do it by hand.

Grouping

The eventual goal is combining, but first we need to determine how we’re going to combine things. It is unwise to make one giant super mesh, for two reasons: The first is that unity uses a 16 bit index buffer for its meshes meaning each mesh can only have 64k verts max (however it sounds like 2017.3 will use 32 bit index buffers, therefore this will be a moot point soon). Secondly, and most importantly, you won’t have a means of LODing individual groups or of taking advantage of frustum or occlusion culling, since your entire forest will be one mesh. The draw calls will be nice and low, but the triangle count might will be absurd. We can afford a few extra draw calls to save a boatload of triangles.

Enter the hex grid. Basically I wrote a script that automatically groups all my veggies into a hex grid. I chose hexes over squares since hexes are more circular, making a simple per-group position check for player distance more accurate. A square will have corners that may be sticking out closer to the player.

You will see later that having everything grouped into a grid is helpful for a few other optimization tricks beyond just combining.

With me so far? Here is one more layer to the madness. The hex groups are then grouped into super hexes. When every regular hex in a given super hex is at its LOD2 state (the furthest LOD which for me is basically two planes, sometimes referred to as an ‘imposter’ mesh), the super hex switches to a combined version of the LOD2 hexes, further consolidating draw calls. If you’re wondering why I didn’t simply make my first hex grid have larger hexes, the reason is that having smaller hexes allows me to capitalize on more granular transitioning to the far LODs, and more accurate culling (as mentioned in the opening paragraph of this section). When things are far enough and low poly enough they can be combined to larger hexes (this transition is unnoticable since by this point all the small hexes are already at their furthest LOD. We’re just switching to a big combined version of them).

Combining

Initially when I began to build my system, I combined the LOD0s as well as the LOD1s. I have found this to be too memory heavy. It makes every vertex on the highest poly version of all your foliage assets have a unique memory footprint, since every combined mesh is a unique mesh. Additionally, the larger your meshes are the less granular frustum culling is, thereby drawing more triangles than necessary. You tend to be standing close to or right on top of the LOD0 hexes, so the issue of inaccurate frustum culling is exacerbated. I found combining only the LOD1s to be the perfect balance. The LOD1 is low poly enough to use very little memory, but has the most massive savings since most of my loaded world will be in a LOD1 state.

Unfortunately combining meshes in Unity is not straightforward. I had to write my own script, since the scripts I tried on the asset store don’t handle submeshes or vertex colors. As far as how you structure the data for your LODs, that is up to you. I opted to not use the built-in LOD component, since it would merely be a data container anyway (the hex groups handle the LOD switching). I simply use a monobehaviour with references to deactivated child GameObjects, so I can easily get the materials and submeshes from their renderers.

LODing and Culling using the Hex Groups

Here’s an important thing to know: the built-in Unity LOD component is expensive. Especially if you have one active on every clump of grass. That is a lot of distance checks happening on the CPU. One of the wonderful benefits of having your world divided into tidy hex groups, is that you can distance check on each group, instead of each child object. Then the entire group LODs as one unit. I even have an animated transition that uses the alpha cutoff property in the shader (a simple way to get a nice blend!)

Another use for the groups is a sort of dynamic occlusion culling. I do not use Unity’s built-in occlusion culling, because last time I checked, it does not mix gracefully with multi-scene. The groups are few enough to where a few raycasts during runtime can be done to see if a hex is entirely occluded from view. I do not do these raycasts every frame, I just give enough extra leeway in the hex bounds to make sure things appear in time when going over hills and around corners. You could never do this for every object. It would be way too expensive. But a few raycasts for a group of 20-50 objects will be worth it, especially when it is only happening every few frames. I only check for occlusion against terrain, since nothing else in a forest is substantial enough to reliably occlude.

Authoring for Smooth LOD Transitions

A smooth LOD0 to LOD1 transition is usually better than a low polycount on the LOD0. If your transition is nice you can move the LOD distance closer, thereby reducing total polys on-screen. One thing I do particularly for my trees, is author with the LOD1 in mind. I build my trees from branch instances in my 3d package. This allows me to author a LOD model for a few single branches, and then update all the instances with this lower poly version to create my complete LOD1.

Terrain

Terrain (by terrain I mean the ground itself) is a little outside the scope of this article, but I feel it’s important to note that I don’t use the built-in Unity terrain system. Therefore all my veggies are regular GameObjects. I opted to use regular meshes over the terrain system for three reasons: The first is that the default Unity terrain system is very poor on performance for the blobby mesh it gives you, and creates hundreds of extra draw calls and thousands of poorly distributed polygons I can’t afford to waste. Using a regular mesh allows for controlled distribution of polygon resolution where you need it. Secondly, authoring shaders for the default terrain system is very restrictive, and there are a lot of idiosyncrasies about it which are poorly documented. Lastly, I have plenty of holes and overhangs. The shader I use for my ground is fairly straightforward. It’s a three channel vertex splat with a macro overlay and normals.

Shading

It’s important that your vegetation shader general performance (known as pixel fillrate) is reasonably optimized. If you are using a deferred rendering path, getting your vegetation shader to be fully deferred can offer huge savings. Mine used to be forward rendered, and when I finally figured out how to get the same shader deferred, I shaved 30% off my render time. Creating a fully deferred vegetation shader with the required translucency was not at all straightforward, as you need access to the light attenuation, which can’t normally be accessed in a deferred shader program. I realize this next part is getting far into the weeds of Unity specifics, but to any curious, I use a surface shader with a custom lighting model that writes a very low fidelity translucency mask into the unused 2 bits in the G-buffer (the alpha of RT2). Then I added the translucency function to Internal-DeferredShading.shader. It took me years before I finally figured out how to do this. Here is the thread where a kind soul eventually helped me figure it out. It was so brutally difficult for me to figure out (took two years) that I’d gladly help anyone with this if they reach out.

Light Baking

Baking lighting for all the plants in a forest results in massive lightmap memory usage, production unfriendly bake times, and I found it to not look that great anyway, since alpha cards don’t generally make nice bakes. I use light probes for anything smaller than a building. I use Light Probe Proxy Volumes for the trees, so there is a nice gradient to lighter values in the canopies. Since the trees are not static, and aren’t seen by the light mapper, I needed a way to darken the probes of shrouded areas manually. I wrote a simple script that tints all the probes within a given volume by a color of my choosing.

Miscellaneous Tricks

  • LOD1 half way up a tree – Some trees are tall enough that you can get away with the upper canopy being lower poly. This is where my packing all LOD stages into the same atlas and material is handy. It enables me to do this without adding extra materials to the LOD0.
  • Dead trees, or trunks without canopies – I tend to reach my desired canopy density long before I reach my desired trunk density, so adding trunks without canopies is a thrifty way to make a forest look thicker.
  • Mega patch assets – In flatter areas of your map, you can make large patches of grass as a single object, thereby reducing draw calls even when things are in the LOD0/uncombined state. Every one of my grass and undergrowth assets has a large patch version.

A Note About GPU Instancing

Unity introduced GPU instancing in 5.4. To use it you must draw the mesh from script. Its different than combining meshes, and in some respects it’s better since you can draw many meshes in a single draw call, but don’t pay the memory overhead associated with uniquely combined meshes. There are, however, a few disadvantages. Since you need to draw the meshes from script, if you want to do any culling of any kind, you need to maintain a list of which meshes should be visible. Beyond the simple fact that you have to write all this yourself, keeping this list maintained in C# can be expensive. Furthermore, passing this massive list (or multiple lists) to the GPU is expensive too.

I have tested GPU instancing fairly extensively. I even replaced my entire grouping system with a GPU instancing system. I found that it was not as performant as my combining system, and had more limitations (such as not being able to use light probes, or Light Probe Proxy Volumes, which are essential for my forest lighting).

There is a new method introduced more recently, called DrawMeshInstanceIndirect, wherein you can use a compute buffer to make the maintaining of your instance lists more performant. It is possible this is an even better solution than my combining system, however there isn’t much documentation on it, and I am not a good enough programmer to figure out how to do it. I tried. I failed.

Conclusion

The TLDR for performance optimization in lush forests:

  • Draw calls will likely be your biggest problem. You need a plan to keep them reduced.
  • Poly count is an easier problem, just make sure the triangle count of each asset is reasonable.
  • I ignore overdraw considerations because there’s nothing I can do without ruining the look.
  • All my foliage textures are atlased to one material, to ensure combined meshes become a single draw call.
  • I use a grouping system that combines the LOD1s of all the meshes in a group.
  • I don’t combine LOD0s because it uses too much memory as the meshes are higher poly.
  • The groups can’t be too big, because then you can’t take advantage of frustum or occlusion culling.
  • I use my own LODing script since I do LOD switching by group rather than by object.
  • I use my own mesh combine script, to gracefully handle vertex color and submeshes.
  • I author my foliage assets by hand, often with a plan for the LOD1 in mind.
  • You can animate alpha cutoff to make a smoother transition.
  • I use regular meshes rather than Unity’s built-in terrain system.
  • I wrote a fully deferred vegetation shader to keep pixel fillrate down.
  • Baking lighting for foliage was not feasible for me, I use light probes to light my forests, with a custom tinter volume for shrouded areas.

And that’s about it! My current combining system (I call it the hex grid) is my 3rd iteration of combining system, so these are the things that ended up working for me after a bit of trial and error. I use it for many of the objects in my world, not just foliage. It works well whenever there are a lot of objects of one kind (like barrels or rocks, for instance). At best multiple instances of an object will consolidate to a single draw call, and at worst there will be the same amount of draw calls as there were before, but with higher memory overhead. Please don’t hesitate to reach out if you have questions.

Voice Acting Casting Call!

For the past nearly four years of Eastshade development now, I’ve been pretty dogged in my belief that Eastshade should be text-only dialog, like Final Fantasy. I held this belief due to the following reasons:

  1. Voice acting means we have to do lip sync animation.
  2. Lip sync means tongue model, teeth model, full facial rigging, and full phoneme markup for every gosh darn character.
  3. The costs of casting and recording full voice for Eastshade’s 20,000+ words over more than 40 different characters.
  4. The overhead of managing many different voice actors and deliverables.

These are reasonable points in favor of no voice. However, things have changed, I’ve done more research, and my outlook has changed. The first critical thing that changed is I got rid of the mouth coverings in the character designs from Leaving Lyndow. I initially thought that mouth coverings would be a design decision that would make things easier, weather we did full voice or greeting lines only. However I learned the hard way what a terrible mistake this was. The difficulties of art design with mouthless characters turned out to be far greater than any savings in production. Of the criticism we received for Leaving Lyndow, the facial design of the characters was probably second most common.

New and improved!

Once the characters had mouths, I got to thinking: What other bad decisions have I made in the interest of saving production time? I questioned everything about the characters. How hard actually was lip sync? Was I blowing it out of proportion? I’d never actually tried, so I carved out a day, and told myself “If I can get a reasonable dynamic lip sync on a character in a work day, then full voice might be viable.” Well turns out lip sync was easier than I’d thought. With the help of a Unity plugin called Salsa, I smashed the goal with flying colors. The assumption I’d been holding for nearly four years was turned upside down in a single work day!

And moreover, I made another assumption-shattering discovery in the experiment: Even with extremely amateur voice acting recorded by yours truly, the character came to life before my eyes. As I imagined each character with their own unique voice, I could feel another dimension of discovery materialize. Each character would have a new type of feedback to offer, and despite how much repeat there was in the character models, distinct voices would bring another layer of spice.

Once I knew it would be technically feasible, and I realized that I didn’t necessarily need Meryl Streep, the last thing in my way was costs and management. After doing some research around the web, its clear now that costs aren’t as inhibiting as I’d assumed. And for how much value I can see it adds, I know it will be worth it. The last thing that remains to be seen is weather we can handle casting in an organized enough way, and empower each actor to create their own deliverables that we don’t have to do a ton of work making game-ready.

So without further ado, if you’re an interested voice actor with means to record yourself, here are our open audition guidelines!

Creating a Dynamic Sky in Unity

Creating a Dynamic Sky in Unity

The sky is an important character in Eastshade. In addition to taking up a large portion of the screen at any given moment, it also gives players the sense that the game world has its own beating heart. While I knew I wanted to make the sky spectacular and dynamic, I also didn’t want to spend tons of performance or development time on it. While something like dynamic volumetric clouds can look absolutely awesome and convincing, they are out of my comfort zone technically and are very expensive to render at high enough detail to look realistic. After all, Eastshade is not an airplane simulator! So I opted for a solution with simple parts that I could hone easily with my artist’s eye, rather than a more procedural approach.

lyndow-tods

The Day Cycle Manager

Firstly, I knew there were going to be a bunch of values to tweak for each distinct time of day, so I wanted one central place to save/load and tweak them all. Since the sky setup is made of quite a few parts and shaders, I was led to create what I call the “DayCycler” component, which looks like this:

day-cycler

This custom inspector may look formidable with its many fields and values, but it’s really just a bunch of references to material values, light values, and rotations pertinent to the global lighting. Inside this component is a simple update loop which interpolates between the most recent and incoming time of day presets. With this simple system, I can go through and tweak each distinct time of day like a painter, and all the in-between times are taken care of for me. No need to manage 20 different IPO curves. The times are 0-1 rather than 24:00 so don’t be confused by times like 0.175 (that would be something like 4:12 AM). There are a lot of things in the world that look at the time of day which are more local, such as a a hanging lantern, or chirping birds. Little lights and audio sources like these drift in and out of memory as the world streams around the player. Referencing and managing all these little things in this central controller would be quite cumbersome, so the second part of this system is this little component:

ValueCycler

I attach this to any little light or audio source that is day/night dependent and it looks at the current time of day and decides what value it should be independent of the DayCycler. No need to micro manage values like these.

Sky Composition

skybox-diagram

The “sky” is composed of a few different elements:

Fog – Starting with the closest and moving out, we have the all-important fog. I use the fantastic Fog Volume for this. The reason I love fog volume is because of its gorgeous and fairly cheap light in-scattering. If you’ve never heard of in-scattering, I suppose it can be described as the look of sunlight passing through fog. I’m not talking about god rays (though in real life I believe its caused by the same thing). It adds a lot of depth and a sense of light direction to the atmosphere.

Clouds – Call me old fashioned, but I like the look of photo clouds. The biggest issue with photo clouds is that it is difficult to make them dynamic. My strategy was simply to take photos on a day that the clouds didn’t have a strong sense of light direction, combined with touching up the parts that look too directional. Once I had my 360 degree cloud panorama, I made an alpha mask cutting out the blue parts because I wanted to encapsulate the clouds seperate from the atmosphere. I mapped my clouds to a dome that rotates slowly to give the impression that the clouds are moving along the horizon. This trick is stupid simple, and is ineffective for giving players the impression that clouds are passing over them, so if you want that you will have to combine this with other methods like overhead UV scrolling clouds or something like that. I actually haven’t gotten around to doing overhead clouds yet, but funnily enough players tend to rarely look directly up and haven’t noticed.

Skybox-Structure

Atmosphere – I find a simple 2-value gradient shader on a dome mesh is sufficient for the atmosphere. The opacity and colors animate with the day cycle. I increase opacity near the horizon so it looks thicker, while the stars show through more when you look directly up.

Sun – There are two parts to my sun. I have a sun flare, and an actual sphere mesh with a highly emissive solid color. This way I get a nice bloom, even if the sun is only partially showing. This is particularly useful for me in Eastshade, as there are daily eclipses and I needed a way of showing the sun slowly hide or emerge from behind the moon. The sun’s directional light doesn’t actually move around in the sky, it just rotates. To keep the sphere lined up with the flare, I rotate the sphere around the player’s head, rather than around its own center.

Midday-Eclipse

Moon – I designed a custom shader for the moon. It’s anything but physically accurate. It expands the light angle a bit, and uses heavy fresnel to fake the bending of light around the atmosphere. I have a special light that shines on the moon alone to simulate the sun hitting it. Here’s a bit of Eastshade trivia regarding its moon:

The moon in Eastshade appears habitable, and since its about the same size as the planet you’re standing on, you orbit it as much as it orbits you! In other words, both planets are moons to one another. This means Eastshade’s moon remains in the same place in the sky all the time, which creates daily solar and lunar eclipses. At midday it blocks you from the sun, and at midnight you block it from the sun. Tidally locked, you orbit around each other in a double planet dance all the way around the sun. Is there another world of intelligent life just across the cosmic pond? The residents of Eastshade can’t know. All they can do is look up and wonder…

Space – The furthest background is the space dome. This dome has a tiling star texture, supplemented with bits of geometry for the larger stars to break up the tiling. The reason I use geometry to break up tiling is because having a texture that wraps around the whole sky would require a MASSIVE texture to look sharp. The fact that stars are tiny little dots means they live or die on their sharpness. If I wanted to add a nebula or something like that I’d probably have that as a decal sticker on the star dome, because doesn’t tile and doesn’t need as much resolution as the stars. I’m trying to keep memory and build size down, mostly because I don’t want to waste development time maintaining a huge build.

geo-stars

Finally, its important to note that all these things follow the player around as they move. Since most of these things are supposed to look infinitely far, there shouldn’t be any parallax between the elements.

Not Done!

I’m not done with the sky systems in Eastshade. There are a few things left to do. Among them is come up with some sort of overhead cloud coverage to pass over the player. I’m thinking I will put a flat disc and use vertex color to taper the opacity around the edges. Then I’ll scroll the UVs over a tiling cloud texture. I also want to have multiple cloud textures for different weather conditions and fade between them. I’ll need to make a weather controller that operates on top of the DayCycler and plays off the base values, so I can have any weather condition at any time of day. I’ll need to implement a global wetness property in my shaders that increases gloss and spec, while darkening the diffuse a bit.

Press Roundup

Last week we released our first official trailer and the response has been hugely positive! People seem genuinely interested in the game and it leaves us inspired as it does flattered. In addition to discussion on a multitude of forums, we received coverage from a number of news sites. While I can’t track them all at this point, I will attempt to gather them in this press roundup list for my own personal records, and for you all to check out if you want. I tried to include only unique articles, and exclude reblogs.

English coverage:

Non-English coverage: