Improving the appearance and performance of runtime-generated branching trees
On the road to better-looking plants, I finally reached the point where it was time to improve the appearance of their trunks.
This system should have very simple inputs, to keep it flexible. The system's inputs will be a series of points forming a path for each stem, connected in a tree structure. Each point will have an extra dimension attached to it indicating the thickness of the tree at that point. In addition, certain material properties may be passed on such as a pallet index or vertex color. These material properties may or may not be blended by the branch generator system, but they will be passed through to the final mesh's vertexes. In this way, we are forced to design an algorithm that is highly agnostic and hopefully reusable. For example, when we have a trunk that's branching into many other stems, we cannot rely on information about which child stem is the "same" trunk. All child stems of a branching area are considered equally. This will make it easier to plug into our existing L-system.
I'll briefly summarize how extending the L-system library works here. If you just want to get to the generation methods, continue to the next section!
Since we only care about turning the l-system string into a mesh, we don't need to worry about extending the l-system update step itself. All we care about is adding a new turtle operation. Unfortunately, in the interest of performance, extending the turtle's operations requires cracking open the core source code.
Luckily the library has some support for this! We just need to write a new TurtleOperationSet scriptable object class and assign an instance of it to the Turtle. Our first move is to create a new implementation of TurtleOperationSet: this is a base class for any scriptable object used to configure a set of things the turtle can do.
The current workflow to add a new type of turtle operation into the library is a little awkward. Our turtle job indexes all of its operations by a single union struct, indexed by an enum. So to extend the process, we need to add a new type to the enum, and a new union-ed data type into our turtle operation structure representing our stem operation. This diff outlines the changes made to integrate our new operation into the existing turtle system:
public enum TurtleOperationType : byte
{
ADD_ORGAN,
+ ADD_STEM,
ROTATE,
SCALE_TRANSFORM,
}
+ public struct TurtleStemPlacementOperation
+ {
+ public byte materialIndex;
+ public ushort radialResolution;
+ public float baseLength;
+ public float baseRadius;
+ public void Operate(
+ ref TurtleState state,
+ int indexInString,
+ SymbolString<float> sourceString,
+ ...)
+ {
+ ...
+ }
+ }
[StructLayout(LayoutKind.Explicit)]
public struct TurtleOperation
{
[FieldOffset(0)] public TurtleOperationType operationType;
+ [FieldOffset(1)] public TurtleStemPlacementOperation stemOperation;
[FieldOffset(1)] public TurtleOrganOperation meshOperation;
[FieldOffset(1)] public TurtleScaleOperation scaleOperation;
[FieldOffset(1)] public TurtleRotationOperation rotationOperation;
public void Operate(
ref TurtleState currentState,
int indexInString,
SymbolString<float> sourceString,
...)
{
switch (operationType)
{
case TurtleOperationType.ADD_ORGAN:
meshOperation.Operate(ref currentState, indexInString, sourceString, ...);
return;
+ case TurtleOperationType.ADD_STEM:
+ stemOperation.Operate(ref currentState, indexInString, sourceString, ...);
+ return;
case TurtleOperationType.ROTATE:
rotationOperation.Operate(ref currentState, indexInString, sourceString);
return;
case TurtleOperationType.SCALE_TRANSFORM:
scaleOperation.Operate(ref currentState, indexInString, sourceString);
return;
default:
return;
}
}
}
[BurstCompile]
public struct TurtleCompilationJob : IJob
{
[ReadOnly]
public SymbolString<float> symbols;
[ReadOnly]
public NativeHashMap<int, TurtleOperation> operationsByKey;
public TmpNativeStack<TurtleState> nativeTurtleStack;
public CustomRuleSymbols customRules;
public TurtleState currentState;
public void Execute()
{
for (int symbolIndex = 0; symbolIndex < symbols.Length; symbolIndex++)
{
var symbol = symbols[symbolIndex];
if (symbol == customRules.branchOpenSymbol)
{
nativeTurtleStack.Push(currentState);
continue;
}
if (symbol == customRules.branchCloseSymbol)
{
currentState = nativeTurtleStack.Pop();
continue;
}
if (operationsByKey.TryGetValue(symbol, out var operation))
{
operation.Operate(
ref currentState,
symbolIndex,
symbols,
...);
}
}
}
}
With our operation included in the turtle's lifecycle, now we're free to start implementing the code and pulling information out of the turtle! If you'd like to dig into the code as it exists in the repository, you can check out all of it on its branch here: github folder link
This is the current state of the system. To produce the illusion of a single stem, we simply copy the same cylinder mesh multiple times onto the path of the trunk. Ultimately the L-system only outputs a series of oriented points, so this is the natural way to implement stems in that system.
In our example, this looks pretty decent. There are several drawbacks to this approach though:
An easy way to expand on this is by welding the vertices of each mesh to each other.
At this point, we will abandon the idea of "copying" meshes, instead our system will dynamically generate all vertices and triangles for the trunk. Each stem segment will be replaced with only one circle of vertexes and will draw triangles connecting to the previous stem segment to build the cylinder. The circle's orientation will come directly from the orientation used to copy the mesh, and triangles will be connected based on the angle from the local y-axis. This is how that looks, compared to our previous approach. Note how the corners on the tightly curved fern fronds are much smoother:
This solves some of our problems! We no longer have sharp edges, and the normals blend nicely now that it is all in one connected mesh. No UVs yet. But, there is one problem! We have these weird scronkled regions wherever the tree branches into multiple segments. This is because the organs' orientation rotates in multiple directions when splitting, this throws off the triangle linking.
To visualize the base data, I rendered every stem "point" orientation as a basic axis visualizer. This comes from using the initial copy-mesh approach and plugging in an axis indicator mesh. We can see that the orientation rotates nearly 180 degrees wherever the trunk splits:
We need to find a way to adjust our triangle indexes based on a rotational offset from the previous stem. Unfortunately, this means we must use MATH ! Don't worry though, most of it is handled by Matrix4x4 transformations for us. What we will do is project the y-axis of the next-segment onto the y-z plane of the previous-segment (in our coordinate system, the x-axis is forward). Taking the angle between this projected vector and the y-axis vector of the previous-segment plane will give us our offset. Below is the code for this, unfortunately, I don't have a good visualization for this. But I can recommend this tool to help build intuition about the dot-product's function.
/// <summary>
/// returns a value representing the rotation required to align the y-axis of <paramref name="next"/> up as closely as possible to the y-axis of <paramref name="parent"/>
/// </summary>
/// <param name="parent">the previous-stem orientation</param>
/// <param name="next">the next-stem orientation</param>
/// <returns>a value between -0.5 and 0.5, representing rotations about the x-axis from -180 to 180 degrees</returns>
static float GetNormalizedCircleOffset(Matrix4x4 parent, Matrix4x4 next)
{
var parentBasisPlaneX = parent.MultiplyVector(new Vector3(0, 0, 1));
var parentBasisPlaneY = parent.MultiplyVector(new Vector3(0, 1, 0));
var parentBasisPlaneNormal = parent.MultiplyVector(new Vector3(1, 0, 0));
var nextY = next.MultiplyVector(new Vector3(0, 1, 0));
var nextYProjectedOnParentBasisPlane = ProjectOntoPlane(nextY, parentBasisPlaneX, parentBasisPlaneY);
var angleOffset = Vector3.SignedAngle(parentBasisPlaneY, nextYProjectedOnParentBasisPlane, parentBasisPlaneNormal);
return angleOffset / 360f;
}
static Vector3 ProjectOntoPlane(Vector3 projectionVector, Vector3 planeBasisX, Vector3 planeBasisY)
{
var projectedX = planeBasisX * (Vector3.Dot(planeBasisX, projectionVector));
var projectedY = planeBasisY * (Vector3.Dot(planeBasisY, projectionVector));
return projectedX + projectedY;
}
With this normalized value, we can remap it to the # of vertices on our stem, and offset our triangle indexes accordingly:
// Add 1 to ensure the value is always positive. Overflow will be handled by the modulo operator when finding the exact index.
var myCircleIndexOffset = (int)math.round((GetNormalizedCircleOffset(parentStem.orientation, pointTransform) + 1) * radialResolution);
Much better, we've successfully descronkled our fern! There's one final touch we can improve though. The trunk becomes abruptly narrower when splitting. Currently, the first stem segment after a split is placed directly above the parent stem, so the cylinder downscales and overlaps with itself straight up. This produces the impression that the trunk first gets smaller, and then splits in two. Ideally, this downscaling would also be accompanied by a movement in the new forward axis of each child stem segment, to imply that the trunk gets thinner at the same time as it splits off from the main trunk.
We can fix this by offsetting all of the vertex circles by 1/2 of their length, effectively sliding the edge loops up along the mesh. Here's a comparison, it's a subtle difference but I think it helps. This also bears some benefit to the more tightly curved parts of the stem!
Another drawback we can now address is seamless texturing! Our goal will be to wrap a seamless tiling texture around the trunk and make sure it blends at least a little reasonably between stem segments and branching. The first approach we can try is wrapping the one axis of the UV map radially around the trunk. Luckily this part is pretty simple, we convert the angle to normalized coordinates in 0..1, and that's our x-coordinate! Easy... or so I thought. This approach requires an extra vertex in our circle mesh. Because UVs interpolate between vertexes, the very last vertex at uv 4/5
will interpolate down to the first vertex's 0/5
coordinate. So we need to break the mesh at the 0-th vertex, to make sure the uvs interpolate correctly.
This is what we're missing without our extra vertex. Fixed mesh is on the right, and improperly wrapped circle meshes are on the left. It might be hard to catch, but sections of the tree on the left are blurrier and stretched due to the UVs interpolating at a higher gradient than they should be. With this issue fixed, the plant is looking pretty decent!
Texture blending between different textures is something we'll want. The fresh growth tips of a tree never look the same as the more gnarled, aged bark of the trunk, and we need to represent this smoothly!
To accomplish texture blending, we use the Extra Vertex Data l-system extension to indicate a texture-index in our shader, based on the depth of the L-system. In this case, we'll use 4 textures, each appearing less aged and more freshly grown. Here's the texture pallet itself, ordered top to bottom, left to right:
By mapping this onto our plant, we can get the thinnest stems looking much more realistically green. The texture blending shader also applies some adjustable fuzzing between adjacent textures in the map, which is why we will see spots of brown mixed in with the green.
Keen-eyed readers might recognize the new texture has no mipmaps, appearing more noisy. Those readers would be right! The shader currently accepts this texture as a texture3D with 4 z-levels to help with interpolation. But, when generating mipmaps on a texture that thin, unity will blend all of the z-levels together into an unrecognizable mess. So for now we just turn it off. To improve this the next shader version should probably use a Texure2D array and handle lerping between pallets manually.
Since this is targeting real-time gameplay generation, any change we make must also consider how it affects performance. So, let's pull out the profiler! The meat of the new code we wrote is all inside the jobs threads, so we will compare how long the total mesh-building process takes with a mature tree. Another relevant property is the complexity of the resulting mesh. The previous method's pentagonal cylinder mesh used a total of 20 vertexes, and 16 triangles: a substantial sum compared to the new method's 6 vertexes and 10 triangles.
I selected the highest-value time for this process compared across about 10 different samples. This isn't the best way to gather info, but at least it gives us an idea of any large effects. Here are the results:
Amazingly, our technique has decreased the time to run from 1.2ms to 0.9ms, almost 25% less time! My theory as to why is that even though we do more math per stem segment, this is offset by using nearly 1/4 the number of vertexes. The actual writing of data to memory isn't cheap, and the job generating our mesh bounds scales directly with vertex count. Overall, we see a 62% reduction in vertexes and a 22% reduction in triangles. This will make the resulting mesh much more performant to push onto the GPU and to copy around in our game if we wish.
I'm very happy this solution does nothing to hamper the game speed, and in fact, may speed it up slightly. I should be able to roll this out into Seeb Defender without concerns.
Overall, I consider this a great improvement over the existing system. I think there are yet many more details to consider, I'll summarize the possible next directions here:
Hopefully, in the future, I'll put up another post looking into at least one of these five improvements. Thanks for reading!
Did you like it? Why don't you try also...