Render Smarter

I have a bit of a potentially-bad habit. Usually, by the time I’ve finished implementing some great new feature, I’m already so excited to begin the next that I forget to take a moment and talk about it. Consequently, a lot of work that goes into Jax goes largely unnoticed until it’s time for a release — and then it usually gets lost amongst the hub-bub.

I’m trying to break that habit, starting now. Fighting the urge to move on, I’m breaking just long enough to write a bit about what it is I’ve just completed: a fully dynamic, fully automated octree!

If you’re not aware of what an octree is, that’s not surprising because a good one will never let you know it exists. To understand octrees, we should first back up and see what the world looks like without them.

Imagine that you’re piloting a space ship in a game world. When you look around your ship, you see all sorts of objects in the distance: stars, planets, space stations, asteroids, and of course, other space ships. There are thousands of such objects. In fact, there are so many objects being rendered every single frame, that the game is unplayable because the graphics pipeline has been clogged with objects being rendered, most of which are too far away to be seen or are off-screen entirely.

The best way to speed up rendering is not to render at all, and one common approach to this is frustum culling. That’s an article in itself which I won’t go into here, but essentially, the frustum is the viewing extent of the camera, and the idea behind frustum culling is to simply not render any object that’s not actually within the bounds of the camera. A simple concept, to be sure, but not quite so simple in practice.

Frustum Culling

Even with frustum culling, the game can be unplayably slow. If there are many thousands of objects, the simple act of iterating through all of them can have a devastating impact on performance — especially in JavaScript, where brute-force iteration through all the objects in a scene just isn’t fast enough to be practical.

This is where space partitioning comes in, separating objects into more manageable groups. Instead of checking each object against the frustum individually, we just check whether a region of objects is inside the frustum, either partially or completely. If so, we can intuit that most or all of those objects will also be inside the frustum. In many cases we can even skip checking those objects entirely, in favor of just rendering them outright.

Octrees are a form of space partitioning, dividing vast quantities of geometry in a scene into more manageable chunks. An octree forms a giant cube around the scene. Then, the cube is subdivided into eight (oct-) equal sub-cubes. Those sub-cubes are themselves subdivided (-tree) into equal children, and so on, until some terminating condition is reached.

Octree

Because an octree consists of ever-shrinking cubes, they are an ideal solution to the problem of rendering many objects in a scene.

There are about as many octree implementations out there as there are uses for them, and they all vary somewhat depending on those uses. So before writing my own, I laid out some ground rules that the tree had to follow in order to be considered successful:

  1. It had to be flexible. I didn’t want an octree that is capable of just render optimizations. Though that is its initial purpose, I’d like to be able to extend it easily into the worlds of collision detection, physics, etc. I may have to alter the tree’s source code at some point, but I don’t want to have to rewrite it, or manage separate code bases for each use case.
  2. It had to to be invisible. An important part, if not the most important part, of Jax is its low learning curve and high adaptability. The octree had to follow the same paradigm, meaning there should be no reason for you, the developer, to have to work directly with the tree (discounting some bizarre edge case). As an example, most octree implementations are initialized with some pre-defined size, object count and/or subdivision depth in mind. With Jax, none of these are acceptable.
  3. It had to be dynamic. In keeping with the “stay out of the developer’s way” rule, the octree had to have no concept of minimum or maximum object size, nor of scene size. It needed to grow when large or distant objects are added, subdivide when there were too many, and merge together nodes that were too empty, all on its own. A complicating factor is object movement: when your object is moved from one edge of the scene to the other, the tree needed to stay up-to-date without any further interaction from you, and it had to do so quickly — there could be no noticeable delay while the tree re-sorted objects that had moved beyond the bounds of their node or even their parents’ nodes.

I’m pleased to say that this octree succeeds at following all 3 rules! (Of course, if it hadn’t, it wouldn’t have made its way into the framework and I wouldn’t be writing about it now.) My next task is to get it integrated into the Jax scene manager, so that it can start doing its thang without a second thought starting in the next release!

 Low Level Stuff

Now I’ll move on to some implementation details, in case any of you out there are working on your own octrees and are looking to see how others have done it. If you’d like to skip the theory and jump straight into the source, look no further than the Jax source code repository!

First, the basics: when an octree is created, what happens? Some octrees are initialized and then immediately subdivided to a set number of levels, but this was not an option in our case because we had no way of knowing the size of the tree. Without knowing the eventual size of the root node, we can’t create child nodes, whose own positions and sizes must be calculated based on some knowledge of their parent’s size and position.

Instead, our octree uses a split threshold representing the maximum number of objects in any given node. If the number of objects added to a node exceeds the threshold, the octree is subdivided and its objects are redistributed to the node’s children. If an object doesn’t fit inside of any of the node’s children, it stays where it is.

This is our terminating condition. If there are too few objects in a node, it won’t subdivide, which prevents a recursion error. In the event two objects occupy the same space, they should eventually exceed the bounds of the child node, which also prevents a recursion error. The only thing left, then, is objects with a size of 0 (because empty or missing meshes are allowed in Jax). It’s quite easy to let “size” revert to 0.1 (or some other arbitrary small number), and that rounds out our termination.

Deciding whether an object fits into a particular node is easy, but different than most implementations. Most octrees calculate the object’s position and size, and compare that to the position and size of the octree node. In our case, we compare the object’s size with the size of the node without taking into account its position, ensuring that the object itself is no more than half the size of the node. Then we compare the object’s position, making sure that it occupies a point somewhere within the node, regardless of the object’s size. If both of these tests pass separately, the object is deemed to “fit” within the node.

The benefit to this approach is that objects can be cleanly sorted into sub-nodes, even when they get close to the edge of the sub-node, which would ordinarily cause the object not to “fit”, forcing it to be placed at a higher, less efficient level in the octree. There is one caveat to remember about this approach, though. When doing the frustum calculations, we need to remember to double the size of any given node, because we’ve allowed objects to be positioned close enough to the edge of a node that they might “hang over” its border by as much as 1/2 the node’s size on either side. There is a small amount of extra overhead incurred by the larger (and therefore, more numerous) frustum checks, but I think that’s fairly negated by the improved efficiency of the tree itself.

In order to see if a node can contain an object, we must know the size of the node itself. So how do we decide on a node’s size if we don’t know anything about the scene? Well, we might not know the eventual size of the scene, but we can adapt to fit it. The root node of the octree is initialized with a size set to 1 unit. This number could actually be any positive number, but I chose 1 for simplicity. If objects are smaller, then there’s nothing more to be done; we go straight into distributing the objects as already explained. For objects that are larger than the root node, the root node and its children (if it has any) grow exponentially in size, doubling until the object fits; each time the size is doubled, the bottom-most node is subdivided, so that the deepest level doesn’t actually change in size at all. An important note is that the root node’s position never changes: it’s always centered at the origin, and the tree will grow to encompass both the object’s position and its size.

This obviously produces a ton of empty nodes at all levels of the octree. To prevent unnecessary recursion through these empty nodes, I decided to maintain two lists of objects in each node: objects that are contained directly within the node itself, and a list of all objects contained in both the node in question, and all of its children and grandchildren combined. Since objects are added to the tree in a top-down manner, all this means  is we have to build up the second list before distributing objects to child nodes; it’s not that difficult, and it affords us one great optimization: not only can we instantly check at any level of the octree whether a particular node contains any objects (thereby rejecting a ton of empty nodes early on), but we can actually access a list of those objects without recursing any deeper into the tree. This is a huge boon for cases where, for example, we’ve determined that a node is entirely within the view frustum and therefore further recursion would be redundant.

So now we’ve got an octree that can enlarge itself to suit the scene, subdivide itself when necessary, and place objects at their ideal depths based on the octree’s splitting threshold and the size and position of each individual object. What we haven’t talked about is how to make it dynamic. First, when an object is moved, the octree needs to be updated so that it remains accurate. Second, when the object moves outside of a given node, that node needs to be checked. If it doesn’t have enough children, it needs to be un-subdivided, or merged, so that we can avoid extra iterations through nearly-empty nodes.

This is actually a lot simpler than it sounds. Jax models already act as event emitters, so it’s trivial to hook into them to receive an event whenever the object moves or rotates. When this happens, we tell the tree to update the particular object in question. To update it, the tree merely checks to see whether the object still fits in its current node. If so, nothing more needs to be done; if not, then the object is removed from its current node and re-added to whichever parent or grandparent node it does fit into. Since the root node will always grow to fit the object’s new orientation, there’s no risk of leaving the bounds of the octree.

Then, whenever an object is removed from a node, that node needs to check the total number of objects contained within it and its children. If it’s dropped below the merging threshold, then it’s time to un-subdivide the node and move its objects back into the node itself.

In the Jax implementation, we don’t actually remove the references to merged children. This would lead to costly garbage collection, which is bad enough on its own but even worse when we already know we’ll probably need those nodes again. So instead, we keep the reference “alive” and just set a “subdivided = false” flag so that the octree knows not to use the nodes in question. For memory’s sake, this could be improved by setting a timestamp on the nodes, so that if they go too long without being used, they are opened up for garbage collection. I’m not going to implement that now, but if it becomes an issue, it’s an option that’s not too difficult to add.

And that’s it! This fairly simple construct is going to net huge performance gains for those of you who have a lot of objects to manage, while having a negligible impact on anybody else. The best part is, it’s completely invisible — as it should be!

Comments are closed.