Of Tanks and Quad Trees

I needed a bit of a diversion from the planet rendering itself, into something that would give some purpose behind it. Why is the planet there? Well, what better use is there for a planet than driving a tank on it?

I needed a bit of a diversion from the planet rendering itself, into something that would give some purpose behind it. Why is the planet there? Well, what better use is there for a planet than driving a tank on it?

The tank model from the XNA Heightmap Collision with Normals sample is the perfect tank to drive on a lonely, empty, and possibly dangerous new world. I grabbed the model and texture resources, and the Tank class, added them to my project, and added some “spawn tank” functionality to get it drawing, resulting in a beautiful flying tank that seemed to defy gravity. Since nobody should be able to defy gravity, it was time to bring the tank down to earth.

The basic concept is to find the triangle underneath the tank and the exact position within that triangle. This will give us the exact height at that location, and the normal so we can orient the tank correctly. When using a single height map that’s not too bad, but when there are potentially hundreds of height maps, organized in a quad tree, there’s some more work involved.

The first thing I do is find the quad tree node that’s under the tank. There isn’t a regular grid like with a height map, so I had to come up with a different way to find it. The first thing that came to mind was to do traverse the quadtree with a simple point-in-bounding-box check, but that would only work if the tank is already very close to the ground. I want to be able to have a position 6,000 km above the surface and be able to find the proper node.

The next idea was to create a view frustum from the planet center looking outward at each node. To create the frustum I’d have to calculate the proper field of view so the frustum planes would go through the edges of the quad tree nodes. I actually tried this method but had trouble getting the field of view calculations to work so things were aligned properly. I still think this method would work if I spent enough time on it, but there’s actually quite a bit of code that has to be executed for each check, so it’s not the best method anyway.

The final idea, and the one I have working, is to do a simple ray-bounding-box intersection test. I was initially going to create the ray at the planet center, pointing out towards the position we’re looking at, but that would very likely run into floating point precision issues, so instead the ray starts 1 km below the planet surface, which gives plenty of precision to work with.

So, I traverse the quad tree, doing the ray-bounding-box intersection test. If I find a hit, I perform the check against the children, and so on, until I reach a leaf node. Once I have a leaf node I need to find the triangle within the leaf, using ray-triangle intersection tests.  Now, I understand the concept, but coding one with my current knowledge is beyond me, so I downloaded the XNA triangle picking example, which has a nice ray-triangle intersection method.

Now, the original ray-bounding-box intersection can give false hits since we’re working with a spherical surface because the bounding boxes can overlap a bit. So the ray-triangle intersection tests might fail to find a triangle. If that happens I just continue on traversing the tree with the bounding box checks until I eventually find the proper node, as well as the proper triangle. This works, 99% of the time. There is still a bug where the code fails to find a node at all. It’s very rare, but something I’ll need to figure out someday.

So, in order to get the proper height value I need to get the exact point within the triangle. When using a height map it’s common to use bi-linear interpolation to find this height, and I spent some time trying to get that to work with partial success. I finally stepped back and realized that the ray-triangle intersect was returning the distance along the ray of the intersect, so it was a simple matter of multiplying the normalized ray-direction by the intersect distance to give me the exact position I needed. Treating that position as a vector from the planet center,  the magnitude is the height value that’s needed for the tank position.

That leaves the normal so the tank can be oriented properly. When using a height map the normal is also found using bi-linear interpolation. This presented a problem since I couldn’t ever get that to work completely for the height. Instead of mucking with it too much I chose to average the normals of the triangle, which seems to fit well within my “good enough” expectations at this time.

So, my planet has gained a purpose, and in so doing I now have some code that’s going to be very useful for things like placing trees and walking and crashing and shooting. I’ll try to put up a video in the next day or so.

GPU Geometry Map Rendering – Part 2

We left off in part 1 talking about the initial failures with my GPU geometry map shader. I did fail to mention that there was a bright spot the first time I ran the new code – it was amazingly fast. So fast I was able to increase the noise octaves from the 5 that would run reasonably well on the CPU up to 30 and still run at well over 60fps. I have to admit that I spent some of that first 18 hour day just roaming around on a barren, reddish planet. That huge improvement in performance made the pain to come well worth it.
So, at the end of part 1 we set up the C# code for executing the geometry map shader. Now let’s take a look at the shader itself.

We left off in part 1 talking about the initial failures with my GPU geometry map shader. I did fail to mention that there was a bright spot the first time I ran the new code – it was amazingly fast. So fast I was able to increase the noise octaves from the 5 that would run reasonably well on the CPU up to 30 and still run at well over 60fps. I have to admit that I spent some of that first 18 hour day just roaming around on a barren, reddish planet. That huge improvement in performance made the pain to come well worth it.

So, at the end of part 1 we set up the C# code for executing the geometry map shader. Now let’s take a look at the shader itself.

float Left;
float Top;
float Width;
float Height;

struct Vertex
{
  float4 Position : Position;
  float2 UV : TexCoord0;
};

struct TransformedVertex
{
  float4 Position : Position;
  float2 UV : TexCoord0;
};

void QuadVertexShader(in Vertex input, out TransformedVertex output)
{
  output.Position = input.Position;
  output.UV = input.UV;
}

float4 QuadPixelShader(in TransformedVertex input) : COLOR0
{
  // grab the texture coordinates - you can use them directly, but doing this
  // lets you see the value in PIX more easily when you're debugging
  float2 uv = input.UV;

  // get the coordinates relative to the input dimensions
  float x = Width * uv.x;
  float y = Height * uv.y;

  // translate the final coordinates
  return float4(Left + x, y - Top, 1, 1);
}

The Left, Top, Width, and Height floats at the top are the parameters used to define the face-space area. These are the values you’re setting when using the XNA Effect class as mentioned in part 1.

quadEffect.Parameters["Left"].SetValue(-1.0f);
quadEffect.Parameters["Top"].SetValue(1.0f);
quadEffect.Parameters["Width"].SetValue(2.0f);
quadEffect.Parameters["Height"].SetValue(2.0f);

The Vertex struct defines the vertices entering the vertex shader, and the TransformedVertex defines the vertices leaving the vertex shader and entering the pixel shader. Since we’re using pre-transformed vertices the vertex shader doesn’t need to do anything but pass the input values through, which it does very nicely.

The pixel shader isn’t actually all that much more complex. The GPU takes the texture coordinates specified in each vertex and interpolates them for us as it’s rasterizing our quad. Let’s think again in just the horizontal dimension. As mentioned previously, we’ve set up the texture coordinates so they start at 0.0 on the left vertex, and 1.0 on the right vertex. The intent of the shader is to map those values to face-space. In this example the 0.0 should map to -1.0, and the right to +1.0. Also, remember that we’re working with a 5×5 geometry map, and these are the face-space values we expect to get for each of the 5 horizontal positions: -1.0, -0.5, 0.0, 0.5, 1.0.

Walking through the shader for the left pixel we expect to see this:

The uv.x value the shader receives is 0.0
x = Width * uv.x = 2.0 * 0.0 = 0.0
Left + x = -1.0 + 0.0 = -1.0

And for the right pixel we (well, at least I did at one time) expect to see this:

The uv.x value the shader receives is 1.0
x = Width * uv.x = 2.0 * 1.0 = 2.0
Left + x = -1.0 + 2.0 = 1.0

But that isn’t what happens. In fact, the 5 texture coordinates we get are 0.0, 0.2, 0.4, 0.8, resulting in face-space values of: -1.0, -0.6, -0.2, +0.2, +0.8. That did not make any sense to me at all – shouldn’t the texture coordinates be interpolated from 0.0 to 1.0? I spent hours sifting through my code to find out what I had set up wrong. I even took the step of creating a separate bare minimum test app, which I really hate to do. Everything I tried gave me the same results, and I was forced to conclude that somehow the GPU must not be interpolating the texture coordinates the way I expected. I suspected that it had something to do with how Direct X maps texels to pixels, but nothing I tried in that regard gave me the results I needed. I finally caved and started up PIX.

It was a bit daunting at first, but there are some simple tutorials out there that will walk you through the basics. I’m going to go through how I ended up using it, in order to easily run the same tests over, and over, and over again. Start by downloading the test project.

Unzip it into your project folder, open the QuadTest solution, rebuild it, and set QuadTestBad as the startup project. Run it, and verify that you get a pretty box with various shades of blue, white, and magenta.

Now start up PIX. Select File/New Experiment. Use the browse button to navigate to the QuadTestBad.exe you just built. Don’t change anything else for now and press the Start Experiment button. The app should start and you should see the pretty box, as well as some PIX overlay infromation in the upper left. If that’s the case, all is well. If not, you’ll need to figure out why on your own.

Now what we need to do is have PIX grab all the Direct3D data for a single frame. Often you can do this by selecting the “Single-frame capture of Direct3D whenever F12 is pressed” radio button before starting the experiment. That works just fine, and is the method I started out with. But there’s a better way in this case. From the experiment window (the one where you set up the program path), select the More Options button. You should see something like this:

The left tree view might have some different values in it if you changed any options on the initial screen, but the things you need to do here are the same regardless. In the tree view the green T lines are Triggers, and the purple A lines are Actions. When selecting a trigger line you’ll see the right panel which will let you select the type of trigger. Select “Frame” for the trigger type. This will reveal the options for the Frame trigger type. Enter “1” for the frame number. Now select the Action line, which will change the right panel to allow you to select an action. Select the “Set Call Capture” action. Then under Capture Type select “Single-frame capture Direct 3D”, and check the “capture d3dx calls also” box as well.

We’ve just defined a trigger based action. We’ve told PIX to capture Direct3D data on frame #1. Now let’s create a second trigger based action. Press the green T button to create the new trigger. Again select Frame for the trigger type, but this time enter 2 for the frame number. For the action select Terminate Program. So, we’ve now told PIX to exit the program on frame #2. To restate, on frame 1 PIX will capture a single frame of data, and then on frame 2 it will exit the program. You can just go the F12 route in many cases, but when you’re going to be repeating something over and over it can save a lot of time setting up the triggers. In some cases you may have to set the triggers since it can be impossible to hit F12 at just the right moment to capture the data you want.

So, go a head and start the experiment. You should see the QuadTestBad app start up, and then immediately quit. PIX should then display a screen full of data. For this discussion we’re only interested in the panels on the bottom. Events and Details. Events will show you all of the Direct3D calls, and Details will show you details about those events. There is a lot of noise in the D3D calls, so it can be difficult to find what you’re looking for by just scanning through it. There are some buttons at the top of the Events panel that let you move to the next or previous frame, and the next or previous draw call. So, press the button that has the D with a down arrow:

That will take us to the draw call we’re interested in examining. In the Detail panel select the Render tab. You should see the quad we drew. This view will let us step through the vertex and pixel shaders for each pixel displayed in the render tab. You can also look at the Mesh tab to see information at the vertex level, both pre-vertex-shader and post-vertex-shader. Go ahead and do that now, and you’ll see that the screen-space positions and texture coordinates we set up for the full-screen quad did indeed arrive at the vertex shader correctly, and they were correctly sent on to the pixel shader.

Let’s debug a pixel. Go back to the Render tab, mouse over the upper left corner of the image until the X and Y values displayed in the status area show 0, 0. You can zoom into the image if necessary using the buttons at the top of the panel. Once you have the mouse over the top left pixel, right click and select “Debug This Pixel” to open up the Debugger tab. This tab shows a history of the pixel for this frame. Scroll down a bit until you see the DrawPrimitive call. There are several links displayed for debugging the vertices, as well as the pixel. Let’s start with one of the vertices just so you can see it. Click the Debug Vertex 0 link to bring up the vertex shader debugger. Here you can step through the vertex shader code (both forward and backward) and examine all of the variables and registers and such that are involved. Press F10 to step through each line. You’ll see variable values added to the list at the bottom as the values change, or look at the Registers tab to see the individual register values. You can also switch to the Disassembly tab to see the assembly code, which contains comments to help match up registers to variable names in the HLSL code.

To get back to the initial debugger screen press the “back” toolbar button – it’s the one with the green circle and white arrow pointing left.

Now let’s do the fun part and debug the pixel shader. Click the Debug Pixel (0, 0) link,which takes you to the pixel shader debugger. You may have noticed that the pixel shader could be made much more efficient by changing things to use vectors instead of individual floats, and using the incoming texture coordinates directly rather than copying them into a local variable. If you noticed this, you would be right. But splitting things out this way makes debugging in PIX a lot easier, at least for me. You’ll notice that the pixel shader debugger doesn’t display the value for input.UV anywhere, and there is also no way to add a “watch” like you would do in Visual Studio. You could look at the Registers tab and get a good idea, but that can involve a lot of thinking and writing, and examining the assembly code to determine which register is mapped to which variable. So, I found that it helped a great deal to break things out like this because the debugger adds everything to the variable list as you make changes.

If you execute the first line, float2 uv = input.UV, you’ll see what I mean. The uv variable is added to the list and shows the current value, which is (0, 1). Now, if you debug all 5 of the top row of pixels you’ll see that the uv.x values for the pixels are 0.0, 0.2, 0.4, 0.6, 0.8. Why doesn’t it ever reach 1.0? Honestly, I still think it has something to do with texel to pixel mapping, and tried I don’t know how many combinations of shifting things around by half pixels in various coordinate systems, but I still have no idea why it doesn’t make it all the way to 1.0. I’m hoping someone will read this and let me know.

I do know how it’s coming up with those values though. It starts uv.x out at 0, and adds 1.0 / GeometryMapWidth for the next pixel. In our test case GeometryMapWidth is 5, so it’s adding 0.2 each time. If I could make the GPU add 0.25 each time I’d be in business. What I’d like to do is have the GPU add 1.0 / (GeometryMapWidth – 1) each time, but I can’t change the divisor – the GPU is always going find the step by dividing by GeometryWidth. But what if I could change is the numerator? Accessing some little used and rusty Algebra skills I came up with this:

n / GeometryMapWidth = 0.25
n = GeometryMapWidth * 0.25

Our GeometryMapWidth is 5, so n = 1.25. But how do we make the GPU use that as the numerator? Well, as it turns out the numerator in the 1.0 / GeometryMapWidth formula isn’t always 1.0. It’s really the width defined by the texture coordinates from the left and right vertices (in the horizontal case we’re considering). So far the rightmost texture value has been 1.0, and the left has been 0.0. So the formula becomes (1.0 – 0.0) / GeometryMapWidth. If the left coordinate is something besides 0, for example 0.13, the formula would look like (1.0 – 0.13) / GeometryMapWidth.

So, using that knowledge, we can change the numerator to whatever we want by manipulating the texture coordinates. Since the left value is 0.0 and needs to stay that way, we can change the right value to 1.25. The GPU will then calculate the step value as (1.25 – 0.00) / GeometryMapWidth, or 1.25 / 5, which is the 0.25 value we’re looking for! So now this is what our full screen quad definition looks like:

float pw = 1.0f / (Width - 1);
float ph = 1.0f / (Height - 1);

vertices = new VertexPositionTexture[4];
vertices[0] = new VertexPositionTexture(
  new Vector3(-1, 1, 0f), new Vector2(0, 1));
vertices[1] = new VertexPositionTexture(
  new Vector3(1, 1, 0f), new Vector2(1 + pw, 1));
vertices[2] = new VertexPositionTexture(
  new Vector3(-1, -1, 0f), new Vector2(0, 0 - ph));
vertices[3] = new VertexPositionTexture(
  new Vector3(1, -1, 0f), new Vector2(1 + pw, 0 - ph));

This generalizes the approach a bit. Instead of hard coding the 0.25 value we can calculate what it needs to be based on the size of the geometry map. We then add that value to 1 to get the final value, for the horizontal dimension. For the vertical dimension we’re actually subtracting it from the bottom coordinate due to the relationship between the different coordinate systems.

So, in the sample code, set QuadTestGood as the Startup Project and run it through PIX. Debug each pixel and you’ll see that the texture coordinates are interpolated like we want them to be.

One final piece of the puzzle though. When we generate the geometry map we need to generate an extra border of vertices around it for use in calculating the vertex normals. We can do this by simply expanding the texture coordinates in each direction by the value we calculated in the previous step.

// if we have a border then expand the texture 
// coordinates out to account for it
if (border)
{
  vertices[0].TextureCoordinate.X -= pw;
  vertices[0].TextureCoordinate.Y += ph;

  vertices[1].TextureCoordinate.X += pw;
  vertices[1].TextureCoordinate.Y += ph;

  vertices[2].TextureCoordinate.X -= pw;
  vertices[2].TextureCoordinate.Y -= ph;

  vertices[3].TextureCoordinate.X += pw;
  vertices[3].TextureCoordinate.Y -= ph;
}

If you want you can run the QuadTestGoodWithBorder project through PIX and verify that it works as well.

Now that we have the right texture coordinates, the rest of the shader just scales the coordinate by the face-space width and height to calculate the correct x and y values to pass to the noise functions. The sample shader currently just returns those values as the color. You’d want to change the render target to floating point, and replace this line:

// translate the final coordinates
return float4(Left + x, y - Top, 1, 1);

with this:

return TerrainNoise(float3(Left + x, y - Top, 1));

And create your TerrainNoise function using any of the myriad methods revealed by Google.

So, I guess that’s it. I can’t say I entirely enjoyed this ride, but the destination was worth it. And if anyone wants to explain to me why texture coordinates don’t want to interpolate all the way to 1 on a full screen quad, please feel free. 🙂

GPU Geometry Map Rendering – Part 1

I spent the past week moving my procedural planet renderer’s
geometry map creation code from the CPU to the GPU. It didn’t go as smoothly as
I would have liked, but in a way that was a good thing since I gained a much
deeper understanding of some render pipeline things that I had been taking for
granted.

I spent the past week moving my procedural planet renderer’s geometry map creation code from the CPU to the GPU. It didn’t go as smoothly as I would have liked, but in a way that was a good thing since I gained a much deeper understanding of some render pipeline things that I had been taking for granted. I also learned how to use PIX for shader debugging, which I now realize I should’ve done a long time ago. I hope to walk through some shader debugging in a later post.

CPU Geometry Maps

You may recall that a planet is defined as a cube with six faces. Each face can be thought of as a single flat plane when it comes to generating height values, and going forward we’ll do just that by considering only the front cube face.

The cube face is defined with a coordinate system with (-1, -1) on the lower left, and (1, 1) at the upper right. This is very similar to clip-space, but we’ll refer to it as cube-face-space or just face-space.
Cube-Face-Space
The geometry map can be thought of as a grid overlaying the cube face. Each position in the grid can also be thought of as a pixel in a heightmap. The grid can be as detailed as needed, but each dimension should match 2^n+1 – for example: 33×33, 65×65 – in order to help with subdividing. A very commonly used size is 33×33, and this is what I use. However, for this discussion we’ll use 5×5 to make things a bit easier to talk about. The geometry map has its own coordinate space, with (0, 0) on the lower left and (1, 1) on the upper right. Let’s call this geomap-space.

Geomap-Space

We need to generate a height value for each position in the geometry map grid. To do this we map from a geomap-space coordinate to a face-space coordinate, and use the result to generate the height at that position and store that height in the geometry map.

Here is a stripped down version of the CPU code I was using to create my geometry maps. The left, top, width, and height parameters define the face-space area we’re generating height values for. For a 5×5 geometry map over the entire face this function would be called like this:

CreateGeometryMap(-1, 1, 2, 2, 5, 5);
public void CreateGeometryMap(float left, float top, float width, 
                              float height)
{
  // Calculate how far we need to move horizontally for each vertex
  // so the first is at "left", and the last is at "left + width".
  // Do the same for the vertical dimension. GeometryMapWidth and
  // GeometryMapHeight define the geometry map dimensions
  float horizontalStep = width / (GeometryMapWidth - 1);
  float verticalStep = height / (GeometryMapHeight - 1);

  float y = top - height; // start at the bottom

  for (int gy = 0; gy < GeometryMapHeight; gy++)
  {
    float x = left;

    for (int gx = 0; gx < GeometryMapWidth; gx++)
    {
      geometrymap[gx * GeometryMapWidth + gy] = GetHeightAt(x, y);
      x += horizontalStep;
    }

    y += verticalStep;
  }
}

Let’s walk through what it’s doing, in just the horizontal dimension since the vertical is exactly the same. This is a key to understanding how the GPU version works, so don’t skip over it.

The code iterates over the geometry map pixels from 0 to 4. The first pixel corresponds to -1 in face-space, the last pixel corresponds to +1 in face-space. The intervening pixels are found using the horizontalStep variable, which is calculated by dividing the face-space width by the geometry map width minus one. Remember that we passed in -1 for left, and 2 for width, so horizontalStep is 2 / (5 – 1), or 0.5.

The variable x starts out with the value left, or -1. Walking through the inner loop we get these values for x:

gx = 0, x = -1
gx = 1, x = -0.5
gx = 2, x = 0
gx = 3, x = 0.5
gx = 4, x = 1.0

If our geometry map width was 33, horizontalStep would be 2 / (33 – 1) or 0.0625. The first few steps of the loop would look like this:

gx = 0, x = -1
gx = 1, x = -0.9375
gx = 2, x = -0.875
gx = 3, x = -0.8125
gx = 4, x = -0.75
gx = 5, x = -0.6875
gx = 6, x = -0.625

And so on out to gx = 32 and x = 1. As mentioned before, determining y works exactly the same.

So, that pretty much takes care of the CPU method. It works great, it’s fairly simple, and it’s slower than a turtle crossing an Iowa road in winter. Well, to be fair, the GetHeightAt() function is slow because of the nature of the noise functions. Moving that functionality to the GPU is where we see the huge performance wins. So lets get to it.

GPU Geometry Maps

On the GPU side, we need to be able to generate exactly the same face-space coordinates as CalculateGeometryMap. Why does it have to be exact? Since the GPU only returns the height values, the CPU still has to determine the x and y values since they’re also used to create the actual vertices. If the x and y used by the GPU is different from the one used by CPU, bad things happen, like terrain patches shifting a pixel or two when they’re subdivided, and three 18 hour days spent trying to figure out why (yes, this is how I spent my week-long vacation from my “real” job).

When you think about needing the GPU to iterate over something in a general purpose way, what should immediately come to mind is texture coordinates. I’m enough of a n00b at this stuff that it didn’t immediately come to my mind, so while letting Google do my thinking for me I ran across the Britonia blog, which looks like it will prove to be a very helpful resource when it comes to this whole planet creation business.

The idea I came across on that blog is to have the GPU interpolate texture coordinates in such as way as they match the x and y values on the CPU version. It was a breakthrough moment for me, I ran with it, and soon came to a screeching halt. Let’s start with the “running with it” part though.

First thing is to set up a render target which will hold our geometry map. The render target needs to match the size of our desired geometry map, so we just create it with the same dimensions. (Note that all of this code will be available in the sample linked at the end of part 2 of the article).

const int Width = 5;
const int Height = 5;

renderTarget = new RenderTarget2D(
  GraphicsDevice, Width, Height, 1,
  SurfaceFormat.Color, MultiSampleType.None,
  0, RenderTargetUsage.DiscardContents);

Note that I’m using SurfaceFormat.Color here. In the actual version you want SurfaceFormat.Single so you get 32-bit floating point goodness. In this case Color works out fine since we’re going to be examining the output in PIX and don’t really care what the final format looks like.

Next thing is to set up a full screen quad with pre-transformed vertices. Pre-transformed means we’re defining the vertices in clip space, so no transformation is necessary in the vertex shader. We can just pass the coordinates directly to the vertex shader with no changes. Also, we define texture coordinates that cover the full quad, so the GPU will interpolate them from 0 to 1 for us as it processes each pixel.

vertices = new VertexPositionTexture[4];
vertices[0] = new VertexPositionTexture(
  new Vector3(-1, 1, 0f), new Vector2(0, 1));
vertices[1] = new VertexPositionTexture(
  new Vector3(1, 1, 0f), new Vector2(1, 1));
vertices[2] = new VertexPositionTexture(
  new Vector3(-1, -1, 0f), new Vector2(0, 0));
vertices[3] = new VertexPositionTexture(
  new Vector3(1, -1, 0f), new Vector2(1, 0));

Note that “full screen” really means “full render target”. Because the render target is 5×5, and we’re rendering a full screen quad to it, the quad will contain 5×5 pixels, and our pixel shader will be executed for each of those pixels, with the texture coordinates interpolated over the pixels from 0 to 1 in each dimension. (Yes, the screeching halt will soon be upon us).

The last thing we need is to tell the pixel shader what face-space dimensions to work with. These are the same parameters passed to the CreateGeometryMap function on the CPU.

quadEffect.Parameters["Left"].SetValue(-1.0f);
quadEffect.Parameters["Top"].SetValue(1.0f);
quadEffect.Parameters["Width"].SetValue(2.0f);
quadEffect.Parameters["Height"].SetValue(2.0f);

To rehash a bit, CreateGeometryMap required the face-space dimensions (left, top, width, height), as well as constant values defining the geometry map dimensions. For our pixel shader, the face-space dimensions are taken care of by the effect parameters, and the geometry map dimensions are taken care of by the render target dimensions.

All that’s left now is drawing the quad.

GraphicsDevice.DrawUserPrimitives(
  PrimitiveType.TriangleStrip, vertices, 0, 2);

And, now the screeching halt:

The results were close, but not what I expected. Neighboring geometry maps were off by a pixel or two, and when quad tree nodes split, all the geometry seemed to shift a pixel or two. It was all very distracting and ugly. After spending quite awhile walking through code and trying to figure out where I went wrong, I finally decided it was time to install PIX.

And the rest will have to wait until the next post, where I’ll go through the shader itself, and walk through debugging it in PIX to see what’s going on.

Procedural Planet Engine Status

Previously I mentioned I was going to do a mulligan on my procedural planet engine. The few hours I’ve worked on it so far have lead to a beautiful new architecture that’s doing most of the same things as before, as well as some major new things, using about 25% of the code.

Previously I mentioned I was going to do a mulligan on my procedural planet engine. The few hours I’ve worked on it so far have lead to a beautiful new architecture that’s doing most of the same things as before, as well as some major new things, using about 25% of the code.

Here is where things stand currently. I’ll go through some of these in more detail in a later post:

The planet consists of a cube, with the vertices mapped to a sphere. Each of the six cube faces is a quad tree which is used for subdividing the terrain as you move closer to the planet. Each node in the quad tree represents a patch of terrain with 33×33 vertices that are spread out evenly to cover the patch’s area.

In the previous version the quad tree nodes were subdivided synchronously, which resulted in jerkiness when moving slowly, and outright 5 second waits when moving quickly if a lot of nodes needed to be subdivided. That was good enough then since my priorities were elsewhere, but it’s not good enough for the new version. Now, when a node needs to be split the request is queued on a separate thread. The current node will continue to draw until the split is complete. The split requests can be cancelled as well if the camera has moved elsewhere before the split request reaches the head of the queue.

The nice thing about this design is that if you’re moving very fast you end up getting fewer node splits because they’re cancelled before they happen since they’re no longer necessary. Conversely, if you’re moving slowly the splits can easily keep up with your location so you get all of the required detail. On the con side, if you’re moving quickly down to a low level, then stop, it can take a bit for the queue to catch up generating the terrain patches, so the detail can take awhile to show up.

Generating a patch currently happens on the CPU using Perlin as a noise basis and various fractal algorithms such as fBm, Turbulence, and Ridged Multifractal. I will be moving this to the GPU over the coming weeks which will vastly improve the “catching up” problem mentioned previously. This will also enable creating procedural normal maps and textures on the fly.

So, the current version of the app lets me start out in space and fly to an Earth-sized planet down to ground level with ever increasing detail, and absolutely no stalling. The entire planet can be explored, but there is no texturing yet, and lighting is using vertex normals so it’s fairly ugly, but it gets the job done at this stage.

I think the next thing I will do is work on moving the patch generation to the GPU. This seemed like a daunting task 8 months ago, but it should be pretty straightforward now. This is a requirement to allow generating higher resolution procedural normal maps, which will be a big step in improving the look of the terrain.

So, that’s it for now. In future posts I’ll go through some of these features in more detail and discuss how I did things.

What Next?

So, I worked on Guardian since May, and finally finished it up several weeks ago and it’s up on Xbox Live Indie Games. I won’t be purchasing a fancy car any time soon, but I might be able to use the proceeds for a Sunburn Lighting Engine license in a few weeks. I’ll buy the license anyway, but it’d be cool to do it with money earned from my game.

I spent way too much time and energy finishing Guardian, but I’m pleased to know that I still have it in me after all these years. Still, I want to move back to working on these things as a hobby rather than a money-making enterprise. It was much more enjoyable and relaxing.

So, after a couple weeks off letting my brain recover, I’ve decided to dust off my procedural planet engine and work on it some more. By “dust off”, I mean completely scrap it and start over, using what I learned during the first iteration, and some of the things I’ve learned since while working on other things.

I also plan get back to posting useful things on this blog.

So, I worked on Guardian since May, and finally finished it up several weeks ago and it’s up on Xbox Live Indie Games. I won’t be purchasing a fancy car any time soon, but I might be able to use the proceeds for a Sunburn Lighting Engine license in a few weeks. I’ll buy the license anyway, but it’d be cool to do it with money earned from my game.

I spent way too much time and energy finishing Guardian, but I’m pleased to know that I still have it in me after all these years. Still, I want to move back to working on these things as a hobby rather than a money-making enterprise. It was much more enjoyable and relaxing.

So, after a couple weeks off letting my brain recover, I’ve decided to dust off my procedural planet engine and work on it some more. By “dust off”, I mean completely scrap it and start over, using what I learned during the first iteration, and some of the things I’ve learned since while working on other things.

I also plan get back to posting useful things on this blog.

Guardian Playtest Release

Just uploaded the next version of Guardian for playtesting (you’ll need to be an XNA Creators Club member and signed in to follow the link successfully). There were lots of changes this time around. Over 250 items checked off of the todo list, many of them polish type things, but also some very major changes and additions.

Just uploaded the next version of Guardian for playtesting (you’ll need to be an XNA Creators Club member and signed in to follow the link successfully). There were lots of changes this time around. Over 250 items checked off of the todo list, many of them polish type things, but also some very major changes and additions, including these:

  • Supports multiple control configurations. The “standard” set is default and uses suggestions received from the last playtest
  • 2-4 player cooperative play
  • Comprehensive tutorial
  • Cleaned up the help “wall of text” somewhat
  • Added “demo mode” to allow watching how the game is played – this will also function as “attract mode” in a later version
  • Added 4 difficulty levels: Easy, Normal, Hard, Legendary
  • High scores for each difficulty level
  • Cooperative high scores for each difficulty level
  • Global (peer-to-peer) high scores for each difficulty level
  • The background nebula is now animated in the menus
  • Background nebula regenerated for each new wave
  • Increased sprite sizes
  • Show ammo level on selected weapon, eliminated inventory display
  • Automatically switch back to laser canon if out of ammo

I have a couple more things to finish up and then I’ll probably release another version before the 7 day playtest is up. If that all goes well I’ll be submitting it for review in a couple of weeks!

Sprite Sheet Creator

When developing the iPhone version of Guardian I manually created my sprite sheets. I used individual sprites up until the end so everything was pretty much set in stone by the time I created the the sprite sheet. Even then I ended up having to recreate the sprite sheet two or three times, and let me tell you, manually figuring out the texture coordinates isn’t a particularly pleasant experience. In this case I believe I made the right choice. There were few enough sprites that I would have spent more time creating the tool than I would have saved.

When developing the iPhone version of Guardian I manually created my sprite sheets. I used individual sprites up until the end so everything was pretty much set in stone by the time I created the the sprite sheet. Even then I ended up having to recreate the sprite sheet two or three times, and let me tell you, manually figuring out the texture coordinates isn’t a particularly pleasant experience. In this case I believe I made the right choice. There were few enough sprites that I would have spent more time creating the tool than I would have saved.

The XBox version has quite a few more sprites, so I decided that spending time creating a sprite sheet tool was going to be well worth the effort. It didn’t take too long to get it working well enough to use, and not too much longer than that to make it solid enough for distribution.

Sprite Sheet Creator

The application is released as open source under the MIT License.

Download SpriteSheetCreator.zip

Dream Build Play 2009

I managed to make the deadline for entering the Dream Build Play 2009 competition. The results are expected to be announced by the end of the month. Based on the quality of entries this year I’m not holding out a huge amount of hope of actually winning. Regardless, it was a great experience and I learned quite a bit in the process.

I managed to make the deadline for entering the Dream Build Play 2009 competition. The results are expected to be announced by the end of the month. Based on the quality of entries this year I’m not holding out a huge amount of hope of actually winning. Regardless, it was a great experience and I learned quite a bit in the process.

As part of the contest submission we wrote up a game description including play instructions and some comments on the technical design, as well as a video trailer. The video is below, and the documentation I submitted follows. Many of the other entries have posted videos in the Dream Build Play 2009 YouTube group. And you can see all of the contest entries in the Dream Build Play Gallery.

I still have a few more things to do on the game before I put it up for sale on the XBox, but I’m going to take a bit of a break before continuing on. I should have more time to update this blog as well. Thanks for reading – hope you enjoy the video.

 

 

Guardian

Game Play

The overriding premise of the game is to keep asteroids from hitting your planet, and alien ships from shooting it. Your planet will be damaged as it’s hit, and when the damage reaches the planet center the game is over.  You’re competing for high score, with the top ten high scores are tracked locally.

You command a satellite that constantly orbits the planet.  Pressing A fires your selected primary weapon from the satellite towards the red target indicator that you move with the left thumbstick to aim your shots.  You have to be careful to time your shots so the planet isn’t in between the satellite and the target.  Your main weapon is the Laser Canon and has infinite shots available.  The Plasma Canon is more powerful and the shots move faster.  The Rail Gun is an instantaneous kill, and will destroy anything in its path, including taking out a large portion of your planet if it’s in the way.  You will receive large cumulative bonuses for killing multiple enemies with a single Rail Gun shot.  Each primary weapon has a different charge rate which limits how often you can fire.  As mentioned, the Laser Canon has infinite shots, but all other weapons require ammunition.

Pressing Y fires your selected secondary weapon.  These weapons fire from the surface of the planet.  Missiles will automatically target the asteroid or enemy ship closest to the red target – it takes a second or two for the missile to lock on.  Nuclear Missiles will target the actual red target location, so you can fire them at a point in space and use the large blast radius to take out multiple targets.  The BG4143 will destroy everything on the screen by sending out a shockwave with an ever increasing radius.

Alien ships will drop a powerup after destruction.  You grab the powerup by moving the target close by and using X to activate the planetary tractor beam.  The beam will slowly pull the powerup to the planet, after which it will be used or automatically added to your inventory.  Powerups can add energy to your shields or primary weapon, add maximum shield/weapon energy, and increase shield/weapon charge rate. Powerups will also make ammunition available for the various secondary weapons.

You cycle your primary weapon by pressing the Right trigger, and cycle the secondary weapon using the Left trigger.  The right shoulder button will generate a new background at any time, and the left shoulder button displays your current weapon inventory during game play.

Shields work on their own with no intervention.  They will protect your planet for awhile but have a slow charge rate which can be increased by powerups.  Once the shield power is used up asteroids or enemy fire will damage your planet.  However the shield will continue to recharge as long as nothing is hitting it.

There are 32 asteroids in each wave, and 0 to 3 enemy ships.  There are also bonus asteroids and comets that will move by your planet quickly.  These can be difficult to hit, but you’ll receive a bonus score for destroying them.  They will never hit your planet, but they do come in close enough to hit the satellite and destroy it.  The asteroids and ships start out fairly slowly, but they speed up over time until they’re moving quite rapidly.

Development and Design

A limited version of Guardian was originally created for the iPhone, but I wasn’t happy developing on that platform so I made the decision to port it to XNA and XBox and add much of the functionality I had originally planned for the iPhone.

Most of the game uses basic 2D technology: Sprite sheets, particle systems, state machines, and the like.  Collision detection is mostly accomplished through point-in-circle tests.  However the planetary collision detection uses pixel tests since the planet is eaten away throughout the game.

Some of the more interesting technicalities are described in the following sections.

Background Nebula

The background nebulae are generated using a pixel shader which uses fractal brownian motion and other procedural techniques to build a random cloud and star texture.  Each time you see a new background it was generated in real time.  The backgrounds can actually be animated at 60fps to get some very nice moving nebula effects, but combining it with the rest of the game dropped me down to 30fps.  At some point I plan to optimize it some more.

It is also interesting to note that the backgrounds are generated entirely on the GPU and exist entirely in video memory.

Planet Generation

The planets are also procedurally generated, and are actually 3D.  The basic spherical structure is a cube, and a mapping function is used to move each vertex out to the sphere’s radius, and then the noise functions are used to add the height value.  A second sphere is used for the ocean areas.  Each time the game is started you get a new planet.  In future versions I plan to allow regenerating the planet, as well as having different texture sets to allow for non-Earth type planets.

Generating the planet in 3D allowed me to show the planet rotating in the menu areas, with a seamless transition to the game play area by simply moving the camera out to the proper location using the SmoothStep function.  The planet displayed during game play is still 3D, and can actually be rotated, but it looks kind of strange since the craters don’t move with it.

Planet Craters

The craters are created by drawing one of several random crater sprites into a mask texture each time damage is done to the planet.  The mask texture holds the cumulative result of each crater application.  Before drawing the planet the mask is used to set the stencil buffer, then the planet is drawn with the craters masked out.

Yet Another Guardian Progress Video

Things are coming along nicely with Guardian. The Dream Build Play entry deadline is fast approaching, but I think I’m in pretty good shape to get my entry completed. The video shows most of the functionality. Pretty much all that’s left now is fleshing out some of the graphics, and adding a few more weapons. Then I can start getting some sleep.

Things are coming along nicely with Guardian. The Dream Build Play entry deadline is fast approaching, but I think I’m in pretty good shape to get my entry completed. The video shows most of the functionality. Pretty much all that’s left now is fleshing out some of the graphics, and adding a few more weapons. Then I can start getting some sleep.