Animation With Timers

Last time I talked about using timers for scheduling events by simply keeping track of a time limit and how much time elapsed. Another good use for timers is performing some action over time, such as modifying alpha to fade an object in or out, or sliding an object from one location to another.

Let’s use the fading example. Fading is often done by animating transparency. The object starts out completely transparent, and gradually becomes less transparent until it’s fully opaque. An object is fully transparent when it’s drawn using alpha blending with an alpha of zero, and fully opaque when alpha is one. By quickly changing alpha from 0 to 1 over time, and redrawing the object  with the new alpha, it appears to fade into view.

You might say that when alpha is 0 the object is 0% visible, and when alpha is 1 it’s 100% visible. Thinking about the starting and ending values as percentages is very useful. For example, If you’re moving an object from one position to another, the object hasn’t moved at all at 0%, it’s moved halfway at 50%, and it’s moved all the way at 100%.  Another example, if you’re fading in a song, it’s completely silent at 0% volume, it’s medium loudness at 50% volume, and it’s full volume at 100%. Being able to animate a value from 0 to 1 has all kinds of applications in a video game.

So we need to be able to animate a value from 0 to 1, and we want it to take a specified amount of time. In the last post we already made a timer that can run for a specified amount of time.

In that timer, the variable Elapsed keeps track of how much time has elapsed since the timer started. Elapsed starts out at 0, and as the timer runs it progresses to Limit, which is when the timer expires. Looking at it another way, Elapsed has progressed 0% when it starts, and 100% when it ends. In order to calculate the progress as a percentage, we simply divide Elapsed by Limit.

Let’s change the SimpleTimer class so it calculates this percentage for us. We’ll save it in a field called Time. Here’s the class as it stood at the end of the last post.

public class SimpleTimer
{
  public float Limit;
  public float Elapsed;

  public SimpleTimer(float limit)
  {
    Limit = limit;
    Stop(); // make sure it's not running
  }

  public void Start()
  {
    Elapsed = 0.0f;
  }

  public void Stop()
  {
    Elapsed = -1.0f;
  }

  public bool Update(float elapsedTime)
  {
    bool result = false;

    if (Elapsed >= 0.0f)
    {
      Elapsed += elapsedTime;
      result = Elapsed >= Limit;
    }

    return result;
  }
}

First, add our Time field right after Elapsed:

public float Limit;
public float Elapsed;
public float Time;    // <-- add this

Next, add an UpdateTime method that does the percentage calculation and saves it to the Time field:

private void UpdateTime()
{
  Time = Elapsed / Limit;
}

Now, every time we modify Elapsed we need to call UpdateTime so Time is recalculated. We set Elapsed to 0 in the Start method, so we need an UpdateTime call there:

public void Start()
{
  Elapsed = 0.0f;
  UpdateTime();  // <-- add this
}

We also modify Elapsed in the Update method:

if (Elapsed >= 0.0f)
{
  Elapsed += elapsedTime;
  result = Elapsed >= Limit;
  UpdateTime();  // <-- add this
}

And that’s it.  Here’s the full class:

public class SimpleTimer
{
  public float Limit;
  public float Elapsed;
  public float Time;

  public SimpleTimer(float limit)
  {
    Limit = limit;
    Stop();
  }
  public void Start()
  {
    Elapsed = 0.0f;
    UpdateTime();
  }
  public void Stop()
  {
    Elapsed = -1.0f;
  }
  private void UpdateTime()
  {
    Time = Elapsed / Limit;
  }
  public bool Update(float elapsedTime)
  {
    bool result = false;
    if (Elapsed >= 0.0f)
    {
      Elapsed += elapsedTime;
      result = Elapsed >= Limit;
      UpdateTime();
    }
    return result;
  }
}

Now, as our timer is running, it’s always calculating how far it’s gone towards Limit as a percentage, which gives us a value between 0 and 1.  If we’re animating something like alpha that already takes a value between 0 and 1 then we can just use Time directly. But what if we need to change from one color to another, or move an object from one position on the screen to another?

This is where interpolation comes into play. The dictionary defines interpolation as “the process of determining the value of a function between two points at which it has prescribed values”. The “prescribed values” are our starting and ending colors, or our starting and ending positions. There are multiple ways to do the interpolation depending on your needs. A common one is Linear Interpolation, and that’s what we’ll use here. I’m just going to describe a couple of methods supplied by the XNA framework. If you want to learn how linear interpolation works you should be able to find the information on the web pretty easily.

XNA provides several Lerp (Linear intERPolation) methods:

MathHelper.Lerp
Vector2.Lerp
Vector3.Lerp
Matrix.Lerp
Color.Lerp
Quaternion.Lerp

Each of these takes starting and ending values of the appropriate data type, an amount value between 0 and 1, and returns a new value that’s amount percent between the starting and ending values. For example, MathHelper.Lerp performs linear interpolation on float values. Here are some sample values and what the function returns:

MathHelper.Lerp(10, 20, 0.0f);  // returns 10
MathHelper.Lerp(10, 20, 0.2f);  // returns 12
MathHelper.Lerp(10, 20, 0.5f);  // returns 15
MathHelper.Lerp(10, 20, 0.8f);  // returns 18
MathHelper.Lerp(10, 20, 1.0f);  // returns 20

To move an object you can use the Vector2.Lerp method and pass in Time from our timer class as amount. In this example position will start at <50, 50> and work its way in a straight line towards <750, 400> as timer.Time changes from 0 to 1.

Vector2 start = new Vector2(50, 50);
Vector2 end = new Vector2(750, 400);
Vector2 position = Vector2.Lerp(start, end timer.Time);

Similarly, color can be interpolated between two values like this:

color = Color.Lerp(Color.CornflowerBlue, Color.White, timer.Time);

I’ve included a sample project that contains the timer class as well as a simple demonstration that moves a square across the screen and cycles colors using the timer and linear interpolation.

There are many more useful things you can add to your timers, such as pausing, reversing, auto-reversing, auto-restarting. It’s also very useful to add events that fire when the Time value updates, when the timer ends, and so on.

Download SimpleTimer.zip

Make Your Time!

It’s going on seven months since my last post, which is kind of pathetic. When finishing a large project, like the WP7 port of Guardian, my brain tends to shut down for awhile for recharging. That, on top of starting a new job back in November, lead to a much longer blogging hiatus than intended. So, how about something simple to get back into the swing of things. And since we’re talking about time…

Very often in video games there is a need to perform some action after a certain amount of time elapses. Examples include deciding when to show the next frame of an animation, deciding when to spawn the next enemy, hiding a message after it’s been displayed for awhile, and so on. For a most basic implementation of this we need two values: something to describe how long we want to wait, and something to track how long we’ve been waiting.

float timerLimit = 2.0f;
float timerElapsed = 0.0f;

We also need to pick a unit of time to work with. Seconds seem to be pretty convenient, and that’s what we’ll use here. So the value of 2.0 for timerLimit means we want to wait 2 seconds before doing something.

Once we have these values established it’s a matter of updating timerElapsed each frame and checking it against timerLimit to see if the desired amount of time has passed. Since the game programming API of choice here is XNA, the proper place to do this updating is in the game’s Update() method.

protected override void Update(GameTime gameTime)
{
  // grab the number of seconds that have elapsed since the last
  // frame, including fractional seconds, so if you're running
  // at 60fps this value will be 0.0166667 more or less, or
  // 1/60th of a second.
  float elapsedTime = (float)gameTime.ElapsedGameTime.TotalSeconds;

  // add the elapsed time to our timer
  timerElapsed += elapsedTime;

  // if the timer has exceed our limit then do something
  if (timerElapsed >= timerLimit)
  {
    // TODO : do something
  }
}

Once timerElapsed passes timerLimit then we can perform our action. There’s a problem with this code though. Once the timer exceeds our limit it will keep executing the action every frame thereafter. While this may be what we want sometimes, in many cases, if not most, it isn’t. So we need a way to turn off the timer, or restart it.

To restart it we can simply reset timerElapsed back to zero if we want to the event to fire again in the same amount of time. To stop the timer we can use timerElapsed as a flag and set to -1 to indicate that the timer is inactive. Our timer code now looks like this:

// update the timer if it's enabled
if (timerElapsed != -1)
{
  // add the elapsed time to our timer
  timerElapsed += elapsedTime;

  // if the timer has exceed our limit then do something
  if (timerElapsed >= timerLimit)
  {
    // TODO : do something

    // set the elapsed time back to 0 to schedule the event again,
    // otherwise set it to -1 to disable the timer
    timerElapsed = -1.0f;
  }
}

To start the timer initially our code needs to set timerElapsed to 0 which will being the process of tracking elapsed time until timerLimit is reached.

This code practically begs to become a class, so let’s not disappoint it.

public class SimpleTimer
{
  public float Limit;
  public float Elapsed;

  public SimpleTimer(float limit)
  {
    Limit = limit;
    Stop(); // make sure it's not running
  }

  public void Start()
  {
    Elapsed = 0.0f;
  }

  public void Stop()
  {
    Elapsed = -1.0f;
  }

  public bool Update(float elapsedTime)
  {
    bool result = false;

    if (Elapsed >= 0.0f)
    {
      Elapsed += elapsedTime;
      result = Elapsed >= Limit;
    }

    return result;
  }
}

Our previous example now looks like this:

SimpleTimer timer = new SimpleTimer(2.0f);
protected override void Update(GameTime gameTime)
{
  // grab the number of seconds that have elapsed since the last
  // frame, including fractional seconds, so if you're running
  // at 60fps this value will be 0.0166667 more or less, or
  // 1/60th of a second.
  float elapsedTime = (float)gameTime.ElapsedGameTime.TotalSeconds;

  if (timer.Update(elapsedTime))
  {
    // TODO : do something

    timer.Stop(); // or use timer.Start() to restart it
  }
}

Somewhat less crappy.

Scheduling events like this is one use for a timer. Another thing that’s often needed is the ability to perform an action over time, such as fading out text. The next post will cover that.

Guardian Menus

Spruced up the menus quite a bit with some floaty asteroids instead of just the raw text. Looks much nicer I think. The todo list is pretty much empty now. Mostly some final balancing and play testing left to go, and some art work for the various icons and other app store images. If all goes well over the next couple of days it’ll be complete and ready to test when I hopefully get my hardware on Monday.

Spruced up the menus quite a bit with some floaty asteroids instead of just the raw text. Looks much nicer I think. The todo list is pretty much empty now. Mostly some final balancing and play testing left to go, and some art work for the various icons and other app store images. If all goes well over the next couple of days it’ll be complete and ready to test when I hopefully get my hardware on Monday.

User Interface for Guardian High Scores and Challenges

Here is my attempt at using XNA to duplicate some of the UI functionality on WP7. Well, not really duplicate so much as “create a passing resemblance to”.

The menu is pretty boring. I’ve been trying to come up with a good idea to make it interesting, and think I finally have something that will be worth the effort. That’ll probably be the next movie.

The first WP7 phones are supposed to be released in the U.S. on November 8th – just one week away. Here’s hoping Guardian works well on the hardware without needing too many optimizations.

My name is Crappy Coding Guy, and I use Texture2D.GetData

In a previous post about texture modification, I mentioned the evils of transferring data from the GPU to the CPU, and then presented an example showing one way to avoid doing it. The post wasn’t really about deformable 2D terrain or collision detection, but was intended to help newer game programmers open up a new way of thinking when it comes to using the GPU to accomplish tasks.

Since that post, and the one showing a video of my WP7 game, I’ve received a couple of questions about how I do the collision detection in Guardian, which would seem to require the use of Texture2D.GetData.

In a previous post about texture modification, I mentioned the evils of transferring data from the GPU to the CPU, and then presented an example showing one way to avoid doing it.  The post wasn’t really about deformable 2D terrain or collision detection, but was intended to help newer game programmers open up a new way of thinking when it comes to using the GPU to accomplish tasks.

Since that post, and the one showing a video of my WP7 game, I’ve received a couple of questions about how I do the collision detection in Guardian, which would seem to require the use of Texture2D.GetData.

As it turns out, I am evil, and I do use GetData.  But, my evilness is optimized based on information from here, here, and here.

  • Crater drawing is batched, meaning that rather than draw each one as it’s created, I add them to a list and draw all of them every few frames. This reduces the number of GetData calls – one per batch of craters rather than one per crater.
  • After drawing craters to the render target, I wait a few frames before calling GetData to make sure the GPU has processed all of the drawing commands. This minimizes pipeline stalls.
  • If I have a pending GetData call to make and more craters come in, the craters will stay batched until the GetData call is complete.  In other words, the drawing and getting are synchronized so that a GetData call always happens several frames after drawing a batch of craters, and any new crater draw requests wait until after a pending GetData.

If there are a lot of craters being created the built-in delays can cause some slightly inaccurate collision detection since we may be looking at collision data that’s outdated by several frames. At least in this particular game there are never huge numbers of crater adds going on so this isn’t a problem. If there are more than several crater adds they tend to be bunched close together, so the explosion animation hides any visual oddities.

There is one other optimization that I have available but haven’t needed to use.  The collision data doesn’t need to be at the same resolution as the drawing data.  Basically have two sets of render targets – one for the visual texture, and a lower resolution set for the collision data.  Do the GetData on the collision texture and scale everything appropriately when doing the collision check. You have to draw twice – once for the visual data and once for the collision data – but you’re pulling much less data from the GPU which would possibly offset the extra drawing time (this isn’t something I’ve tested yet). You won’t be pixel perfect, but for this type of game that isn’t necessary. As I write this it seems using multiple render targets would eliminate the “draw twice” issue here, but I’ve never done that so some research would be required.

So there you have it. Is this the best or most efficient way?  I don’t know – I’m far from an expert on any of this. To be honest, I never actually tested doing this just on the CPU, so it’s entirely possible that that approach is better if there are collision detection requirements. There are also other considerations, such as whether your game is CPU or GPU bound, which would go into determining which method is better suited to your needs. Ultimately, whatever works in your situation is the right method.

Guardian WP7 Bosses

Added some tough “boss” asteroids. This one just has a single kill point so it’s relatively simple. Later waves will require destroying multiple targets for the kill. Still need to add a destruction sequence when it’s destroyed, rather than just fading away like it currently does.

Added some tough “boss” asteroids. This one just has a single kill point so it’s relatively simple. Later waves will require destroying multiple targets for the kill. Still need to add a destruction sequence when it’s destroyed, rather than just fading away like it currently does.

WP7 Planet Defender Progress Video

I’ve been working on a Windows Phone 7 port of Guardian and finally have something to show for it. All of the systems are in place now, just need to do a lot of tuning and game play tweaking. Of course, at this point I have no idea how it’s going to perform on an actual phone, but it should do fairly well. Hopefully there won’t be too much optimization required after I get my hands on some hardware.

I recorded the video using Fraps while running the version compiled for Windows. Keeping a version working in Windows has made testing and debugging work much more smoothly than having to deploy to the simulator each time I make a change. It’s also a fairly simple matter to simulate the touch functionality using the mouse.

Texture Modification in XNA 3.1

I’ve had a couple of questions about what changes are needed to get the texture modification tutorial to work in XNA 3.1.

So, here’s a 3.1 version of the project, and a quick overview of the major things that need to change.

  • You need to create the depth/stencil buffer yourself, set it on the GraphicsDevice when setting the render target, and restore the previous buffer when you’re done.
  • RenderTarget2D can’t be used directly as a texture, you must call RenderTarget2D.GetTexture instead and use that when drawing.
  • Render states are all set in GraphicsDevice.RenderState instead of the various classes used in 4.0.
  • Various minor syntax changes.

Texture Modification using Render Targets, with some Stencil Buffer Action

Sometimes you need to modify a texture while your game is running, and there are a number of ways to do this. One of the first things newer game programmers often try to do is use Texture2D.GetData to copy the texture data from the GPU to an array on the CPU, modify the bytes, and then send it back to the GPU with Texture2D.SetData.

This is a bad idea on many, levels. Beyond issues with pipeline stalls, GetData and SetData can be slow, especially when working with a large texture. Any time you’re tempted grab data from the GPU for use on the CPU you should very carefully consider all of your options. There are often other solutions that let you keep the data entirely on the GPU and accomplish the same thing.

This tutorial will use an example that could be solved with GetData and SetData, and show you another alternative using render targets and the stencil buffer that will let you perform the same function entirely on the GPU.

Sometimes you need to modify a texture while your game is running, and there are a number of ways to do this. One of the first things newer game programmers often try to do is use Texture2D.GetData to copy the texture data from the GPU to an array on the CPU, modify the bytes, and then send it back to the GPU with Texture2D.SetData.

This is a bad idea on many levels. Beyond issues with pipeline stalls, GetData and SetData can be slow, especially when working with a large texture. Any time you’re tempted grab data from the GPU for use on the CPU you should very carefully consider all of your options. There are often other solutions that let you keep the data entirely on the GPU and accomplish the same thing.

This tutorial will use an example that could be solved with GetData and SetData, and show you another alternative using render targets and the stencil buffer that will let you perform the same function entirely on the GPU.

CPU Craters

Let’s pretend you want to draw 2D planet, and periodically add a crater to it. You want a hole to appear somewhere on the planet, so it looks like part of it was removed.

You could do this using the GetData/SetData method by getting the data from a texture into an array, setting the color to the background (or alpha to 0) in the shape of the crater, then writing the data back to the texture. Or you could be a little cleverer and eliminate GetData by always keeping the data in the array, but you still have to do the SetData to get it into the texture on the GPU each time it’s changed.

GPU Craters

The method we’ll use to do this entirely on the GPU involves several steps. First, we need a couple of resources. We’ll use a simple textured circle for a planet, and a crater shaped texture for the crater.

It’s important to note that the black areas on these have an alpha value of 0, meaning completely transparent. For the planet this just lets us draw the round shape over the background without looking like a square image. But for the crater image the alpha value is very important since it will control what part of the crater image is removed from the planet.

Next, we need to set up two render targets (these will be referred to later as Render Target A, and Render Target B). When we need to add a crater, one of these will be used as a target for drawing to, while the other used as a texture. The next time we add a crater they will swap roles – the texture will become the target, and the target will become the texture. This is called “ping-ponging” and will be discussed more fully later.

Once we have these resources ready to go, the method for adding a crater goes like this:

  1. Activate Render Target A using GraphicsDevice.SetRenderTarget.
  2. Clear the graphics device, setting the color to solid black, and the stencil buffer to 0.
  3. Set up the stencil buffer state so whatever we draw writes a value of 1 to the stencil buffer.
  4. Set up the alpha test state so we only draw where the alpha value is zero.
  5. Draw the crater texture. Because of the way we’ve set up the graphics device, only the parts of the crater texture that have alpha = 0 will be drawn, and those parts will write a 1 to the stencil buffer. So what we have at this point is a “mask” in the stencil buffer that we can use in the next step. The white area in the following image represents the stencil mask we’ve set up – the stencil buffer contains “1” in the white area, and “0” everywhere else.
  6. Set up the stencil buffer so when we draw, anything that has a value of 1 in the stencil buffer will be masked out – meaning it won’t draw.
  7. Draw the “planet texture”. Because of the way we’ve set up the graphics device, anything with a 1 in the stencil buffer won’t be drawn – since these 1’s are in the shape of a crater, that shape will be masked out of the planet texture, leaving holes that look like craters.
  8. Set the render target to the backbuffer. We can now access Render Target A as a texture, and that texture contains the planet texture with a crater-shaped hole in it.
Step 5
Step 7

From now on, until we need to add another crater, we can treat Render Target A as a texture and draw it using SpriteBatch, and we’ll have a nice crater. Now, what if we need to add another crater? This is where the ping-ponging comes in. Since Render Target A is now the “planet texture”, we need to be able to draw somewhere else when we’re filling in the stencil buffer with our crater shape. It just so happens that we set up another place to draw to, Render Target B.

So now, in Step 1, instead of activating Render Target A we need to activate Render Target B and draw the crater shapes into that. But what happens when we get to Step 7? Well, the “planet texture” is now in Render Target A, so we draw that. And in Step 8, Render Target B now contains our new planet texture with two craters.

And if we add a third crater then we’re back to where we started – drawing to Render Target A, and using Render Target B as the source texture. In other words, we “ping-pong” between the two render targets – each time we need to modify the texture, one is used for a texture, and one is used for drawing to, and then those roles are swapped.

You may have noticed that there’s one issue here. The first time through, Render Target B has nothing in it, so we can’t use it as the planet texture. This can be handled by using the actual planet texture the first time, and the render target thereafter.

The Code

Now let’s walk through the code involved, using XNA 4.0. You can do this in 3.1, but you’ll have to make significant changes when creating the render targets and setting the render states.

The complete code is in the downloadable project linked at the end of the tutorial. We’ll just go through the highlights here, referring to the steps mentioned above as we go.

The XNA 4.0 API has been changed substantially where render states are concerned, and for the better. Render states have been grouped by functionality into several classes. You create instances of these classes to represent the state you want, then set them on the graphics device, or pass them to SpriteBatch. So first we need to create these render state objects.

Set Up Render State Objects

For Step 3, we need to use the DepthStencilState class to set up the device to always set the stencil buffer to 1. We enable the stencil buffer, set the stencil function to Always, the pass operation to Replace, and ReferenceStencil to 1. This means that as we’re drawing, each pixel will Always pass, and the value in the stencil buffer will be Replaced with 1.

stencilAlways = new DepthStencilState();
stencilAlways.StencilEnable = true;
stencilAlways.StencilFunction = CompareFunction.Always;
stencilAlways.StencilPass = StencilOperation.Replace;
stencilAlways.ReferenceStencil = 1;
stencilAlways.DepthBufferEnable = false;

And for Step 4 we need to use the standard AlphaTestEffect so we can draw the asteroid texture only where the alpha value is 0.

Matrix projection = Matrix.CreateOrthographicOffCenter(0, PlanetDataSize, PlanetDataSize, 0, 0, 1);
Matrix halfPixelOffset = Matrix.CreateTranslation(-0.5f, -0.5f, 0);
alphaTestEffect = new AlphaTestEffect(GraphicsDevice);
alphaTestEffect.VertexColorEnabled = true;
alphaTestEffect.DiffuseColor = Color.White.ToVector3();
alphaTestEffect.AlphaFunction = CompareFunction.Equal;
alphaTestEffect.ReferenceAlpha = 0;
lphaTestEffect.World = Matrix.Identity;
alphaTestEffect.View = Matrix.Identity;
alphaTestEffect.Projection = halfPixelOffset * projection;

We first set up an orthographic projection matrix that matches SpriteBatch. We set AlphaFunction to Equal, and ReferenceAlpha to 0. This means the alpha test will pass whenever the alpha value we’re drawing is equal to 0. In our crater texture, the crater area has an alpha value of 0, while the surrounding area has 1, so only the crater area will be drawn.

For Step 6 we need a stencil buffer state that allows drawing only where the stencil buffer contains a 0. We enable the stencil buffer, set the stencil function to Equal, the pass operation to Keep, and the reference stencil to 0. This means that when we’re drawing, each pixel will pass if the value in the stencil buffer is Equal to 0.

stencilKeepIfZero = new DepthStencilState();
stencilKeepIfZero.StencilEnable = true;
stencilKeepIfZero.StencilFunction = CompareFunction.Equal;
stencilKeepIfZero.StencilPass = StencilOperation.Keep;
stencilKeepIfZero.ReferenceStencil = 0;
stencilKeepIfZero.DepthBufferEnable = false;

Create Render Targets

Now that we have the render state objects created, it’s time to create the render targets. Both are the same, so just one is shown here. This creates a render target with a Color format, and a depth format that includes a stencil buffer.

renderTargetA = new RenderTarget2D(GraphicsDevice, PlanetDataSize, 
  PlanetDataSize, false, SurfaceFormat.Color, 
  DepthFormat.Depth24Stencil8, 0, 
  RenderTargetUsage.DiscardContents);

Draw the Crater Mask

Next up is drawing the crater masks (Steps 2-5). First we activate the render target, clear it to solid black, and clear the stencil buffer to 0.

GraphicsDevice.SetRenderTarget(activeRenderTarget);
GraphicsDevice.Clear(ClearOptions.Target | ClearOptions.Stencil,
                     new Color(0, 0, 0, 1), 0, 0);

Next we begin a SpriteBatch, passing in the stencilAlways and alphaTestEffect objects that we created earlier. Calculate some random rotation, size the crater texture using a Rectangle, and call SpriteBatch.Draw to draw the crater.

spriteBatch.Begin(SpriteSortMode.Immediate, BlendState.Opaque,
                  null, stencilAlways, null, alphaTestEffect);
Vector2 origin = new Vector2(craterTexture.Width * 0.5f,
                             craterTexture.Height * 0.5f);
float rotation = (float)random.NextDouble() * MathHelper.TwoPi;
Rectangle r = new Rectangle((int)position.X, (int)position.Y, 50, 50);

spriteBatch.Draw(craterTexture, r, null, Color.White, rotation,
                 origin, SpriteEffects.None, 0);

spriteBatch.End();

Draw the Planet Texture

Now we need to draw the latest planet texture, using the stencil buffer to mask out the craters (Steps 6-7). We begin a SpriteBatch, passing in the stencilKeepIfZero object we created earlier. Note that the first time we draw the actual planet texture, but subsequently we draw using the texture from the previous iteration.

spriteBatch.Begin(SpriteSortMode.Immediate, BlendState.Opaque,
                  null, stencilKeepIfZero, null, null);

if (firstTime)
{
  spriteBatch.Draw(planetTexture, Vector2.Zero, Color.White);
  firstTime = false;
}
else
  spriteBatch.Draw(textureRenderTarget, Vector2.Zero, Color.White);

spriteBatch.End();

Swap Render Targets

Finally we activate the backbuffer render target.

GraphicsDevice.SetRenderTarget(null);

And then swap the render targets as discussed previously.

RenderTarget2D t = activeRenderTarget;
activeRenderTarget = textureRenderTarget;
textureRenderTarget = t;

In the main Draw function, you draw the latest cratered planet using the textureRenderTarget. Of course, you need to deal with using the planet texture the first time through though. The downloadable code shows one simple way to do that.

GraphicsDevice.Clear(Color.CornflowerBlue);
spriteBatch.Begin();
spriteBatch.Draw(textureRenderTarget, planetPosition, Color.White);
spriteBatch.End();

Conclusion

And there you have it, a powerful technique for altering textures during your game. Doing this entirely on the GPU is quite a bit more complex than GetData/SetData, but is well worth the extra trouble.

There are some things you can do to improve this technique. If you need to add a lot of craters, rather than adding them one at a time you can batch them up for a while, then in Step 5 draw all of them at once.

I hope you found this tutorial informative. Learning about render targets and stencil buffers opens up a whole new world of possibilities beyond just making craters. What other uses can you think of?

Download the sample XNA 4.0 project

Download the sample XNA 3.1 project

GPU Geometry Map Rendering – Part 2

We left off in part 1 talking about the initial failures with my GPU geometry map shader. I did fail to mention that there was a bright spot the first time I ran the new code – it was amazingly fast. So fast I was able to increase the noise octaves from the 5 that would run reasonably well on the CPU up to 30 and still run at well over 60fps. I have to admit that I spent some of that first 18 hour day just roaming around on a barren, reddish planet. That huge improvement in performance made the pain to come well worth it.
So, at the end of part 1 we set up the C# code for executing the geometry map shader. Now let’s take a look at the shader itself.

We left off in part 1 talking about the initial failures with my GPU geometry map shader. I did fail to mention that there was a bright spot the first time I ran the new code – it was amazingly fast. So fast I was able to increase the noise octaves from the 5 that would run reasonably well on the CPU up to 30 and still run at well over 60fps. I have to admit that I spent some of that first 18 hour day just roaming around on a barren, reddish planet. That huge improvement in performance made the pain to come well worth it.

So, at the end of part 1 we set up the C# code for executing the geometry map shader. Now let’s take a look at the shader itself.

float Left;
float Top;
float Width;
float Height;

struct Vertex
{
  float4 Position : Position;
  float2 UV : TexCoord0;
};

struct TransformedVertex
{
  float4 Position : Position;
  float2 UV : TexCoord0;
};

void QuadVertexShader(in Vertex input, out TransformedVertex output)
{
  output.Position = input.Position;
  output.UV = input.UV;
}

float4 QuadPixelShader(in TransformedVertex input) : COLOR0
{
  // grab the texture coordinates - you can use them directly, but doing this
  // lets you see the value in PIX more easily when you're debugging
  float2 uv = input.UV;

  // get the coordinates relative to the input dimensions
  float x = Width * uv.x;
  float y = Height * uv.y;

  // translate the final coordinates
  return float4(Left + x, y - Top, 1, 1);
}

The Left, Top, Width, and Height floats at the top are the parameters used to define the face-space area. These are the values you’re setting when using the XNA Effect class as mentioned in part 1.

quadEffect.Parameters["Left"].SetValue(-1.0f);
quadEffect.Parameters["Top"].SetValue(1.0f);
quadEffect.Parameters["Width"].SetValue(2.0f);
quadEffect.Parameters["Height"].SetValue(2.0f);

The Vertex struct defines the vertices entering the vertex shader, and the TransformedVertex defines the vertices leaving the vertex shader and entering the pixel shader. Since we’re using pre-transformed vertices the vertex shader doesn’t need to do anything but pass the input values through, which it does very nicely.

The pixel shader isn’t actually all that much more complex. The GPU takes the texture coordinates specified in each vertex and interpolates them for us as it’s rasterizing our quad. Let’s think again in just the horizontal dimension. As mentioned previously, we’ve set up the texture coordinates so they start at 0.0 on the left vertex, and 1.0 on the right vertex. The intent of the shader is to map those values to face-space. In this example the 0.0 should map to -1.0, and the right to +1.0. Also, remember that we’re working with a 5×5 geometry map, and these are the face-space values we expect to get for each of the 5 horizontal positions: -1.0, -0.5, 0.0, 0.5, 1.0.

Walking through the shader for the left pixel we expect to see this:

The uv.x value the shader receives is 0.0
x = Width * uv.x = 2.0 * 0.0 = 0.0
Left + x = -1.0 + 0.0 = -1.0

And for the right pixel we (well, at least I did at one time) expect to see this:

The uv.x value the shader receives is 1.0
x = Width * uv.x = 2.0 * 1.0 = 2.0
Left + x = -1.0 + 2.0 = 1.0

But that isn’t what happens. In fact, the 5 texture coordinates we get are 0.0, 0.2, 0.4, 0.8, resulting in face-space values of: -1.0, -0.6, -0.2, +0.2, +0.8. That did not make any sense to me at all – shouldn’t the texture coordinates be interpolated from 0.0 to 1.0? I spent hours sifting through my code to find out what I had set up wrong. I even took the step of creating a separate bare minimum test app, which I really hate to do. Everything I tried gave me the same results, and I was forced to conclude that somehow the GPU must not be interpolating the texture coordinates the way I expected. I suspected that it had something to do with how Direct X maps texels to pixels, but nothing I tried in that regard gave me the results I needed. I finally caved and started up PIX.

It was a bit daunting at first, but there are some simple tutorials out there that will walk you through the basics. I’m going to go through how I ended up using it, in order to easily run the same tests over, and over, and over again. Start by downloading the test project.

Unzip it into your project folder, open the QuadTest solution, rebuild it, and set QuadTestBad as the startup project. Run it, and verify that you get a pretty box with various shades of blue, white, and magenta.

Now start up PIX. Select File/New Experiment. Use the browse button to navigate to the QuadTestBad.exe you just built. Don’t change anything else for now and press the Start Experiment button. The app should start and you should see the pretty box, as well as some PIX overlay infromation in the upper left. If that’s the case, all is well. If not, you’ll need to figure out why on your own.

Now what we need to do is have PIX grab all the Direct3D data for a single frame. Often you can do this by selecting the “Single-frame capture of Direct3D whenever F12 is pressed” radio button before starting the experiment. That works just fine, and is the method I started out with. But there’s a better way in this case. From the experiment window (the one where you set up the program path), select the More Options button. You should see something like this:

The left tree view might have some different values in it if you changed any options on the initial screen, but the things you need to do here are the same regardless. In the tree view the green T lines are Triggers, and the purple A lines are Actions. When selecting a trigger line you’ll see the right panel which will let you select the type of trigger. Select “Frame” for the trigger type. This will reveal the options for the Frame trigger type. Enter “1” for the frame number. Now select the Action line, which will change the right panel to allow you to select an action. Select the “Set Call Capture” action. Then under Capture Type select “Single-frame capture Direct 3D”, and check the “capture d3dx calls also” box as well.

We’ve just defined a trigger based action. We’ve told PIX to capture Direct3D data on frame #1. Now let’s create a second trigger based action. Press the green T button to create the new trigger. Again select Frame for the trigger type, but this time enter 2 for the frame number. For the action select Terminate Program. So, we’ve now told PIX to exit the program on frame #2. To restate, on frame 1 PIX will capture a single frame of data, and then on frame 2 it will exit the program. You can just go the F12 route in many cases, but when you’re going to be repeating something over and over it can save a lot of time setting up the triggers. In some cases you may have to set the triggers since it can be impossible to hit F12 at just the right moment to capture the data you want.

So, go a head and start the experiment. You should see the QuadTestBad app start up, and then immediately quit. PIX should then display a screen full of data. For this discussion we’re only interested in the panels on the bottom. Events and Details. Events will show you all of the Direct3D calls, and Details will show you details about those events. There is a lot of noise in the D3D calls, so it can be difficult to find what you’re looking for by just scanning through it. There are some buttons at the top of the Events panel that let you move to the next or previous frame, and the next or previous draw call. So, press the button that has the D with a down arrow:

That will take us to the draw call we’re interested in examining. In the Detail panel select the Render tab. You should see the quad we drew. This view will let us step through the vertex and pixel shaders for each pixel displayed in the render tab. You can also look at the Mesh tab to see information at the vertex level, both pre-vertex-shader and post-vertex-shader. Go ahead and do that now, and you’ll see that the screen-space positions and texture coordinates we set up for the full-screen quad did indeed arrive at the vertex shader correctly, and they were correctly sent on to the pixel shader.

Let’s debug a pixel. Go back to the Render tab, mouse over the upper left corner of the image until the X and Y values displayed in the status area show 0, 0. You can zoom into the image if necessary using the buttons at the top of the panel. Once you have the mouse over the top left pixel, right click and select “Debug This Pixel” to open up the Debugger tab. This tab shows a history of the pixel for this frame. Scroll down a bit until you see the DrawPrimitive call. There are several links displayed for debugging the vertices, as well as the pixel. Let’s start with one of the vertices just so you can see it. Click the Debug Vertex 0 link to bring up the vertex shader debugger. Here you can step through the vertex shader code (both forward and backward) and examine all of the variables and registers and such that are involved. Press F10 to step through each line. You’ll see variable values added to the list at the bottom as the values change, or look at the Registers tab to see the individual register values. You can also switch to the Disassembly tab to see the assembly code, which contains comments to help match up registers to variable names in the HLSL code.

To get back to the initial debugger screen press the “back” toolbar button – it’s the one with the green circle and white arrow pointing left.

Now let’s do the fun part and debug the pixel shader. Click the Debug Pixel (0, 0) link,which takes you to the pixel shader debugger. You may have noticed that the pixel shader could be made much more efficient by changing things to use vectors instead of individual floats, and using the incoming texture coordinates directly rather than copying them into a local variable. If you noticed this, you would be right. But splitting things out this way makes debugging in PIX a lot easier, at least for me. You’ll notice that the pixel shader debugger doesn’t display the value for input.UV anywhere, and there is also no way to add a “watch” like you would do in Visual Studio. You could look at the Registers tab and get a good idea, but that can involve a lot of thinking and writing, and examining the assembly code to determine which register is mapped to which variable. So, I found that it helped a great deal to break things out like this because the debugger adds everything to the variable list as you make changes.

If you execute the first line, float2 uv = input.UV, you’ll see what I mean. The uv variable is added to the list and shows the current value, which is (0, 1). Now, if you debug all 5 of the top row of pixels you’ll see that the uv.x values for the pixels are 0.0, 0.2, 0.4, 0.6, 0.8. Why doesn’t it ever reach 1.0? Honestly, I still think it has something to do with texel to pixel mapping, and tried I don’t know how many combinations of shifting things around by half pixels in various coordinate systems, but I still have no idea why it doesn’t make it all the way to 1.0. I’m hoping someone will read this and let me know.

I do know how it’s coming up with those values though. It starts uv.x out at 0, and adds 1.0 / GeometryMapWidth for the next pixel. In our test case GeometryMapWidth is 5, so it’s adding 0.2 each time. If I could make the GPU add 0.25 each time I’d be in business. What I’d like to do is have the GPU add 1.0 / (GeometryMapWidth – 1) each time, but I can’t change the divisor – the GPU is always going find the step by dividing by GeometryWidth. But what if I could change is the numerator? Accessing some little used and rusty Algebra skills I came up with this:

n / GeometryMapWidth = 0.25
n = GeometryMapWidth * 0.25

Our GeometryMapWidth is 5, so n = 1.25. But how do we make the GPU use that as the numerator? Well, as it turns out the numerator in the 1.0 / GeometryMapWidth formula isn’t always 1.0. It’s really the width defined by the texture coordinates from the left and right vertices (in the horizontal case we’re considering). So far the rightmost texture value has been 1.0, and the left has been 0.0. So the formula becomes (1.0 – 0.0) / GeometryMapWidth. If the left coordinate is something besides 0, for example 0.13, the formula would look like (1.0 – 0.13) / GeometryMapWidth.

So, using that knowledge, we can change the numerator to whatever we want by manipulating the texture coordinates. Since the left value is 0.0 and needs to stay that way, we can change the right value to 1.25. The GPU will then calculate the step value as (1.25 – 0.00) / GeometryMapWidth, or 1.25 / 5, which is the 0.25 value we’re looking for! So now this is what our full screen quad definition looks like:

float pw = 1.0f / (Width - 1);
float ph = 1.0f / (Height - 1);

vertices = new VertexPositionTexture[4];
vertices[0] = new VertexPositionTexture(
  new Vector3(-1, 1, 0f), new Vector2(0, 1));
vertices[1] = new VertexPositionTexture(
  new Vector3(1, 1, 0f), new Vector2(1 + pw, 1));
vertices[2] = new VertexPositionTexture(
  new Vector3(-1, -1, 0f), new Vector2(0, 0 - ph));
vertices[3] = new VertexPositionTexture(
  new Vector3(1, -1, 0f), new Vector2(1 + pw, 0 - ph));

This generalizes the approach a bit. Instead of hard coding the 0.25 value we can calculate what it needs to be based on the size of the geometry map. We then add that value to 1 to get the final value, for the horizontal dimension. For the vertical dimension we’re actually subtracting it from the bottom coordinate due to the relationship between the different coordinate systems.

So, in the sample code, set QuadTestGood as the Startup Project and run it through PIX. Debug each pixel and you’ll see that the texture coordinates are interpolated like we want them to be.

One final piece of the puzzle though. When we generate the geometry map we need to generate an extra border of vertices around it for use in calculating the vertex normals. We can do this by simply expanding the texture coordinates in each direction by the value we calculated in the previous step.

// if we have a border then expand the texture 
// coordinates out to account for it
if (border)
{
  vertices[0].TextureCoordinate.X -= pw;
  vertices[0].TextureCoordinate.Y += ph;

  vertices[1].TextureCoordinate.X += pw;
  vertices[1].TextureCoordinate.Y += ph;

  vertices[2].TextureCoordinate.X -= pw;
  vertices[2].TextureCoordinate.Y -= ph;

  vertices[3].TextureCoordinate.X += pw;
  vertices[3].TextureCoordinate.Y -= ph;
}

If you want you can run the QuadTestGoodWithBorder project through PIX and verify that it works as well.

Now that we have the right texture coordinates, the rest of the shader just scales the coordinate by the face-space width and height to calculate the correct x and y values to pass to the noise functions. The sample shader currently just returns those values as the color. You’d want to change the render target to floating point, and replace this line:

// translate the final coordinates
return float4(Left + x, y - Top, 1, 1);

with this:

return TerrainNoise(float3(Left + x, y - Top, 1));

And create your TerrainNoise function using any of the myriad methods revealed by Google.

So, I guess that’s it. I can’t say I entirely enjoyed this ride, but the destination was worth it. And if anyone wants to explain to me why texture coordinates don’t want to interpolate all the way to 1 on a full screen quad, please feel free. 🙂