Ongoing guide: Getting started with Shaders in AndEngine

  ... tutorials on how to use AndEngine.

Ongoing guide: Getting started with Shaders in AndEngine

Postby ltmon » Sun Jan 29, 2012 10:10 am

Hi All,

I thought I'd write up this tutorial as a running commentary on my own learning about shaders, and specifically how to use them in AndEngine. The existing specification and many tutorials on shaders is often somewhat complex and assumes a lower-level knowledge of OpenGL than many of us who don't write game engines for a living have. They also tend to focus on 3d problems, such as lighting.

This tutorial will hopefully distill the various sources of information I've been using to figure this out into some small, practical fragments of code to help you get started.

I would expect you to already know Java, AndEngine and have at least a passing understanding of what OpenGL does and how it does it.

I'm just getting started myself, so this tutorial is partly to "error check" my understanding as well as help others. If I'm talking nonsense, please shoot me down and I'll fix the problem. If I subsequently learn better on any point I'll correct it and highlight the correction.

1. What are shaders

In OpenGL ES 2.0 (the version used by the GLES2 branch of AndEngine and available on Android devices >= 2.2) the rendering pipeline is programmable. This means that at various stages between a Sprite or other entity being sent to the GPU and it appearing on screen we have opportunities to insert some code that is executed directly on the GPU and affects the final appearance of our entity.

The advantage of doing this is speed: the code exectued directly on the GPU can be orders of magnitude faster than other code due to the high parallelism in a modern GPU. This has turned out to be useful for all sorts of nice things: lighting, blur, motion effects, bloom ("glowing") effects and more. Well crafted shaders with the right purpose in mind should allow you to make beautiful graphical effects in AndEngine without blowing out your performance.

There are specifically 2 points in the GPU pipeline where we can insert our own code. The programs we insert are called the vertex shader and the fragment shader.

2. The vertex shader

AndEngine sends to OpenGL a list of points in space that represent what you are drawing to screen. These are the vertices of your scene -- for example the 4 points on each corner of a Sprite. The vertex shader is executed early in the pipeline and is called once for each vertex.

The final goal of the vertex shader is to return a position where that vertex should be rendered on the screen. It can do other things, like calculate normals, but that's not all that useful in a 2d world.

3. The fragment shader

Later in the pipeline the vertices and other information have been distilled into fragments. By the time these fragments get to your fragment shader you can pretty much regard them as individual pixels. As such, your shader will be executed once for each pixel on the screen.

The final goal of the fragment shader is to decide on the color of every single pixel.

4. You're already running shaders!

The OpenGL ES 2.0 pipeline needs shaders to output anything to screen at all. AndEngine already provides very simple shader programs to take your entities and put them in the right place according to your camera, with the right colors for each pixel. You can find all of this in the AndEngine source, look in src/org/andengine/opengl/shader

The important thing to remember there is that when you implement a shader you aren't just modifying the original color or position of your entity -- you're starting from scratch. Luckily it's very simple to reproduce the basic default shaders, so we'll do that first.

5. The shader language -- GLSL
Shader programs are written in their own language GLSL. You can find the formal specification of this language at the OpenGL specification page -- ... 1.0.17.pdf (warning: PDF). I'll try and explain some of the language as we go along, suffice to say that it's similar in many ways to Java and C so shouldn't be too hard to understand.

In AndEngine GLSL programs are usually just defined as Strings in your source, however there is nothing stopping you writing the shader in an external file and loading it as well. For the moment we'll stick to embedded Strings.

6. Digging through AndEngine code

Look around AndEngine you'll find a class ShaderProgram, with the following constructor:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. public ShaderProgram(IShaderSource pVertexShaderSource, IShaderSource pFragmentShaderSource) { ... }
Parsed in 0.010 seconds, using GeSHi

Knowing about vertex and fragment shaders, this should be clear: We can create a ShaderProgram with a vertex and fragment shader. A look at the implementations of IShaderSource shows a likely candidate for simple use:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. public StringShaderSource(String pShaderSource) {...}
Parsed in 0.010 seconds, using GeSHi

So, we can create a ShaderProgram out of two StringShaderSource objects defining the vertex and fragment shaders as Strings. It looks like we have the building blocks.

7. The basic fragment shader

Earlier I told you there was a basic fragment shader that AndEngine already uses. Here it is (or close enough) as a StringShaderSource simplified to it's absolute most basic necessities:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. IShaderSource vShader = new StringShaderSource(
  2.     "uniform mat4 " + ShaderProgramConstants.UNIFORM_MODELVIEWPROJECTIONMATRIX + ";\n" +
  3.     "attribute vec4 " + ShaderProgramConstants.ATTRIBUTE_POSITION + ";\n" +
  4.     "void main()        \n" +
  5.     "{  \n" +
  6.     "   gl_Position = " + ShaderProgramConstants.UNIFORM_MODELVIEWPROJECTIONMATRIX + " * " + ShaderProgramConstants.ATTRIBUTE_POSITION + ";\n" +
  7.     "}"
  8. );
Parsed in 0.011 seconds, using GeSHi

Let's break it down:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. "uniform mat4 " + ShaderProgram.UNIFORM_MODELVIEWPROJECTIONMATRIX + ";\n" +
Parsed in 0.010 seconds, using GeSHi

A variable is defined. It's name in a constant of ShaderProgramConstants.UNIFORM_MODELVIEWPROJECTIONMATRIX, which will allow us to reference it easily from the Java code later. It is a matrix that is used to define how the scene in AndEngine coordinates maps to the world, the camera and ultimately the screen. It's a "uniform", meaning it doesn't change during execution of a frame, but can do between frames. It's kind of a complex matrix to do this, but for most purposes you will be using the one built into AndEngine. It's data type "mat4" is a 4x4 matrix.

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. "attribute vec4 " + ShaderProgramConstants.ATTRIBUTE_POSITION + ";\n" +
Parsed in 0.010 seconds, using GeSHi

This is another variable that defines the position of the vertex. It's a 4 dimensional vector (although the 4th dimension is always 1). As it's an "attribute" it's allowed to change within a frame.

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1.     "{  \n" +
  2.     "   gl_Position = " + ShaderProgramConstants.UNIFORM_MODELVIEWPROJECTIONMATRIX + " * " + ShaderProgramConstants.ATTRIBUTE_POSITION + ";\n" +
  3.     "}"
Parsed in 0.010 seconds, using GeSHi

This is now doing what vertex shaders should do! The special variable "gl_Position" is set by multiplying the vector of the vertex by the model view projection matrix. The final result? gl_Position holds a vector which gives us the position of the vertex on screen. This is the voodoo magic provided by the model view projection matrix -- a bit of reading or browsing the source code should help you start to understand how it's calculated in the first place.

8. Color in a pixel with fragment shaders

This time I'm going to use a very simple fragment shader to set the color of all pixels that are run through it. The default AndEngine fragment shader is a bit more complex, as it needs to work out the color you have set, the texture you may have set and maybe apply blending. This one is going to just set a very simple color.

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. IShaderSource fShader = new StringShaderSource(
  2.     "precision mediump float;           \n" +
  3.     "uniform vec4 theColor;             \n" +
  4.     "void main()                                        \n" +
  5.     "{                                                  \n" +
  6.     "   gl_FragColor = theColor;        \n" +
  7.     "}"
  8. );
Parsed in 0.010 seconds, using GeSHi

Breaking it down:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1.     "precision mediump float;                                           \n" +
Parsed in 0.010 seconds, using GeSHi

We're setting the default precision for all floats (inside vectors) to mediump. The default is highp, which is higher precision but slower. Your usage will ultimately determine what precision to use.

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1.     "uniform vec4 theColor;                                             \n" +
Parsed in 0.010 seconds, using GeSHi

This uniform is going to hold a 4d vector of our color in RGBA format.

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1.     "void main()                                                                        \n" +
  2.     "{                                                                                          \n" +
  3.     "   gl_FragColor = theColor;        \n" +
  4.     "}"
Parsed in 0.010 seconds, using GeSHi

A new special variable: gl_FragColor is set to a vector indicating the output color of this fragment (or pixel). We're simply setting this to the value we pass in via theColor. Real, useful, fragment shaders will generally do much more than this, but we have to start somewhere.

9. Setting the scene

Here's an abitrary Rectangle, added to a scene:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. Rectangle r = new Rectangle(100f, 100f, 300f, 100f, getVertexBufferObjectManager());
  2. mScene.attachChild(r);
Parsed in 0.012 seconds, using GeSHi

Easy enough. I've left out the boiler plate as that's assumed knowledge. You should have created a LimitedFPSEngine and a Scene in a BaseGameActivity to house this. Once you can see the rectangle, time to move on. You might have to ensure your background color is not the same as your rectangle.

AndEngine entities can have ShaderPrograms set directly on them... so lets do that:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. Rectangle r = new Rectangle(100f, 100f, 300f, 100f, getVertexBufferObjectManager());
  3. IShaderSource vShader = new StringShaderSource(...);
  4. IShaderSource fShader = new StringShaderSource(...);
  6. r.setShaderProgram(new ShaderProgram(vShader, fShader));
  8. mScene.attachChild(r);
Parsed in 0.013 seconds, using GeSHi

Just imagine the vertex and fragment shader strings I showed above are copied out in full again.

Run your program, and .... well nothing will be on screen actually. We haven't taken an important step!

10. Linking and Binding, Binding and Linking

In our shaders we had several attributes and uniforms that need values. The process of assigning or changing values in a shader is in the link and bind functions of the ShaderProgram. We have to override these to get what we want into the output.

AndEngine provides some extensions of ShaderProgram that do just this for the normal things: the all important model view projection matrix for example. We are going to do most of it ourselves just to learn.

The following should exist in your extension of ShaderProgram:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1.         public int theColorLocation = ShaderProgram.LOCATION_INVALID;
  2.         private Random r = new Random();
  4.         @Override
  5.         protected void link(final GLState pGLState) throws ShaderProgramLinkException {
  7.              GLES20.glBindAttribLocation(this.mProgramID, ShaderProgramConstants.ATTRIBUTE_POSITION_LOCATION, ShaderProgramConstants.ATTRIBUTE_POSITION);
  9.    ;
  11.              PositionTextureCoordinatesShaderProgram.sUniformModelViewPositionMatrixLocation = this.getUniformLocation(ShaderProgramConstants.UNIFORM_MODELVIEWPROJECTIONMATRIX);
  12.              theColorLocation = this.getUniformLocation("theColor");
  13.         }  
  15.        @Override
  16.        public void bind(final GLState pGLState, final VertexBufferObjectAttributes pVertexBufferObjectAttributes) {
  17.            super.bind(pGLState, pVertexBufferObjectAttributes);
  18. GLES20.glUniformMatrix4fv(PositionTextureCoordinatesShaderProgram.sUniformModelViewPositionMatrixLocation, 1, false, pGLState.getModelViewProjectionGLMatrix(), 0);
  19.            GLES20.glUniform4f(theColorLocation, r.nextFloat(), r.nextFloat(), r.nextFloat(), 0.3f);
  20.        }
Parsed in 0.014 seconds, using GeSHi

In the link method, we are linking attributes and uniforms in our shader programs to locations in Java: specifically just integers that will hold for us a way of referencing that attribute or uniform later on.

The bind method is called per frame, and allows us to bind a uniform or attribute to a value for that frame. You can see here we're binding the model view position matrix, and our color.

The color we are randomizing on each frame so you can see that bind() is actually being called each time: set your engine to a low framerate to see this clearly in action.

11. Done for today

We've now seen how to do some simple manipulation of a single entity with GLSL shader programs. Nothing here was very visually exciting, nor was it even a faster way to get a random colored rectangle if you ever wanted that. Hopefully, though, it was enough to start to understand how shaders fit together.

I'll hopefully be extending this with new posts as I learn. For my own purposes I'll be wanting a bloom (glowing) shader and a drop shadow shader, so that's the initial direction I'll be taking.

12. Addendum

Please submit to me any corrections on the above. As I said, this is really just my own first steps that I'm putting down in paper (it's a great way to learn, try it).

A question for the (advanced?) reader: how is ShaderProgramConstants.ATTRIBUTE_POSITION linked and bound? I couldn't find where that is done in the AndEngine source. Does it need to be done?

13. Addendum, part 2

One important part missed was the need to load your ShaderPrograms. This is usually best done in the onCreateResources() method of your game:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. this.getShaderProgramManager().loadShaderProgram(new ShaderProgram(...));
Parsed in 0.012 seconds, using GeSHi

Your shader will work correctly without this, but will likely not resume correctly when your game is paused and you leave for another program before resuming.
Last edited by ltmon on Thu Feb 02, 2012 12:18 pm, edited 3 times in total.
Posts: 41
Joined: Wed May 25, 2011 12:54 am

Re: Getting started with Shaders

Postby Mathew » Sun Jan 29, 2012 6:10 pm

Great article, nice to know new techniques - thank you!
User avatar
Posts: 1073
Joined: Sun Jul 31, 2011 2:49 pm
Location: Tarnów, Poland

Re: Getting started with Shaders

Postby RealMayo » Sun Jan 29, 2012 6:49 pm

Awesome, thank you!
User avatar
Posts: 1694
Joined: Sat Sep 03, 2011 9:25 pm
Location: Chicago, IL

Re: Getting started with Shaders

Postby ltmon » Wed Feb 01, 2012 3:56 am

Hi Again -- time for part 2.

This time we are doing something a little more interesting -- a targeted Gaussian blur.

A few things before getting started:

  • The previous post has been updated with a new addendum about loading your shaders, please go read it if you haven't already.
  • The source code for this tutorial is presented as an attached file rather than inline. I'll be referring to line numbers throughout, so please download it now and open it in your favourite editor with line numbers.
  • You can use this code in a new AndEngine project -- just replace your main game activity with it. You'll need a graphic called "mona_lisa.<jpg|png>" in assets/gfx, my reference one being a 312x400 jpg.

1. Blur

One of the things a fragment shader is good at, that is also very useful, is performing blurring operations. Any blur algorithm basically works by retrieving a portion of the colour of neighbouring pixels and adding them to the current pixel. Usually you will take values from pixels out to a certain distance and add less and less of each.

Blur is exciting, because it can be a building block of many other effects, such as bloom or shadows. On it's own, it can be used to present a depth of field or just a nice effect when pausing your game.

In this particular case we are going to use an algorithm called Gaussian blur. Gaussian blur is a fast way to get a nice looking blur while still keeping up good performance. It is used commonly in many places, such as the semi transparent title bars on a Windows 7 desktop. For a full description of the algorithm, I'm going to refer you to a great little article: ... er-shader/

You'll notice that the article even contains a couple of GLSL shaders for us to use. We'll be focusing on implementing those shaders in AndEngine and explaining a bit along the way.

2. The shaders

The source code for the AndEngine shaders start at line 159. This time we're subclassing the ShaderProgram explicitly.

The firsrt thing you'll notice is that we use the singleton pattern. There's actually very little point having more than one instace of your ShaderProgram in most cases, so we make sure that no extra ones are created.

Starting at line 168 is the fragment shader source code -- we'll get to that later.

In the constructor at line 192 we initialize this with our fragment shader and the vertex shader taken directly from the AndEngine source code. We don't need anything new or exciting in our vertex shader for a blur, so we can save ourselves the effort and use the default one. Still, it's worth having a look at it because it introduces something new.

3. The default vertex shader

The source for the default AndEngine vertex shader is reproduced below. It's a little more complex than the one we used last time, but is very similar in many respects.

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. public static final String VERTEXSHADER =
  2.                 "uniform mat4 " + ShaderProgramConstants.UNIFORM_MODELVIEWPROJECTIONMATRIX + ";\n" +
  3.                 "attribute vec4 " + ShaderProgramConstants.ATTRIBUTE_POSITION + ";\n" +
  4.                 "attribute vec2 " + ShaderProgramConstants.ATTRIBUTE_TEXTURECOORDINATES + ";\n" +
  5.                 "varying vec2 " + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ";\n" +
  6.                 "void main() {\n" +
  7.                 "       " + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + " = " + ShaderProgramConstants.ATTRIBUTE_TEXTURECOORDINATES + ";\n" +
  8.                 "       gl_Position = " + ShaderProgramConstants.UNIFORM_MODELVIEWPROJECTIONMATRIX + " * " + ShaderProgramConstants.ATTRIBUTE_POSITION + ";\n" +
  9.                         "}";
Parsed in 0.015 seconds, using GeSHi

We can see that it's final goal is to set the gl_Position. It does this in the same way that we did last time, but multiplying the position on the vertex by the model view projection matrix to end up with a coordinate for that vertex on the screen.

The new bit is our ability to add textures coordinates to that vertex, and effectively "pin" a texture on top of any shape we define with a series of vertices.

The program takes an attribute "ShaderProgramConstants.ATTRIBUTE_TEXTURECOORDINATES" which will be bound to a coordinate on a texture. In AndEngine this generally means that 4 vertices will be bound to the 4 corners of a rectangular texture, but it can be much more complex than that if required.

Additionally we are defining a varying ShaderProgramConstants.VARYING_TEXTURECOORDINATES. A "varying" is a new concept for us. It doesn't define a value that we are providing the shader, but a new variable that will be returned by the shader. It's method for returning the varying is to pass it onwards through the pipeline -- so eventually to the fragment shader.

As we are simply setting the varying texture coordinates to the attribute texture coordinates, we will have direct access to the texture coordinates within our fragment shader -- a useful piece of knowledge.

But wait...

4. Interpolation

We are passing down the coordinates for where a texture is pinned to a vertex. But there isn't a vertex on every fragment. What about all those fragments that are between vertices?

Varying's solve this problem by interpolation. For any fragment (pixel) between two or more vertices it will get a linearly interpolated value of any varying being passed down the pipeline. For example, if you set a varying to be a vec4 representing the color green on one vertex and to a vec4 representing red on another, the pixels between those two vertices would be handed a vec4 containing all the colors between green and red in a smooth gradient.

As we are setting our varying in the vertex shader to a corner of a texture, what we will have on any given fragment is the coordinate of the texture that should be rendered onto that pixel. Pretty useful I would think!

Interpolation is a core feature of the GLES2.0 rendering pipeline, and we should use it where possible to make life fast and easy.

4. Back to the fragment shader

So we now have a vertex shader, and we know it's going to be passing us a position on a texture that should be rendered. Let's have a look at our fragment shader on line 168.

Firstly we're setting our default precision to be lowp. This is just to speed it up a bit, as high precision maths is a bit pointless and the effects will not be visible.

Next we're setting a uniform of type sampler2d called ShaderProgramConstants.UNIFORM_TEXTURE_0. This, if you haven't guessed, is going to be the texture that is currently being rendered. We'll make sure to link and bind that in the appropriate places.

The const "blurSize" is just to help us change settings. You can make this bigger or smaller to blur values from pixels further or closer to the one being processed. As pixel coordinates are represented as floating point number between 0.0 and 1.0 we have to normalise this a bit (division by WIDTH-1) to make it speak in our language of camera width and height.

In the main method is the guts of the algorithm. You should be able to line this up with the algorithm as described in the earlier link -- it's grabbing values from a number of pixels away, multiplying their values to a proportion of their original and adding them to the currently processed pixel.
Next are the link() and bind() methods. Hopefully you can read and understand them now, but ask questions if not.

5. But wait, there's more

If you read the Gaussian blur description, you would know that it's a 2-pass algorithm: one pass to blur on the horizontal and another to blur on the vertical. To define the vertical blurring, I've made a new ShaderProgram starting on line 232. It's nearly identical to the first one, but blurs in the vertical direction instead of the horizontal.

Yet, we can only draw a Sprite with a single ShaderProgram -- how do we get two processing passes?

6. Render-to-texture

The answer to this lies in a common OpenGL pattern, Render-to-texture. In this pattern you first render your sprite to a separate texture (instead of the screen) and then render that texture to the screen. This way, we can apply a different ShaderProgram each time we render it.

AndEngine allows us to use this pattern with it's RenderTexture class. On line 89 we're initializing 2 of them. We are also creating a Sprite that contains that texture and also setting the ShaderProgram for each to be our 2 passes.

On line 143, we're drawing our Sprite to the first texture. We do this by calling the RenderTexture.begin() method. It needs the current GLState. Optionally I'm also clearing that texture to transparent before I render onto it by passing in the transparent Color.

The game is constructed so that anything drawn to mRenderTexture1 will end up blurred. In this way we can add non-blurred objects to the scene also. If you want to see an example of a blur which does the entire scene, have a look at the RadialBlurExample in the AndEngine examples project. It should be quite understandable after this tutorial.

7. Playing texture Ping-Pong

If we needed to do even more passes, then we wouldn't necessarily need a RenderTexture for each. Just clear each RenderTexture in turn, set it's ShaderProgram for that pass and draw it to the second RenderTexture.

Even though it's not necessarily the most efficient way, this example is playing ping-pong with mRenderTexture1 and mRenderTexture2 to show how it works.

Lets look closer at the overriden onDrawFrame() method of the engine on line 65.

Firstly, before we draw any frame, we make sure that the required RenderTextures have been initialized.

On line 73 we just draw our frame as per normal. What we are expecting, however, is that some of our Sprites would have rendered to mRenderTexture1 instead of to the screen -- the texture that holds all the objects we wish to blur.

With mRenderTexture1 and therefore mRenderTextureSprite1 populated with stuff to be blurred, we begin() mRenderTexture2, filling it with transparent pixels as we do so. We then draw mRenderTextureSprite, remembering it will do this with the GaussianBlurPass1ShaderProgram, and will end up on mRenderTexture2.

Finally, starting from line 81, we render the Sprite connected to mRenderTexture2 to the screen. There's a little bit of boiler plate around this to ensure that mRenderTextureSprite2 is projected onto the screen correctly by the model view projection matrix, but this is generally code that is the same in all instances.

If you want to see another example of playing RenderTexture ping-pong, have a look at the MotionStreakExample in the AndEngine examples project.

8. And we're done, but trouble in paradise

The rest of the code is boiler plate, and should be fully reproduceable by anyone who already knows how to use AndEngine.

Start it up and you should have a nicely blurry Mona Lisa smiling back at you.

You can uncomment line 152 in order to make her spin around. One thing you'll likely notice is a very low frame rate: 10fps or so on my nice new (and powerful) Galaxy Nexus.

Next time (hopefully) I'll be able to present a way to make this blur usable in real-time with a decent frame rate.

EDIT: I have found that this code pulls a much more respectable 30fps on the much less respectable Huawei Ideos X5. It is lower resolution screen to drive, but also a much lesser graphics chip. I would be really interested to hear anyone else's experience with their device.

EDIT AGAIN: The Samsung Galaxy S2 can run this at a full 60 fps. The problem with the Galaxy Nexus is with a GPU that isn't quite up to the task of feeding it's 720p screen. The "fill rate" of the PowerVG GPU on the Galaxy Nexus is below what is needed to do this full screen blur in real time. I've got some optimisations that should make it more acceptable, so next time I'll go through that.

9. Addendum

Standard disclaimer: I'm a newbie at this too. I'll fix anything that's wrong, just tell me.
The all important source code
(2.64 KiB) Downloaded 1073 times
Posts: 41
Joined: Wed May 25, 2011 12:54 am

Re: Getting started with Shaders

Postby ltmon » Thu Feb 02, 2012 12:16 pm

Well I'm back again, hopefully some of you are still following on.

This entry of the tutorial will be a little bit of hard slog, with some gory technical details so hang in there -- it's worth it because you'll learn some techniques for making fragment shaders perform well on mobile devices.

Get out the code from the previous entry, because we'll be using that once more.

1. But why was it slow?

Our performance problem is one coming about due to HD phones and tablets being more widely available. This new generation have fantastic screens, but not yet the GPU to really drive them -- not well enough for our purposes anyway. These devices are expected to push more than 2.5 as many pixels as the previous generation, yet have GPUs that are really not that much better.

This situation is likely to continue for a while, with maybe only Tegra 3 holding enough grunt to run the described fullscreen Gaussian blur at qHD or HD resolution at a high framerate.

2. Downsampling

The first trick we're going to implement is to downsample our textures. What this means is:

  • Make the RenderTexture's smaller
  • Render our Gaussian blur
  • Put the texture back on the screen rescaled back to full resolution

The outcome of this should be obvious: we are losing resolution, but processing fewer pixels. This process is very useful for blur algorithms because the blurring itself tends to hide many artifacts produced by the downsampling.

To do this we first halve the size of the RenderTexture's we are working with. Let's look at the initRenderTexture function.

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. private void initRenderTexture(GLState pGLState) {
  2.         mRenderTexture1 = new RenderTexture(mCamera.getSurfaceWidth()/DOWNSAMPLE_RATIO, mCamera.getSurfaceHeight()/DOWNSAMPLE_RATIO, PixelFormat.RGBA_4444);
  3.         mRenderTexture1.init(pGLState);
  4.         mRenderTextureSprite1 = new UncoloredSprite(0f, 0f, TextureRegionFactory.extractFromTexture(mRenderTexture1), getVertexBufferObjectManager());
  5.         mRenderTextureSprite1.setShaderProgram(GaussianBlurPass1ShaderProgram.getInstance());
  7.         mRenderTexture2 = new RenderTexture(mCamera.getSurfaceWidth()/DOWNSAMPLE_RATIO, mCamera.getSurfaceHeight()/DOWNSAMPLE_RATIO, PixelFormat.RGBA_4444);
  8.         mRenderTexture2.init(pGLState);
  9.         mRenderTextureSprite2 = new UncoloredSprite(0f, 0f, TextureRegionFactory.extractFromTexture(mRenderTexture2), getVertexBufferObjectManager());
  10.         mRenderTextureSprite2.setShaderProgram(GaussianBlurPass2ShaderProgram.getInstance());
  11. }
Parsed in 0.014 seconds, using GeSHi

You'll see that I'm dividing the width and height of the two RenderTexture by a statically defined DOWNSAMPLE_RATIO. You'll have to add this integer constant into your code. 2 is a good value to start with, but you can try higher and see the visual changes it produces.

Next, let's modify the main onDrawFrame method, starting at line 75:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. mRenderTexture2.begin(pGLState, false, true, Color.TRANSPARENT);
  2. {
  3.         mRenderTextureSprite1.onDraw(pGLState, mCamera);
  4. }
  5. mRenderTexture2.end(pGLState);
  7. mRenderTexture1.begin(pGLState, false, true, Color.TRANSPARENT);
  8. {                                      
  9.         mRenderTextureSprite2.onDraw(pGLState, mCamera);
  10. }
  11. mRenderTexture1.end(pGLState);
  13.                                 mRenderTextureSprite1.setShaderProgram(PositionTextureCoordinatesShaderProgram.getInstance());
  15. pGLState.pushProjectionGLMatrix();
  16. pGLState.orthoProjectionGLMatrixf(0, mCamera.getSurfaceWidth()/DOWNSAMPLE_RATIO, 0, mCamera.getSurfaceHeight()/DOWNSAMPLE_RATIO, -1, 1);
  17. {
  18.         mRenderTextureSprite1.onDraw(pGLState, mCamera);
  19. }
  20. pGLState.popProjectionGLMatrix();
  22. mRenderTextureSprite1.setShaderProgram(GaussianBlurPass1ShaderProgram.getInstance());
Parsed in 0.015 seconds, using GeSHi

What you'll notice here is that we are doing an extra round of ping-pong compared to the previous iteration. Because the final stage is drawing to screen at full resolution -- not downsampled -- we need to complete the drawing of the blur on our smaller RenderTextures in order to reap the full performance benefit before going to the screen.

Another change is to put our DOWNSAMPLE_RATIO into the othoProjectionGLMatrix so that the final draw is at full resolution rather than our scaled down textures.

Finally, we set the first render texture back to the default shader program in order to have it ready for a new frame.

By setting the DOWNSAMPLE_RATIO to 2 we already are rendering across only a quarter of the pixels we were previously, and will gain a large jump in FPS due to that. You can try it now, before moving on to the next change.

3. Linear sampling

The default fragment shader doesn't suffer from slowdowns, so what is it about the Gaussian blur shader that causes such a hit to performance?

The answer lies with pretty much every call in the shader to "texture2d". This call is retrieving the color value for a given pixel on a texture, and getting this is a slow operation, especially on mobile GPUs. The only exception to this is the current pixel, which is prefetched by the GPU and as such is essentially free.

So in our shaders we are fetching 8 pixel values that are expensive and one that is very cheap. The default shader only needs to fetch the cheap current pixel, which is why it's so fast compared to ours.

Calls to retrieve a pixel other than the current one are known as "dependent texture reads", and it's wise to avoid them as much as is possible in any shader.

So if we reduce the calls to texture2d we'll get a performance win. One option is just to reduce the number of steps in our Gaussian blur function from 9 down to 7 or even 5. This is certainly worth considering, as you may not notice the difference in quality.

We also have a better option: linear sampling. For the full gory details, have a read of: ... -sampling/

The synopsis is that the interpolation functionality of our graphics card doesn't just work from pixel to pixel, but also between pixels. The graphics card can just as easily fetch the color value of pixel (0, 1.5) as it can pixel (0,1). The (0,1.5) value will be an interpolated value midway between (0,1) and (0,2).

Using this knowledge we can construct a blur that uses fewer samples without loss of quality by looking in between the pixels and essentially gaining knowledge about the colors of 2 adjacent pixels from a single call to texture2d.

Here's a horizontal fragment shader made with this method.:

Syntax: [ Download ] [ Hide ]
Using java Syntax Highlighting
  1. public static final String FRAGMENTSHADER =
  2. "precision lowp float;\n" +
  3. "uniform lowp sampler2D " + ShaderProgramConstants.UNIFORM_TEXTURE_0 + ";\n" +
  5. "varying mediump vec2 " + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ";\n" +
  7. "const float blurSize = 2.0/" + (WIDTH-1) + ".0;        \n" +
  9. "void main()    \n" +
  10. "{                              \n" +
  11. "       vec4 sum = vec4(0.0);   \n" +
  12. "       sum += texture2D(" + ShaderProgramConstants.UNIFORM_TEXTURE_0 + ", vec2(" + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".x - 3.2307692308*blurSize, " + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".y)) * 0.0702702703; \n" +
  13. "       sum += texture2D(" + ShaderProgramConstants.UNIFORM_TEXTURE_0 + ", vec2(" + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".x - 1.3846153846*blurSize, " + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".y)) * 0.3162162162; \n" +
  14. "       sum += texture2D(" + ShaderProgramConstants.UNIFORM_TEXTURE_0 + ", vec2(" + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".x, " + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".y)) * 0.2270270270; \n" +
  15. "       sum += texture2D(" + ShaderProgramConstants.UNIFORM_TEXTURE_0 + ", vec2(" + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".x + 1.3846153846*blurSize, " + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".y)) * 0.3162162162; \n" +
  16. "       sum += texture2D(" + ShaderProgramConstants.UNIFORM_TEXTURE_0 + ", vec2(" + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".x + 3.2307692308*blurSize, " + ShaderProgramConstants.VARYING_TEXTURECOORDINATES + ".y)) * 0.0702702703;      \n" +
  17. "       gl_FragColor = sum;     \n" +
  18. "}                                              \n";
Parsed in 0.017 seconds, using GeSHi

4 dependent texture reads instead of 8 means we should nearly double our speed of execution. The static weights and offsets used here were drawn from the linked article -- that article also describes how you can derive your own weights and offsets for your own implementation if you need different steps or sigma values.

Armed with this you should be able also modify the vertical pass fragment shader to do the same.

4. Results

With a 2x downsample and the linear sampling of pixels the full screen Gaussian blur shader now runs at 60fps solid on a Galaxy Nexus -- a huge jump from our previous 10 fps. Personally, I can only just pick the quality difference introduced by the downsample.

This Galaxy Nexus is almost the worst case modern Android device for fragment shaders due to it's high resolution and low fill rate so being able to support this means that most GLES 2.0 phones should run this code quite well.

If your game is particularly fast moving, you could even downsample a whole lot further without anyone noticing the difference.

5. Other things to do

There are of course many other optimisations to be made, some which will worthwhile or not depending on your own needs. Here's a few to think about:

  • It might seem obvious, but why not process your textures offline and have second sprite? Or process only on the first frame and keep that texture around? Make sure you actually need dynamic effects before you use them.
  • Only process the pixels you need to. If only one small Sprite needs blurring, then make a RenderTexture of only the size of that Sprite only process that rather than a full screen.
  • Modify the fragment shader to simply set all transparent pixels to vec4(0.0). This will mean there are no soft edges on your blurred object, but that could even be what you want.
  • Use compressed textures and mipmaps. This won't help much when using RenderTextures, but on single pass implementations having a proper compressed texture will do wonders for performance. This functionality is in AndEngine now and should be used for this among many other reasons.

6. Addendum

Kittehface, creator of fantastic live wallpapers, was kind enough to answer a couple of questions for to help me out with this. Go buy his stuff on the market!
Last edited by ltmon on Fri Feb 03, 2012 4:05 am, edited 1 time in total.
Posts: 41
Joined: Wed May 25, 2011 12:54 am

Re: Ongoing guide: Getting started with Shaders in AndEngine

Postby Niffy » Fri Feb 03, 2012 1:31 am

I've yet to read it but a quick read through looks like it will be a great help.
I've booked marked this, it will come in handy in the future. This should be pinned because it introduces us to new feature to GLES2.
Posts: 284
Joined: Sat Sep 17, 2011 8:39 pm

Re: Ongoing guide: Getting started with Shaders in AndEngine

Postby makers-f » Mon Feb 13, 2012 6:27 pm

Bookmarked it too! Really nice article, you put a lot of effort into it! Thanks a lot!
Posts: 36
Joined: Sat Sep 17, 2011 6:11 pm

Re: Ongoing guide: Getting started with Shaders in AndEngine

Postby pep_dj » Tue Feb 14, 2012 10:09 am

Thanks for your tutorials!!. I'll try to make my own shaders ;)

Any new content for this turorial will be appreciated :)
Posts: 170
Joined: Fri Nov 12, 2010 9:05 pm

Re: Ongoing guide: Getting started with Shaders in AndEngine

Postby ltmon » Tue Feb 14, 2012 1:04 pm

pep_dj wrote:Any new content for this turorial will be appreciated :)

I've got some prepped, but need to find time to clean it up and finish it off. Will be here soon I hope.
Posts: 41
Joined: Wed May 25, 2011 12:54 am

Re: Ongoing guide: Getting started with Shaders in AndEngine

Postby pep_dj » Tue Feb 14, 2012 1:07 pm

ltmon wrote:
pep_dj wrote:Any new content for this turorial will be appreciated :)

I've got some prepped, but need to find time to clean it up and finish it off. Will be here soon I hope.

Thanks ltmon, I'll be waiting for it. Are so useful :!: :)
Posts: 170
Joined: Fri Nov 12, 2010 9:05 pm


Return to Tutorials

Who is online

Users browsing this forum: Google Adsense [Bot], Majestic-12 [Bot] and 8 guests