Research at Stealthy Labs

Protecting Software Assets Using OpenGL


Synopsis

Suppose you are an independent game developer. You are facing piracy and fake copies of your game, and you do not have the legal and economic power to handle this problem. You want to continue making games without getting discouraged by pirates, who most likely reside in other countries. What do you do ? How do you prevent or reduce the incentive to pirate your game through reverse engineering ?

Maybe you could perform encryption of your game assets, like textures, shaders and images, to thwart the piracy and copy-cat efforts ? You could use standard encryption libraries like OpenSSL, but that still leaves the decrypted data open to access, in CPU memory, by anyone running a debugger on your software.

What if you could use OpenGL to do the encryption and leave the data in the framebuffer object and render it from there using OpenGL itself ? Then you would never have to even extract the data from GPU memory into CPU memory ! Debugging tools for OpenGL are not good enough, and reverse engineering tools for OpenGL are non-existent. Let's try it!


Introducing TeaTime

We are going to demonstrate how to perform encryption and decryption of data on the GPU using OpenGL and OpenGL Shading Language (GLSL). To simplify the proof of concept, we will choose Tiny Encryption Algorithm (TEA) for the cryptography aspect. The advantage of TEA over other algorithms is twofold:

  • it is really light weight computationally, and
  • if you want to use it on mobile, it will not drain your battery as much as something like AES-256, which is compute intensive.

You could even use a simple XOR operation instead of TEA, but that's too easy to break. We have called this proof-of-concept software TeaTime.

NOTE: This blog post:

  • is not for demonstrating the merits of TEA.
  • is not a tutorial on OpenGL.
  • is not a tutorial on cryptography and encryption algorithms.
  • only demonstrates a proof of concept.

Here is a simple method of implementing this idea:

  1. Create a new framebuffer in addition to the default framebuffer.
  2. Create two textures in the framebuffer.
  3. Load your encrypted data into the first texture.
  4. Decrypt the texture using the TEA shader code, written in GLSL, with the output written into the second texture.
  5. Directly operate on the second texture and transfer it to the default framebuffer or do something else with it.

That's it! All the operations are default OpenGL operations which you already use. You already are rendering to multiple textures and framebuffers, why not use that rendering concept to encrypt/decrypt data. That's exactly what we have done with TeaTime, and we have abstracted it out into a high level API for you to use.


Description

The source code is available on Github and has been tested on Debian 7.0 Linux with AMD/ATI driver fglrx. You will need the GLUT, GLEW and OpenGL libraries installed to run this. The minimum required OpenGL version is 3.0. Since it uses pure OpenGL calls and is written in C, we see no reason why it will not work on iOS, Android, Windows and Mac OS X.

Our test example displays a solid teapot on the window, while performing encryption and decryption of sample data in the background.

If you look at the source code, the encryption part is placed in a function called teatime_demo(). The function is reproduced here, and the comments explain what is happening.


#define INPUT_SZ 64
#define TEA_ROUNDS 32

void teatime_demo()
{
    int rc = 0;
    teatime_t *tea = NULL;
    do {
        uint32_t ilen = INPUT_SZ;
        uint32_t input[INPUT_SZ];
        uint32_t olen = INPUT_SZ;
        uint32_t output[INPUT_SZ];
        uint32_t elen = INPUT_SZ;
        uint32_t expected[INPUT_SZ];
        /* the encryption key */
        uint32_t ikey[4] = { 0xDEADBEEF, 0xCAFEFACE,
                             0xFACEB00C, 0xF00D1337 };
        uint32_t rounds = TEA_ROUNDS;
        for (uint32_t i = 0; i < ilen; ++i)
            input[i] = (i + 1) * 5;
        for (uint32_t i = 0; i < olen; ++i)
            output[i] = 0;
        for (uint32_t i = 0; i < elen; ++i)
            expected[i] = 0;
        /* create the framebuffer here */
        tea = teatime_setup();
        if (!tea) {
            rc = -ENOMEM;
            break;
        }
        teatime_print_version(stdout);
        /* you need to set the viewport for the framebuffer */
        rc = teatime_set_viewport(tea, ilen);
        if (rc < 0)
            break;
        /* Perform encryption */
        /* Create the texture with the input data */
        rc = teatime_create_textures(tea, input, ilen);
        if (rc < 0)
            break;
        /* here you are creating the shader program for encryption */
        /* the teatime_encrypt_source() returns a null-terminated string
         * that has the source code for the TEA shader */
        rc = teatime_create_program(tea, teatime_encrypt_source());
        if (rc < 0)
            break;
        /* You are providing the number of rounds of TEA that you want to run
         * and the key with which you want to encrypt*/
        rc = teatime_run_program(tea, ikey, rounds);
        if (rc < 0)
           break;
        /* Since we want to verify the output, we are reading it back into CPU
         * memory. Ideally you will just reuse the texture by accessing the
         * tea->otexid variable for the texture. */ 
        rc = teatime_read_textures(tea, output, olen);
        if (rc < 0)
            break;
        /* Verify the output by doing CPU encryption and checking */
        for (uint32_t i = 0; i < olen && i < ilen; i += 2) {
            TEA_cpu_encrypt(&input[i], ikey, &expected[i], rounds);
            printf("%u. Encrypting Input = %08x Output = %08x Expected = %08x\n", i,
                    input[i], output[i], expected[i]);
            printf("%u. Encrypting Input = %08x Output = %08x Expected = %08x\n", i + 1,
                    input[i + 1], output[i + 1], expected[i + 1]);
        }
        for (uint32_t i = 0; i < elen; ++i)
            expected[i] = 0;
        for (uint32_t i = 0; i < olen && i < elen; i += 2) {
            TEA_cpu_decrypt(&output[i], ikey, &expected[i], rounds);
            printf("%u. Decrypting Input = %08x Output = %08x Expected = %08x\n", i, output[i],
                    expected[i], input[i]);
            printf("%u. Decrypting Input = %08x Output = %08x Expected = %08x\n", i + 1,
                    output[i + 1], expected[i + 1], input[i + 1]);
        }
        /* free up memory for now */
        teatime_delete_textures(tea);
        teatime_delete_program(tea);
        /* Perform decryption */
        for (uint32_t i = 0; i < elen; ++i)
            expected[i] = 0;
        /* the input for decryption is always the output of the encryption */
        rc = teatime_create_textures(tea, output, olen);
        if (rc < 0)
            break;
        /* here you are creating the shader program for decryption */
        /* the teatime_decrypt_source() returns a null-terminated string
         * that has the source code for the TEA shader */
        rc = teatime_create_program(tea, teatime_decrypt_source());
        if (rc < 0)
            break;
        rc = teatime_run_program(tea, ikey, rounds);
        if (rc < 0)
           break;
        rc = teatime_read_textures(tea, expected, elen);
        if (rc < 0)
            break;
        /* Verify the output by doing CPU encryption and checking */
        for (uint32_t i = 0; i < olen && i < elen; i++) {
            printf("%u. Decrypting Input = %08x Output = %08x Expected = %08x\n", i, output[i],
                    expected[i], input[i]);
        }
        teatime_delete_textures(tea);
        teatime_delete_program(tea);
    } while (0);
    teatime_cleanup(tea);
}
            

As you can see, we show the code for both encryption and decryption and compare it with the CPU implementations of TEA to verify the results. The TEA CPU implementations are available in the file teapot.c and the OpenGL implementation of all the teatime_* functions are available in the files teatime.h and teatime.c on Github.

The variable ikey is the encryption key in the above code. Here it is a constant, but in reality you may want to use something more dynamically generated, such as based on a user email address or phone number or device identifier. However, explaining key escrow and key management is out of the scope of this blog post. Nevertheless, here are some suggestions:

  • Use a unique key per asset you want to encrypt.
  • Try to use a set of unique keys per user of the product.
  • The keys should never be held in CPU memory.
  • Embed the key or set of keys in an encrypted asset. This will allow you to load the set of keys into texture memory and not in the CPU. Then you can decrypt the rest of the assets also on the GPU using methods like TeaTime, and directly display them on screen.
  • Use a verified key exchange system like Diffie-Hellman to exchange the key that will decrypt the set of unique keys onto the GPU.
Not all of the above may be feasible for your situation, but it is definitely doable. You as a developer can make that choice.

This code is fully reusable and the shader code in GLSL is a string that gets compiled at runtime by your OpenGL driver. This allows you, the developer, to actually use various encryption/decryption algorithms for different assets with the same API ! In other words, we have written the TEA algorithm for you, you can use Extended TEA (XTEA) or if you're up for a challenge, you could use AES-128 or AES-256 even. The GPU is fully capable of running those operations as long as you have enough GPU memory available. You may also use multiple encryption algorithms for different assets, and if you're running an online game service, you could send the shader code over the internet as well using HTTPS or other encrypted channels.

This is what the shader code for decryption looks like.


#version 130
#extension GL_EXT_gpu_shader4 : enable
uniform usampler2D idata;
uniform uvec4 ikey;
uniform uint rounds;
out uvec4 odata;
void main(void) {
    uvec4 x = texture(idata, gl_TexCoord[0].st);
    uint delta = 0x9e3779b9;
    uint sum = delta * rounds;
    for (uint i = 0; i < rounds; ++i) {
      x[1] -= (((x[0] << 4) + ikey[2]) ^ (x[0] + sum)) ^ ((x[0] >> 5) + ikey[3]);
      x[0] -= (((x[1] << 4) + ikey[0]) ^ (x[1] + sum)) ^ ((x[1] >> 5) + ikey[1]);
      x[3] -= (((x[2] << 4) + ikey[2]) ^ (x[2] + sum)) ^ ((x[2] >> 5) + ikey[3]);
      x[2] -= (((x[3] << 4) + ikey[0]) ^ (x[3] + sum)) ^ ((x[3] >> 5) + ikey[1]);
      sum -= delta;
   } 
   odata = x;
}

            

Demonstration

Here is a short video of the demo.

If you liked this, subscribe to our newsletter to receive our monthly research articles, or follow us on Twitter.

Fork me on GitHub