Buffers

In OpenGL, the CPU-side code is mostly copying data to GPU buffers with code like this.

var vertexVbo uint32
gl.GenBuffers(1, &vertexVbo)
gl.BindBuffer(gl.ARRAY_BUFFER, vertexVbo)
gl.VertexAttribPointer(0, 3, gl.FLOAT, false, 0, nil)
gl.EnableVertexAttribArray(0)
gl.BufferData(gl.ARRAY_BUFFER, 4*len(triangles), gl.Ptr(&triangles[0]), gl.STATIC_DRAW)

In this tutorial we will create a BufferList type that encapsulates the storing and copying of data from the CPU to the GPU. This type will be able to create all the necessary buffers needed to use a shader so it will rely on the Shader type we defined in an earlier tutorial.

Buffer List Type

The following is the basis for our BufferList object.

type BufferList struct {
    Shader      shaders.Shader
    CpuBuffers  [][]float32
    GpuBuffers  []uint32
    TypeSize    []int32
    NumSprites  int
    MaxSprites  int
    RenderOrder int
    Texture     uint32
}

A BufferList stores the data needed for a specific Shader. It is convenient to keep a copy of the shader in the object. The Shader object is lightweight so don’t need to use a reference.

CpuBuffers is a list of lists that holds our data on the CPU side. For each list in CpuBuffers we have an entry in GpuBuffers. Each GPU buffer entry just needs a uint32 identifier since the actual storage is handled by OpenGL.

Parameters NumSprites and MaxSprites hold the current number and max number of sprites in the buffers respectively. We use MaxSprites to preallocate our buffers. Preallocating is important for performance reasons. Go slices can by dynamically expanded but we generally don’t want to do that while our game is running because it can be slow. Instead, we preallocate a big array during game setup and afterwards we can add data to our buffers really quickly. For performance reasons we also don’t delete data from the buffers. Instead, we write over existing data and use NumSprites to make sure we don’t render expired data.

The downside of using preallocated buffers is that we must have a good estimate about the number of sprites we expect to store. Otherwise, we risk running out of space if underestimate. If we overestimate, we waste space.

Parameter RenderOrder is used by the renderer to control the order that BufferList objects are processed. The order of rendering is important if we are rendering transparent sprites or if we are rendering without depth testing enabled. This parameter is only used by the renderer and we won’t need it here.

The last field of our object is Texture which holds the id of the the texture used by the sprites stored in our buffers. The texture itself is created in a SpriteAtlasand we store it here only as a convenience.

Constructor

The BufferList constructor is initialized based on a shader that we pass as a parameter. Initially, we create the struct with all the parameters that can be copied in. Cpu and Gpu buffers are then preallocated getting an entry for each shader attribute in shader.

func NewBufferList(shader shaders.Shader, maxSprites, renderOrder int, texture uint32) (*BufferList, error) {
    sb := BufferList{
        RenderOrder: renderOrder,
        Shader:      shader,
        CpuBuffers:  make([][]float32, len(shader.Attributes)-1),
        GpuBuffers:  make([]uint32, len(shader.Attributes)-1),
        MaxSprites:  maxSprites,
        NumSprites:  0,
        Texture:     texture,
    }
    ...

Next, we sort the attributes by location to ensure that they are added in the correct order. This is because go maps do not guarantee the ordering of stored values. Having the shader attributes stored in the order they appear in the shader is very beneficial as we can index them directly.

    // same sorting as sprite.shaderData
    attrKeys := ds.SortMapByValue(shader.Attributes, func(a, b shaders.ShaderAttribute) bool {
        return a.Location < b.Location
    })

The final step is to allocate the buffer for each attribute. We use the sorted keys to ensure correct ordering.

    gl.GenVertexArrays(1, &sb.VAO)
    gl.BindVertexArray(sb.VAO)
    
    for i, v := range attrKeys {
        attr := shader.Attributes[v]
        if attr.Name == "vertex" {
            createSpriteVertexBuffer()
            continue
        }
        // intialize empty cpu buffers of fixed size
        sb.CpuBuffers[i-1] = make([]float32, maxSprites*int(attr.Type.Size))
        // initialize  GPU buffers
        sb.GpuBuffers[i-1] = newGLBuffer(maxSprites*int(attr.Type.Size), int(attr.Type.Size), attr.Location)
        sb.TypeSize = append(sb.TypeSize, attr.Type.Size)
    }

    gl.BindVertexArray(0)
    return &sb, nil
}

For Cpu buffers we allocate space for our maximum number of allowed sprites maxSprites*int(attr.Type.Size). We store Type.Size in the helper array TypeSize as we will be accessing it multiple times and we don’t want to use Shader.Attributes because it is a map (its better to avoid accessing a map in a very fast loop such as our render loop).

For Gpu buffers we use the initialization process that we saw in the instancing tutorial. All buffers are initialized with newGLBuffer seen below except for the vertex buffer.

// Allocate a buffer object
func newGLBuffer(size, typesize int, location uint32) uint32 {
    var vbo uint32
    gl.GenBuffers(1, &vbo)
    gl.BindBuffer(gl.ARRAY_BUFFER, vbo)
    gl.VertexAttribPointer(location, int32(typesize), gl.FLOAT, false, 0, nil)
    gl.EnableVertexAttribArray(location)
    gl.BufferData(gl.ARRAY_BUFFER, 4*size, gl.Ptr(nil), gl.DYNAMIC_DRAW) // 4 bytes in a float32
    // this sets the buffer index to move once per instance instead of once per vertex so all
    // vertices in the instance get the same value
    gl.VertexAttribDivisor(location, 1)
    return vbo
}

The vertex buffer does not need the VertexAttribDivisor call since it is the attribute that will be instantiated. A minor difference is that we set the vertex array to STATIC_DRAW since this array will never be updated because it will always store our sprite template.

func createSpriteVertexBuffer() uint32 {
    var vbo uint32
    gl.GenBuffers(1, &vbo)
    gl.BindBuffer(gl.ARRAY_BUFFER, vbo)
    gl.VertexAttribPointer(0, 3, gl.FLOAT, false, 0, nil)
    gl.EnableVertexAttribArray(0)
    gl.BufferData(gl.ARRAY_BUFFER, 4*len(spriteTemplateCentered), gl.Ptr(&spriteTemplateCentered[0]), gl.STATIC_DRAW)
    return vbo
}

Vertex Array Objects

One inconvenience with our vertex buffer is that when we want to use it we have to iterate over every buffer and bind it. Fortunately, we can fix that using an OpenGL mechanism called vertex array objects (VAO). VAOs store collections of vertex buffer objects (VBO). To use a VAO we must first generate it similarly to how we generate VBOs:

var vao uint32
gl.GenVertexArrays(1, vao)

We then bind the VAO to enable it and do our buffer allocation as usual.

gl.BindVertexArray(vao)

// create a bunch of VBOs
for buffer := range buffers{
    newGLBuffer(...)
}
gl.BindVertexArray(0)

When we are done we unbind the VAO by binding the zero VAO (please take a moment to appreciate this intuitive OpenGL API). When we want to use our buffers now we don’t have to bind each VBO individually, we just bind the VAO:

gl.BindVertexArray(vao)
gl.DrawArraysInstanced(...)
gl.BindVertexArray(0)

We will store the VAO identifier in the BufferList object for easy access:

type BufferList struct {
    RenderOrder int
    Shader      shaders.Shader
    CpuBuffers  [][]float32
    GpuBuffers  []uint32
    typeSize    []int32
    NumSprites  int
    MaxSprites  int
    Texture     uint32
    VAO         uint32 // Vertex Array Object
}

BufferList Usage

The BufferList object is used in conjunction with the renderer and is used in the render loop. We will go over the render loop in detail in another tutorial but for now its useful to know the basic steps in the loop.

Clear all buffers
Go over game sprites and add sprite data to BufferList
Move data from CPU to GPU
Render

Steps 1,2 and 3 are BufferList functionality that we will see now.

Adding Data to the Buffers

The AddSprite method is our way of adding data to the buffers. It checks NumSprites to see if there is space to add more and exits if not. If there is space available it copies data from a Sprite object into our CPU buffers.

func (bf *BufferList) AddSprite(sprite *Sprite) error {
    if bf.NumSprites == bf.MaxSprites {
        return errors.New("Store full")
    }

    for i := range bf.CpuBuffers {
        bf.UpdateBuffers(i, bf.NumSprites, sprite.shaderData[i].Data)
    }
    bf.NumSprites++
    return nil
}

We haven’t defined the Sprite object yet but the only thing we need from it here is the data that goes in the buffers. These are stored in a slice that has an entry for each shader attribute.

type Sprite struct {
    //...
    shaderData   []*shaders.ShaderAttribute
    //...
}

The shaderData slice is ordered in the same way we order our CpuBuffers so we just iterate through and copy:

func (sb *BufferList) UpdateBuffers(buffer, index int, data []float32) {
    typeSize := int(sb.TypeSize[buffer])
    startIndex := index * typeSize
    copy(sb.CpuBuffers[buffer][startIndex:startIndex+len(data)], data)
}

We don’t update the GPU buffers at this point because we will likely be adding many sprites and copying to the GPU is a high latency operation. Instead, when when we have added all the sprites to the CPU buffers we copy them to the GPU all at once. This is done with MoveCpuToGPU.

func (bf *BufferList) MoveCpuToGpu() int {
    dataMoved := 0
    if bf.NumSprites <= 0 {
        return 0
    }
    for i := range bf.CpuBuffers {
        typeSize := int(bf.TypeSize[i])
        data := bf.CpuBuffers[i][0 : typeSize*bf.NumSprites]
        updateGLBuffer(bf.GpuBuffers[i], data, 0)
        dataMoved += typeSize * bf.NumSprites
    }
    return dataMoved
}

func updateGLBuffer(glbuffer uint32, newData []float32, position int) {
    gl.BindBuffer(gl.ARRAY_BUFFER, glbuffer)
    gl.BufferSubData(gl.ARRAY_BUFFER, 4*position, 4*len(newData), gl.Ptr(newData))
}

This loops through all CPU buffers, calculates the number of floats that must be moved (typeSize*bf.NumSprites) and creates a slice of the data that must be moved.

data := bf.CpuBuffers[i][0 : typeSize*bf.NumSprites]

The above doesn’t trigger any copying. In Go, slicing just creates a pointer to the original array with two indices start and end so there is no performance penalty for this. The slice is then passed to updateGLBuffer which binds the appropriate GPU buffer and copies the data over. This function is able to partially update a GPU buffer by copying after a start index given by position but we don’t use this functionality here and we always copy from index zero. It is the responsibility of the renderer to decide when to trigger MoveCpuToGpu.

Clearing the Buffers

To clear the buffers we just set NumSprites to zero.

func (sb *BufferList) Empty() {
    sb.NumSprites = 0
}

We can do this because our buffers are pre-allocated. When we call AddSprite to add data again it will simply overwrite existing data. When we copy to the GPU we only copy up to NumSprites so we are not rendering old data.

Comments

Our buffers object is very strongly interlinked with the renderer which we will see in an upcoming tutorial which is why some of the details of the implementation, like how we use the VAO object, might not be very clear yet. Also, the BufferList object is intended to be used internally by the engine and the user should never have to create one manually which is why we allowed some programming war crimes like storing a reference to the atlas texture which is a member of another object!