WebGL Shaders: GPU-Accelerated Particle Systems
If you're still iterating over an array of particle objects in JavaScript and updating their x and y positions every frame, you're doing it wrong. It's 2016, and the bottleneck is the CPU-GPU bridge. We need to move the entire simulation loop into the shaders.
The Bottleneck: Buffer SubData
Calling gl.bufferSubData every frame for 100,000 particles is a performance killer. Instead, we store the state (position, velocity, age) in textures or use Transform Feedback (if you're on the bleeding edge of WebGL 2, though WebGL 1 is still king for compatibility).
Vertex Shader Logic
We can calculate the new position based on the initial position and the elapsed time directly in the vertex shader.
attribute vec3 a_initialPosition;
attribute vec3 a_velocity;
attribute float a_startTime;
uniform float u_time;
uniform float u_duration;
varying float v_age;
void main() {
float elapsed = u_time - a_startTime;
v_age = elapsed / u_duration;
if (v_age < 1.0) {
// Simple physics: p = p0 + v*t + 0.5*a*t^2
vec3 gravity = vec3(0.0, -9.8, 0.0);
vec3 currentPosition = a_initialPosition + (a_velocity * elapsed) + (0.5 * gravity * elapsed * elapsed);
gl_Position = u_projectionMatrix * u_modelViewMatrix * vec4(currentPosition, 1.0);
gl_PointSize = 2.0 * (1.0 - v_age);
} else {
// Move off-screen
gl_Position = vec4(999.0, 999.0, 999.0, 1.0);
}
}
Fragment Shader Beauty
To make them look like real fire or smoke, we use the gl_PointCoord to draw a procedural circle or sample a noise texture.
precision mediump float;
varying float v_age;
void main() {
float dist = distance(gl_PointCoord, vec2(0.5, 0.5));
if (dist > 0.5) discard;
float alpha = 1.0 - v_age;
gl_FragColor = vec4(1.0, 0.5, 0.2, alpha * (0.5 - dist) * 2.0);
}
CPU-Side Setup
The key is to upload the data once and just update a single u_time uniform.
const particles = new Float32Array(count * 7); // x, y, z, vx, vy, vz, startTime
// ... fill data ...
const buffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, buffer);
gl.bufferData(gl.ARRAY_BUFFER, particles, gl.STATIC_DRAW);
function render(now) {
gl.uniform1f(u_timeLocation, now / 1000);
gl.drawArrays(gl.POINTS, 0, count);
requestAnimationFrame(render);
}
By leveraging the parallel nature of the GPU, we can easily hit 60fps with millions of particles, something that would melt a mobile CPU if done in JS.
Aunimeda develops mobile and PC games - from casual hyper-casual titles to mid-core games with complex progression systems.
Contact us to discuss your game project. See also: Game Development, Mobile Game Development