Close

Introduction to GPUs and GPGPU

A project log for The Hardware Side of Computation

An exploration of CPUs, GPUs, and FPGAs, and how they each perform computation

pointyointmentPointyOintment 04/10/2017 at 05:210 Comments

It is time for the third type of processor I will be covering: the graphics processing unit (GPU). This is the one about which I know the least so far, so this will be the shortest post.

Specialized chips for graphics processing have existed since the 1970s. First used in arcade games, initially they served to connect the CPU to the monitor and provide graphics capabilities in a way that didn't require lots of RAM to store the image, or constant processing by the CPU to move 2D images around onscreen. In the late 1980s and 1990s, new devices were created that could handle basic 3D graphics calculations (transform, clipping, and lighting) in hardware. Again, the first of these were found in arcade machines. In 1996 and 1997 they became available in home video game and computer systems. Modern GPUs are descended from these.

Modern GPUs can perform programmable vertex shading and pixel shading. These consist of running a short program for each geometric vertex and each pixel in the output image. To speed this up, GPUs are built with many physical processing units (called shaders), which run these short programs (also called shaders) in parallel. Early modern GPUs' pixel shaders weren't actually fully programmable; they were more like blocks in an FPGA, programmable only for a certain set of functions. This was necessary to be able to run them at high pixel clock speeds to get all of the pixels in the image prepared in time. By the early 2000s, though, hardware pixel and vertex shaders were converging and also becoming more similar to CPUs, gaining looping and floating point math abilities, while keeping (and increasing) their parallelism. This soon came to be used for bump mapping, a process that simulates a texture on a rendered surface, even though the actual surface in the 3D model is flat.

Graphics APIs

Various APIs have been created to enable easy use of GPUs for graphics in your programs. They are useful for 3D games, 3D modeling programs, et cetera. OpenGL, a popular one, was first released in 1992 and is still used widely today. DirectX, a proprietary API (really a group of APIs, only some of which are GPU-related) from Microsoft, was first released in 1995 and is also very popular today, particularly for commercial games.

GPGPU

Most of the calculations used in 3D graphics are based on matrices and vectors, and involve doing a great many similar, simple calculations simultaneously, so modern GPUs are highly optimized for these demands. However, 3D graphics is far from the only field where highly parallel matrix and vector math is used. Using a GPU for non-graphics applications is called general-purpose computing on graphics processing units (GPGPU). This is done by means of compute kernels (or compute shaders), which are given to the GPU in place of vertex and pixel shaders. The data is usually provided in the form of a 2D grid, because GPUs are optimized for working with data in this form. Using a GPU to run a compute kernel on a grid of data is equivalent to using a CPU to loop over the grid and perform an operation during each iteration of the loop. However, the GPU performs the operation in parallel on multiple (perhaps all of the) elements in the grid simultaneously.

It is important to note that GPGPU is only effective for problems that can be solved efficiently by stream processing. While a GPU can solve problems where the subproblems are heavily interconnected, it will be much slower at that problem than a CPU, because the parallelism doesn't help with that type of problem.

When designing algorithms for GPGPU, you should try to maximize the arithmetic intensity (the number of arithmetic operations performed per unit of data), to keep the processing from being slowed down by memory access.

While you can do GPGPU using graphics APIs or direct access to the GPU, APIs also exist that allow easier use of GPUs for GPGPU, including OpenCL and OpenMP (both of which are not GPGPU-specific; they can run on many kinds of processors), as well as DirectCompute (part of DirectX).

GPGPU is useful for many scientific applications such as weather forecasting and climate research, bioinformatics, and astrophysics, as well as non-scientific (more immediately profitable) applications including audio, video, and still image processing, machine learning, data mining, cryptography/cryptanalysis, and computer-aided design and engineering. GPUs and GPGPU are commonly used in supercomputers.

Discussions