The Simple DirectMedia Layer (SDL) provides three kinds of surfaces for rendering graphics: software, hardware, and OpenGL. Software surfaces are stored in the computers main memory. Hardware surfaces are stored in memory on your video card. OpenGL surfaces are handled in whatever way OpenGL does things on your system. My previous SDL article provides basic information about using SDL and details of software surfaces. This article explores the promise and problems of using hardware surfaces.
People assume that because hardware buffers are based on hardware that programs that use them will be faster than programs that use software buffers. That assumption is often not true. In fact, some applications will run slower with hardware than with software buffers. The decision to use a hardware buffer must be based on testing and not on assumptions. People also seem to forget that the hardware is a limited resource. Just because you can get enough memory to make the program fly on your development machine does not mean it will perform as well on another computer. When they work for you hardware surfaces are fast, easy to use, and give you smooth animation without tearing and other unpleasant visual artifacts. When they don't, they can lead to a set of bewildering problems.
A cross platform development tool like SDL is designed to let you write programs that work without change on many different kinds of hardware and operating systems. Hardware-dependent programming is the opposite of cross platform development. The conflict between the realities of hardware dependent development and the goals of SDL are at the root of many of the problems you may encounter when using hardware surfaces. SDL hardware surfaces are both cross-platform and hardware dependent.
While writing about SDL hardware surfaces, I have used more weasel words than a pork-barrel politician the week before election day. I do that because most features have a few special cases where they don't quite work the way you expect them to. The special cases come from the many factors that affect how hardware surfaces really work:
|
Related Reading
Even Grues Get Full |
The actual hardware installed in your computer. There are many ways to build a video card. Your video card may have hundreds of megabytes of dedicated super fast DDR RAM. Then again, it may not have any dedicated memory, instead sharing the computer's main memory. Your hardware may have a blindingly fast graphics accelerator or it may let the CPU do all the work. Using hardware surfaces doesn't tell you much about how they will perform on any given system. In fact, they may perform poorly on a system with a high end graphics accelerator while performing remarkably well on a system with a weak graphics system.
How your computer talks to your video card. Most computers talk to the video card over a data bus, like the AGP bus, that is designed to send data from the CPU to the video card, not the other way around. It is almost always the case that the CPU can write to the video card much faster that it can read from the video card. It is usually the case that the CPU can read and write its main memory faster than it can read or write video memory. That all means that using the CPU to copy images (or anything) around in graphics memory is going to be slow.
The version of the device drivers you are using. Your hardware may be capable of providing hardware surfaces and hardware acceleration, but the drivers you are using may not make those abilities available to user programs. You can write a program that works great on your computer but that will fail on a similar computer with the same OS and the same graphics card just because it has a different version of the device driver.
The way the operating system controls access to hardware. Operating systems control access to the physical hardware and try to keep programs from messing with the hardware in ways that can crash the computer. Because of differences in their design, Windows allows normal programs to get hardware surfaces while on Linux and other Unix-like operating systems, a program must have root privileges to access the hardware.
Having listed so many problems with SDL hardware surfaces, you might think they are not worth using. However, if you are writing a two-dimensional game on a platform with good support for SDL hardware surfaces, they may be the correct choice. You just have to know enough to know when they are a bad choice.
The easiest way to show the differences between hardware and software
surfaces is to convert the softlines.cpp program, which I
wrote for my last article, from using software to hardware surfaces. The
new program is called hardlines.cpp. Converting the code did
not require me to change many lines of code. Of course, what seemed like
tiny details kept the code form working as expected. The closer you get to
the hardware, the pickier the work gets.
hardlines.cpp has the same sections with the same
functions as softlines.cpp. Working from the top of the
program down, I didn't have to make any changes in the program until I
reached the main() function.
The first change I had to make was to add some Linux specific code just
before the call to SDL_Init():
#ifdef __linux__
putenv("SDL_VIDEODRIVER=dga");
#endif
SDL checks the value of the SDL_VIDEODRIVER environment
variable to decide which driver to use. To get hardware surfaces while
running on Linux under X, you have to specify which driver to use. I've
chosen the DGA driver because the default X11 driver does not support
hardware surfaces. The SDL
FAQ has more information about selecting drivers on Linux and Windows.
There is also a
detailed list of SDL environment variables and their use. The number
of different drivers that you have to choose from is staggering and shows
the range of applications for which SDL could be used.
Then next change is in the call to SDL_SetVideoMode():
screen = SDL_SetVideoMode(screenWidth,
screenHeight,
0,
SDL_ANYFORMAT |
SDL_FULLSCREEN |
SDL_HWSURFACE |
SDL_DOUBLEBUF);
The changes are small, but the reasons for the changes aren't. The
options tell SDL that I want a full screen (SDL_FULLSCREEN),
double buffered (SDL_DOUBLEBUF), hardware surface
(SDL_HWSURFACE). The part that isn't obvious is that on my
desktop system if I want a hardware surface, it has to be full screen. I
can't get a hardware surface for a window. This is one of those things
that is operating system and device driver specific. Some systems let you
have a hardware surface for a window. Even if you can get a hardware
surface for a window, you may not be able to get a double buffered
hardware surface for a window.
There are good reasons to refuse a hardware surface for a
window. SDL_SetVideoMode() returns a pointer to an SDL_Surface.
Inside that structure is a pointer to the pixel data for the
surface. Without that pointer you can't draw anything. The demo program
uses that pointer to draw lines. Having a pointer to a window on the
screen means there is a good chance that you can write to any pixel on the
screen, not just the ones in your window. You can probably read from any
pixel on the screen, which creates a nasty security hole. A bug in your
program can scramble the whole desktop, not just your window.
Because you have a pointer to the data in the window, you also have to worry about what happens when the window is moved, resized, or obscured. When the window moves, the address of the image data for that window also moves. If it changes and you use an old copy of the pointer, your program winds up drawing in the wrong place. If another window partially covers your window, who is responsible for keeping you from writing to the covered parts of your window? How does an SDL application even find out what those are? Double buffering introduces another set of problems. You may be able to get a hardware surface in a window, and not be able to get a double buffered surface for that window, because the entire desktop is not double buffered.
All of these problems can and in fact have been solved many different ways. By far the easiest solution is just to require that applications that directly access the screen run as full screen applications. If you want to use SDL hardware surfaces, assume that your application will have to run in full screen mode.
To make sure that the program actually got a hardware surface I added code that tests the surface type right after I set the video mode:
if (0 == screen->flags & SDL_HWSURFACE)
{
printf("Can't get hardware surface\n");
exit(1);
}
If SDL can't give you what you ask for it will give you what it can. If it can't give you a hardware surface, SDL will give you a software surface. We have to check to see if we really got a hardware surface.
|
After you have set the video mode, you can use
SDL_CreateRGBSurface() to create more hardware surfaces
and
SDL_FreeSurface() to release them. These surfaces are
used to hold image data, such as sprites or fonts, that you want to draw
onto the screen. If your screen and your graphics are both in hardware
surfaces, SDL can use the graphics hardware to copy from one surface to
another. Using the graphics hardware gives you a significant performance
boost.
This may sound obvious, but if the video card has 32 megabytes of memory you aren't going to store more than 32 megabytes of data in it. You won't get the full 32 megabytes because the windowing system and other applications may also be storing information in graphics memory. When you use hardware surfaces, you have to set a budget for graphics memory use and then stick to that budget.
Graphics hardware is a shared resource. Operating systems generally
require that we lock shared resources before we use them and unlock them
after we are done. SDL provides
SDL_LockSurface() and
SDL_UnlockSurface() to lock and unlock hardware
surfaces. It is possible to have a hardware surface that should not be
locked and SDL provides the
SDL_MUSTLOCK() macro so that we can tell them apart.
Failing to lock a hardware surface can cause unexpected results or even
program crashes. Locking the surface ensures that all graphics hardware
pending operations are completed before you can touch the buffer. In
hardlines.cpp the call to SDL_FillRect()
may be performed by the graphics hardware and run in parallel with your
code. In fact, there could be several graphics operations that are queued
up waiting for the graphics accelerator to perform them. If we don't wait
for those operations to complete, the program can be drawing lines in
software while the background is being filled by the graphics
accelerator. No matter what happens, the results are unpredictable and
certainly not what you want. Further, the pointer stored in the surface
record can change. If you are using double buffering and you swap the
buffers, the current buffer is a different block of video memory. The
pointer can also change if the window was moved. The pointer is only
guaranteed to be valid while the surface is locked.
After learning why you have to lock hardware surfaces, you might think that you should just lock them at the beginning of the program and leave them locked. We can't do that because while the hardware is locked, we cannot safely make any system calls. System calls may not be able to complete until the hardware is unlocked.
To make the sample program work with hardware surfaces I have added code around the code that updates the screen that locks and unlocks the hardware screen surface.
if (SDL_MUSTLOCK(screen))
{
if (-1 == SDL_LockSurface(screen))
{
printf("Can't lock hardware surface\n");
exit(1);
}
}
rl->update(t);
gl->update(t);
bl->update(t);
if (SDL_MUSTLOCK(screen))
{
SDL_UnlockSurface(screen);
}
To be as portable and fast as possible, I only lock the surface if
SDL_MUSTLOCK() says it must be locked. There is a real cost
to locking the surface, so we don't want to lock it if we don't have
to. Using SDL_MUSTLOCK() also lets the code work with
software buffers.
At the very end of the original animation loop, we had two lines of code:
SDL_Flip(screen);
SDL_Delay(10);
When using a double buffered display, graphics are drawn into the back
buffer and only become visible after the call to SDL_Flip(). When
used with software surfaces SDL_Flip() copies the contents of
the back buffer to the display and returns immediately. The story is more
complicated with hardware surfaces.
The version of SDL_Flip() used for hardware surfaces can
be implemented in at least two different ways. It can copy the back buffer
to the front buffer, or it can tell the hardware to stop displaying the
current surface and start displaying what is in the back buffer. In the
second case it just changes the value of a pointer that tells the hardware
where the graphics are. At that point the display surface (also called
the front buffer) becomes the back buffer and the back buffer becomes the
display buffer. No copying is done at all.
Copying and swapping both get the next frame on the screen. You only
care about the difference if you are doing incremental updates of the
frames. If SDL_Flip() is copying buffers, the back buffer
always has a copy of the last frame that was drawn. If
SDL_Flip() is doing page swapping, the back buffer usually
contains the next-to-last frame. I say usually because double buffering
can be implemented using a hidden third buffer to reduce the time spent
waiting for the buffer swap to happen. You can find out what kind of
swapping is being done by watching the value of the back buffer pointer
(screen->pixels in hardware.cpp) to see if it
changes and how many different values it has. If it never changes, then
SDL_Flip() is copying the pixels. If it toggles back and
forth between two values, then page swapping is being used.
Using hardware surfaces changes the timing behavior of
SDL_Flip() and lets us get rid of tearing. Image tearing
results from changing the display buffer while the video hardware is
drawing what you see on the screen. The video hardware is constantly
reading the contents of video memory, your animation frame, and converting
it to a video signal that your monitor then turns into a pattern of
colored light that you see. The process of painting an image on the
screen takes time. At 85 frames per second, it takes just just under 12
milliseconds to draw the frame on your screen. The process is broken up
into several phases, but the ones we are interested in are the frame time
and the video retrace period. The frame time is the length of time from
when the hardware starts displaying the current image on the screen until
it starts display the next image on the screen. The video retrace period
is a brief period at the end of the frame time when the video system has
finished displaying one image but hasn't started displaying the next
image.
If we change the content of the display buffer during the frame time, the hardware will display part of the front buffer at the top of the screen and part of the back buffer at the bottom of the screen. Splitting the image like that is called tearing. We want the buffers to switch during the vertical retrace period so we never see parts of two frames on the screen at the same time.
We want our animation programs to
Unfortunately, that wait can be very long. There is a lot of work that we could be doing instead of waiting for the buffer swap. What we really want to do is
This is precisely what SDL tries to do. The call to
SDL_Flip() tells the hardware to swap buffers at the next
video retrace, but it does not wait for the retrace. When you try to lock
the surface, or when one of the SDL graphics routines tries to, SDL waits
until the buffers have swapped. Delaying the wait lets you keep working
after calling SDL_Flip() but prevents tearing and prevents
you from writing to a buffer that is being displayed. This design lets
your program do all the set up work needed for drawing the next frame
while waiting for the buffers to swap.
There is, of course, a caveat. On some systems it is not possible to
implement SDL_Flip() to work the way I just described. On
those systems, SDL_Flip() may wait until the buffers have
swapped or it may never wait and give you tearing. I have never
encountered these problem, but you need to test SDL_Flip() on
your target system before depending on a specific behavior.
SDL_Delay() is rarely needed when using hardware
surfaces. The wait for the hardware buffer swap keeps the program from
generating frames faster than they can be drawn on the screen and forces
the program to give up time to the operating system. Thus the next to last
change to hardlines.cpp was to remove that line. Removing the
call to SDL_Delay() is not always correct. It would have been
more correct to time the animation loop and call SDL_Delay()
if we were drawing an unreasonable number of frames per second.
I added code to compute the average frame rate of the animation and print it out at the end. I just count the number of frames that were drawn and divide by the time it took to draw them. If the program is working correctly the frame rate should be very close to the frames per second setting on your display.
This article covered details of using SDL hardware surfaces along with the problems and incompatibilities that interfere with there use. As there are no standards for hardware, device drivers, and operating systems that cross the range of platforms that are supported by SDL, there are bound to be incompatibilities and inconsistencies. This another case where SDL isn't amazing because it works so well, SDL is amazing because it works at all.
Next time I'll be looking at how to use OpenGL from within SDL. The combination of a portable 3D API like OpenGL with the portable input and multimedia capabilities of SDL make it possible to write high performance commercial games that run on Linux, Windows, and the Mac.
Bob Pendleton has been fascinated by computer games ever since his first paid programming job -- porting games from an HP 2100 minicomputer to a UNIVAC 1108 mainframe.
Return to the Linux DevCenter.
Copyright © 2009 O'Reilly Media, Inc.