Checking OpenGL error state after OpenGL calls in debug builds can be an invaluable tool for finding bugs in OpenGL code. But what about errors like running out of memory when allocating textures or other OpenGL resources? What are the best practices on handling or avoiding errors like these?
An OpenGL resource allocation failure would probably be fatal in most cases so should a program just try to allocate a reasonable amount of resources and hope for the best? What kind of approaches are used in real world projects on different platforms, e.g., on PC and on mobile platforms?
Running out of memory when allocating resources for textures and vertex buffers is on the rare side these days. When you would run into this sort of situation you should already know that you are approaching limitations for your system requirements, and have a resource manager smart enough to deal with it.
In the PC spectrum, the amount of available memory is becoming less relevant and harder to define. Textures are becoming virtualized resources, where portions of them are only fetched and stored in local (GPU) memory when a specific sub-region is referenced in a shader (Sparse Textures in OpenGL 4.4 terms, or Tiled Resources in D3D 11.2 terms). You may also hear this feature referred to as Partially Resident Textures, and that is the term I like to use most often.
Since Partially Resident Textures (PRT) are an architectural trend on DX 11.2+ PC hardware and a key feature of the Xbox One / PS4 the amount of available memory will be less and less of an application terminating event. It will be more of a performance hitch when page faults have to be serviced (e.g. memory for part of a texture is referenced for the first time), and care will have to be taken to try and minimize thrashing. This is really not much different from the situation 10 years ago, except that instead of a texture either being completely resident or completely non-resident now individual tiles in a texture atlas or mipmap levels may have different states. The way that memory faults are handled can actually open up doors for more efficient procedurally generated content and streaming from optical / network based storage.
Having said that, virtualizing memory resources is not the most efficient way to approach real-time applications and/or embedded applications. Extra hardware is usually needed to handle memory mapping, and extra latency is introduced when a memory fetch for a non-resident resource is issued. In the mobile domain I doubt PRTs are going to change a whole lot, here you will still benefit from lower-level memory management and things like proxy textures before texture allocation; unfortunately OpenGL ES does not even support proxy textures.
Your resource manager should be designed to keep a running tab of the memory allocated for all types of resources. It will not be completely accurate, because OpenGL hides a lot of details from you but it will give you a big picture. You will be able to see immediately that switching from an RGBA16F render buffer to a an RGBA8 saves you X-many bytes of memory or eliminating 1 vertex attribute from one of your vertex buffers changes storage requirements for instance. You can insert your own checks when allocating resources and handle them as assertion failures, etc. at run-time. Better to define and monitor your own thresholds than to have OpenGL complain only AFTER it cannot satisfy a memory request.
There's no "one size fits it all" approach for this. It all depends on the application and how critical it is. The general rule is: Whereever possible fail gracefully and safe.
In the case of a game a preferable course of action would be to save a snapshot of the current game state (it's a good idea to add autosave spots prior and right after critical points) terminate the game process and show the user a understandable reason for the failure; and if there's a save game assure him his progress is not lost.
In the case of a medical diagnostics system inform the user that the graphics display has become corrupt and that he must not use what is currently visible on screen for any further diagnostic purposes.
In the case of a flight controller display, a medical treatment system or similar applications where total failure is not an options, your system must be build in a way that any partial failure the failing part will be isolated and there are enough redundancies and backups that operations can commence normally.
Flight controller displays for example are not fed by a single computer, but each display has (IIRC) three independently operating computers, producing identical output their programming differs so that a programming failure in one of the computers will create an inconsistency with the other 2. Each computer feed its internal state into an arbiter which makes sure that all computers agree on their data. The display signal itself is fed through a further independent comparing arbiter which compares the display output and would disable the offending systems output in case of failure as well.