-
Notifications
You must be signed in to change notification settings - Fork 126
Early Raw Notes on JSON, trace packets, history of code, etc.
Note: This is a historical document compiled from several emails with some useful info, written for internal consumption back when vogl didn't have a name yet. Substitute "vogl" for all instances of "gli":
I've put a lot of effort into supporting JSON in this project. Feel free to poke around them as the data that lives there is all the trace data that the replayer and GL state restorer use during replay (i.e. nothing is hidden and it's a lossless conversion to JSON).
I've historically had some issues with compatibility with JSON tools such as "jq" because my stuff allowed duplicate keys in objects, but I believe I've fixed all of that (and I've added code to detect these cases). Note when the serializer gets stuck and can't losslessly encode a double as a JSON double data type it'll resort to encoding the value as a hex string (always beginning with "0x"). Also, when it can't convert what is supposed to be a GLenum back to a string it'll also resort to a hex value.
I've got a lot of C/C++ utility routines that either give access to, or help with the following common GL tasks. Much of this lives in the glicommon lib which is used by the tracer and glireplay:
- Take a GL enum (pname) and convert it to a string, optionally given a function or category hint (because GL_NO_ERROR, GL_NONE, and GL_ZERO are both 0 for example)
- Take a GL enum string and convert it to a value
- Given a pname (a GL enum), retrieve the # of parameter array elements associated with the enum (this table was originally from apitrace, then I heavily tweaked/fixed it, still ongoing)
- I have various arrays and helpers with info about every GL/GLX function: return value type/namespace, param type/namespace, whether or not the NVidia driver exported the func (or if you must get it via GetProcAddress), etc.
- There are around 125 GL types ("ctypes"), I have helpers that can give you their names, whether or not they are pointers, whether or not they are unsigned/signed/float/etc.
- Lots of helpers to determine the array sizes of GL textures, images, etc.
- Lots of internal texture format related helpers
Note the 64-bit replayer can play back traces made in a 32-bit process, and vice versa. Trace pointers are always represented as 64-bit uint's internally.
The C++ tool "gligen" scans the old official GL spec and enum files and spits out all the .inc files needed by the interceptor and the glicommon lib to get this info. In a minor fit of madness I also hacked up a custom parser for glapi.py (from apitrace) to get key GL function return and parameter namespace info that was unavailable anywhere else. I also use a set of Python scripts that scrape opengl.org for GL API info, which is compiled into a huge XML file which gligen loads. I don't use this XML file to drive my stuff (none of the info in there affects what gets output in the .inc files), but I do cross reference everything in there against everything else to check for correctness. The files which drive gligen currently live in bin/glspec. gligen must be ran from this directory.
libglitrace.so only outputs binary traces. Everything in the the trace file is organized into simple packets, and the packets are always just appended to the end of the file just like it was a network stream. Each packet starts with a simple header containing an ID and the packet's full size. The file format consists of a tiny "start of file" (SOF) header packet. Most importantly, this packet sends down the process's pointer size. Almost all following packets are "GL entrypoint" packets, and the trace should usually be terminated by a final EOF packet. Immediately following this header are 3 glInternalTraceCommandRAD() packets which are currently used for debugging. These guys describe every GL function and data type (ctype) that the tracer knew about. I plan on merging the SOF and these trace command packets into a single packet of some sort soon, which will simplify some code in several places. Anyway, after the initial intro packets there is a raw dump of every GL call made by the app. Nothing is filtered, transposed, deleted or expanded (unlike apitrace which uses the trace file more as a flexible instruction stream for its replayer). The trace is a recording of every GL call made by the app. The trace file write is mutexed, so if the app uses multiple contexts on different threads it's possible to see the trace flip flop between threads/contexts. Each GLX/GL entrypoint packet contains:
- The index of the GL call, the GL call counter, the context that made the call, etc.
- The return value (up to a uint64)
- The parameter values (up to a uint64)
- There's a blob of data in the packet containing any referred to return or param arrays, which are copied from client memory verbatim. Note the tracer only follows parameter pointers once, i.e. it can't automatically serialize complex data structures containing pointers to pointers. There are very few GL data structs this complex, so manually coding these exceptional cases works fine.
- Finally, there's an optional "name value map" data structure that appears in more complex GL calls. The tracer can put anything it wants there. Names can be any common data type (ints/strings/floats/etc.) Values can be a common data type, or uint8 blobs. It's really like a simple JSON object, but it's purely binary. These packets are handled by a class named "gli_trace_packet" in the glicommon lib. This code handles the JSON and binary serialization/deserialization of trace packets. Both the tracer and replayer leverage this class to deal with trace files. The JSON trace files are a 1:1 direct rep of the binary data whenever possible. The JSON deserializer tries to be very forgiving in case the JSON text was manually edited. There are also GL state snapshots, generated by glireplay's -trim command, which are currently always serialized as textual JSON as a seperate loose file. These snapshots describe the GL context state, and all live GL objects. It's not all the state that GL supports, but a decent chunk of it and everything that Source1 and Scaleform GL need. This is still a work in progress, I intend on serializing state snapshots directly to the binary trace using binary JSON. Right now the -trim stuff is still a work in progress.
- The pointer values in the packets are just for debugging, the replayer almost ignores them. All it cares about is if they are 0 (NULL) or not. This is very different from apitrace, which uses more of a virtual memory with dynamic ptr remapping design.
- The GL object handles in the trace must be consistent, but they can be absolutely anything. The replayer maps from the trace to GL domain on the fly as it plays back traces. It does the same thing for program uniform locations.
- There's some special handling for program attribs to ensure the tracer and replayer don't need to dynamically remap attrib indices. This is somewhat hairy. apitrace historically hasn't gotten this right everywhere and it causes replay divergence problems on Source1.
- There are some evil name value maps that I need to fix. For example, the name value map sent down by glLinkProgram() (containing attrib and uniform information, and location handles) is virtually impossible to manually understand or manually edit. The goal is for the file to be editable by a human, and for there to be no super complex data structs that need to be preserved to successfully edit a file.
- In a successfully replay, the backbuffer's CRC64 must match the tracer's. If the app uses multisampling, I can't use a CRC64 and must resort to per-component sums and make sure they are close enough (multisampling is not deterministic on some/all (?) drivers).
- The replayer always makes every GL call it's told to make, even if it knows the call is just going to just fail. This is very important, because these GL calls may expose driver bugs that the user is trying to debug. Even glGet()'s must be faithfully replayed.
- The replayer tries to diff its context state vs. what the tracer saw on the fly, and report differences. This is extremely useful to detect divergence or other weirdness.
- The replayer calls glGetError() religiously. Yes there are newer ways but at the end of the day we want to know about GL errors as quickly and precisely as possible.
- It is very hard to build proper state shadows of GL context state. Much harder than most GL devs think. You can't build a proper state shadow unless you also call glGetError() after every call which could change GL state. Full state shadows are the devil, generally. I would argue that building a state shadow that works properly during error conditions, and on every driver (and driver version from the same vendor) is probably impossible (or at least way too much work for it to be practical). I've seen some tracers that try to do this, and they are useless for real debugging because when the shadows diverge (and they do) you wind up debugging the shadow's state and not the actual driver's state.
- I'm currently only using C++03, I'm very leery of requiring C++11 until we know how much it'll actually impact our portability. But I would love to have stuff like rvalue references. crnlib started as the crunch Linux port. It contains a lot of stuff that I'll never use on this project (like clusterization and fancy types of dxt compression), so I'll be cutting it down and renaming it soon.
- The tracer is purposely mostly C, with a bit of C++. The tracer must be able to live and survive in arbitrary apps and (long term) it needs to be pretty portable. Also, perf actually matters in the tracer, and more work needs to be done there, so I tended to keep to straight C. I use ugly, C-style global arrays instead of nice classes on some important things due to my past experiences with optimizing this kind of code.
- gligen is more like C++ scripting, perf doesn't matter here, I do all kinds of ugly stuff in there with strings and it doesn't matter.
- I use 99% custom containers: crnlib::vector, crnlib::map (skip list, also supports sets), crnlib::hash_map (open addressing with linear probing, supports sets too), crnlib::introsort, crnlib::dynamic_string (which uses the small string optimization - not refcounting or copy on write), etc. I've been bitten enough times by MS's crappy STL (especially in debug builds of all things) that I've just hard enough of relying on any of it.
- glitest -test runs a bunch of unit and smoke tests on a bunch of core classes in crnlib. I keep on adding more tests as time permits.
- I don't call new/delete/malloc/free/etc. - everything goes through crnlib_malloc/crn_new/etc. wrappers.
- You purposely can't free a pointer returned by crnlib_new_array by calling crnlib_free(), you must call crnlib_delete_array(). A array new's are prefixed by a special header to detect mismatches during runtime. (This is like standard C++, except in regular C++ you may get lucky or it'll just corrupt the heap.)
- Set CRNLIB_MALLOC_DEBUGGING to 1 to enable fairly robust (but sorta slow) malloc correctness and leak checking, but it only works if you use crnlib_malloc/crn_new. I have been burned by glibc's crappy malloc debugging implementation so I found something else that I can trivially enable and know it's working.
- I still rely on the C heap but we've now got access to Iggy's new heap code and I intend on switching to it soon.
- I'm currently just using a single global heap, but I know this will probably not perform well enough in the tracer, so long term it'll need to be fixed at least there to use pools, multiple heaps, etc.
- I don't use C++ exceptions at all, or RTTI, like most game devs. I'm deeply in the "C++ exceptions are garbage" camp..
- When you see my code use virtuals it means I really need them. Virtuals where insanely slow on X360 so I learned to avoid them whenever possible.
- I rarely, if ever, use multiple inheritance. I may inherit from multiple abstract base classes on a full moon, though.
- I like templates and think they are mostly fine, I just try not to go too crazy with things like mixins and metaprogramming because I know it can make the code impossible to understand by other people. (Believe me, I've done it, and loved doing it, but none of my ex-coworkers at MS understood any of it..)
- I now try to avoid the C run time's string code due to locale issues. Locale issues bit us hard on the TF2/L4D2 Linux ports, so I'm now just writing my own string helpers whenever possible. I don't even trust the C runtime to convert a int to a string for crying out loud, what's the world coming to. We still rely on the C run time's sprintf(), etc. implementation but I want to kill that as soon as possible.