The past few months things have been quiet as I'm working towards the goal of a 1.0 version release. That means you shouldn't get your hopes up for anything shiny and new in this release! There has been work in the background doing refactoring and some under-the hood improvements, and this version still brings plenty of bugfixes.
Also as things converge towards a 1.0, please take a moment to read the following two documents if they are important to you:
Since privacy is very important to me I want to make sure I hear any concerns anyone might have on these topics as soon as possible. A little while ago I gauged general reactions to these ideas in principle and they seemed positive - insomuch as very few people were starkly opposed to these changes happening - so now I've written details for everyone to check. For some people they will always want to opt-out of any such options which is perfectly fine.
Any feedback on these can be left as comments on those pages, or sent to me via twitter, email or on #renderdoc on freenode.
Version v0.34
Binaries for this release are up on the downloads page for Windows and x64 linux as a binary tarball.
For anyone building from source now, note that VS2015 is now required to build on windows. VS2010 was retired with a heavy heart a couple of months ago.
To anyone packaging renderdoc for linux, there have been some changes to the build requirements to support building python integration in. There have also been improvements to version tagging, so pay attention to the BUILD_VERSION_*
cmake variables, and also take a look at renderdoc/api/replay/version.h
.
Features/Improvements
- The Qt UI now supports full python integration, and there's a fully documented and deliberately designed python API available for the internals. Also if PySide2 is available at build-time, there's cross-integration with Qt so you can create new UI panels and widgets from python and have them integrate into the windowing system as well.
- Many other smaller incremental improvements to the Qt UI.
- Vulkan captures will now do simple remapping of physical devices. If the capture system and replay system both contain two physical devices (for example Intel IGPU and Nvidia DGPU) and the order changes between capture and replay, the closest matching device will be used for each. This also means if you capture on a system with only an NV card and replay on a system with AMD and NV cards, the NV card will be explicitly selected. This should reduce some annoying crashes between capture and replay on the same system, but vulkan captures are still not generally portable between differing systems unless the hardware and driver are similar enough.
- Further improvements to lower-end support on GL. In particular more code that was unnecessarily relying on EXT_dsa was removed or emulated.
- Add support for KHX experimental external sharing extensions
KHX_external_memory*
,VK_KHX_external_semaphore*
. - Add a potential workaround for slow-down or lockups in the UI. This wasn't seen consistently but it seems primarily on OpenGL captures the UI's mousemove events could come in faster than the underlying system could pick texture values for example. This lead to a backlog in queued events and lead to the UI being laggy - or locking up if a synchronous event happened like changing drawcall. Now high-frequency events like texture picks on mouse move are allowed to pre-empt and remove any queued events, so the queue will never be more than one event behind and can quickly catch up once the mouse stops moving.
- Changed version number tagging - particularly for linux build. Instead of packing an "-official" suffix onto the git hash, we now configure several variables independently to store the git commit hash, distribution name, distribution-specific version, and contact URL for the distribution package.
- Added vulkan hardware counters by @victor-moya.
- Improvements to handling of VS output/GS output. The output buffer is no longer a fixed size but now resizes up to whatever size is needed, and GS output is fetched individually for each instance. This fixes a couple of bugs where the VS/GS output would be corrupt for instances after 0 due to an incorrect per-instance stride calculation.
- Apply a fudge-factor to the non-contractual refcount on the D3D12 backbuffer, to try to match the runtime's behaviour.
- When launching a new process from the UI open a little infinite progress bar if it's going to take a while instead of locking up the program.
- Fetch the renderpass state in vulkan even if no pipeline is bound - this allows previewing a renderpass when the
vkCmdBeginRenderPass
event is selected. - Unset the renderdoc vulkan capture layer when replaying. This prevents problem if the env var was accidentally left set when running the replay program.
- Added a
floateleven
specifier in the buffer viewer for unpacking R11G11B10 data. - Improved copy-paste support from tree or list views. Ctrl-A will now 'select all', and the results will be sorted in-order before being copied.
- Added an option to completely disable the fake event markers that are added to captures with no markers.
- Added a --python command line option to renderdocui to run a python script from the command line completely automated.
- Add support for saving Depth24 textures in GL to HDR/EXR formats, and saving double formatted data.
- Add a python function to return the ID of the texture debug overlay's resource ID.
- On GL cache the results of fetching a particular mip level. Because all array slices are together in a mip level in GL, this could lead to extreme memory allocation overhead when fetching each slice individually.
- Add a call to XInitThreads on linux in replay applications to ensure nvidia driver optimisations can work without crashing, and a warning about the 378 series where these will crash unless disabled by an environment variable.
- When naming command buffers in vulkan, make sure to propagate the name to all baked command buffers.
Bugfixes
- Fix an regression on linux that could cause UI panels not to draw, due to a fix being lost in a bad merge.
- Update the windows hooking code to handle the same dll filename being loaded from multiple places, and so having unique module entries. This commonly manifested as any application using AMD's extensions crashing on replay - since the AMD extensions weren't properly force-disabled on capture as atidxx64.dll was loaded twice.
- Fix a crash on D3D12 if the program was captured as 32-bit and is then replayed in a 64-bit UI.
- Remove code that ignored SIGCHLD signals, since Qt needs them internally to function. This will cause every process launched by RenderDoc to become a zombie process until qrenderdoc closes.
- Add missing handling of VK_FORMAT_A8B8G8R8_*_PACK32.
- Remove direct-mode display vulkan extensions when replaying as they aren't used.
- Correct some parsing of /proc/self/maps - device numbers are in hex not decimal.
- Fetch pipeline state after replaying drawcall, not before, so we pick up the state consistently (mostly only applies to mutable data in the state like hidden atomic counter values).
- Handle disassembling unknown extension-set operations in SPIR-V without crashing. Also add support for AMD/NV extension operations.
- Only serialise Vulkan queue indices if the sharing mode is CONCURRENT.
- Fix calculation of compressed texture size per-mip to avoid over-allocation.
- Stopped the python shell incorrectly complaining about missing libraries.
- Experimental fix for an unknown crash disassembling SPDB chunks.
- Fix crash with coherent vulkan maps if a not-mapped memory handle was unmapped again.
- Fixes for handling use of VAO 0 (the default VAO).
- Handle errors in glCreateShaderProgramv by returning immediately instead of trying to wrap a '0' program.
- Fix off-by-one event IDs in runtime generated debug messages.
- Fix DrawInstanced setting baseVertex instead of vertexOffset property.
- When patching D3D12 pipeline state objects, ensure samplemask and sampledesc are initialised properly as they might not have been in the original.
- Fix an out-of-range error when picking vertices in the GS output which is expanded to more vertices than existed in VS input/VS output.
- Apply the correct image usage to vulkan swapchains instead of just our own, so that e.g. STORAGE_BIT is replayed correctly.
- Fix a mistaken output merger validation condition - depth-read-only DSVs where the texture only contains depth can be bound if an SRV is already bound. Previously we were only handling the case for textures with depth and stencil and depth-read-only.
- CUDA dlls are no longer hooked to allow capturing applications that use CUDA.
- On Vulkan when creating images during replay we add any usage bits we might have needed for the image on capture, to ensure memory requirements are compatible. We don't need to do this though for images that aren't part of the capture itself. This fixes an issue with the intel vulkan mesa driver where it doesn't support storage multisample images.
- SPIR-V reflection should list all outputs from pixel shaders as system values containing colour, even if they're not annotated explicitly.
- Fix the calculation of slices for displaying 3D textures on D3D12.
- Fix a copy-paste error that would replay
vkCmdDrawIndexedIndirect
asvkCmdDrawIndirect
- Fixed element size not being set for D3D11 structured buffer UAVs when bound to the OM instead of CS.
- Fixed a possible state-vector trash when resizing swapchains that could show up as incorrect state.
- Fix an incorrect sample mask being used when fetching shader output values in pixel history.
- On D3D11 fix OM UAVs not being shown as used if a stage other than pixel shader uses them.
- Fixed vulkan secondary command buffers not taking a reference to inherited framebuffer and renderpass.
- Fix a crash on sparse buffer serialisation where a structure was being trashed with incorrect data.
- Handle vulkan physical devices without the
depthClamp
capability. - On Vulkan removed a leak of semaphores when replaying.
- Fixed handling of small sparse buffers in vulkan that might need to be rounded up to larger memory requirements.
- Made sure that shader search paths for separate binary info are always available even if the user isn't using SetPrivateData to specify the relative path for a shader.
- Changed shader search paths to be stored publicly so they are reflected into the xml file.
- Fixed a refcounting problem on D3D11, where resources that were released after being recorded onto a deferred device context but before being executed onto the immediate device context would not be kept alive.
- Fixed a case where Vulkan MSAA textures that haven't been touched and need only be cleared instead of having data restored, would crash on replay trying to copy data that didn't exist.
- Fixed some cases where the code didn't handle strip restart indices properly.
- Clamp the 'maximum row count' value in the raw buffer viewer to 0.
- Fix a D3D12 crash where the number of descriptors in a range was unbounded.
- Fix an issue editing shaders on GL where fragdata bindings being copied from the previous shader program to the edited shader program would return duplicate bindings (multiple variables bound to the same slot) and needed to be ignored.
- Fixes to size calculation on compressed array textures on GL.
- On GL we serialise out program attrib and fragdata bindings. In normal execution the program sets these explicitly in the shader or in code so it works out, but if the program uses the undefined default values and reflects to change its calls, then replaying on a different machine with different undefined values could break things. Now we always serialise out the program state and apply it.
- Fix the highlighting of matching GL attributes and vertex buffers in the GL pipeline state view.
- Fix a crash if the filesystem watcher looking for modifications to custom shaders fired after the texture viewer was closed.
- Fix a crash with a progress bar value going outside of 0.0-1.0.
- Processes that were created by the user as
CREATE_SUSPENDED
are no longer resumed inadvertently. - Work around an intel driver bug ironically triggered by driver bug checks at startup.
- Fix a mistake where we would change the currently bound VAO when replaying a glEnableVertexAttrib or glDisableVertexAttrib call.
- On top of the above, we also save and restore the VAO state explicitly because nvidia's driver has the same exact bug!
- When uploading the GL font texture, make sure to reset the pixel unpack state to default/tightly packed in case the application changed it.
- Fix GL VAOs not replaying properly if double attributes were used.
- Fix handling of empty structs in SPIR-V disassembly.
- Fix a crash shader debugging if an index into an immediate constant buffer was negative and caused the bounds check to overflow.
- Change the ctrl-left and ctrl-right shortcut to move to previous/next drawcall so that it doesn't override the existing shortcut in textboxes to move between words.
- Fix some configuration being trashed when restoring ID3DDeviceContextState, that would lead to inaccurate reference counting and a crash if a debug message was generated.
- Stop an out-of-bounds array access checking for samplers in a non-sampled load operations.
- Fix handling of InstanceDataStepRate = 0 for instanced vertex inputs in D3D11, treating it as all values being identical.
- Add an ignore to hooking CoreMessaging.dll - it causes a bizarre crash in PeekMessage if hooked.
- Fix vulkan indirect draw functions not advancing the parameter pointer when parsing each individual draw, leading to all draws looking identical in the mesh viewer etc.
- Slightly increased the timeout waiting for linux processes to start and open a port for target control connections. This avoids an error like "failed to launch X" when X actually launched just fine.
- Fix several flipped states where a cross was shown instead of a tick in the GL pipeline viewer.
- Don't let linux applications delete the shared log while it's still open in another program (like the replay UI).
- Make sure to align elements in an array to float4 boundaries in constant buffers when fetching structured data. This still allows variables to sit after the array in a shared packed float4.
- On D3D11 fix a potential crash reading out-of-bounds when trying to update more vertices than are available in the drawcall, if the buffer was resized to be larger by a previous draw. Now we only upload to a subset of the buffer.
- On linux don't print to stdout/stderr with log messages in captured programs, since this can interfere with child processes like if you run a script which then calls dirname or pwd.