Skip to content

Added Win32 Import function prototype#503

Merged
jlacroixAMD merged 1 commit intoGPUOpen-LibrariesAndSDKs:masterfrom
Agrael1:master
Sep 9, 2025
Merged

Added Win32 Import function prototype#503
jlacroixAMD merged 1 commit intoGPUOpen-LibrariesAndSDKs:masterfrom
Agrael1:master

Conversation

@Agrael1
Copy link
Copy Markdown
Contributor

@Agrael1 Agrael1 commented Aug 28, 2025

I have added an import function for Win32 Handle and also added vmaGetMemoryWin32Handle2 that can export handle with supplied handle type. Note that Allocator must be created with support for supplied handle type.

Closes #499

@sawickiap
Copy link
Copy Markdown
Contributor

I am sorry but I don't want to add such a significant amount of code to avoid feature creep. Importing Win32 or D3D11 handles is a special feature that looks very detached from what VMA library does. The purpose of VMA is to simplify the most common use case which is creating buffers and images while suballocating them from larger VkDeviceMemory blocks that are hidden from the user. The requirement to query for eligible memory types in a special way and create a dedicated memory block in a special way looks to me more like something that should be done using pure Vulkan functions, not wrapped in VMA code. The code you proposed in function TestImportWin32Handle demonstrates that VMA doesn't do much more than you would do with raw Vulkan functions - creating VkDeviceMemory object instead of the special kind of VmaAllocation object from imported handle.

I also don't want to add dependency on DXGI or D3D11 to the sample project. Initializing D3D11 device in the VMA sample is far out of scope of this codebase.

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 1, 2025

Yes, it always allocates a dedicated block that is managed by VmaAllocation. The main feature of the block is to hold the reference to shared object and clean it up. Secondly There is D3D12 heap and Vulkan Opaque Win32 Handle/KMT, which are allocations of the block of memory, which is to be suballocated. I can see the latter being allocated with VMAPool object, but with D3D12 there is more work.

My test was misleading a bit, I acknowledge it.

Should I then decline the PR?

@sawickiap
Copy link
Copy Markdown
Contributor

Regarding your comment in #499, what do you mean by "at least add an export function and update the handle for KMT handles"? Do you mean adding vmaGetMemoryWin32Handle2 to support a custom handleType? It may be OK if it doesn't add too much logic, but then, should the function be still called "Win32" if it supports other handle types?

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 1, 2025

You see, the problem with those extensions are, that they don't provide a uniform way to export handles.

vmaGetMemoryWin32Handle2 can theoretically export any heap (D3D11/12, Win32 and KMT variations). Right now vmaGetMemoryWin32Handle only exports a Win32 Opaque, which is well managed, but limited, as D3D can't import those, only OpenGL and Vulkan can.

All D3D and KMT versions are still Win32 HANDLEs, so the logic holds.

I could also add FD and others support, but that would be in another PR.

Now I actually think, maybe it is worth to create an extension header for those features, instead of adding everything to a single file?

Comment thread include/vk_mem_alloc.h Outdated
private:
HANDLE m_hHandle;
VMA_RW_MUTEX m_Mutex; // Protects access m_Handle
bool m_NTHandle = false; // True if m_Handle is NT handle, false if it's a KMT handle.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming this variable to m_IsNTHandle.

Comment thread include/vk_mem_alloc.h Outdated
m_hHandle = VMA_NULL;
return res;
}
m_NTHandle = (::GetHandleInformation(m_hHandle, VMA_NULL) != 0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you rather use IsNTHandle method here instead of duplicating its logic?

Comment thread include/vk_mem_alloc.h
public:
// Strengthened
VkResult GetHandle(VkDevice device, VkDeviceMemory memory, PFN_vkGetMemoryWin32HandleKHR pvkGetMemoryWin32HandleKHR, HANDLE hTargetProcess, bool useMutex, HANDLE* pHandle) noexcept
VkResult GetHandle(VkDevice device, VkDeviceMemory memory, PFN_vkGetMemoryWin32HandleKHR pvkGetMemoryWin32HandleKHR, VkExternalMemoryHandleTypeFlagBits handleType, HANDLE hTargetProcess, bool useMutex, HANDLE* pHandle) noexcept
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add an VMA_ASSERT here restricting handleType to only compatible values?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added it to vmaGetMemoryWin32Handle2, since it is the only place this could matter

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I trust you in implementing it the best way, since you know this are and you can test it.

@sawickiap
Copy link
Copy Markdown
Contributor

I propose to split this change into 2 phases.

1. Extending export

I am happy to accept the part of your code related to extending export to other types of handles - the logic that starts at function vmaGetMemoryWin32Handle2.

I just posted few comments as a code review.

2. Adding import

I have another idea. What if I add following functions that would create VmaAllocation, optionally also VkBuffer or VkImage, always with dedicated memory, and they would offer extra parameter for the pNext chain of the AllocateInfo? Everything else, like providing VkImportMemoryWin32HandleInfoKHR structure, would be on the user's side.

I think this approach would:

  • give a lot of flexibility and generality,
  • not add too much code to VMA.
VMA_CALL_PRE VkResult VMA_CALL_POST vmaAllocateDedicatedMemory(
    VmaAllocator VMA_NOT_NULL allocator,
    const VkMemoryRequirements* VMA_NOT_NULL pVkMemoryRequirements,
    const VmaAllocationCreateInfo* VMA_NOT_NULL pCreateInfo,
    void* VMA_NULLABLE VMA_EXTENDS_VK_STRUCT(VkMemoryAllocateInfo) pMemoryAllocateNext,
    VmaAllocation VMA_NULLABLE* VMA_NOT_NULL pAllocation,
    VmaAllocationInfo* VMA_NULLABLE pAllocationInfo);
VMA_CALL_PRE VkResult VMA_CALL_POST vmaCreateDedicatedBuffer(
    VmaAllocator VMA_NOT_NULL allocator,
    const VkBufferCreateInfo* VMA_NOT_NULL pBufferCreateInfo,
    const VmaAllocationCreateInfo* VMA_NOT_NULL pAllocationCreateInfo,
    void* VMA_NULLABLE VMA_EXTENDS_VK_STRUCT(VkMemoryAllocateInfo) pMemoryAllocateNext,
    VkBuffer VMA_NULLABLE_NON_DISPATCHABLE* VMA_NOT_NULL pBuffer,
    VmaAllocation VMA_NULLABLE* VMA_NOT_NULL pAllocation,
    VmaAllocationInfo* VMA_NULLABLE pAllocationInfo);
VMA_CALL_PRE VkResult VMA_CALL_POST vmaCreateDedicatedImage(
    VmaAllocator VMA_NOT_NULL allocator,
    const VkImageCreateInfo* VMA_NOT_NULL pImageCreateInfo,
    const VmaAllocationCreateInfo* VMA_NOT_NULL pAllocationCreateInfo,
    void* VMA_NULLABLE VMA_EXTENDS_VK_STRUCT(VkMemoryAllocateInfo) pMemoryAllocateNext,
    VkImage VMA_NULLABLE_NON_DISPATCHABLE* VMA_NOT_NULL pImage,
    VmaAllocation VMA_NULLABLE* VMA_NOT_NULL pAllocation,
    VmaAllocationInfo* VMA_NULLABLE pAllocationInfo);

Would that work for you? If yes, I can implement those.

I don't want to add dependency on extra Vulkan function vkGetMemoryWin32HandlePropertiesKHR to VMA. I also don't want to add dependency on DXGI or D3D11 to the entire project, even it it would mean not having a full test for the feature.

What do you mean by "extension header"? Some new library on top of VMA? That would be out of scope of this repository. Any external repository or fork is welcome - I can link to it from README.

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 2, 2025

I like that idea more, flexibility is very good! That could actually overshadow creation of dedicated allocator just for exporting memory.

I think that is the best course this can go.

I will deliver the export call shortly.

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 2, 2025

Another question. Now that I think of it, FD and HANDLE are mutually exclusive, so it can theoretically be created in a single function call. They behave the same, only RAII wrappers would be changed a bit.

I will try doing that and present it for review.

EDIT: I have looked up the behavior of FD, it does not make sense to do this for FD, since it has no such UB as windows HANDLEs

@Agrael1 Agrael1 force-pushed the master branch 3 times, most recently from 886d2c0 to f990ad9 Compare September 2, 2025 17:46
@sawickiap
Copy link
Copy Markdown
Contributor

sawickiap commented Sep 2, 2025

Thank you for preparing this code. It all looks good to me.

However, when I tested it, it crashes on my machine. Did you test it with launching the sample app and pressing "T"?

Call stack:

KernelBase.dll!00007ffe6e756bd1
VmaSample.exe!VmaWin32Handle::IsNTHandle
VmaSample.exe!VmaWin32Handle::GetHandle
VmaSample.exe!VmaDeviceMemoryBlock::CreateWin32Handle
VmaSample.exe!VmaAllocation_T::GetWin32Handle
VmaSample.exe!vmaGetMemoryWin32Handle
VmaSample.exe!TestWin32Handles
VmaSample.exe!Test
VmaSample.exe!WndProc

It's inside GetHandleInformation. hHandle = 0x000000000000063c.

Exception thrown at 0x00007FFE6E756BD1 (KernelBase.dll) in VmaSample.exe: 0xC0000005: Access violation writing location 0x0000000000000000.

I use Windows 11 24H2 (OS Build 26100.4946).

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 2, 2025

I did, yet I couldn't test it after the changes, since I'm a bit away from the machine. I will do it tomorrow morning

EDIT: I did a small change, that might be causing this. I wonder why it didn't throw anything on me before. If you could run it again I would be grateful.

@sawickiap
Copy link
Copy Markdown
Contributor

Thanks for the fix! I merged your code - to my fork for now:
https://github.com/sawickiap/VulkanMemoryAllocator

I will implement my part with "Dedicated" functions soon.

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 2, 2025

Thanks!

@sawickiap
Copy link
Copy Markdown
Contributor

I've updated documentation. Can you please check if it looks correct? You can find local HTML pages in my fork, latest code of the master branch, file "docs/html/index.html".

What I changed is:

  • Documentation of functions vmaGetMemoryWin32Handle, vmaGetMemoryWin32Handle2
  • Chapter "Interop with other graphics APIs"

@sawickiap
Copy link
Copy Markdown
Contributor

I am also wondering if specifying multiple handle types is allowed and should be supported? These are bit flags, after all.

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 3, 2025

Adding multiple handle export types are allowed only on memory creation.
VkExportMemoryAllocateInfo says that you can create a memory with multiple handle types, but VkMemoryGetWin32HandleInfoKHR has only FlagBits, meaning it supports a single handle type to export

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 3, 2025

if handleType == VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT_KHR, each call to this function creates a new handle that must be closed using:

CloseHandle(handle);
You can close it any time, before or after destroying the allocation object. It is reference-counted internally by Windows.

Well, technically, handleType can be a KMT (Kernel Mode Thunk) or an NT handle. Any NT handle must be closed with CloseHandle. KMT handles are global and they must not be closed. Also, KMT handles don't maintain reference count to underlying memory, so destroying the memory in one process will cause a device removal (analogous to SIGSEGV on GPU).

@sawickiap
Copy link
Copy Markdown
Contributor

How does the concept of KMT and NT handles map to handle types available in Vulkan? How do you propose to fix the documentation?

@sawickiap
Copy link
Copy Markdown
Contributor

sawickiap commented Sep 3, 2025

I started coding something on branch add-create-dedicated, adding function vmaAllocateDedicatedMemory.

I found a way to test memory import without pulling D3D11. I create a WinAPI shared memory handle using CreateFileMapping and import it. What do you think about it?

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 3, 2025

I started coding something on branch add-create-dedicated, adding function vmaAllocateDedicatedMemory.

I found a way to test it without pulling D3D11. I create a WinAPI shared memory handle using CreateFileMapping and import it. What do you think about it?

You don't have to. It is possible to just create a memory on Vulkan side with handle type VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_BIT_KHR and then import it on the same device. That should create a compatible handle.

@sawickiap
Copy link
Copy Markdown
Contributor

sawickiap commented Sep 3, 2025

OK, that's another option, but possibly testing this shared memory block is better because I also write to it using pointer obtained through WinAPI MapViewOfFile, read the data on Vulkan side using vmaCopyAllocationToMemory (which internally maps the Vulkan buffer) and compare it, which makes a nice test I think.

Also, this way the tests for export and import stay separate.

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 3, 2025

How does the concept of KMT and NT handles map to handle types available in Vulkan? How do you propose to fix the documentation?

KMT are explicitly stated as KMT handles, they are VK_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_TEXTURE_KMT_BIT_KHR and VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT_BIT_KHR.

They are global, opaque, use the same type, but can't be freed. This is internally checked in VMA.

Documentation of the VK_KHR_external_memory_win32 extension states that:

If handleType is defined as an NT handle, vkGetMemoryWin32HandleKHR must be called no more than once for each valid unique combination of memory and handleType.

This is ensured automatically inside VMA. If VK_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_BIT is used as the handle type, or other NT handle types (maybe add a table with handle types), the library fetches the handle on first use, remembers it internally, and closes it when the memory block or dedicated allocation is destroyed. Every time you call [vmaGetMemoryWin32Handle2()](https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator/pull/group__group__alloc.html#ga1a8d7aba3bf5a4de66c801b9988afa58), VMA calls DuplicateHandle and returns a new handle that you need to close. For further information, please check the documentation of this function.

If the Handle type is KMT, then every call to vmaGetMemoryWin32HandleKHR2 will return the same handle, that does not need to be closed. Lifetime of the memory is tied to the memory block itself.

Something like that

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 3, 2025

OK, that's another option, but possibly testing this shared memory block is better because I also write to it using pointer obtained through WinAPI MapViewOfFile, read the data on Vulkan side using vmaCopyAllocationToMemory (which internally maps the Vulkan buffer) and compare it, which makes a nice test I think.

Also, this way the tests for export and import stay separate.

Maybe, if it works :)
I haven't tried it myself

sawickiap added a commit to sawickiap/VulkanMemoryAllocator that referenced this pull request Sep 3, 2025
@sawickiap
Copy link
Copy Markdown
Contributor

OK, I updated the docs on the master branch of my fork, adding "or other NT handle types", as you suggested.

@sawickiap
Copy link
Copy Markdown
Contributor

I finished implementing the functions I planned on branch add-create-dedicated: vmaAllocateDedicatedMemory, vmaCreateDedicatedBuffer, vmaCreateDedicatedImage, with full documentation. Can you please check if they work for you and meet your requirements?

However, I have a mysterious problem. On my RTX 4090, when TestWin32HandlesImport is executed, some further tests fail. Namely, TestLinearAllocator fails to allocate Vulkan memory, observing error code VK_ERROR_OUT_OF_DEVICE_MEMORY. I am not sure how to fix this.

@sawickiap
Copy link
Copy Markdown
Contributor

I can't make my test working correctly. It passes on Nvidia card, but on my AMD integrated graphics it crashes somewhere later.

Maybe the test you proposed with export and import would work better? Can you please show how it should look like?

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 3, 2025

Sure, I am a bit exhausted now, I will get back in an hour or so.

The test is simple, just copy the test with handle export, but instead of creating buffer allocate memory with D3D12_HEAP over the pool with added flag. Then export the handle of that allocation and import it back. That should work from what I remember.

The thing is, I was able to create heaps for export, but I wasn't able to create buffers over heaps that are exported to D3D11/12, but you can still import them. Which is odd, since after import you can create a buffer. But that worked on NVidia, don't know what would AMD say on that. It looks suspiciously.

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 5, 2025

Hi, I have created a patch with test of importing capabilities:
test_import.patch

The test is basic, just creation doing create buffer on pool with export -> export its handle -> import in main allocator without any export capabilities. I have tried creating a D3D12 heap, but ultimately it didn't work out.
This test simulates export from one process to other, which is also a use case.

I have also tested D3D11 handle locally, it works.

sawickiap added a commit to sawickiap/VulkanMemoryAllocator that referenced this pull request Sep 8, 2025
@sawickiap
Copy link
Copy Markdown
Contributor

Did you test the code you proposed in the patch? It failed for me on TEST(CloseHandle(handle2)); because handle2 was unused and stayed null. I fixed it.

When checking for externalBufferProperties.externalMemoryProperties.externalMemoryFeatures & VK_EXTERNAL_MEMORY_FEATURE_EXPORTABLE_BIT, in the importing test shouldn't we also check for VK_EXTERNAL_MEMORY_FEATURE_IMPORTABLE_BIT? I think yes, so I added this check as well.

I pushed the changes to this test to my branch and fork, like before. Please let me know if all looks good to you.

@Agrael1
Copy link
Copy Markdown
Contributor Author

Agrael1 commented Sep 8, 2025

Yes, I have tested the code, but I think I have debugged through and stopped right before the CloseHandle.
I think importable could be tested, I know that all the DX memory and opaque handles have to be importable by the spec, but it won't hurt.

Everything seems good

@jlacroixAMD jlacroixAMD merged commit 33d7d27 into GPUOpen-LibrariesAndSDKs:master Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for importing the external memory

3 participants