The vulkan_rust Guide
Vulkan is a powerful graphics and compute API, but its explicitness comes at a cost: there is a lot to learn before you can put a single triangle on screen. Most documentation dumps the full specification on you and expects you to swim. This guide takes a different approach.
Every concept in this guide follows the same progression:
- Why it matters, the problem this concept solves, in plain language.
- Intuition, a mental model, analogy, or diagram that builds the right picture before you see any code.
- Worked example, annotated code you can read, run, and modify.
- Formal reference, spec terminology, edge cases, and links to the rustdoc API reference for when you need the full picture.
This structure is deliberate. Research in cognitive science shows that understanding develops from concrete to abstract, not the other way around. We build your intuition first, then formalize it.
Who this guide is for
You know Rust. You have some idea that GPUs exist and do interesting things. You may or may not have used OpenGL, DirectX, Metal, or WebGPU before, none of that is required. This guide assumes zero prior Vulkan knowledge.
If you are coming from another Vulkan crate like ash, the
migration guide shows the differences
side by side.
How this guide is organized
This guide follows the Diataxis documentation framework, which separates content by purpose:
| Section | Purpose | Start here if… |
|---|---|---|
| Getting Started | Step-by-step tutorials | You want to draw something now |
| Concepts | Explanations of how Vulkan works | You want to understand why |
| How-To Guides | Recipes for specific tasks | You know what you need to do |
| Architecture | Design decisions behind vulkan_rust | You want to contribute or evaluate |
Concept dependency map
The concepts section is ordered so each chapter builds on the ones before it. Here is the dependency structure:
Object Model
|
+---> Memory Management
| |
| v
+---> Command Buffers ----+
| | |
| v v
+---> Synchronization Render Passes
| | |
| v v
+---> Pipelines <---------+
| |
| v
+---> Descriptor Sets
|
+---> pNext Extension Chain (independent, read any time)
+---> Validation Layers (independent, read any time)
You can read linearly from top to bottom, or jump to whatever you need. The dependency map shows you which chapters you should read first if something doesn’t make sense.
API documentation
This guide is a companion to the API reference. The API docs cover every type, method, and constant with spec links, error codes, safety requirements, and thread safety annotations. This guide covers the why and how that API docs cannot.
Quick taste
Here is the minimum code to initialize Vulkan with vulkan_rust:
use vulkan_rust::{Entry, LibloadingLoader};
fn main() {
// Load the Vulkan loader library from the system.
let loader = unsafe { LibloadingLoader::new() }.expect("Failed to find Vulkan");
let entry = unsafe { Entry::new(loader) }.expect("Failed to load Vulkan");
// Query the highest Vulkan version the driver supports.
let version = entry.version().expect("Failed to query version");
println!("Vulkan {}.{}.{}", version.major, version.minor, version.patch);
}
Ready to go further? Start with Installation.
Installation
Add vulkan_rust to your project
[dependencies]
vulkan-rust = "0.10"
Platform requirements
Windows
Install the LunarG Vulkan SDK. This
provides vulkan-1.dll and the validation layers.
Linux
Install your distribution’s Vulkan packages:
# Ubuntu / Debian
sudo apt install libvulkan-dev vulkan-validationlayers
# Fedora
sudo dnf install vulkan-loader-devel vulkan-validation-layers
# Arch
sudo pacman -S vulkan-icd-loader vulkan-validation-layers
macOS
Install the LunarG Vulkan SDK for macOS, which includes MoltenVK for Vulkan-on-Metal translation.
Verify your setup
After installing, run this to confirm Vulkan is available:
# If you installed the Vulkan SDK:
vulkaninfo --summary
You should see your GPU listed with a supported Vulkan version.
Next steps
Ready to write code? Continue to Hello Triangle, Part 1.
Hello Triangle, Part 1: Instance & Device
This is the first part of a four-part tutorial that builds a complete Vulkan application from scratch. By the end of part 4, you will have a colored triangle on screen. By the end of this part, you will have a working connection to your GPU.
What we build in this part:
Load Vulkan ──> Create Instance ──> Pick a GPU ──> Create Device ──> Get a Queue
Each step depends on the previous one. We will take them one at a time, with an explanation of why each step exists before the code.
Prerequisites
- Install vulkan_rust and the Vulkan SDK
- A working Rust toolchain (
cargo buildsucceeds) - A system with a Vulkan-capable GPU
Create the project
cargo new hello-triangle
cd hello-triangle
Add vulkan-rust to your Cargo.toml:
[dependencies]
vulkan-rust = "0.10"
Step 1: Load the Vulkan library
Before you can call any Vulkan function, you must load the Vulkan shared
library (vulkan-1.dll on Windows, libvulkan.so on Linux,
libvulkan.dylib on macOS). This library is the loader, the
entry point that routes your calls to the correct GPU driver.
use vulkan_rust::{Entry, LibloadingLoader};
fn main() {
// Load the Vulkan shared library from the system.
// This can fail if the Vulkan SDK is not installed.
let loader = LibloadingLoader::new()
.expect("Failed to find Vulkan library");
// Create the Entry, which resolves the bootstrap function pointers
// (vkGetInstanceProcAddr, vkGetDeviceProcAddr).
let entry = unsafe { Entry::new(loader) }
.expect("Failed to load Vulkan entry points");
// Verify: query the highest Vulkan version the driver supports.
let version = entry.version().expect("Failed to query Vulkan version");
println!("Vulkan {}.{}.{}", version.major, version.minor, version.patch);
}
Run this with cargo run. If you see output like Vulkan 1.3.280, your
setup is working.
Why is this
unsafe? Loading a shared library and calling its functions through raw pointers is inherently unsafe. The compiler cannot verify that the library is valid or that the function pointers it returns are correct. This is the onlyunsafewe need to understand right now; the rest follow the same pattern.
Step 2: Create a Vulkan Instance
An Instance is your application’s connection to the Vulkan runtime. It loads the driver, enables validation layers, and provides access to the physical GPUs on the system.
Think of it as opening a session: “I am application X, I want to use Vulkan version Y, please give me access.”
use vulkan_rust::vk;
use vk::*;
// ── Describe your application ──────────────────────────────────
//
// ApplicationInfo tells the driver who you are. This is optional
// but helps driver vendors optimize for known applications.
let app_info = ApplicationInfo::builder()
.application_name(c"Hello Triangle")
.application_version(1)
.engine_name(c"No Engine")
.engine_version(1)
.api_version(1 << 22); // Vulkan 1.0
// ── Describe what you need ─────────────────────────────────────
//
// No layers or extensions yet. We will add validation layers and
// surface extensions in later parts.
let create_info = InstanceCreateInfo::builder()
.application_info(&app_info);
// ── Create the instance ────────────────────────────────────────
let instance = unsafe { entry.create_instance(&create_info, None) }
.expect("Failed to create Vulkan instance");
println!("Instance created successfully");
Before reading on: why do you think the Instance takes an
api_versionfield? What would happen if you requested a version the driver doesn’t support?
The api_version tells the driver the highest Vulkan version your
application is written against. If the driver supports that version or
higher, it succeeds. If you request 1.3 on a 1.0-only driver, instance
creation fails with ERROR_INCOMPATIBLE_DRIVER.
Step 3: Pick a physical device (GPU)
A system can have multiple GPUs: a discrete NVIDIA/AMD card, an integrated Intel GPU, or even a software renderer. You must choose one.
use vk::PhysicalDeviceType;
// ── Enumerate GPUs ─────────────────────────────────────────────
let physical_devices = unsafe { instance.enumerate_physical_devices() }
.expect("Failed to enumerate GPUs");
println!("Found {} GPU(s):", physical_devices.len());
// ── Inspect each one ───────────────────────────────────────────
for (i, &pd) in physical_devices.iter().enumerate() {
let props = unsafe { instance.get_physical_device_properties(pd) };
// The device name is a null-terminated C string in a fixed-size array.
let name_bytes: Vec<u8> = props.device_name
.iter()
.take_while(|&&c| c != 0)
.map(|&c| c as u8)
.collect();
let name = String::from_utf8_lossy(&name_bytes);
let device_type = match props.device_type {
PhysicalDeviceType::DISCRETE_GPU => "Discrete GPU",
PhysicalDeviceType::INTEGRATED_GPU => "Integrated GPU",
PhysicalDeviceType::VIRTUAL_GPU => "Virtual GPU",
PhysicalDeviceType::CPU => "CPU (software)",
_ => "Other",
};
println!(" [{}] {} ({})", i, name, device_type);
}
// ── Pick the first GPU ─────────────────────────────────────────
//
// A real application would score GPUs by capability (discrete >
// integrated, required features, memory size). For this tutorial,
// the first one is fine.
let physical_device = physical_devices[0];
Before reading on: the code above uses
get_physical_device_propertiesto read the GPU name and type. What other information do you think the driver exposes about each physical device?
The PhysicalDeviceProperties struct also contains the driver version,
the Vulkan API version the device supports, and limits, a struct
with hundreds of fields describing maximum texture sizes, buffer
alignments, and other hardware limits.
Step 4: Find a queue family that supports graphics
The GPU exposes queues, which are the endpoints where you submit work. Queues are grouped into families, where each family supports a specific set of operations (graphics, compute, transfer, etc.).
We need a queue family that supports graphics operations.
use vk::QueueFlags;
// ── Query queue families ───────────────────────────────────────
let queue_families = unsafe {
instance.get_physical_device_queue_family_properties(physical_device)
};
// ── Find one that supports graphics ────────────────────────────
let graphics_family_index = queue_families
.iter()
.enumerate()
.find(|(_, family)| {
family.queue_flags & QueueFlags::GRAPHICS
!= QueueFlags::empty()
})
.map(|(index, _)| index as u32)
.expect("No graphics queue family found");
println!("Using queue family {} for graphics", graphics_family_index);
Queue families are identified by their index in the array. We will pass this index to device creation (to request a queue from that family) and to many other calls throughout the application.
Step 5: Create a logical Device
A Device is your interface to one physical GPU. It loads all the device-level function pointers and provides the methods you will use for the rest of the application: creating buffers, recording commands, submitting work.
Creating a Device also creates the queues you requested.
use vk::*;
// ── Request one queue from the graphics family ─────────────────
let queue_priority = 1.0_f32;
let queue_info = DeviceQueueCreateInfo::builder()
.queue_family_index(graphics_family_index)
.queue_priorities(std::slice::from_ref(&queue_priority));
// ── Create the device ──────────────────────────────────────────
//
// No extensions or features yet. We will add the swapchain
// extension in Part 2.
let device_info = DeviceCreateInfo::builder()
.queue_create_infos(std::slice::from_ref(&queue_info));
let device = unsafe {
instance.create_device(physical_device, &device_info, None)
}
.expect("Failed to create logical device");
println!("Device created successfully");
Before reading on: we requested a queue with priority
1.0. What do you think the priority controls?
Queue priority is a hint to the driver about how to schedule work when
multiple queues compete for GPU resources. 1.0 is the highest
priority. Most applications use a single queue and set it to 1.0.
The actual effect is driver-dependent.
Step 6: Get a queue handle
The Device created our queues internally. We retrieve handles to them
with get_device_queue.
// ── Retrieve the graphics queue ────────────────────────────────
//
// Queue family index: the family we chose above.
// Queue index: 0, because we only requested 1 queue from this family.
let graphics_queue = unsafe {
device.get_device_queue(graphics_family_index, 0)
};
println!("Graphics queue ready");
The queue handle is not created or destroyed by you. It is owned by the Device and valid for the Device’s lifetime. (See The Vulkan Object Model for the distinction between created, allocated, and enumerated objects.)
Step 7: Clean up
Vulkan requires explicit destruction in reverse creation order.
vulkan_rust has no Drop implementations on purpose, so you must
call the destroy methods yourself.
// ── Destroy in reverse order ───────────────────────────────────
//
// Queue handles are owned by the Device, no destroy needed.
// Device must be destroyed before Instance.
// Instance must be destroyed last.
unsafe {
device.destroy_device(None);
instance.destroy_instance(None);
}
println!("Cleaned up successfully");
Putting it all together
Here is the complete program. Copy this into src/main.rs and run it
with cargo run.
use vulkan_rust::{Entry, LibloadingLoader};
use vulkan_rust::vk;
use vk::*;
fn main() {
// ── Step 1: Load Vulkan ────────────────────────────────────
let loader = LibloadingLoader::new()
.expect("Vulkan library not found");
let entry = unsafe { Entry::new(loader) }
.expect("Failed to load Vulkan");
let version = entry.version().expect("Failed to query version");
println!("Vulkan {}.{}.{}", version.major, version.minor, version.patch);
// ── Step 2: Create Instance ────────────────────────────────
let app_info = ApplicationInfo::builder()
.application_name(c"Hello Triangle")
.application_version(1)
.engine_name(c"No Engine")
.engine_version(1)
.api_version(1 << 22); // Vulkan 1.0
let create_info = InstanceCreateInfo::builder()
.application_info(&app_info);
let instance = unsafe { entry.create_instance(&create_info, None) }
.expect("Failed to create instance");
// ── Step 3: Pick a GPU ─────────────────────────────────────
let physical_devices = unsafe { instance.enumerate_physical_devices() }
.expect("Failed to enumerate GPUs");
let physical_device = physical_devices[0];
let props = unsafe {
instance.get_physical_device_properties(physical_device)
};
let name_bytes: Vec<u8> = props.device_name
.iter()
.take_while(|&&c| c != 0)
.map(|&c| c as u8)
.collect();
println!("GPU: {}", String::from_utf8_lossy(&name_bytes));
// ── Step 4: Find a graphics queue family ───────────────────
let queue_families = unsafe {
instance.get_physical_device_queue_family_properties(physical_device)
};
let graphics_family_index = queue_families
.iter()
.enumerate()
.find(|(_, family)| {
family.queue_flags & QueueFlags::GRAPHICS
!= QueueFlags::empty()
})
.map(|(index, _)| index as u32)
.expect("No graphics queue family found");
// ── Step 5: Create Device ──────────────────────────────────
let queue_priority = 1.0_f32;
let queue_info = DeviceQueueCreateInfo::builder()
.queue_family_index(graphics_family_index)
.queue_priorities(std::slice::from_ref(&queue_priority));
let device_info = DeviceCreateInfo::builder()
.queue_create_infos(std::slice::from_ref(&queue_info));
let device = unsafe {
instance.create_device(physical_device, &device_info, None)
}
.expect("Failed to create device");
// ── Step 6: Get the graphics queue ─────────────────────────
let _graphics_queue = unsafe {
device.get_device_queue(graphics_family_index, 0)
};
println!("Vulkan initialized successfully!");
println!("Ready for Part 2: Swapchain & Surface");
// ── Step 7: Clean up ───────────────────────────────────────
unsafe {
device.destroy_device(None);
instance.destroy_instance(None);
}
}
Expected output:
Vulkan 1.3.280
GPU: NVIDIA GeForce RTX 4070
Vulkan initialized successfully!
Ready for Part 2: Swapchain & Surface
(Your version number and GPU name will differ.)
What we learned
This part covered the Vulkan initialization sequence:
| Step | What | Why |
|---|---|---|
| Load library | LibloadingLoader::new() + Entry::new() | Get access to Vulkan function pointers |
| Create Instance | entry.create_instance() | Open a session with the Vulkan driver |
| Pick GPU | enumerate_physical_devices() + get_physical_device_properties() | Choose which hardware to use |
| Find queue family | get_physical_device_queue_family_properties() | Find a queue that supports graphics |
| Create Device | instance.create_device() | Get a logical interface to the GPU |
| Get queue | device.get_device_queue() | Get the submission endpoint |
Every Vulkan application does these steps. They are the foundation that everything else builds on.
What we skipped (and will add later)
- Validation layers (Part 2), catch API misuse during development. See Validation Layers for the concept.
- Surface and swapchain (Part 2), connect to a window so we can display pixels.
- Extensions, we will enable
VK_KHR_swapchainand surface extensions in Part 2.
Exercises
- Print all GPUs. Modify the program to print every physical device with its name and type, not just the first one.
- Print all queue families. For the chosen GPU, print every queue family with its flags (GRAPHICS, COMPUTE, TRANSFER) and queue count.
- Choose discrete over integrated. Modify the GPU selection to prefer a discrete GPU when one is available.
Next
Part 2: Swapchain & Surface adds a window, creates a swapchain, and introduces validation layers.
Hello Triangle, Part 2: Swapchain & Surface
In Part 1 we loaded Vulkan, created an Instance and Device, and retrieved a graphics queue. We can talk to the GPU, but we have nowhere to show anything.
What we build in this part:
Open a window ──> Create Surface ──> Create Swapchain ──> Get image views
+ validation layers
By the end of this part, we will have a window with a swapchain ready to receive rendered frames.
New dependencies
We need a windowing library. This tutorial uses winit, but vulkan_rust
works with anything that implements raw-window-handle.
[dependencies]
vulkan-rust = "0.10"
winit = "0.30"
Step 1: Open a window
Before creating a Vulkan surface, we need a platform window.
use winit::application::ApplicationHandler;
use winit::event::WindowEvent;
use winit::event_loop::{ActiveEventLoop, EventLoop};
use winit::window::{Window, WindowId};
struct App {
window: Option<Window>,
}
impl ApplicationHandler for App {
fn resumed(&mut self, event_loop: &ActiveEventLoop) {
if self.window.is_some() {
return;
}
let attrs = Window::default_attributes()
.with_title("Hello Triangle")
.with_inner_size(winit::dpi::LogicalSize::new(800, 600));
let window = event_loop
.create_window(attrs)
.expect("Failed to create window");
// ... Vulkan initialization uses &window here ...
self.window = Some(window);
}
fn window_event(
&mut self,
event_loop: &ActiveEventLoop,
_id: WindowId,
event: WindowEvent,
) {
if matches!(event, WindowEvent::CloseRequested) {
event_loop.exit();
}
}
}
fn main() {
let event_loop = EventLoop::new().expect("Failed to create event loop");
let mut app = App { window: None };
event_loop.run_app(&mut app).expect("Event loop error");
}
Step 2: Create the Instance with surface extensions
In Part 1 we created an Instance with no extensions. Now we need the platform surface extensions so Vulkan can render to our window.
vulkan_rust provides required_extensions() which returns the right
extensions for your platform.
use vulkan_rust::{Entry, LibloadingLoader};
use vulkan_rust::vk;
use vk::*;
// ── Load Vulkan ────────────────────────────────────────────────
let loader = LibloadingLoader::new()
.expect("Vulkan library not found");
let entry = unsafe { Entry::new(loader) }
.expect("Failed to load Vulkan");
// ── Gather required extensions ─────────────────────────────────
//
// required_extensions() returns platform-specific extensions:
// Windows: VK_KHR_surface + VK_KHR_win32_surface
// Linux: VK_KHR_surface + VK_KHR_xlib_surface + VK_KHR_wayland_surface
// macOS: VK_KHR_surface + VK_EXT_metal_surface
let surface_extensions = vulkan_rust::required_extensions();
let extension_ptrs: Vec<*const i8> = surface_extensions
.iter()
.map(|ext| ext.as_ptr())
.collect();
// ── Enable the validation layer ────────────────────────────────
//
// Always enable during development. See the Validation Layers
// concept chapter for details.
let validation_layer = c"VK_LAYER_KHRONOS_validation";
let layer_ptrs = [validation_layer.as_ptr()];
// ── Create the instance ────────────────────────────────────────
let app_info = ApplicationInfo::builder()
.application_name(c"Hello Triangle")
.application_version(1)
.engine_name(c"No Engine")
.engine_version(1)
.api_version(1 << 22); // Vulkan 1.0
let create_info = InstanceCreateInfo::builder()
.application_info(&app_info)
.enabled_extension_names(&extension_ptrs)
.enabled_layer_names(&layer_ptrs);
let instance = unsafe { entry.create_instance(&create_info, None) }
.expect("Failed to create instance");
Before reading on: we enabled validation layers here but did not set up a debug messenger callback. What happens to validation errors?
They go to stderr on most platforms. Setting up a debug messenger (as shown in the Validation chapter) gives you programmatic control over the output. For a tutorial, stderr is fine.
Step 3: Create a Surface
A Surface is Vulkan’s abstraction over a platform window. It represents the thing you render to: a Win32 HWND, an X11 Window, a Wayland wl_surface, etc.
vulkan_rust provides instance.create_surface() which handles the
platform dispatch for you via raw-window-handle.
// ── Create the surface ─────────────────────────────────────────
//
// create_surface uses raw-window-handle to detect the platform
// and call the right vkCreate*Surface function.
let surface = unsafe { instance.create_surface(&window, &window, None) }
.expect("Failed to create surface");
The surface is an Instance-level object. It must be destroyed before the Instance.
Step 4: Pick a GPU (with presentation support)
In Part 1 we picked the first GPU. Now we also need to verify it can present to our surface, which means it has a queue family that supports both graphics and presentation.
// ── Enumerate GPUs ─────────────────────────────────────────────
let physical_devices = unsafe { instance.enumerate_physical_devices() }
.expect("Failed to enumerate GPUs");
// ── Find a GPU with a queue family that supports both graphics
// and presentation to our surface ────────────────────────────
use vk::*;
let mut physical_device = PhysicalDevice::null();
let mut graphics_family_index = 0u32;
'outer: for &pd in &physical_devices {
let queue_families = unsafe {
instance.get_physical_device_queue_family_properties(pd)
};
for (i, family) in queue_families.iter().enumerate() {
let supports_graphics =
family.queue_flags & QueueFlags::GRAPHICS
!= QueueFlags::empty();
// Check if this queue family can present to our surface.
let supports_present = unsafe {
instance.get_physical_device_surface_support_khr(
pd,
i as u32,
surface,
)
}
.unwrap_or(false);
if supports_graphics && supports_present {
physical_device = pd;
graphics_family_index = i as u32;
break 'outer;
}
}
}
assert!(
!physical_device.is_null(),
"No GPU found with graphics + presentation support"
);
Before reading on: why do we check for presentation support separately from graphics support? Can a queue family support graphics but not presentation?
Yes. On some hardware, a queue family can execute graphics commands but cannot present to a specific surface. Presentation support depends on both the queue family and the surface (which is tied to a specific monitor/display). Always check with
get_physical_device_surface_support_khr.
Step 5: Create the Device with the swapchain extension
Now we add VK_KHR_swapchain, the extension that lets us create a
swapchain.
use vk::extension_names::KHR_SWAPCHAIN_EXTENSION_NAME;
let device_extensions = [KHR_SWAPCHAIN_EXTENSION_NAME.as_ptr()];
let queue_priority = 1.0_f32;
let queue_info = DeviceQueueCreateInfo::builder()
.queue_family_index(graphics_family_index)
.queue_priorities(std::slice::from_ref(&queue_priority));
let device_info = DeviceCreateInfo::builder()
.queue_create_infos(std::slice::from_ref(&queue_info))
.enabled_extension_names(&device_extensions);
let device = unsafe {
instance.create_device(physical_device, &device_info, None)
}
.expect("Failed to create device");
let graphics_queue = unsafe {
device.get_device_queue(graphics_family_index, 0)
};
Step 6: Query surface capabilities
Before creating a swapchain, we must ask the surface what it supports: image formats, present modes, minimum/maximum image count, and supported image sizes.
// ── Query what the surface supports ────────────────────────────
let capabilities = unsafe {
instance.get_physical_device_surface_capabilities_khr(
physical_device,
surface,
)
}
.expect("Failed to query surface capabilities");
let formats = unsafe {
instance.get_physical_device_surface_formats_khr(
physical_device,
surface,
)
}
.expect("Failed to query surface formats");
let present_modes = unsafe {
instance.get_physical_device_surface_present_modes_khr(
physical_device,
surface,
)
}
.expect("Failed to query present modes");
Step 7: Choose swapchain settings
We need to decide three things: the image format, the present mode, and the image extent (resolution).
use vk::*;
// ── Choose format ──────────────────────────────────────────────
//
// Prefer B8G8R8A8_SRGB with SRGB_NONLINEAR color space.
// Fall back to whatever is available.
let surface_format = formats
.iter()
.find(|f| {
f.format == Format::B8G8R8A8_SRGB
&& f.color_space == ColorSpaceKHR::SRGB_NONLINEAR
})
.unwrap_or(&formats[0]);
// ── Choose present mode ────────────────────────────────────────
//
// MAILBOX = triple buffering (low latency, no tearing).
// FIFO = vsync (guaranteed available).
let present_mode = if present_modes.contains(&PresentModeKHR::MAILBOX) {
PresentModeKHR::MAILBOX
} else {
PresentModeKHR::FIFO // always available
};
// ── Choose extent (resolution) ─────────────────────────────────
//
// If current_extent is 0xFFFFFFFF, the surface size is determined
// by the swapchain extent. Otherwise, use the surface's size.
let extent = if capabilities.current_extent.width != u32::MAX {
capabilities.current_extent
} else {
let size = window.inner_size();
Extent2D {
width: size.width.clamp(
capabilities.min_image_extent.width,
capabilities.max_image_extent.width,
),
height: size.height.clamp(
capabilities.min_image_extent.height,
capabilities.max_image_extent.height,
),
}
};
// ── Choose image count ─────────────────────────────────────────
//
// Request one more than the minimum so we always have an image
// to render to while the display is reading another.
let image_count = {
let desired = capabilities.min_image_count + 1;
if capabilities.max_image_count > 0 {
desired.min(capabilities.max_image_count)
} else {
desired // max_image_count == 0 means no upper limit
}
};
Step 8: Create the swapchain
let swapchain_info = SwapchainCreateInfoKHR::builder()
.surface(surface)
.min_image_count(image_count)
.image_format(surface_format.format)
.image_color_space(surface_format.color_space)
.image_extent(extent)
.image_array_layers(1)
.image_usage(ImageUsageFlags::COLOR_ATTACHMENT)
.image_sharing_mode(SharingMode::EXCLUSIVE)
.pre_transform(capabilities.current_transform)
.composite_alpha(CompositeAlphaFlagBitsKHR::OPAQUE)
.present_mode(present_mode)
.clipped(true) // discard pixels behind other windows
.old_swapchain(SwapchainKHR::null());
let swapchain = unsafe {
device.create_swapchain_khr(&swapchain_info, None)
}
.expect("Failed to create swapchain");
The swapchain now owns a set of images. We retrieve their handles next.
Step 9: Get swapchain images and create image views
The swapchain images are owned by the swapchain, so we do not destroy them ourselves. But we need image views to use them in render passes and framebuffers.
// ── Get the swapchain images ───────────────────────────────────
let swapchain_images = unsafe {
device.get_swapchain_images_khr(swapchain)
}
.expect("Failed to get swapchain images");
println!("Swapchain has {} images", swapchain_images.len());
// ── Create an image view for each swapchain image ──────────────
let swapchain_image_views: Vec<ImageView> = swapchain_images
.iter()
.map(|&image| {
let view_info = ImageViewCreateInfo::builder()
.image(image)
.view_type(ImageViewType::_2D)
.format(surface_format.format)
.components(ComponentMapping {
r: ComponentSwizzle::IDENTITY,
g: ComponentSwizzle::IDENTITY,
b: ComponentSwizzle::IDENTITY,
a: ComponentSwizzle::IDENTITY,
})
.subresource_range(ImageSubresourceRange {
aspect_mask: ImageAspectFlags::COLOR,
base_mip_level: 0,
level_count: 1,
base_array_layer: 0,
layer_count: 1,
});
unsafe { device.create_image_view(&view_info, None) }
.expect("Failed to create image view")
})
.collect();
Where we are now
At this point we have:
Window (winit)
│
└── Surface (VK_KHR_surface)
│
└── Swapchain (VK_KHR_swapchain)
│
├── Image 0 ──> ImageView 0
├── Image 1 ──> ImageView 1
└── Image 2 ──> ImageView 2
The swapchain gives us images to render into. The image views let us use those images in render passes. In Part 3, we will create a render pass and a graphics pipeline so we can actually draw something.
Clean up
Destruction in reverse creation order:
unsafe {
// Image views (we created these)
for &view in &swapchain_image_views {
device.destroy_image_view(view, None);
}
// Swapchain (device-level)
device.destroy_swapchain_khr(swapchain, None);
// Device
device.destroy_device(None);
// Surface (instance-level, before instance)
instance.destroy_surface(surface, None);
// Instance
instance.destroy_instance(None);
}
What we learned
| Step | What | Why |
|---|---|---|
| Surface extensions | required_extensions() | Platform-specific window integration |
| Validation layer | VK_LAYER_KHRONOS_validation | Catch mistakes during development |
| Surface | instance.create_surface() | Connect Vulkan to a window |
| Presentation check | get_physical_device_surface_support_khr | Ensure the GPU can present to this surface |
| Swapchain extension | VK_KHR_swapchain | Enable swapchain creation on the device |
| Surface capabilities | get_physical_device_surface_capabilities_khr | Query supported formats, sizes, present modes |
| Swapchain | create_swapchain_khr | A set of images the display rotates through |
| Image views | create_image_view | Make swapchain images usable by render passes |
Concepts to explore
- Validation Layers & Debugging, how to set up a debug messenger for better error output.
- The Vulkan Object Model, why we destroy in reverse order.
Exercises
- Print all surface formats. Before choosing a format, print every format and color space the surface supports.
- Print the chosen present mode. Print which present mode was selected (MAILBOX or FIFO) and why.
- Handle no validation layer. What happens if the validation layer
is not installed? Modify the code to check for its availability with
enumerate_instance_layer_propertiesand skip it gracefully.
Next
Part 3: Render Pass & Pipeline creates the graphics pipeline that defines how we draw our triangle.
Hello Triangle, Part 3: Render Pass & Pipeline
In Part 2 we opened a window, created a surface and swapchain, and retrieved image views. We have somewhere to render, but no instructions for how to render.
What we build in this part:
Write shaders ──> Create Render Pass ──> Create Pipeline ──> Create Framebuffers
Threshold concept. The graphics pipeline is one of Vulkan’s biggest conceptual shifts. Instead of setting state one call at a time (like OpenGL’s
glEnable(GL_DEPTH_TEST)), you define all rendering state in a single pipeline object. This is verbose, but it means the driver has complete information at creation time and compiles everything to GPU machine code once, not at draw time.
Step 1: Write shaders
We need a vertex shader (positions the triangle) and a fragment shader (colors it). Write these as GLSL and compile to SPIR-V.
triangle.vert:
#version 450
// Hard-coded triangle vertices (no vertex buffer needed).
vec2 positions[3] = vec2[](
vec2( 0.0, -0.5),
vec2( 0.5, 0.5),
vec2(-0.5, 0.5)
);
vec3 colors[3] = vec3[](
vec3(1.0, 0.0, 0.0), // red
vec3(0.0, 1.0, 0.0), // green
vec3(0.0, 0.0, 1.0) // blue
);
layout(location = 0) out vec3 frag_color;
void main() {
gl_Position = vec4(positions[gl_VertexIndex], 0.0, 1.0);
frag_color = colors[gl_VertexIndex];
}
triangle.frag:
#version 450
layout(location = 0) in vec3 frag_color;
layout(location = 0) out vec4 out_color;
void main() {
out_color = vec4(frag_color, 1.0);
}
Compile them with glslc (included in the Vulkan SDK):
glslc triangle.vert -o triangle.vert.spv
glslc triangle.frag -o triangle.frag.spv
Place the .spv files in your project’s src/ directory (or wherever
you prefer, adjust the path in the code below).
Before reading on: this vertex shader hard-codes the triangle positions inside the shader rather than reading them from a vertex buffer. Why might this be useful for a first example?
It eliminates the need for vertex buffers, memory allocation, and buffer binding, letting us focus on the pipeline and render pass without those distractions. A real application reads vertices from buffers (covered in the Memory Management chapter).
Step 2: Load SPIR-V and create shader modules
use vulkan_rust::vk;
use vulkan_rust::cast_to_u32;
use vk::*;
// ── Load SPIR-V bytecode ───────────────────────────────────────
let vert_bytes = include_bytes!("triangle.vert.spv");
let frag_bytes = include_bytes!("triangle.frag.spv");
// SPIR-V must be aligned to 4 bytes. cast_to_u32 checks alignment.
let vert_code = cast_to_u32(vert_bytes)
.expect("Vertex shader SPIR-V is not 4-byte aligned");
let frag_code = cast_to_u32(frag_bytes)
.expect("Fragment shader SPIR-V is not 4-byte aligned");
// ── Create shader modules ──────────────────────────────────────
let vert_info = ShaderModuleCreateInfo::builder()
.code(vert_code);
let frag_info = ShaderModuleCreateInfo::builder()
.code(frag_code);
let vert_module = unsafe { device.create_shader_module(&vert_info, None) }
.expect("Failed to create vertex shader module");
let frag_module = unsafe { device.create_shader_module(&frag_info, None) }
.expect("Failed to create fragment shader module");
Shader modules are temporary containers. After the pipeline is created, we can destroy them.
Step 3: Create the render pass
The render pass describes what attachments we render to and how they are handled. See Render Passes & Framebuffers for the full concept.
use vulkan_rust::vk;
use vk::*;
// ── Color attachment: the swapchain image ──────────────────────
let color_attachment = AttachmentDescription {
flags: AttachmentDescriptionFlags::empty(),
format: surface_format.format, // from Part 2
samples: SampleCountFlagBits::_1,
load_op: AttachmentLoadOp::CLEAR, // clear to black
store_op: AttachmentStoreOp::STORE, // keep the result
stencil_load_op: AttachmentLoadOp::DONT_CARE,
stencil_store_op: AttachmentStoreOp::DONT_CARE,
initial_layout: ImageLayout::UNDEFINED,
final_layout: ImageLayout::PRESENT_SRC, // ready for display
};
// ── Subpass: use the color attachment ──────────────────────────
let color_ref = AttachmentReference {
attachment: 0,
layout: ImageLayout::COLOR_ATTACHMENT_OPTIMAL,
};
let subpass = SubpassDescription {
flags: SubpassDescriptionFlags::empty(),
pipeline_bind_point: PipelineBindPoint::GRAPHICS,
input_attachment_count: 0,
p_input_attachments: core::ptr::null(),
color_attachment_count: 1,
p_color_attachments: &color_ref,
p_resolve_attachments: core::ptr::null(),
p_depth_stencil_attachment: core::ptr::null(),
preserve_attachment_count: 0,
p_preserve_attachments: core::ptr::null(),
};
// ── Subpass dependency ─────────────────────────────────────────
//
// Ensure the image layout transition happens before we write color.
let dependency = SubpassDependency {
src_subpass: vk::SUBPASS_EXTERNAL,
dst_subpass: 0,
src_stage_mask: PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT,
dst_stage_mask: PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT,
src_access_mask: AccessFlags::NONE,
dst_access_mask: AccessFlags::COLOR_ATTACHMENT_WRITE,
dependency_flags: DependencyFlags::empty(),
};
let render_pass_info = RenderPassCreateInfo::builder()
.attachments(std::slice::from_ref(&color_attachment))
.subpasses(std::slice::from_ref(&subpass))
.dependencies(std::slice::from_ref(&dependency));
let render_pass = unsafe {
device.create_render_pass(&render_pass_info, None)
}
.expect("Failed to create render pass");
Step 4: Create the pipeline layout
Our shaders don’t use any descriptors or push constants, so the layout is empty.
use vulkan_rust::vk;
use vk::*;
let layout_info = PipelineLayoutCreateInfo::builder();
let pipeline_layout = unsafe {
device.create_pipeline_layout(&layout_info, None)
}
.expect("Failed to create pipeline layout");
Step 5: Create the graphics pipeline
This is the largest struct in the Vulkan API. Every piece of rendering state is specified here.
use vulkan_rust::vk;
use vk::*;
// ── Shader stages ──────────────────────────────────────────────
let entry_name = c"main";
let stages = [
*PipelineShaderStageCreateInfo::builder()
.stage(ShaderStageFlags::VERTEX)
.module(vert_module)
.name(entry_name),
*PipelineShaderStageCreateInfo::builder()
.stage(ShaderStageFlags::FRAGMENT)
.module(frag_module)
.name(entry_name),
];
// ── Vertex input: empty (positions are hard-coded in shader) ───
let vertex_input = PipelineVertexInputStateCreateInfo::builder();
// ── Input assembly: triangle list ──────────────────────────────
let input_assembly = PipelineInputAssemblyStateCreateInfo::builder()
.topology(PrimitiveTopology::TRIANGLE_LIST);
// ── Viewport and scissor: dynamic (set at draw time) ───────────
let mut viewport_state = PipelineViewportStateCreateInfo::builder();
viewport_state.viewport_count = 1;
viewport_state.scissor_count = 1;
// ── Rasterization ──────────────────────────────────────────────
let rasterizer = PipelineRasterizationStateCreateInfo::builder()
.polygon_mode(PolygonMode::FILL)
.cull_mode(CullModeFlags::BACK)
.front_face(FrontFace::CLOCKWISE)
.line_width(1.0);
// ── Multisampling: off ─────────────────────────────────────────
let multisampling = PipelineMultisampleStateCreateInfo::builder()
.rasterization_samples(SampleCountFlagBits::_1);
// ── Color blending: no blending, write all channels ────────────
let blend_attachment = PipelineColorBlendAttachmentState {
blend_enable: 0,
color_write_mask: ColorComponentFlags::R
| ColorComponentFlags::G
| ColorComponentFlags::B
| ColorComponentFlags::A,
..unsafe { core::mem::zeroed() }
};
let color_blending = PipelineColorBlendStateCreateInfo::builder()
.attachments(std::slice::from_ref(&blend_attachment));
// ── Dynamic state ──────────────────────────────────────────────
let dynamic_states = [DynamicState::VIEWPORT, DynamicState::SCISSOR];
let dynamic_state = PipelineDynamicStateCreateInfo::builder()
.dynamic_states(&dynamic_states);
// ── Assemble the pipeline ──────────────────────────────────────
let pipeline_info = GraphicsPipelineCreateInfo::builder()
.stages(&stages)
.vertex_input_state(&vertex_input)
.input_assembly_state(&input_assembly)
.viewport_state(&viewport_state)
.rasterization_state(&rasterizer)
.multisample_state(&multisampling)
.color_blend_state(&color_blending)
.dynamic_state(&dynamic_state)
.layout(pipeline_layout)
.render_pass(render_pass)
.subpass(0);
let pipeline = unsafe {
device.create_graphics_pipelines(
PipelineCache::null(),
&[*pipeline_info],
None,
)
}
.expect("Failed to create graphics pipeline")[0];
// ── Shader modules are no longer needed ────────────────────────
unsafe {
device.destroy_shader_module(vert_module, None);
device.destroy_shader_module(frag_module, None);
};
Before reading on: we set
cull_modeto BACK andfront_faceto CLOCKWISE. What happens if the triangle vertices are wound counter-clockwise? What would you see?The triangle would be culled (invisible). Back-face culling discards triangles whose vertices appear in the wrong winding order from the camera’s perspective. If your triangle is invisible, try switching to
COUNTER_CLOCKWISEor disabling culling withCullModeFlags::NONE.
Step 6: Create framebuffers
A framebuffer binds specific image views to a render pass. We need one per swapchain image.
use vulkan_rust::vk;
use vk::*;
let framebuffers: Vec<Framebuffer> = swapchain_image_views
.iter()
.map(|&view| {
let views = [view];
let fb_info = FramebufferCreateInfo::builder()
.render_pass(render_pass)
.attachments(&views)
.width(extent.width)
.height(extent.height)
.layers(1);
unsafe { device.create_framebuffer(&fb_info, None) }
.expect("Failed to create framebuffer")
})
.collect();
Where we are now
Render Pass "clear to black, store the result, present"
│
Pipeline "use these shaders, fill triangles, no blending"
│
Framebuffers [swapchain image 0, swapchain image 1, ...]
We have everything needed to describe what to draw and how. In Part 4, we record commands that use the pipeline and render pass, submit them, and present the result.
Clean up (new objects)
Add these to the cleanup sequence from Part 2, before device destruction:
unsafe {
for &fb in &framebuffers {
device.destroy_framebuffer(fb, None);
}
device.destroy_pipeline(pipeline, None);
device.destroy_pipeline_layout(pipeline_layout, None);
device.destroy_render_pass(render_pass, None);
// ... then image views, swapchain, device, surface, instance
}
What we learned
| Step | What | Why |
|---|---|---|
| Shaders | GLSL → SPIR-V → ShaderModule | GPU programs that position and color pixels |
| Render pass | create_render_pass | Declares attachments and how they are loaded/stored |
| Pipeline layout | create_pipeline_layout | Declares what resources shaders expect (none for now) |
| Graphics pipeline | create_graphics_pipelines | Bakes all rendering state into one compiled object |
| Framebuffers | create_framebuffer | Binds specific images to a render pass |
Concepts to explore
- Pipelines, dynamic state, compute pipelines, pipeline cache.
- Render Passes & Framebuffers, load ops, subpass dependencies, dynamic rendering.
Exercises
- Change the clear color. Modify the render pass begin info (in Part 4) to clear to a different color. The clear value is passed when beginning the render pass, not when creating it.
- Add a depth attachment. Create a depth image and image view, add a second attachment to the render pass, and enable depth testing in the pipeline.
- Try
PolygonMode::LINE. Change the polygon mode to LINE to see the triangle as wireframe. (Requires thefillModeNonSoliddevice feature.)
Next
Part 4: Command Buffers & Drawing records the draw commands, submits them, and presents the triangle to the screen.
Hello Triangle, Part 4: Command Buffers & Drawing
This is the final part. In Part 3 we created the render pass, pipeline, and framebuffers. Now we record commands, submit them, and present a triangle to the screen.
What we build in this part:
Create sync objects ──> Create command pool/buffers
│ │
└──> Render loop: acquire image ──> record commands ──> submit ──> present
This part ties together every concept from the previous three parts. When you see the triangle, you will have written a complete Vulkan application.
Step 1: Create synchronization objects
We need fences and semaphores to coordinate CPU and GPU work. See Synchronization for the full concept.
use vulkan_rust::vk;
use vk::*;
// ── Semaphores: GPU-to-GPU synchronization ─────────────────────
let sem_info = SemaphoreCreateInfo::builder();
// "The swapchain image is ready to render into."
let image_available = unsafe { device.create_semaphore(&sem_info, None) }
.expect("Failed to create semaphore");
// "Rendering is done, safe to present."
let render_finished = unsafe { device.create_semaphore(&sem_info, None) }
.expect("Failed to create semaphore");
// ── Fence: GPU-to-CPU synchronization ──────────────────────────
//
// SIGNALED so the first frame doesn't block forever waiting for
// a "previous frame" that never existed.
let fence_info = FenceCreateInfo::builder()
.flags(FenceCreateFlags::SIGNALED);
let in_flight_fence = unsafe { device.create_fence(&fence_info, None) }
.expect("Failed to create fence");
Before reading on: why do we create the fence with SIGNALED? What would happen on the first frame if we didn’t?
The render loop starts by waiting for the fence. On the first frame, no GPU work has been submitted yet, so an unsignaled fence would block forever. Starting it signaled lets the first frame pass through immediately.
Step 2: Create a command pool and command buffer
use vulkan_rust::vk;
use vk::*;
// ── Command pool ───────────────────────────────────────────────
let pool_info = CommandPoolCreateInfo::builder()
.flags(CommandPoolCreateFlags::RESET_COMMAND_BUFFER)
.queue_family_index(graphics_family_index);
let command_pool = unsafe { device.create_command_pool(&pool_info, None) }
.expect("Failed to create command pool");
// ── Allocate one command buffer ────────────────────────────────
let alloc_info = CommandBufferAllocateInfo::builder()
.command_pool(command_pool)
.level(CommandBufferLevel::PRIMARY)
.command_buffer_count(1);
let command_buffer = unsafe {
device.allocate_command_buffers(&alloc_info)
}
.expect("Failed to allocate command buffer")[0];
Step 3: Record drawing commands
This function records all the commands needed to draw one frame. We call it every frame with the correct framebuffer for the current swapchain image.
use vulkan_rust::vk;
use vk::*;
unsafe fn record_commands(
device: &vulkan_rust::Device,
command_buffer: CommandBuffer,
render_pass: RenderPass,
framebuffer: Framebuffer,
pipeline: Pipeline,
extent: Extent2D,
) {
unsafe {
// ── Begin recording ────────────────────────────────────────
let begin_info = CommandBufferBeginInfo::builder();
device.begin_command_buffer(command_buffer, &begin_info)
.expect("Failed to begin command buffer");
// ── Begin render pass ──────────────────────────────────────
let clear_value = ClearValue {
color: ClearColorValue {
float32: [0.0, 0.0, 0.0, 1.0], // black
},
};
let clear_values = [clear_value];
let rp_begin = RenderPassBeginInfo::builder()
.render_pass(render_pass)
.framebuffer(framebuffer)
.render_area(Rect2D {
offset: Offset2D { x: 0, y: 0 },
extent,
})
.clear_values(&clear_values);
device.cmd_begin_render_pass(
command_buffer,
&rp_begin,
SubpassContents::INLINE,
);
// ── Bind the pipeline ──────────────────────────────────────
device.cmd_bind_pipeline(
command_buffer,
PipelineBindPoint::GRAPHICS,
pipeline,
);
// ── Set dynamic viewport and scissor ───────────────────────
let viewport = Viewport {
x: 0.0,
y: 0.0,
width: extent.width as f32,
height: extent.height as f32,
min_depth: 0.0,
max_depth: 1.0,
};
device.cmd_set_viewport(command_buffer, 0, &[viewport]);
let scissor = Rect2D {
offset: Offset2D { x: 0, y: 0 },
extent,
};
device.cmd_set_scissor(command_buffer, 0, &[scissor]);
// ── Draw the triangle ──────────────────────────────────────
//
// 3 vertices, 1 instance, starting at vertex 0, instance 0.
// The vertex data is hard-coded in the shader.
device.cmd_draw(command_buffer, 3, 1, 0, 0);
// ── End render pass and recording ──────────────────────────
device.cmd_end_render_pass(command_buffer);
device.end_command_buffer(command_buffer)
.expect("Failed to end command buffer");
}
}
This is the core of every Vulkan frame: begin recording, begin render pass, bind pipeline, set state, draw, end render pass, end recording.
Step 4: The render loop
Now we tie everything together in the event loop. Each frame:
- Wait for the previous frame’s fence (CPU waits for GPU).
- Acquire the next swapchain image (GPU signals
image_available). - Record commands into the command buffer.
- Submit the command buffer (waits on
image_available, signalsrender_finishedand the fence). - Present the image (waits on
render_finished).
use winit::application::ApplicationHandler;
use winit::event::WindowEvent;
use winit::event_loop::{ActiveEventLoop, EventLoop};
use winit::window::WindowId;
impl ApplicationHandler for App {
fn resumed(&mut self, _event_loop: &ActiveEventLoop) {
// Window and Vulkan setup already done (see Part 2).
}
fn window_event(
&mut self,
event_loop: &ActiveEventLoop,
_id: WindowId,
event: WindowEvent,
) {
match event {
WindowEvent::CloseRequested => {
event_loop.exit();
}
WindowEvent::RedrawRequested => {
unsafe { self.draw_frame() };
// Request the next frame immediately.
self.window.as_ref().unwrap().request_redraw();
}
_ => {}
}
}
}
// In main:
let event_loop = EventLoop::new().expect("Failed to create event loop");
event_loop.run_app(&mut app).expect("Event loop error");
The draw_frame function:
use vulkan_rust::vk;
use vk::*;
unsafe fn draw_frame(
device: &vulkan_rust::Device,
swapchain: SwapchainKHR,
in_flight_fence: Fence,
image_available: Semaphore,
render_finished: Semaphore,
command_buffer: CommandBuffer,
framebuffers: &[Framebuffer],
render_pass: RenderPass,
pipeline: Pipeline,
extent: Extent2D,
graphics_queue: Queue,
) {
unsafe {
// ── 1. Wait for previous frame ─────────────────────────────
device.wait_for_fences(&[in_flight_fence], true, u64::MAX)
.expect("Failed to wait for fence");
device.reset_fences(&[in_flight_fence])
.expect("Failed to reset fence");
// ── 2. Acquire next swapchain image ────────────────────────
let image_index = device
.acquire_next_image_khr(
swapchain,
u64::MAX,
image_available,
Fence::null(),
)
.expect("Failed to acquire swapchain image");
// ── 3. Record commands ─────────────────────────────────────
device.reset_command_buffer(
command_buffer,
CommandBufferResetFlags::empty(),
)
.expect("Failed to reset command buffer");
record_commands(
device,
command_buffer,
render_pass,
framebuffers[image_index as usize],
pipeline,
extent,
);
// ── 4. Submit ──────────────────────────────────────────────
let wait_sems = [image_available];
let wait_stages = [PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT];
let cmd_bufs = [command_buffer];
let signal_sems = [render_finished];
let submit_info = SubmitInfo::builder()
.wait_semaphores(&wait_sems)
.wait_dst_stage_mask(&wait_stages)
.command_buffers(&cmd_bufs)
.signal_semaphores(&signal_sems);
device.queue_submit(graphics_queue, &[*submit_info], in_flight_fence)
.expect("Failed to submit draw command buffer");
// ── 5. Present ─────────────────────────────────────────────
let present_wait = [render_finished];
let swapchains = [swapchain];
let indices = [image_index];
let present_info = PresentInfoKHR::builder()
.wait_semaphores(&present_wait)
.swapchains(&swapchains)
.image_indices(&indices);
device.queue_present_khr(graphics_queue, &present_info)
.expect("Failed to present");
}
}
The synchronization flow each frame:
CPU: wait_for_fences ────────────────────────────────────> (free to continue)
│
v
GPU: acquire_next_image ──signals──> image_available
│ (GPU waits at COLOR_ATTACHMENT_OUTPUT)
v
GPU: queue_submit ──signals──> render_finished
│ ──signals──> in_flight_fence
v │
GPU: queue_present │
│
CPU: (next frame) wait_for_fences <────────┘
Step 5: Wait before cleanup
Before destroying anything, wait for the GPU to finish all work:
// After the event loop exits:
unsafe { device.device_wait_idle() }
.expect("Failed to wait for device idle");
Then destroy everything in reverse creation order:
unsafe {
device.destroy_fence(in_flight_fence, None);
device.destroy_semaphore(render_finished, None);
device.destroy_semaphore(image_available, None);
device.destroy_command_pool(command_pool, None);
for &fb in &framebuffers {
device.destroy_framebuffer(fb, None);
}
device.destroy_pipeline(pipeline, None);
device.destroy_pipeline_layout(pipeline_layout, None);
device.destroy_render_pass(render_pass, None);
for &view in &swapchain_image_views {
device.destroy_image_view(view, None);
}
device.destroy_swapchain_khr(swapchain, None);
device.destroy_device(None);
instance.destroy_surface(surface, None);
instance.destroy_instance(None);
}
You did it
Run cargo run. You should see a window with a colored triangle on a
black background:
┌──────────────────────────────────────┐
│ │
│ ▲ (red) │
│ ╱ ╲ │
│ ╱ ╲ │
│ (blue) ╱ ╲ (green) │
│ ▔▔▔▔▔ │
│ │
└──────────────────────────────────────┘
If you see a black window with no triangle, check these common issues:
- Validation errors in the console. Read them. They usually point directly at the problem.
- Front face winding. If your triangle vertices are wound
counter-clockwise but you set
CLOCKWISE, the triangle is culled. TryCullModeFlags::NONEto test. - Missing SPIR-V files.
include_bytes!panics at compile time if the file is not found.
What we built across all four parts
Part 1: Entry ──> Instance ──> PhysicalDevice ──> Device ──> Queue
Part 2: Window ──> Surface ──> Swapchain ──> ImageViews
Part 3: Shaders ──> RenderPass ──> Pipeline ──> Framebuffers
Part 4: Sync objects ──> CommandPool/Buffer ──> Render loop
Every Vulkan application follows this structure. The details change (more pipelines, more buffers, more complex synchronization), but the architecture is the same.
What we skipped
This tutorial focused on getting a triangle on screen. A production application would add:
- Multiple frames in flight to avoid the CPU waiting for the GPU every frame. See Double Buffering.
- Window resize handling to recreate the swapchain when the window size changes. See Handle Window Resize.
- Vertex buffers to pass vertex data from CPU memory to the GPU. See Memory Management.
- Descriptor sets to pass uniforms and textures to shaders. See Descriptor Sets.
- Depth testing for 3D rendering.
Exercises
- Change the triangle color. Modify the fragment shader (or the vertex shader’s color array) and recompile the SPIR-V.
- Draw a rectangle. Change the shader to output 6 vertices (two
triangles) and update the
cmd_drawvertex count. - Add frames in flight. Create two sets of sync objects and command buffers. Alternate between them each frame so the CPU can record frame N+1 while the GPU renders frame N.
- Handle resize. When the window is resized, recreate the swapchain, image views, and framebuffers. The Handle Window Resize guide covers this.
Where to go from here
- Concepts section: deep dives into every Vulkan subsystem.
- How-To Guides: recipes for specific tasks (textures, resize, push constants).
- API reference: every type and method with spec links and error codes.
How to Read This Section
This section explains how Vulkan works, not as a tutorial to follow, but as a set of mental models you can carry with you while writing any Vulkan code.
Structure of each chapter
Every concept chapter follows the same four-part structure:
| Part | Purpose | How to use it |
|---|---|---|
| Motivation | Why this concept exists | Read first, it tells you what problem you’re solving |
| Intuition | Analogy, diagram, or informal explanation | Build a mental picture before touching code |
| Worked example | Annotated code showing the concept in practice | Read the annotations, not just the code |
| Formal reference | Spec terminology, edge cases, API links | Come back to this when you need precision |
You do not need to memorize the formal reference on first reading. The intuition and worked example are enough to start writing code. The formal section is there for when your intuition hits an edge case and you need to know exactly what the spec says.
Threshold concepts
Some ideas in Vulkan are threshold concepts, once they click, they permanently change how you understand the API. These are flagged with a marker:
Threshold concept. This idea transforms how you think about Vulkan. If it feels confusing, that is normal, it means your mental model is being restructured. Stay with it.
The three biggest threshold concepts in Vulkan are:
- Explicit memory management, you allocate GPU memory yourself and decide what goes where.
- Synchronization is your responsibility, the GPU runs asynchronously and Vulkan gives you no implicit ordering guarantees.
- State is baked into pipeline objects, you cannot change rendering state on the fly like in OpenGL.
Reading order
The chapters are ordered by dependency, each builds on the ones before it. If a concept doesn’t make sense, check the dependency map to see which prerequisite you might need to revisit.
Two chapters are independent and can be read at any time:
Active reading
Throughout each chapter, you will find questions like:
Before reading on: why do you think Vulkan requires explicit synchronization instead of handling it automatically?
These are retrieval prompts. Pausing to answer, even briefly, even wrong, significantly improves retention. You are not expected to know the answer. The act of thinking about it before reading the explanation is what matters.
The Vulkan Object Model
Motivation
Every Vulkan API call operates on handles, opaque references to objects that live on the GPU or in the driver. Before you can do anything useful in Vulkan, you need to understand what these handles are, how they relate to each other, and who is responsible for destroying them.
If you have used file descriptors on Unix, database connections, or COM objects on Windows, the concept is the same: you request a resource, you get back an opaque identifier, you use that identifier in every subsequent call, and you close it when you are done. Vulkan has roughly 59 different handle types, but they all follow this pattern.
Intuition
Handles are opaque identifiers, not objects
A Vulkan handle is not a pointer to a struct you can inspect. It is an opaque number the driver gives you. You pass it back to the driver in later calls, and the driver uses it to look up the real resource internally. You never dereference a handle or read its fields.
In vulkan_rust, every handle is a #[repr(transparent)] newtype over
either usize or u64:
// This is the entire definition of a Buffer handle.
// There is nothing inside it except a number.
#[repr(transparent)]
pub struct Buffer(u64);
Handles form a parent-child tree
Vulkan objects are not independent. They form a hierarchy where each object is created from (and belongs to) a parent:
Instance (your connection to the Vulkan driver)
├── PhysicalDevice (a GPU on the system, enumerated, not created)
│ └── Device (your logical interface to that GPU)
│ ├── Queue (a submission endpoint, retrieved, not created)
│ ├── CommandPool
│ │ └── CommandBuffer (allocated from a pool, not created directly)
│ ├── Buffer
│ ├── Image
│ ├── Fence
│ ├── Semaphore
│ ├── Pipeline
│ ├── DescriptorPool
│ │ └── DescriptorSet (allocated from a pool, not created directly)
│ └── ... (~50 more types)
└── SurfaceKHR (a window's rendering target)
This hierarchy determines two things:
- Creation order. You cannot create a
Bufferwithout aDevice, and you cannot create aDevicewithout aPhysicalDevice, which requires anInstance. - Destruction order. You must destroy children before their parent.
If you destroy a
Devicewhile it still has liveBufferhandles, that is undefined behavior.
Before reading on: look at the tree above. Why do you think
CommandBufferandDescriptorSetare “allocated from a pool” instead of “created directly” likeBufferorImage?
The creation-destruction lifecycle
Almost every Vulkan object follows the same lifecycle:
1. Fill a CreateInfo struct (describe what you want)
2. Call create_xxx() (driver creates it, gives you a handle)
3. Use the handle (pass it to other API calls)
4. Call destroy_xxx() (you are done, release it)
The exception is objects that are enumerated (PhysicalDevice, Queue) or allocated from pools (CommandBuffer, DescriptorSet). These have slightly different creation/destruction patterns, covered below.
Dispatchable vs non-dispatchable handles
Vulkan has two categories of handle, and the difference matters for understanding how the driver works internally.
Dispatchable handles (Instance, PhysicalDevice, Device,
CommandBuffer, Queue) are pointer-sized (usize). Internally, the
driver stores a dispatch table at the address the handle points to.
When you call a Vulkan function, the loader uses this dispatch table
to route the call to the correct driver. There are only 5 dispatchable
handle types.
Non-dispatchable handles (Buffer, Image, Fence, Pipeline, and
all the rest) are 64-bit integers (u64). They are opaque identifiers
that the driver interprets however it likes. There are roughly 54 of
these.
You rarely need to think about this distinction in application code. It matters when you are doing interop (passing handles between processes or APIs) or when you are debugging driver internals.
Worked example: the complete lifecycle of a Buffer
This example shows the full create-use-destroy lifecycle. Each step is labeled with its purpose.
use vulkan_rust::vk;
use vulkan_rust::vk::*;
use vulkan_rust::vk::Handle;
use vulkan_rust::Device;
unsafe fn buffer_lifecycle(device: &Device) {
// ── Step 1: Describe what you want ──────────────────────────
//
// Every create call takes a CreateInfo struct. The builder
// fills in sType automatically and provides a typed API
// for each field.
let buffer_info = BufferCreateInfo::builder()
.size(1024) // 1 KiB
.usage(BufferUsageFlags::VERTEX_BUFFER)
.sharing_mode(SharingMode::EXCLUSIVE);
// ── Step 2: Create the object ───────────────────────────────
//
// The driver allocates the resource and returns a handle.
// This can fail (out of memory, invalid parameters), so it
// returns a Result.
let buffer: Buffer = device
.create_buffer(&buffer_info, None)
.expect("Failed to create buffer");
// The handle is just a number. You can copy it, compare it,
// hash it, or check if it is null.
assert!(!buffer.is_null());
// ── Step 3: Use the handle ──────────────────────────────────
//
// You would normally bind memory to this buffer, then use
// it in command buffer recording. For this example, we just
// show that the handle is a lightweight Copy type.
let buffer_copy = buffer; // handles are Copy
assert_eq!(buffer, buffer_copy);
// ── Step 4: Destroy the object ──────────────────────────────
//
// You must destroy the buffer before destroying the Device
// that created it. vulkan_rust does not track this for you.
// There is no Drop implementation. You are responsible.
device.destroy_buffer(buffer, None);
// After this point, using `buffer` is undefined behavior.
// Rust's type system does not prevent this, the handle is
// still a valid Copy value. Vulkan's validation layers
// will catch use-after-destroy if you enable them.
}
Before reading on: the code above calls
device.destroy_buffer(buffer, None). What do you think the second argument (None) is for? Hint: it relates to custom memory allocation, not GPU memory.
Objects that come from pools
CommandBuffers and DescriptorSets are not created individually. They are allocated in bulk from a pool, and freed back to that pool (or the entire pool is reset/destroyed at once):
use vulkan_rust::vk;
use vulkan_rust::vk::*;
use vulkan_rust::vk::Handle;
// Pool-based lifecycle (simplified)
unsafe {
// Create the pool (this is a normal create/destroy object).
let pool_info = CommandPoolCreateInfo::builder()
.queue_family_index(graphics_queue_family);
let pool = device.create_command_pool(&pool_info, None)?;
// Allocate command buffers FROM the pool.
let alloc_info = CommandBufferAllocateInfo::builder()
.command_pool(pool)
.level(CommandBufferLevel::PRIMARY)
.command_buffer_count(2);
let command_buffers = device.allocate_command_buffers(&alloc_info)?;
// Use command_buffers[0], command_buffers[1], ...
// Option A: Free individual command buffers back to the pool.
device.free_command_buffers(pool, &command_buffers);
// Option B: Reset the entire pool (returns all buffers to initial state).
device.reset_command_pool(pool, CommandPoolResetFlags::empty())?;
// Destroy the pool (implicitly frees all remaining command buffers).
device.destroy_command_pool(pool, None);
}
This pool pattern exists for performance: allocating and freeing individual small objects is expensive, so Vulkan amortizes the cost by batching them through pools.
Objects that are enumerated, not created
PhysicalDevices and Queues are not created by you. They are discovered:
unsafe {
// PhysicalDevices: the driver tells you what GPUs exist.
let physical_devices = instance.enumerate_physical_devices()?;
// Queues: retrieved from a Device after creation.
let queue = device.get_device_queue(queue_family_index, 0);
}
You do not destroy enumerated objects. Their lifetime is tied to their parent (PhysicalDevice lives as long as the Instance, Queue lives as long as the Device).
Formal reference
The Handle trait
Every handle type in vulkan_rust implements the Handle trait:
pub trait Handle: Copy + Eq + Hash {
type Repr; // usize or u64
fn null() -> Self; // the null handle (0)
fn from_raw(raw: Self::Repr) -> Self;
fn as_raw(self) -> Self::Repr;
fn is_null(self) -> bool;
}
All handles also derive Copy, Clone, PartialEq, Eq, Hash,
Default (returns null), and Debug (prints the type name and hex value).
Handle categories
| Category | Repr | Examples | Count |
|---|---|---|---|
| Dispatchable | usize | Instance, PhysicalDevice, Device, CommandBuffer, Queue | 5 |
| Non-dispatchable | u64 | Buffer, Image, Fence, Semaphore, Pipeline, … | ~54 |
Destruction rules
-
You must destroy what you create.
vulkan_rusthas noDropimplementations on handles. This is deliberate: automatic destruction would require tracking creation order, reference counting, and deferred destruction (the GPU might still be using the object). That complexity belongs in your application, not in the bindings. -
Destroy children before parents. The tree above defines the order. Validation layers will warn you if you get it wrong.
-
The GPU must be done with an object before you destroy it. If a command buffer references a Buffer that you then destroy, the GPU will read freed memory. Use fences or
device_wait_idle()to ensure GPU work has completed. -
Pool destruction frees all children. Destroying a CommandPool implicitly frees all CommandBuffers allocated from it. Same for DescriptorPool and DescriptorSets.
-
Enumerated objects are not destroyed. PhysicalDevice and Queue handles are valid for the lifetime of their parent.
Interop: from_raw_parts
If another system creates Vulkan objects for you (OpenXR, a C library, a test harness), you can wrap them:
// Wrap an externally-created Instance.
let instance = unsafe {
Instance::from_raw_parts(raw_instance_handle, get_instance_proc_addr_fn)
};
// Wrap an externally-created Device.
let device = unsafe {
Device::from_raw_parts(raw_device_handle, get_device_proc_addr_fn)
};
The wrapped objects load all function pointers from the provided
get_*_proc_addr function, so they work identically to objects
created through Entry::create_instance.
API reference links
Key takeaways
- Vulkan handles are opaque numbers, not pointers to inspectable structs.
- Handles form a parent-child tree. Create bottom-up, destroy top-down.
- Most objects follow create → use → destroy. Pools and enumerated objects are the two exceptions.
vulkan_rustgives youCopyhandles with noDrop. You manage lifetimes. Validation layers are your safety net during development.
Memory Management
Threshold concept. Vulkan memory management permanently changes how you think about GPU resources. In OpenGL, the driver decided where your data lived. In Vulkan, you decide, and that decision affects performance more than almost anything else.
Motivation
A GPU has multiple memory pools with different properties: some are fast
for the GPU but invisible to the CPU, some are accessible to both but
slower, some are special-purpose. OpenGL hid this complexity behind
glBufferData and hoped the driver would make good choices. Sometimes
it did. Often it didn’t.
Vulkan exposes this hardware reality directly because the “right” memory choice depends on your workload, and only you know your workload. A mesh that never changes after upload needs different memory than a uniform buffer you update every frame.
Intuition
The warehouse analogy
Think of GPU memory like a warehouse with different storage areas:
- Device-local memory is the high-speed shelving right next to the assembly line (GPU cores). Fast to access, but the front office (CPU) can’t reach it directly.
- Host-visible memory is the loading dock, both the warehouse workers (GPU) and the delivery trucks (CPU) can access it, but it’s slower for the assembly line.
- Host-coherent memory is a special loading dock where changes are immediately visible to both sides, without needing to shout “new stuff here!” (flush/invalidate).
- Host-cached memory is a loading dock with a clipboard: the CPU reads are fast because they come from a cache, but you need to invalidate before reading to make sure the clipboard is up to date.
The two-step binding model
In Vulkan, creating a Buffer and allocating memory for it are separate operations. This is different from most APIs and is often the first surprise:
1. Create a Buffer (describes shape and usage, no memory yet)
2. Query memory requirements (driver tells you: size, alignment, compatible types)
3. Allocate DeviceMemory (reserve a block from a memory pool)
4. Bind memory to buffer (connect the two)
This separation exists because multiple buffers can share a single memory allocation (sub-allocation), which is far more efficient than allocating individually. Production Vulkan applications almost always use a memory allocator (like VMA) to manage sub-allocation, but understanding the raw API is essential before using one.
Before reading on: why do you think Vulkan separates “create buffer” from “allocate memory”? What advantage does this give you that a single
create_buffer_with_memory()call would not?
Memory types and heaps
Every Vulkan device exposes a set of memory heaps (physical pools of VRAM or system RAM) and memory types (combinations of properties that describe how a heap can be used).
┌─────────────────────────────────────────────────────────┐
│ Physical Device Memory Properties │
│ │
│ Heaps: │
│ ┌──────────────────────┐ ┌──────────────────────────┐ │
│ │ Heap 0: 8 GiB │ │ Heap 1: 16 GiB │ │
│ │ flags: DEVICE_LOCAL │ │ flags: (none) │ │
│ │ (dedicated GPU VRAM) │ │ (system RAM) │ │
│ └──────────────────────┘ └──────────────────────────┘ │
│ │
│ Memory Types (each points to a heap): │
│ ┌─────────────────────────────────────────────┐ │
│ │ Type 0: heap 0, DEVICE_LOCAL │ │
│ │ Type 1: heap 1, HOST_VISIBLE | HOST_COHERENT│ │
│ │ Type 2: heap 0, DEVICE_LOCAL | HOST_VISIBLE │ ←BAR │
│ │ Type 3: heap 1, HOST_VISIBLE | HOST_CACHED │ │
│ └─────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────┘
The number and properties of heaps and types vary between GPUs. A discrete GPU typically has separate heaps for VRAM and system RAM. An integrated GPU often has a single heap that is both device-local and host-visible. Your code must query these at runtime and choose accordingly.
The decision tree
When allocating memory for a resource, follow this logic:
Is this data written by the CPU every frame?
├── Yes → HOST_VISIBLE | HOST_COHERENT
│ (uniform buffers, dynamic vertex data)
│
└── No → Is this data uploaded once and never touched again?
├── Yes → DEVICE_LOCAL (use a staging buffer to upload)
│ (static meshes, textures)
│
└── No → Is this data read back by the CPU?
├── Yes → HOST_VISIBLE | HOST_CACHED
│ (readback buffers, screenshots)
│
└── No → DEVICE_LOCAL
(render targets, compute output)
Worked example: uploading a mesh to the GPU
This is the most common memory operation in Vulkan: getting vertex data from the CPU into fast GPU memory. It uses the staging buffer pattern.
Step 1: Create the destination buffer
use vulkan_rust::vk;
use vk::*;
// The buffer that will hold the mesh on the GPU.
// TRANSFER_DST means "this buffer can receive data from a copy command."
let buffer_info = BufferCreateInfo::builder()
.size(vertex_data_size)
.usage(
BufferUsageFlags::VERTEX_BUFFER
| BufferUsageFlags::TRANSFER_DST
)
.sharing_mode(SharingMode::EXCLUSIVE);
let gpu_buffer = unsafe { device.create_buffer(&buffer_info, None)? };
Step 2: Query what memory this buffer needs
// The driver tells us: how many bytes, what alignment, and which
// memory types are compatible with this buffer.
let mem_requirements = unsafe {
device.get_buffer_memory_requirements(gpu_buffer)
};
// mem_requirements.size → minimum allocation size
// mem_requirements.alignment → byte alignment requirement
// mem_requirements.memory_type_bits → bitmask of compatible memory types
Step 3: Find the right memory type
use vulkan_rust::vk;
use vk::*;
// Query what memory the hardware offers.
let mem_properties = unsafe {
instance.get_physical_device_memory_properties(physical_device)
};
// Find a memory type that is:
// 1. Compatible with the buffer (listed in memory_type_bits)
// 2. Device-local (fast GPU access)
let desired = MemoryPropertyFlags::DEVICE_LOCAL;
let memory_type_index = (0..mem_properties.memory_type_count)
.find(|&i| {
let type_compatible =
mem_requirements.memory_type_bits & (1 << i) != 0;
let properties_match =
mem_properties.memory_types[i as usize]
.property_flags & desired == desired;
type_compatible && properties_match
})
.expect("No suitable memory type found");
Before reading on: the code above iterates memory types in order (0, 1, 2, …). The Vulkan spec recommends that drivers list memory types from most preferred to least preferred. Why does picking the first match give you the best performance?
Step 4: Allocate and bind
use vulkan_rust::vk;
use vk::*;
let alloc_info = MemoryAllocateInfo::builder()
.allocation_size(mem_requirements.size)
.memory_type_index(memory_type_index);
let gpu_memory = unsafe { device.allocate_memory(&alloc_info, None)? };
// Bind the memory to the buffer. After this, the buffer is backed
// by real memory and can be used.
unsafe { device.bind_buffer_memory(gpu_buffer, gpu_memory, 0)? };
Step 5: Upload via staging buffer
Device-local memory is usually not host-visible, so you can’t write to it directly from the CPU. The solution: create a temporary staging buffer in host-visible memory, write your data there, then copy to the GPU buffer.
use vulkan_rust::vk;
use vk::*;
// Create a temporary staging buffer in host-visible memory.
let staging_info = BufferCreateInfo::builder()
.size(vertex_data_size)
.usage(BufferUsageFlags::TRANSFER_SRC)
.sharing_mode(SharingMode::EXCLUSIVE);
let staging_buffer = unsafe { device.create_buffer(&staging_info, None)? };
let staging_reqs = unsafe {
device.get_buffer_memory_requirements(staging_buffer)
};
// Find HOST_VISIBLE | HOST_COHERENT memory for the staging buffer.
let staging_desired =
MemoryPropertyFlags::HOST_VISIBLE
| MemoryPropertyFlags::HOST_COHERENT;
let staging_type_index = (0..mem_properties.memory_type_count)
.find(|&i| {
let type_ok = staging_reqs.memory_type_bits & (1 << i) != 0;
let props_ok =
mem_properties.memory_types[i as usize]
.property_flags & staging_desired == staging_desired;
type_ok && props_ok
})
.expect("No host-visible memory type found");
let staging_alloc = MemoryAllocateInfo::builder()
.allocation_size(staging_reqs.size)
.memory_type_index(staging_type_index);
let staging_memory = unsafe {
device.allocate_memory(&staging_alloc, None)?
};
unsafe { device.bind_buffer_memory(staging_buffer, staging_memory, 0)? };
// Map the staging memory, copy vertex data in, then unmap.
unsafe {
let data_ptr = device.map_memory(
staging_memory,
0,
vertex_data_size,
MemoryMapFlags::empty(),
)?;
core::ptr::copy_nonoverlapping(
vertices.as_ptr() as *const u8,
data_ptr as *mut u8,
vertex_data_size as usize,
);
// Because we chose HOST_COHERENT, we do not need to call
// flush_mapped_memory_ranges. The write is automatically
// visible to the GPU.
device.unmap_memory(staging_memory);
};
// Record a command to copy from staging → gpu buffer.
// (Command buffer recording is covered in the Command Buffers chapter.)
// ... cmd_copy_buffer(staging_buffer, gpu_buffer, &[region]) ...
// After the copy completes on the GPU, clean up the staging buffer.
unsafe {
device.destroy_buffer(staging_buffer, None);
device.free_memory(staging_memory, None);
};
Why not skip the staging buffer?
On some GPUs (especially integrated GPUs and GPUs with Resizable BAR),
there is a memory type that is both DEVICE_LOCAL and HOST_VISIBLE.
In that case, you can map device-local memory directly and skip
the staging buffer. But this memory is often limited in size and not
available on all hardware. The staging buffer pattern works everywhere.
Formal reference
Memory property flags
| Flag | Meaning |
|---|---|
DEVICE_LOCAL | Fastest for GPU access. Usually not host-visible on discrete GPUs. |
HOST_VISIBLE | Can be mapped with map_memory for CPU read/write. |
HOST_COHERENT | Mapped writes are automatically visible to the GPU (no flush needed). |
HOST_CACHED | Mapped reads come from CPU cache (fast reads). Requires invalidate before reading GPU-written data. |
LAZILY_ALLOCATED | Memory may not be allocated until used. For transient attachments only. |
PROTECTED | For DRM-protected content. |
The memory type selection algorithm
use vulkan_rust::vk;
use vk::*;
fn find_memory_type(
mem_properties: &PhysicalDeviceMemoryProperties,
type_bits: u32, // from MemoryRequirements.memory_type_bits
desired: MemoryPropertyFlags,
) -> Option<u32> {
(0..mem_properties.memory_type_count).find(|&i| {
let compatible = type_bits & (1 << i) != 0;
let has_properties =
mem_properties.memory_types[i as usize].property_flags
& desired == desired;
compatible && has_properties
})
}
This function appears in nearly every Vulkan application. It finds the first memory type that is compatible with the resource and has the properties you need.
Flush and invalidate
If you use memory that is HOST_VISIBLE but not HOST_COHERENT:
- After writing from the CPU, call
flush_mapped_memory_rangesto make your writes visible to the GPU. - Before reading on the CPU (after the GPU has written), call
invalidate_mapped_memory_rangesto refresh the CPU’s view.
With HOST_COHERENT memory, neither call is needed. Most applications
use coherent memory for simplicity.
Key structs
| Struct | Purpose |
|---|---|
PhysicalDeviceMemoryProperties | Describes all heaps and types on the hardware |
MemoryType | One entry: property flags + which heap it draws from |
MemoryHeap | One pool: total size in bytes + heap flags |
MemoryRequirements | What a buffer/image needs: size, alignment, compatible types |
MemoryAllocateInfo | Input to allocate_memory: how many bytes, which type |
MappedMemoryRange | Range for flush/invalidate when not using coherent memory |
Destruction order
1. Ensure GPU is not using the buffer/image (fence or device_wait_idle)
2. Destroy the buffer/image (device.destroy_buffer / device.destroy_image)
3. Free the memory (device.free_memory)
You must unbind (destroy) all buffers and images from a DeviceMemory
before freeing it.
API reference links
MemoryPropertyFlagsPhysicalDeviceMemoryPropertiesMemoryRequirements- Vulkan spec: Memory Allocation
Key takeaways
- Vulkan separates buffer/image creation from memory allocation. You create the resource, ask what memory it needs, allocate, then bind.
- Memory types have different properties (device-local, host-visible, coherent, cached). Choose based on your access pattern.
- The staging buffer pattern (host-visible temp → device-local permanent) is the standard way to upload data on discrete GPUs.
- Query memory properties at runtime. Never assume a specific memory layout; it varies between GPUs.
- In production, use a sub-allocator (like VMA). Allocating per-buffer is correct but slow.
Command Buffers
Motivation
In OpenGL, calling glDrawArrays immediately sends work to the GPU
(or at least, the driver pretends it does). In Vulkan, you record
commands into a buffer, then submit that buffer to a queue. The GPU
processes the queue asynchronously while your CPU moves on.
This separation exists for three reasons:
- Batching. One submission of many commands is cheaper than many individual calls. Each submission has overhead (kernel transitions, driver bookkeeping), so bundling hundreds of draw calls into a single command buffer and submitting once is dramatically faster.
- Reuse. You can record a command buffer once and submit it many times. If a scene doesn’t change, why re-record every frame?
- Multi-threading. Different CPU threads can record into different command buffers simultaneously, then submit them all on one thread. This is how modern engines scale across CPU cores.
Intuition
The shopping list analogy
A command buffer is a shopping list. You write down everything you need (“bind this pipeline”, “draw 36 vertices”, “copy this image”), then hand the list to someone else (a GPU queue) who goes and does it all. You don’t stand in the store waiting for each item, you hand off the list and do other work.
The lifecycle looks like this:
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Record │────>│ Submit │────>│ Execute │
│ (CPU) │ │ (CPU→GPU) │ │ (GPU) │
│ │ │ │ │ │
│ "bind X" │ │ hand off │ │ GPU reads │
│ "draw Y" │ │ to queue │ │ the list │
│ "copy Z" │ │ │ │ and acts │
└────────────┘ └────────────┘ └────────────┘
The CPU is free to do other work (including recording the next frame’s command buffer) while the GPU executes.
Command pools: why they exist
Allocating command buffers one at a time would be like allocating individual bytes from the OS. It’s correct, but the overhead per allocation is huge. Command pools solve this by pre-allocating a chunk of memory, then handing out command buffers from that pool cheaply.
┌──────────── Command Pool ────────────┐
│ │
│ ┌──────────┐ ┌──────────┐ │
│ │ CmdBuf 0 │ │ CmdBuf 1 │ ... │
│ └──────────┘ └──────────┘ │
│ │
│ (all allocated from one pool) │
│ (pool is tied to one queue family) │
└──────────────────────────────────────┘
Each pool is tied to a single queue family. This lets the driver optimize the memory layout for that queue type.
Before reading on: if command pools are tied to a single queue family, and you want to record commands for both a graphics queue and a transfer queue, how many pools do you need?
Primary vs secondary command buffers
Primary command buffers are what you submit to queues. They can contain any command.
Secondary command buffers cannot be submitted directly. Instead,
they are executed from within a primary command buffer using
cmd_execute_commands. Think of them as subroutines: you record
reusable chunks of work (like “render the UI”) into secondary buffers,
then call them from your primary buffer.
Primary command buffer:
begin render pass
bind pipeline A
draw meshes
execute_commands(secondary_ui_buffer) ← calls the secondary
end render pass
Most applications start with primary buffers only and add secondary buffers when they need multi-threaded recording or reusable sub-passes.
Worked example: record and submit
This example creates a command pool, allocates a command buffer, records a simple buffer copy, and submits it.
Step 1: Create a command pool
use vulkan_rust::vk;
use vk::*;
// Create a pool for the graphics queue family.
// RESET_COMMAND_BUFFER lets us reset individual command buffers
// instead of resetting the entire pool.
let pool_info = CommandPoolCreateInfo::builder()
.flags(CommandPoolCreateFlags::RESET_COMMAND_BUFFER)
.queue_family_index(graphics_queue_family);
let command_pool = unsafe {
device.create_command_pool(&pool_info, None)?
};
Step 2: Allocate a command buffer
use vulkan_rust::vk;
use vk::*;
// Allocate one primary command buffer from the pool.
let alloc_info = CommandBufferAllocateInfo::builder()
.command_pool(command_pool)
.level(CommandBufferLevel::PRIMARY)
.command_buffer_count(1);
// allocate_command_buffers returns a Vec of handles.
let command_buffer = unsafe {
device.allocate_command_buffers(&alloc_info)?
}[0];
Step 3: Record commands
use vulkan_rust::vk;
use vk::*;
// Begin recording. ONE_TIME_SUBMIT tells the driver this buffer
// will be submitted once and then reset or freed, enabling
// driver-side optimizations.
let begin_info = CommandBufferBeginInfo::builder()
.flags(CommandBufferUsageFlags::ONE_TIME_SUBMIT);
unsafe {
device.begin_command_buffer(command_buffer, &begin_info)?;
};
// Record a buffer copy command.
// This does NOT execute the copy. It records the instruction
// into the command buffer for later execution.
let copy_region = BufferCopy {
src_offset: 0,
dst_offset: 0,
size: 1024,
};
unsafe {
device.cmd_copy_buffer(
command_buffer,
src_buffer,
dst_buffer,
&[copy_region],
);
};
// Finish recording.
unsafe { device.end_command_buffer(command_buffer)? };
Before reading on: between
begin_command_bufferandend_command_buffer, the command buffer is in the “recording” state. What do you think happens if you try to submit a command buffer that is still in the recording state?
Step 4: Submit to a queue
use vulkan_rust::vk;
use vk::*;
// Build a submit info. This describes:
// - which command buffers to execute
// - which semaphores to wait on before starting
// - which semaphores to signal when done
let submit_info = SubmitInfo::builder()
.command_buffers(&[command_buffer]);
// Submit to the graphics queue.
// The Fence (here Fence::null()) will be signaled when the GPU
// finishes all commands in this submission. Passing null means
// "I don't need to know when it's done from the CPU."
unsafe {
device.queue_submit(
graphics_queue,
&[*submit_info],
Fence::null(),
)?;
};
// For this example, we wait for the queue to finish before
// continuing. In a real application, you would use a fence
// instead of blocking the CPU.
unsafe { device.queue_wait_idle(graphics_queue)? };
Step 5: Clean up
use vulkan_rust::vk;
use vk::*;
// Option A: Free the command buffer back to the pool.
unsafe {
device.free_command_buffers(command_pool, &[command_buffer]);
};
// Option B: Reset for reuse (only if pool was created with
// RESET_COMMAND_BUFFER flag).
unsafe {
device.reset_command_buffer(
command_buffer,
CommandBufferResetFlags::empty(),
)?;
};
// When you're done with the pool entirely:
unsafe { device.destroy_command_pool(command_pool, None) };
// This implicitly frees all command buffers allocated from it.
Command buffer states
A command buffer is always in one of these states:
allocate
┌────────────────────────────────┐
v │
Initial ──begin──> Recording ──end──> Executable ──submit──> Pending
^ │ │
│ │ │
└──────────── reset ───────────────────┘ (GPU finishes) |
│ │
└─────────────────────────────────────────────────────────────┘
(returns to Executable or Initial)
| State | What you can do |
|---|---|
| Initial | Nothing useful. Call begin_command_buffer to start recording. |
| Recording | Record commands (cmd_* methods). Call end_command_buffer when done. |
| Executable | Submit to a queue. Or reset to record again. |
| Pending | The GPU is executing it. Do not touch it. Wait for completion. |
The most common mistake is trying to re-record or reset a command buffer while it is still pending (the GPU hasn’t finished yet). Validation layers will catch this.
Common patterns
One-shot command buffer for transfers
Many operations (uploading textures, transitioning image layouts) need a command buffer just once. The pattern:
use vulkan_rust::vk;
use vk::*;
unsafe fn one_shot_submit(
device: &Device,
pool: CommandPool,
queue: Queue,
record: impl FnOnce(CommandBuffer),
) -> VkResult<()> {
// Allocate
let alloc_info = CommandBufferAllocateInfo::builder()
.command_pool(pool)
.level(CommandBufferLevel::PRIMARY)
.command_buffer_count(1);
let cmd = unsafe { device.allocate_command_buffers(&alloc_info)? }[0];
// Record
let begin = CommandBufferBeginInfo::builder()
.flags(CommandBufferUsageFlags::ONE_TIME_SUBMIT);
unsafe { device.begin_command_buffer(cmd, &begin)? };
record(cmd);
unsafe { device.end_command_buffer(cmd)? };
// Submit and wait
let submit = SubmitInfo::builder()
.command_buffers(&[cmd]);
unsafe {
device.queue_submit(queue, &[*submit], Fence::null())?;
device.queue_wait_idle(queue)?;
};
// Free
unsafe { device.free_command_buffers(pool, &[cmd]) };
Ok(())
}
This is the pattern used for staging buffer uploads in the Memory Management chapter.
Per-frame command buffers
For rendering, you typically have one command buffer per frame in flight:
Frame 0: [record on CPU] ──submit──> [execute on GPU]
Frame 1: [record on CPU] ──submit──> [execute on GPU]
↑ ↑
recording while executing the
GPU runs the commands we
previous frame just submitted
Each frame waits for its fence before re-recording. See Synchronization for how fences and semaphores coordinate this.
Formal reference
Command pool creation flags
| Flag | Meaning |
|---|---|
TRANSIENT | Hint: command buffers from this pool are short-lived. Lets the driver optimize allocation. |
RESET_COMMAND_BUFFER | Allows individual command buffers to be reset. Without this, you can only reset the entire pool. |
PROTECTED | Command buffers allocated from this pool can operate on protected resources. |
Command buffer begin flags
| Flag | Meaning |
|---|---|
ONE_TIME_SUBMIT | This buffer will be submitted once, then reset or freed. Enables driver optimizations. |
RENDER_PASS_CONTINUE | Secondary command buffer: this will be entirely inside a render pass. |
SIMULTANEOUS_USE | This buffer can be submitted to multiple queues or resubmitted while still pending. |
Recording methods on Device
All recording methods follow the pattern device.cmd_*(command_buffer, ...).
The device dispatches to the correct function pointer, the command_buffer
identifies which buffer to record into. Examples:
| Method | Purpose |
|---|---|
cmd_bind_pipeline(cb, bind_point, pipeline) | Set the active pipeline |
cmd_draw(cb, vertices, instances, first_vert, first_inst) | Draw without an index buffer |
cmd_copy_buffer(cb, src, dst, &[regions]) | Copy between buffers |
cmd_begin_render_pass(cb, &begin_info, contents) | Start a render pass |
cmd_end_render_pass(cb) | End the current render pass |
The full list has ~150 cmd_* methods covering every Vulkan command.
Destruction rules
- Wait for the GPU before freeing. A command buffer in the Pending
state must not be freed or reset. Use a fence or
device_wait_idle. - Destroying a pool frees all its buffers. You do not need to free command buffers individually before destroying their pool.
- Pools are not thread-safe. If two threads record command buffers from the same pool, you must synchronize externally. The typical solution: one pool per thread.
SubmitInfo structure
SubmitInfo connects command buffers to synchronization primitives:
SubmitInfo {
wait_semaphores + wait_dst_stage_mask ← "wait for these before starting"
command_buffers ← "execute these"
signal_semaphores ← "signal these when done"
}
The wait_dst_stage_mask specifies which pipeline stages must wait,
not the entire submission. This enables the GPU to start early stages
while still waiting for a semaphore on a later stage.
API reference links
Key takeaways
- Commands are recorded, not executed. Recording is cheap CPU work; execution happens asynchronously on the GPU.
- Command pools amortize allocation cost. One pool per queue family, typically one pool per thread.
- Command buffers have states: Initial → Recording → Executable → Pending. Never touch a Pending buffer.
- Use
ONE_TIME_SUBMITfor throw-away work (uploads, transitions). Use per-frame buffers with fences for rendering. - The
SubmitInfostruct is where command buffers meet synchronization. That connection is the topic of the next chapter.
Synchronization
Threshold concept. Synchronization is the single most confusing aspect of Vulkan for newcomers. Once you understand it, you understand Vulkan’s execution model. If this chapter takes you three reads, that is completely normal.
Motivation
The GPU does not execute your commands in the order you recorded them. Not between queues, not between submissions, and not even between draw calls within the same command buffer. The GPU pipelines work: while one draw call is running its fragment shader, the next draw call might already be running its vertex shader.
Vulkan gives you zero implicit ordering guarantees.
This sounds terrifying, and it is also why Vulkan is fast. The GPU can overlap operations, reorder for efficiency, and keep all its hardware units busy. But it means you must tell the driver when ordering matters, because only you know which operations depend on each other.
Intuition: the factory
Imagine a factory with multiple assembly lines (queue families). Each line has workers at different stations (pipeline stages) who process items one after another.
Without synchronization, the factory runs at full speed: items flow through stations as fast as possible, and different lines operate independently. This is great, until you have a dependency: “station B needs the output from station A before it can start.”
Vulkan gives you four tools to express these dependencies:
┌─────────┬────────────────────────────┬──────────────────────────┐
│ Tool │ What it synchronizes │ Analogy │
├─────────┼────────────────────────────┼──────────────────────────┤
│ Fence │ GPU → CPU │ A sign on the factory │
│ │ "is the GPU done yet?" │ door: "batch complete" │
├─────────┼────────────────────────────┼──────────────────────────┤
│ Sema- │ Queue → Queue │ A conveyor belt between │
│ phore │ "queue B waits for queue A"│ two assembly lines │
├─────────┼────────────────────────────┼──────────────────────────┤
│ Barrier │ Command → Command │ A supervisor on one │
│ │ (within a command buffer) │ line: "wait for station │
│ │ │ A before station B" │
├─────────┼────────────────────────────┼──────────────────────────┤
│ Event │ Split barrier │ A sticky note: "I'll │
│ │ (signal now, wait later) │ leave this here, check │
│ │ │ for it later" │
└─────────┴────────────────────────────┴──────────────────────────┘
Each tool solves a different problem. Using the wrong tool for the job is a common source of bugs.
Before reading on: you submit two command buffers to the same queue, one after the other. Does the second one wait for the first to finish before it starts executing?
Answer: No. Submissions to the same queue begin in order, but their execution can overlap. The second submission might start its vertex shader while the first is still running its fragment shader. If you need the first to fully complete before the second starts, you need explicit synchronization.
Worked example 1: CPU waits for GPU (Fence)
Problem: You submitted a command buffer. You need to know when the GPU is done so you can read back the results, or so you can safely re-record the command buffer for the next frame.
Solution: A fence. You pass it to queue_submit, and the GPU
signals it when all commands in that submission finish.
use vulkan_rust::vk;
use vk::*;
// ── Create a fence ──────────────────────────────────────────────
//
// SIGNALED means the fence starts in the signaled state.
// This matters for the first frame: wait_for_fences on an
// unsignaled fence with no prior submission would block forever.
let fence_info = FenceCreateInfo::builder()
.flags(FenceCreateFlags::SIGNALED);
let fence = unsafe { device.create_fence(&fence_info, None)? };
use vulkan_rust::vk;
use vk::*;
// ── The render loop ─────────────────────────────────────────────
// Step 1: Wait for the previous frame's GPU work to finish.
// timeout = u64::MAX means "wait forever"
// wait_all = 1 (true) means "wait for ALL fences in the slice"
unsafe {
device.wait_for_fences(&[fence], true, u64::MAX)?;
};
// Step 2: Reset the fence so it can be signaled again.
// A fence can only be signaled once. You must reset it
// before reusing.
unsafe { device.reset_fences(&[fence])? };
// Step 3: Record and submit a command buffer.
// ...record commands...
let submit = SubmitInfo::builder()
.command_buffers(&[command_buffer]);
// Pass the fence to queue_submit. The GPU will signal it
// when this submission completes.
unsafe {
device.queue_submit(queue, &[*submit], fence)?;
};
// The CPU continues immediately. The GPU works in parallel.
// Next iteration, wait_for_fences will block until the GPU
// signals this fence.
The fence lifecycle:
create (SIGNALED)
│
v
┌─> wait ──> reset ──> submit (with fence) ──> GPU signals ─┐
│ │
└───────────────────────────────────────────────────────────┘
When to use fences
- Waiting for a frame to finish before re-recording its command buffer
- Waiting for a transfer to complete before reading the result on the CPU
- Throttling the CPU so it doesn’t race too far ahead of the GPU
Worked example 2: Queue-to-queue sync (Semaphore)
Problem: The swapchain gives you an image to render into, but the image might not be ready yet (the display might still be reading it). After rendering, you need to present the image, but only after the render commands finish.
Solution: Two semaphores. One says “the image is ready to render into.” The other says “rendering is done, safe to present.”
use vulkan_rust::vk;
use vk::*;
// Create two semaphores (no flags needed).
let sem_info = SemaphoreCreateInfo::builder();
let image_available = unsafe { device.create_semaphore(&sem_info, None)? };
let render_finished = unsafe { device.create_semaphore(&sem_info, None)? };
use vulkan_rust::vk;
use vk::*;
// ── Acquire a swapchain image ───────────────────────────────────
//
// This signals image_available when the image is ready.
let image_index = unsafe {
device.acquire_next_image_khr(
swapchain,
u64::MAX, // timeout
image_available, // semaphore to signal
Fence::null(), // no fence needed here
)?
};
// ── Submit rendering commands ───────────────────────────────────
//
// Wait on image_available (at the COLOR_ATTACHMENT_OUTPUT stage,
// because that's when we actually write to the image).
// Signal render_finished when done.
let wait_stages = [PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT];
let submit = SubmitInfo::builder()
.wait_semaphores(&[image_available])
.wait_dst_stage_mask(&wait_stages)
.command_buffers(&[command_buffer])
.signal_semaphores(&[render_finished]);
unsafe {
device.queue_submit(queue, &[*submit], frame_fence)?;
};
// ── Present the image ───────────────────────────────────────────
//
// Wait on render_finished before the display reads the image.
let present_info = PresentInfoKHR::builder()
.wait_semaphores(&[render_finished])
.swapchains(&[swapchain])
.image_indices(&[image_index]);
unsafe { device.queue_present_khr(queue, &present_info)? };
The semaphore flow:
acquire_next_image ──signals──> image_available
│
│ (GPU waits at COLOR_ATTACHMENT_OUTPUT)
v
queue_submit ──signals──> render_finished
│
│ (presentation waits)
v
queue_present
Semaphores vs fences
| Fence | Semaphore | |
|---|---|---|
| Who waits? | The CPU (via wait_for_fences) | The GPU (another queue operation) |
| Who signals? | The GPU (via queue_submit) | The GPU (via queue_submit) |
| Use case | CPU needs to know when GPU is done | One GPU operation depends on another |
| Can you query it? | Yes (get_fence_status) | No (GPU-only) |
Before reading on: why does the submit wait at
COLOR_ATTACHMENT_OUTPUTspecifically, instead of waiting atTOP_OF_PIPE(the very beginning)? What work can the GPU do before it needs the swapchain image?Answer: The vertex shader, tessellation, and geometry stages do not write to the swapchain image. They can run while the image is still being read by the display. Only the color attachment output stage needs the image, so we delay waiting until that point. This lets the GPU overlap more work.
Worked example 3: Image layout transition (Pipeline Barrier)
Problem: You want to copy data into an image, then sample it in
a fragment shader. The image must be in TRANSFER_DST_OPTIMAL layout
for the copy, then transitioned to SHADER_READ_ONLY_OPTIMAL for
sampling. The GPU must finish the copy before the transition, and
finish the transition before the shader reads.
Solution: A pipeline barrier with an image memory barrier.
use vulkan_rust::vk;
use vk::*;
// Transition image from TRANSFER_DST to SHADER_READ_ONLY.
//
// This barrier says:
// "All TRANSFER_WRITE operations in the TRANSFER stage must
// complete before any SHADER_READ operations in the
// FRAGMENT_SHADER stage can begin. Also, change the image
// layout."
let barrier = ImageMemoryBarrier::builder()
.src_access_mask(AccessFlags::TRANSFER_WRITE)
.dst_access_mask(AccessFlags::SHADER_READ)
.old_layout(ImageLayout::TRANSFER_DST_OPTIMAL)
.new_layout(ImageLayout::SHADER_READ_ONLY_OPTIMAL)
.src_queue_family_index(QUEUE_FAMILY_IGNORED)
.dst_queue_family_index(QUEUE_FAMILY_IGNORED)
.image(texture_image)
.subresource_range(ImageSubresourceRange {
aspect_mask: ImageAspectFlags::COLOR,
base_mip_level: 0,
level_count: 1,
base_array_layer: 0,
layer_count: 1,
});
unsafe {
device.cmd_pipeline_barrier(
command_buffer,
PipelineStageFlags::TRANSFER, // src stage
PipelineStageFlags::FRAGMENT_SHADER, // dst stage
DependencyFlags::empty(),
&[], // no memory barriers
&[], // no buffer memory barriers
&[*barrier], // one image memory barrier
);
};
Understanding the three parts of a barrier
A pipeline barrier has three components that work together:
1. Stage mask: WHEN must things happen?
"Transfer stage must finish before fragment shader starts"
2. Access mask: WHAT memory operations are involved?
"Writes from transfers must be visible to shader reads"
3. Layout: HOW should the image be reorganized?
"Convert from transfer-optimal to shader-read-optimal tiling"
All three are needed. The stage mask creates an execution dependency (ordering of operations). The access mask creates a memory dependency (visibility of writes). The layout transition physically reorganizes how the image data is stored in memory.
A common mistake is setting the stage masks correctly but forgetting the access masks, or vice versa. Both are required for correctness.
Common barrier recipes
| From | To | src stage | dst stage | src access | dst access |
|---|---|---|---|---|---|
| Transfer → Shader read | TRANSFER_DST → SHADER_READ_ONLY | TRANSFER | FRAGMENT_SHADER | TRANSFER_WRITE | SHADER_READ |
| Undefined → Transfer dst | UNDEFINED → TRANSFER_DST | TOP_OF_PIPE | TRANSFER | NONE | TRANSFER_WRITE |
| Undefined → Color attachment | UNDEFINED → COLOR_ATTACHMENT | TOP_OF_PIPE | COLOR_ATTACHMENT_OUTPUT | NONE | COLOR_ATTACHMENT_WRITE |
| Color attachment → Present | COLOR_ATTACHMENT → PRESENT_SRC | COLOR_ATTACHMENT_OUTPUT | BOTTOM_OF_PIPE | COLOR_ATTACHMENT_WRITE | NONE |
| Color attachment → Shader read | COLOR_ATTACHMENT → SHADER_READ_ONLY | COLOR_ATTACHMENT_OUTPUT | FRAGMENT_SHADER | COLOR_ATTACHMENT_WRITE | SHADER_READ |
Keep this table handy. Most applications only need these transitions.
Pipeline stages: the execution order
To understand stage masks, you need to know the order the GPU processes work. Here is the graphics pipeline, simplified:
TOP_OF_PIPE (pseudo-stage: "before anything")
│
v
DRAW_INDIRECT (read indirect draw parameters)
│
v
VERTEX_INPUT (read vertex/index buffers)
│
v
VERTEX_SHADER (run vertex shader)
│
v
EARLY_FRAGMENT_TESTS (depth/stencil test before fragment shader)
│
v
FRAGMENT_SHADER (run fragment shader)
│
v
LATE_FRAGMENT_TESTS (depth/stencil test after fragment shader)
│
v
COLOR_ATTACHMENT_OUTPUT (write to color attachments)
│
v
BOTTOM_OF_PIPE (pseudo-stage: "after everything")
Special stages (not in the pipeline order):
TRANSFER (copy/blit/clear operations)
COMPUTE_SHADER (compute dispatch)
HOST (CPU reads/writes to mapped memory)
ALL_GRAPHICS (shorthand for all graphics stages)
ALL_COMMANDS (shorthand for everything)
When you set src_stage = TRANSFER and dst_stage = FRAGMENT_SHADER,
you are saying: “everything in the TRANSFER stage that came before
this barrier must finish before anything in the FRAGMENT_SHADER stage
that comes after this barrier can start.”
Events: split barriers
Events are an advanced optimization. A pipeline barrier creates a dependency at a single point in the command buffer. An event lets you split the barrier: signal it at one point, wait for it at a later point. This gives the GPU more room to reorder work between the signal and the wait.
use vulkan_rust::vk;
use vk::*;
// Signal the event after the transfer completes.
unsafe {
device.cmd_set_event(
command_buffer,
event,
PipelineStageFlags::TRANSFER,
);
};
// ... other commands that don't depend on the transfer ...
// Wait for the event before the fragment shader reads.
unsafe {
device.cmd_wait_events(
command_buffer,
&[event],
PipelineStageFlags::TRANSFER,
PipelineStageFlags::FRAGMENT_SHADER,
&[], &[], &[*image_barrier],
);
};
Most applications do not need events. Use pipeline barriers until profiling shows you need the extra overlap.
Formal reference
Synchronization primitives
| Primitive | Scope | Signal | Wait | Create | Destroy |
|---|---|---|---|---|---|
| Fence | GPU → CPU | queue_submit | wait_for_fences | create_fence | destroy_fence |
| Semaphore | Queue → Queue | queue_submit (signal) | queue_submit (wait) | create_semaphore | destroy_semaphore |
| Pipeline Barrier | Within command buffer | N/A | N/A | N/A (inline command) | N/A |
| Event | Split barrier | cmd_set_event | cmd_wait_events | create_event | destroy_event |
FenceCreateFlags
| Flag | Meaning |
|---|---|
SIGNALED | Fence starts in the signaled state. Use this for the first frame so wait_for_fences doesn’t block forever. |
PipelineStageFlags (most used)
| Flag | When it runs |
|---|---|
TOP_OF_PIPE | Before any work. Used as src when there’s nothing to wait for. |
VERTEX_INPUT | Reading vertex/index buffers. |
VERTEX_SHADER | Running the vertex shader. |
FRAGMENT_SHADER | Running the fragment shader. |
COLOR_ATTACHMENT_OUTPUT | Writing to color attachments. |
TRANSFER | Copy, blit, and clear operations. |
COMPUTE_SHADER | Running compute shaders. |
BOTTOM_OF_PIPE | After all work. Used as dst when nothing needs to wait. |
ALL_COMMANDS | Shorthand for every stage. Correct but may be slower than a precise mask. |
AccessFlags (most used)
| Flag | What it protects |
|---|---|
VERTEX_ATTRIBUTE_READ | Vertex shader reads from vertex buffers. |
UNIFORM_READ | Shader reads from uniform buffers. |
SHADER_READ | Shader reads (sampled images, storage buffers). |
SHADER_WRITE | Shader writes (storage images, storage buffers). |
COLOR_ATTACHMENT_READ | Reading color attachments (e.g., blending). |
COLOR_ATTACHMENT_WRITE | Writing to color attachments. |
TRANSFER_READ | Source of a copy/blit. |
TRANSFER_WRITE | Destination of a copy/blit. |
HOST_READ | CPU reads from mapped memory. |
HOST_WRITE | CPU writes to mapped memory. |
ImageLayout values
| Layout | Optimized for |
|---|---|
UNDEFINED | Nothing. Contents are discarded. Use as old_layout when you don’t care about existing data. |
GENERAL | Anything, but not optimal for anything. Last resort. |
COLOR_ATTACHMENT_OPTIMAL | Writing as a color attachment (rendering). |
DEPTH_STENCIL_ATTACHMENT_OPTIMAL | Writing as a depth/stencil attachment. |
SHADER_READ_ONLY_OPTIMAL | Sampling in a shader. |
TRANSFER_SRC_OPTIMAL | Source of a copy/blit. |
TRANSFER_DST_OPTIMAL | Destination of a copy/blit. |
PRESENT_SRC | Presentation to the display (swapchain). |
The happens-before relationship
Vulkan defines ordering through execution dependencies and memory dependencies:
- An execution dependency guarantees that operations in the first synchronization scope (src) complete before operations in the second synchronization scope (dst) begin.
- A memory dependency guarantees that writes in the first access scope are visible to reads in the second access scope.
Both are needed. Without the execution dependency, operations might overlap. Without the memory dependency, caches might serve stale data even after the operation completes.
API reference links
Common mistakes
-
Forgetting access masks. Stage masks alone create execution dependencies, but GPU caches can still serve stale data. You need access masks for memory visibility.
-
Using
ALL_COMMANDS/ALL_GRAPHICSeverywhere. Correct, but overly broad. The GPU can’t overlap anything across a full-pipeline barrier. Use precise stages for better performance. -
Reusing a fence without resetting it. A signaled fence stays signaled forever.
wait_for_fencesreturns immediately on an already-signaled fence. Alwaysreset_fencesbefore resubmitting. -
Submitting while a command buffer is still pending. If the GPU hasn’t finished with a command buffer, you cannot re-record it. Wait for its fence first.
-
Missing the initial fence SIGNALED flag. On the first frame, there is no prior submission to signal the fence. Creating with
SIGNALEDavoids an infinite wait.
Key takeaways
- The GPU does not execute commands in order. You must add explicit synchronization where ordering matters.
- Fences let the CPU wait for the GPU. Semaphores let one GPU operation wait for another. Barriers order commands within a command buffer.
- Barriers have three parts: when (stage masks), what (access masks), and how (layout transitions). All three are needed.
- Start with the common barrier recipes table. Most applications only need a handful of transitions.
- When in doubt, use broader stages (
ALL_COMMANDS) to get correct behavior first, then narrow down for performance later.
Render Passes & Framebuffers
Motivation
A render pass tells Vulkan the structure of your rendering: what attachments you use (color, depth), how they are loaded and stored, and how subpasses depend on each other. This information lets the driver make hardware-specific optimizations, especially on tile-based GPUs (mobile) where the render pass boundaries determine what fits in on-chip memory.
If you skip this concept and just try to render, the validation layers will immediately tell you: “you need a render pass.” Understanding why it exists will save you from cargo-culting boilerplate you don’t understand.
Intuition
Blueprint and canvas
A render pass is a blueprint for a painting session. It describes:
- What surfaces you’ll paint on (attachments: color, depth, stencil)
- How each surface is prepared before painting (load ops)
- What happens to each surface after painting (store ops)
- If there are multiple phases (subpasses) and how they depend on each other
A framebuffer is the specific canvas, the actual images that match the blueprint’s description.
Render Pass (blueprint) Framebuffer (canvas)
┌───────────────────────┐ ┌────────────────────────┐
│ Attachment 0: │ │ Attachment 0: │
│ format: B8G8R8A8 │───matches──│ swapchain_image_view │
│ load: CLEAR │ │ │
│ store: STORE │ │ Attachment 1: │
│ layout: → PRESENT │───matches──│ depth_image_view │
│ │ │ │
│ Attachment 1: │ │ width: 1920 │
│ format: D32_SFLOAT │ │ height: 1080 │
│ load: CLEAR │ │ layers: 1 │
│ store: DONT_CARE │ └────────────────────────┘
│ layout: → DEPTH_OPT │
│ │
│ Subpass 0: │
│ color: [0] │
│ depth: [1] │
└───────────────────────┘
You create the render pass once. You create a framebuffer for each set of images you render to (typically one per swapchain image).
Load and store ops: why they matter
When a render pass begins, the driver needs to know what to do with each attachment’s existing contents:
| Load Op | Meaning | When to use |
|---|---|---|
CLEAR | Fill with a clear value | Start of frame, you want a clean slate |
LOAD | Preserve the existing contents | Continuing previous rendering |
DONT_CARE | Contents are undefined | You will overwrite every pixel anyway |
When the render pass ends:
| Store Op | Meaning | When to use |
|---|---|---|
STORE | Write results to memory | You need the results (color for present, etc.) |
DONT_CARE | Results may be discarded | Transient data (depth buffer you won’t read later) |
Before reading on: on a tile-based mobile GPU, rendering happens in small tiles stored in fast on-chip memory. The load op controls whether tile data is loaded from main memory, and the store op controls whether it is written back. Why would
DONT_CAREbe significantly faster thanLOADon such hardware?Answer:
DONT_CARElets the driver skip the expensive memory transfer entirely. On a mobile GPU, loading a full-screen depth buffer from main memory into tile memory can take milliseconds. If you are clearing it anyway,CLEARtells the driver to fill tiles on-chip without touching main memory.DONT_CAREis even cheaper: it does nothing at all.
Worked example: a single-subpass render pass
This is the most common setup: one color attachment (the swapchain image) and one depth attachment.
Step 1: Describe the attachments
use vulkan_rust::vk;
use vk::*;
// Color attachment: the swapchain image we render into.
let color_attachment = AttachmentDescription {
flags: AttachmentDescriptionFlags::empty(),
format: swapchain_format, // e.g. B8G8R8A8_SRGB
samples: SampleCountFlagBits::_1,
load_op: AttachmentLoadOp::CLEAR, // clear at start
store_op: AttachmentStoreOp::STORE, // keep the result
stencil_load_op: AttachmentLoadOp::DONT_CARE,
stencil_store_op: AttachmentStoreOp::DONT_CARE,
initial_layout: ImageLayout::UNDEFINED, // we don't care about previous contents
final_layout: ImageLayout::PRESENT_SRC, // ready for presentation after the pass
};
// Depth attachment: used for depth testing, discarded after.
let depth_attachment = AttachmentDescription {
flags: AttachmentDescriptionFlags::empty(),
format: Format::D32_SFLOAT,
samples: SampleCountFlagBits::_1,
load_op: AttachmentLoadOp::CLEAR,
store_op: AttachmentStoreOp::DONT_CARE, // we won't read it later
stencil_load_op: AttachmentLoadOp::DONT_CARE,
stencil_store_op: AttachmentStoreOp::DONT_CARE,
initial_layout: ImageLayout::UNDEFINED,
final_layout: ImageLayout::DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
};
Step 2: Define the subpass
use vulkan_rust::vk;
use vk::*;
// Subpass 0 uses attachment 0 as color output and attachment 1 as depth.
let color_ref = AttachmentReference {
attachment: 0, // index into the attachments array
layout: ImageLayout::COLOR_ATTACHMENT_OPTIMAL,
};
let depth_ref = AttachmentReference {
attachment: 1,
layout: ImageLayout::DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
};
let subpass = SubpassDescription {
flags: SubpassDescriptionFlags::empty(),
pipeline_bind_point: PipelineBindPoint::GRAPHICS,
input_attachment_count: 0,
p_input_attachments: core::ptr::null(),
color_attachment_count: 1,
p_color_attachments: &color_ref,
p_resolve_attachments: core::ptr::null(),
p_depth_stencil_attachment: &depth_ref,
preserve_attachment_count: 0,
p_preserve_attachments: core::ptr::null(),
};
Step 3: Add a subpass dependency
use vulkan_rust::vk;
use vk::*;
// This dependency ensures that the image layout transition
// (from the previous frame's PRESENT_SRC to our UNDEFINED→COLOR_ATTACHMENT)
// happens before we start writing color output.
let dependency = SubpassDependency {
src_subpass: SUBPASS_EXTERNAL, // operations before the render pass
dst_subpass: 0, // our subpass
src_stage_mask: PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT
| PipelineStageFlags::EARLY_FRAGMENT_TESTS,
dst_stage_mask: PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT
| PipelineStageFlags::EARLY_FRAGMENT_TESTS,
src_access_mask: AccessFlags::NONE,
dst_access_mask: AccessFlags::COLOR_ATTACHMENT_WRITE
| AccessFlags::DEPTH_STENCIL_ATTACHMENT_WRITE,
dependency_flags: DependencyFlags::empty(),
};
Step 4: Create the render pass
use vulkan_rust::vk;
use vk::*;
let attachments = [color_attachment, depth_attachment];
let render_pass_info = RenderPassCreateInfo::builder()
.attachments(&attachments)
.subpasses(&[subpass])
.dependencies(&[dependency]);
let render_pass = unsafe {
device.create_render_pass(&render_pass_info, None)?
};
Step 5: Create framebuffers (one per swapchain image)
use vulkan_rust::vk;
use vk::*;
let framebuffers: Vec<Framebuffer> = swapchain_image_views
.iter()
.map(|&view| {
// Each framebuffer uses a different swapchain image view
// but the same depth image view (shared across frames).
let attachments = [view, depth_image_view];
let fb_info = FramebufferCreateInfo::builder()
.render_pass(render_pass) // must be compatible
.attachments(&attachments)
.width(swapchain_extent.width)
.height(swapchain_extent.height)
.layers(1);
unsafe { device.create_framebuffer(&fb_info, None).unwrap() }
})
.collect();
Step 6: Use in command recording
use vulkan_rust::vk;
use vk::*;
let clear_values = [
ClearValue {
color: ClearColorValue {
float32: [0.0, 0.0, 0.0, 1.0], // black
},
},
ClearValue {
depth_stencil: ClearDepthStencilValue {
depth: 1.0,
stencil: 0,
},
},
];
let begin_info = RenderPassBeginInfo::builder()
.render_pass(render_pass)
.framebuffer(framebuffers[image_index as usize])
.render_area(Rect2D {
offset: Offset2D { x: 0, y: 0 },
extent: swapchain_extent,
})
.clear_values(&clear_values);
unsafe {
// INLINE means we record drawing commands directly in this
// primary command buffer (not via secondary command buffers).
device.cmd_begin_render_pass(
command_buffer,
&begin_info,
SubpassContents::INLINE,
);
// ... bind pipeline, bindescriptor sets, draw ...
device.cmd_end_render_pass(command_buffer);
};
Dynamic rendering (Vulkan 1.3)
Vulkan 1.3 introduced cmd_begin_rendering / cmd_end_rendering,
which lets you skip render pass and framebuffer objects entirely.
You specify attachments inline at recording time:
use vulkan_rust::vk;
use vk::*;
let color_attachment = RenderingAttachmentInfo::builder()
.image_view(swapchain_image_view)
.image_layout(ImageLayout::COLOR_ATTACHMENT_OPTIMAL)
.load_op(AttachmentLoadOp::CLEAR)
.store_op(AttachmentStoreOp::STORE)
.clear_value(ClearValue {
color: ClearColorValue {
float32: [0.0, 0.0, 0.0, 1.0],
},
});
let rendering_info = RenderingInfo::builder()
.render_area(Rect2D {
offset: Offset2D { x: 0, y: 0 },
extent: swapchain_extent,
})
.layer_count(1)
.color_attachments(&[*color_attachment]);
unsafe {
device.cmd_begin_rendering(command_buffer, &rendering_info);
// ... draw ...
device.cmd_end_rendering(command_buffer);
};
Dynamic rendering is simpler for most use cases. Use traditional render passes when you need subpass dependencies, input attachments, or compatibility with Vulkan 1.0/1.1/1.2.
Formal reference
Key structs
| Struct | Purpose |
|---|---|
AttachmentDescription | Describes one attachment: format, samples, load/store ops, layouts |
AttachmentReference | Points a subpass to an attachment by index + desired layout |
SubpassDescription | Lists which attachments a subpass uses (color, depth, input, preserve) |
SubpassDependency | Synchronization between subpasses (same fields as a pipeline barrier) |
RenderPassCreateInfo | Combines attachments + subpasses + dependencies |
FramebufferCreateInfo | Binds specific image views to a render pass |
RenderPassBeginInfo | Starts a render pass instance with a framebuffer + clear values |
Subpass dependencies are barriers
A SubpassDependency has the same fields as a pipeline barrier:
src_stage_mask, dst_stage_mask, src_access_mask, dst_access_mask.
The special value SUBPASS_EXTERNAL refers to commands outside the
render pass (before it starts or after it ends).
If you understood Synchronization, subpass dependencies will feel familiar. They are barriers that the driver inserts automatically at subpass transitions.
Layout transitions are automatic
The render pass handles image layout transitions for you. Each
attachment has an initial_layout and final_layout. The driver
transitions the image at render pass begin/end. Within a subpass, the
image is in the layout specified by the AttachmentReference.
This is one of the render pass’s biggest conveniences: you do not need
to insert manual cmd_pipeline_barrier calls for attachment layout
transitions inside a render pass.
API reference links
Key takeaways
- A render pass is a blueprint describing attachments, subpasses, and dependencies. A framebuffer binds specific images to that blueprint.
- Load and store ops tell the driver how to handle attachment data at
the start and end of the pass. Choosing
DONT_CAREorCLEARoverLOADcan dramatically improve performance on mobile GPUs. - Most applications need only a single subpass. Multiple subpasses are for advanced techniques (deferred rendering, input attachments).
- Vulkan 1.3 dynamic rendering (
cmd_begin_rendering) eliminates the need for render pass and framebuffer objects in simple cases. - Render passes handle layout transitions automatically. You do not need manual barriers for attachment images inside a render pass.
Pipelines
Threshold concept. In OpenGL, you set rendering state one call at a time, blend mode here, depth test there, and the driver compiles the final state lazily. In Vulkan, all state is compiled into a pipeline object up front. This removes driver guesswork and hitching at the cost of more explicit setup.
Motivation
A GPU is not a general-purpose processor. It is a configurable state machine with fixed-function stages (vertex input, rasterization, blending) and programmable stages (vertex shader, fragment shader). A pipeline object captures the full configuration of this machine, every stage, every setting, so the driver can compile it to hardware instructions once and reuse it many times.
This is why OpenGL applications sometimes stutter when a new material appears: the driver has to compile a new internal pipeline on the fly. In Vulkan, you create all your pipelines at load time and switch between them during rendering with zero compilation cost.
Intuition
The mixing console preset
A pipeline is like a preset on a mixing console. Instead of adjusting every knob during a live performance (and risking a pop or crackle), you save the full board state as a preset and recall it instantly. You can have many presets and switch between them, but you cannot twiddle individual knobs mid-song.
(Vulkan 1.3 added dynamic state to relax this, certain knobs can be adjusted at draw time. But the core idea holds: most state is baked.)
What goes into a graphics pipeline
A graphics pipeline is the largest create info in the Vulkan API. It bundles together every stage of the rendering process:
GraphicsPipelineCreateInfo
│
├── Shader stages (vertex shader, fragment shader, ...)
├── Vertex input state (what vertex data looks like)
├── Input assembly state (triangles, lines, points)
├── Viewport state (viewport + scissor rectangle)
├── Rasterization state (polygon mode, culling, depth bias)
├── Multisample state (MSAA settings)
├── Depth/stencil state (depth test, stencil test)
├── Color blend state (blending per attachment)
├── Dynamic state (which of the above can change at draw time)
├── Pipeline layout (what resources the shaders expect)
└── Render pass + subpass (which render pass this pipeline is used in)
Every one of these must be specified. There are no defaults. This is verbose, but it means the driver has complete information at creation time and can optimize aggressively.
Before reading on: if you need to render some objects with blending and some without, how many pipeline objects do you need?
Answer: Two. Each pipeline bakes its blend state. You
cmd_bind_pipelineto switch between them during command recording. Dynamic state (Vulkan 1.3) can make some of these switches cheaper, but you still need separate pipelines for fundamental differences like different shaders.
Pipeline layout: the bridge to resources
A pipeline layout declares what resources the shaders expect:
- Descriptor set layouts: “binding 0 is a uniform buffer, binding 1 is a sampled image” (covered in Descriptor Sets)
- Push constant ranges: small inline data passed at draw time (covered in Push Constants)
The pipeline layout is shared between pipeline creation and command recording, ensuring the resources you bind match what the shaders expect.
Worked example: creating a graphics pipeline
This is a minimal pipeline for rendering colored triangles.
Step 1: Load shaders
use vulkan_rust::vk;
use vulkan_rust::vk::*;
// SPIR-V bytecode, compiled from GLSL with glslc or shaderc.
let vert_code: &[u32] = /* load from file or include_bytes! */;
let frag_code: &[u32] = /* load from file or include_bytes! */;
let vert_info = ShaderModuleCreateInfo::builder()
.code(vert_code);
let frag_info = ShaderModuleCreateInfo::builder()
.code(frag_code);
let vert_module = unsafe { device.create_shader_module(&vert_info, None)? };
let frag_module = unsafe { device.create_shader_module(&frag_info, None)? };
// Shader stage descriptions.
let entry_name = c"main"; // GLSL entry point
let stages = [
*PipelineShaderStageCreateInfo::builder()
.stage(ShaderStageFlags::VERTEX)
.module(vert_module)
.name(entry_name),
*PipelineShaderStageCreateInfo::builder()
.stage(ShaderStageFlags::FRAGMENT)
.module(frag_module)
.name(entry_name),
];
Step 2: Define vertex input
use vulkan_rust::vk;
use vulkan_rust::vk::*;
// Describe how vertex data is laid out in memory.
let binding = VertexInputBindingDescription {
binding: 0,
stride: std::mem::size_of::<Vertex>() as u32,
input_rate: VertexInputRate::VERTEX,
};
let attributes = [
// position: vec3 at offset 0
VertexInputAttributeDescription {
location: 0,
binding: 0,
format: Format::R32G32B32_SFLOAT,
offset: 0,
},
// color: vec3 at offset 12
VertexInputAttributeDescription {
location: 1,
binding: 0,
format: Format::R32G32B32_SFLOAT,
offset: 12,
},
];
let vertex_input = PipelineVertexInputStateCreateInfo::builder()
.vertex_binding_descriptions(&[binding])
.vertex_attribute_descriptions(&attributes);
Step 3: Configure fixed-function state
use vulkan_rust::vk;
use vulkan_rust::vk::*;
let input_assembly = PipelineInputAssemblyStateCreateInfo::builder()
.topology(PrimitiveTopology::TRIANGLE_LIST);
// Use dynamic viewport and scissor so we don't bake window size
// into the pipeline. Set them at draw time with cmd_set_viewport
// and cmd_set_scissor.
let viewport_state = PipelineViewportStateCreateInfo::builder()
.viewport_count(1)
.scissor_count(1);
let rasterizer = PipelineRasterizationStateCreateInfo::builder()
.polygon_mode(PolygonMode::FILL)
.cull_mode(CullModeFlags::BACK)
.front_face(FrontFace::COUNTER_CLOCKWISE)
.line_width(1.0);
let multisampling = PipelineMultisampleStateCreateInfo::builder()
.rasterization_samples(SampleCountFlagBits::_1);
let depth_stencil = PipelineDepthStencilStateCreateInfo::builder()
.depth_test_enable(1)
.depth_write_enable(1)
.depth_compare_op(CompareOp::LESS);
// No blending: write color directly.
let blend_attachment = PipelineColorBlendAttachmentState {
blend_enable: 0,
color_write_mask: ColorComponentFlags::R
| ColorComponentFlags::G
| ColorComponentFlags::B
| ColorComponentFlags::A,
..unsafe { core::mem::zeroed() }
};
let color_blending = PipelineColorBlendStateCreateInfo::builder()
.attachments(&[blend_attachment]);
// Dynamic state: viewport and scissor are set at draw time.
let dynamic_states = [
DynamicState::VIEWPORT,
DynamicState::SCISSOR,
];
let dynamic_state = PipelineDynamicStateCreateInfo::builder()
.dynamic_states(&dynamic_states);
Step 4: Create pipeline layout and pipeline
use vulkan_rust::vk;
use vulkan_rust::vk::*;
use vulkan_rust::vk::Handle;
// Empty layout (no descriptor sets, no push constants).
let layout_info = PipelineLayoutCreateInfo::builder();
let pipeline_layout = unsafe {
device.create_pipeline_layout(&layout_info, None)?
};
// Assemble everything into one create info.
let pipeline_info = GraphicsPipelineCreateInfo::builder()
.stages(&stages)
.vertex_input_state(&vertex_input)
.input_assembly_state(&input_assembly)
.viewport_state(&viewport_state)
.rasterization_state(&rasterizer)
.multisample_state(&multisampling)
.depth_stencil_state(&depth_stencil)
.color_blend_state(&color_blending)
.dynamic_state(&dynamic_state)
.layout(pipeline_layout)
.render_pass(render_pass)
.subpass(0);
// create_graphics_pipelines can create multiple pipelines at once.
let pipeline = unsafe {
device.create_graphics_pipelines(
PipelineCache::null(), // no cache for now
&[*pipeline_info],
None,
)?
}[0];
// Shader modules can be destroyed after pipeline creation.
// The compiled code is baked into the pipeline.
unsafe {
device.destroy_shader_module(vert_module, None);
device.destroy_shader_module(frag_module, None);
};
Step 5: Use in command recording
use vulkan_rust::vk;
use vulkan_rust::vk::*;
unsafe {
device.cmd_bind_pipeline(
command_buffer,
PipelineBindPoint::GRAPHICS,
pipeline,
);
// Set dynamic state.
device.cmd_set_viewport(command_buffer, 0, &[viewport]);
device.cmd_set_scissor(command_buffer, 0, &[scissor]);
// Draw.
device.cmd_draw(command_buffer, vertex_count, 1, 0, 0);
};
Compute pipelines
Compute pipelines are dramatically simpler: just a shader stage and a pipeline layout. No vertex input, no rasterization, no blending.
use vulkan_rust::vk;
use vulkan_rust::vk::*;
use vulkan_rust::vk::Handle;
let compute_info = ComputePipelineCreateInfo::builder()
.stage(*PipelineShaderStageCreateInfo::builder()
.stage(ShaderStageFlags::COMPUTE)
.module(compute_module)
.name(c"main"))
.layout(compute_layout);
let mut compute_pipeline = Pipeline::null();
unsafe {
device.create_compute_pipelines(
PipelineCache::null(),
&[*compute_info],
None,
&mut compute_pipeline,
)?;
};
Pipeline cache
Creating pipelines involves compiling shaders to GPU-specific machine code. A pipeline cache stores this compiled output so subsequent creations (in the same run or across runs, if you save/load the cache) are faster.
use vulkan_rust::vk;
use vulkan_rust::vk::*;
// Create a cache (optionally seeded with data from a previous run).
let cache_info = PipelineCacheCreateInfo::builder();
let cache = unsafe { device.create_pipeline_cache(&cache_info, None)? };
// Pass the cache when creating pipelines.
unsafe {
let pipeline = device.create_graphics_pipelines(cache, &[*pipeline_info], None)?[0];
};
// At shutdown, retrieve cache data and save to disk for next run.
// (use get_pipeline_cache_data)
Dynamic state (Vulkan 1.3)
By default, every setting in the pipeline is baked. Dynamic state lets you mark specific settings as “set at draw time”:
| Dynamic State | What it replaces |
|---|---|
VIEWPORT | Viewport in viewport state |
SCISSOR | Scissor in viewport state |
LINE_WIDTH | Line width in rasterization state |
DEPTH_TEST_ENABLE | Depth test enable in depth/stencil state |
CULL_MODE | Cull mode in rasterization state |
FRONT_FACE | Front face in rasterization state |
PRIMITIVE_TOPOLOGY | Topology in input assembly state |
Vulkan 1.3 made VIEWPORT and SCISSOR dynamic by convention (almost
everyone was using them dynamically anyway). More aggressive dynamic
state lets you consolidate pipelines: instead of separate pipelines for
different cull modes, use one pipeline with CULL_MODE dynamic.
Formal reference
Graphics pipeline stages (in order)
| Stage | State struct | Required? |
|---|---|---|
| Vertex input | PipelineVertexInputStateCreateInfo | Yes |
| Input assembly | PipelineInputAssemblyStateCreateInfo | Yes |
| Tessellation | PipelineTessellationStateCreateInfo | Only with tessellation shaders |
| Viewport | PipelineViewportStateCreateInfo | Yes (unless rasterizer discards) |
| Rasterization | PipelineRasterizationStateCreateInfo | Yes |
| Multisample | PipelineMultisampleStateCreateInfo | Yes |
| Depth/stencil | PipelineDepthStencilStateCreateInfo | If render pass has depth attachment |
| Color blend | PipelineColorBlendStateCreateInfo | If render pass has color attachments |
| Dynamic | PipelineDynamicStateCreateInfo | Optional |
Destruction order
- Destroy pipelines before their pipeline layout.
- Destroy pipeline layouts before their descriptor set layouts.
- Shader modules can be destroyed immediately after pipeline creation.
- Pipeline caches can be destroyed at any time (they are independent of the pipelines created through them).
API reference links
Key takeaways
- A graphics pipeline bakes all rendering state into one object: shaders, vertex layout, rasterization, blending, depth test, everything.
- You create pipelines at load time and switch between them with
cmd_bind_pipelineduring rendering. Zero compilation cost at draw time. - Compute pipelines are much simpler: just a shader + layout.
- Dynamic state lets you defer certain settings to draw time, reducing the number of pipeline objects you need.
- Pipeline caches avoid redundant shader compilation across pipeline creations and across application runs.
Descriptor Sets & Resource Binding
Motivation
Shaders need access to resources: buffers containing transformation
matrices, images to sample, storage buffers for compute output.
Descriptors are Vulkan’s mechanism for connecting shader bindings
(layout(binding = 0) uniform ...) to actual GPU resources.
The descriptor system is more complex than OpenGL’s glBindTexture,
but it exists because binding resources one at a time is a bottleneck.
Vulkan lets you bind sets of resources at once, and reuse those sets
across multiple draw calls.
Intuition
The surgeon’s tray
Think of a descriptor set as a tray of tools laid out for a surgeon:
- The descriptor set layout is the diagram showing which tool goes in which slot (“slot 0: scalpel, slot 1: forceps, slot 2: sutures”).
- The descriptor pool is the sterilization room where trays are prepared (pre-allocated memory for many trays).
- The descriptor set is one prepared tray, with actual tools in each slot.
- Writing a descriptor set is placing specific tools into the slots.
- Binding is sliding the tray under the surgeon’s hands during the operation.
The flow:
1. Define layout → "what slots exist and what types they hold"
2. Create pool → "how many trays can we prepare at once"
3. Allocate set → "give me an empty tray matching this layout"
4. Write descriptors → "put this buffer in slot 0, this image in slot 1"
5. Bind set → "use this tray for the next draw calls"
Before reading on: why do you think Vulkan uses descriptor “pools” instead of allocating descriptors individually? What performance problem does this solve?
Answer: Same reason as command pools, individual allocations are expensive because each one requires driver bookkeeping and possibly a kernel call. Pools pre-allocate a block of memory and hand out descriptors cheaply from that block.
Descriptor types
Each slot in a descriptor set has a specific type:
| Type | What it binds | GLSL example |
|---|---|---|
UNIFORM_BUFFER | Read-only buffer (matrices, parameters) | layout(binding=0) uniform UBO { mat4 mvp; }; |
STORAGE_BUFFER | Read/write buffer (compute data) | layout(binding=0) buffer SSBO { float data[]; }; |
COMBINED_IMAGE_SAMPLER | Image + sampler together | layout(binding=0) uniform sampler2D tex; |
SAMPLED_IMAGE | Image without sampler | layout(binding=0) uniform texture2D tex; |
SAMPLER | Sampler without image | layout(binding=0) uniform sampler s; |
STORAGE_IMAGE | Read/write image (compute) | layout(binding=0, rgba8) uniform image2D img; |
INPUT_ATTACHMENT | Previous subpass output | layout(input_attachment_index=0) uniform subpassInput; |
The most common are UNIFORM_BUFFER and COMBINED_IMAGE_SAMPLER.
Worked example: binding a uniform buffer and a texture
Step 1: Create a descriptor set layout
use vulkan_rust::vk;
use vk::*;
// Describe the bindings: slot 0 is a uniform buffer visible to
// the vertex shader, slot 1 is a combined image sampler visible
// to the fragment shader.
let bindings = [
DescriptorSetLayoutBinding {
binding: 0,
descriptor_type: DescriptorType::UNIFORM_BUFFER,
descriptor_count: 1,
stage_flags: ShaderStageFlags::VERTEX,
p_immutable_samplers: core::ptr::null(),
},
DescriptorSetLayoutBinding {
binding: 1,
descriptor_type: DescriptorType::COMBINED_IMAGE_SAMPLER,
descriptor_count: 1,
stage_flags: ShaderStageFlags::FRAGMENT,
p_immutable_samplers: core::ptr::null(),
},
];
let layout_info = DescriptorSetLayoutCreateInfo::builder()
.bindings(&bindings);
let descriptor_layout = unsafe {
device.create_descriptor_set_layout(&layout_info, None)?
};
// This layout is also passed to create_pipeline_layout, connecting
// the pipeline to the descriptor set structure.
Step 2: Create a descriptor pool
use vulkan_rust::vk;
use vk::*;
// The pool must have enough room for the descriptor types we need.
// If we want 10 sets, each with 1 uniform buffer and 1 image sampler:
let pool_sizes = [
DescriptorPoolSize {
r#type: DescriptorType::UNIFORM_BUFFER,
descriptor_count: 10,
},
DescriptorPoolSize {
r#type: DescriptorType::COMBINED_IMAGE_SAMPLER,
descriptor_count: 10,
},
];
let pool_info = DescriptorPoolCreateInfo::builder()
.max_sets(10)
.pool_sizes(&pool_sizes);
let descriptor_pool = unsafe {
device.create_descriptor_pool(&pool_info, None)?
};
Step 3: Allocate a descriptor set
use vulkan_rust::vk;
use vk::*;
let alloc_info = DescriptorSetAllocateInfo::builder()
.descriptor_pool(descriptor_pool)
.set_layouts(&[descriptor_layout]);
let descriptor_set = unsafe {
device.allocate_descriptor_sets(&alloc_info)?
}[0];
Step 4: Write descriptors (point slots to actual resources)
use vulkan_rust::vk;
use vk::*;
// Point binding 0 to our uniform buffer.
let buffer_info = DescriptorBufferInfo {
buffer: uniform_buffer,
offset: 0,
range: std::mem::size_of::<UniformData>() as u64,
};
// Point binding 1 to our texture.
let image_info = DescriptorImageInfo {
sampler: texture_sampler,
image_view: texture_image_view,
image_layout: ImageLayout::SHADER_READ_ONLY_OPTIMAL,
};
let writes = [
*WriteDescriptorSet::builder()
.dst_set(descriptor_set)
.dst_binding(0)
.descriptor_type(DescriptorType::UNIFORM_BUFFER)
.buffer_info(&[buffer_info]),
*WriteDescriptorSet::builder()
.dst_set(descriptor_set)
.dst_binding(1)
.descriptor_type(DescriptorType::COMBINED_IMAGE_SAMPLER)
.image_info(&[image_info]),
];
// This updates the descriptor set immediately. No command buffer needed.
unsafe { device.update_descriptor_sets(&writes, &[]) };
Step 5: Bind during command recording
use vulkan_rust::vk;
use vk::*;
unsafe {
device.cmd_bind_descriptor_sets(
command_buffer,
PipelineBindPoint::GRAPHICS,
pipeline_layout,
0, // first set index
&[descriptor_set], // sets to bind
&[], // dynamic offsets (none)
);
// Now draw calls in this command buffer can access the
// uniform buffer at binding 0 and the texture at binding 1.
device.cmd_draw(command_buffer, vertex_count, 1, 0, 0);
};
Multiple descriptor sets
You can bind multiple descriptor sets at once. A common pattern:
Set 0: Per-frame data (camera matrices, lighting, time)
Set 1: Per-material data (textures, material properties)
Set 2: Per-object data (model matrix)
This lets you update and bind sets at different frequencies. Set 0 changes once per frame, set 1 changes when you switch materials, set 2 changes per object. You only rebind the sets that changed.
use vulkan_rust::vk;
use vk::*;
// In pipeline layout creation:
let layouts = [per_frame_layout, per_material_layout, per_object_layout];
let layout_info = PipelineLayoutCreateInfo::builder()
.set_layouts(&layouts);
// During rendering:
unsafe {
// Bind set 0 once per frame.
device.cmd_bind_descriptor_sets(
cmd, PipelineBindPoint::GRAPHICS,
pipeline_layout, 0, &[per_frame_set], &[],
);
for material in &materials {
// Bind set 1 per material.
device.cmd_bind_descriptor_sets(
cmd, PipelineBindPoint::GRAPHICS,
pipeline_layout, 1, &[material.descriptor_set], &[],
);
for object in &material.objects {
// Bind set 2 per object.
device.cmd_bind_descriptor_sets(
cmd, PipelineBindPoint::GRAPHICS,
pipeline_layout, 2, &[object.descriptor_set], &[],
);
device.cmd_draw(cmd, object.vertex_count, 1, 0, 0);
}
}
};
Before reading on: in the pattern above, when you bind set 1 for a new material, does set 0 (per-frame) stay bound or does it need to be rebound?
Answer: It stays bound. Binding set N only affects set N. Sets at other indices remain bound from their previous
cmd_bind_descriptor_setscall, as long as the pipeline layout is compatible.
Formal reference
The descriptor set creation flow
DescriptorSetLayoutBinding[]
│
v
DescriptorSetLayoutCreateInfo ──> create_descriptor_set_layout ──> DescriptorSetLayout
│
┌─────────────────────────────────────────────────────┘
v
DescriptorPoolCreateInfo ──> create_descriptor_pool ──> DescriptorPool
│ │
v v
DescriptorSetAllocateInfo ──────> allocate_descriptor_sets ──> DescriptorSet
│
v
WriteDescriptorSet[] ──────────> update_descriptor_sets (set is now usable)
│
v
cmd_bind_descriptor_sets ──────> (shaders can access resources)
Descriptor types reference
| Type | Read/Write | Typical use |
|---|---|---|
UNIFORM_BUFFER | Read | Matrices, parameters (small, frequently updated) |
UNIFORM_BUFFER_DYNAMIC | Read | Same, with dynamic offset at bind time |
STORAGE_BUFFER | Read/Write | Large data, compute buffers |
STORAGE_BUFFER_DYNAMIC | Read/Write | Same, with dynamic offset |
COMBINED_IMAGE_SAMPLER | Read | Textures |
SAMPLED_IMAGE | Read | Image without sampler (separate sampler) |
SAMPLER | N/A | Sampler without image |
STORAGE_IMAGE | Read/Write | Compute shader image output |
INPUT_ATTACHMENT | Read | Previous subpass output |
INLINE_UNIFORM_BLOCK | Read | Small uniform data inline in the set |
Destruction order
- Destroy pipeline layouts before descriptor set layouts.
- Destroying a descriptor pool frees all sets allocated from it.
- Descriptor set layouts can be destroyed after pipeline creation (the pipeline bakes a copy of the layout information).
API reference links
DescriptorSetLayoutDescriptorPoolDescriptorSetWriteDescriptorSetDescriptorType- Vulkan spec: Resource Descriptors
Key takeaways
- Descriptors connect shader bindings to GPU resources (buffers, images).
- The flow is: define layout → create pool → allocate set → write → bind.
- Use multiple descriptor sets (per-frame, per-material, per-object) to minimize rebinding. Only rebind sets that change.
- Descriptor pools work like command pools: pre-allocate in bulk, hand out cheaply.
update_descriptor_setsis a CPU-side operation, not a GPU command. You can update sets between submissions without recording commands.
The pNext Extension Chain
Motivation
Vulkan evolves through extensions, and extensions often need to add fields
to existing structs. But Vulkan structs are #[repr(C)] with a fixed
layout, you cannot just add fields. The solution is pNext: a linked
list pointer in every extensible struct that lets you chain additional
data structures onto it.
This is Vulkan’s most powerful extensibility mechanism and one of its most confusing features for newcomers. Once you understand it, enabling new Vulkan features and extensions becomes straightforward.
Intuition
The envelope analogy
Every Vulkan struct with a pNext field is an envelope. The main struct
is the letter inside. The pNext chain lets you stuff additional pages
into the same envelope.
The driver opens the envelope, reads the main page, then checks if there
are more pages. Each extra page has a header (sType) that identifies
what it is, so the driver knows how to interpret it. Pages it doesn’t
recognize are silently skipped.
DeviceCreateInfo (envelope)
├── sType: DEVICE_CREATE_INFO (header: "this is a device create info")
├── pNext ──────────────────────────┐
├── ... (normal fields) │
│ v
│ PhysicalDeviceVulkan12Features (extra page)
│ ├── sType: PHYSICAL_DEVICE_VULKAN_1_2_FEATURES
│ ├── pNext ──────────────────────────┐
│ ├── ... (Vulkan 1.2 feature flags) │
│ v
│ PhysicalDeviceVulkan13Features (another page)
│ ├── sType: PHYSICAL_DEVICE_VULKAN_1_3_FEATURES
│ ├── pNext: null (end of chain)
│ ├── ... (Vulkan 1.3 feature flags)
Under the hood: two pointers
Every extensible Vulkan struct starts with the same two fields:
pub struct SomeCreateInfo {
pub s_type: StructureType, // identifies the struct type
pub p_next: *const core::ffi::c_void, // pointer to next struct in chain
// ... rest of the fields
}
The sType field is a discriminator, like a tagged union. The driver
reads sType to know what struct it’s looking at, then casts the
pointer to the correct type. This is the same pattern as COM’s
QueryInterface or protobuf’s Any.
Worked example: enabling Vulkan 1.2 and 1.3 features
The most common use of pNext chains is enabling device features from newer Vulkan versions or extensions.
Without vulkan_rust builders (raw C-style)
use vulkan_rust::vk;
use vulkan_rust::vk::*;
// You would need to manually link the structs:
let mut features_13 = PhysicalDeviceVulkan13Features {
s_type: StructureType::PHYSICAL_DEVICE_VULKAN_1_3_FEATURES,
p_next: core::ptr::null_mut() as *const _,
dynamic_rendering: 1, // enable dynamic rendering
synchronization2: 1, // enable synchronization2
..unsafe { core::mem::zeroed() }
};
let mut features_12 = PhysicalDeviceVulkan12Features {
s_type: StructureType::PHYSICAL_DEVICE_VULKAN_1_2_FEATURES,
p_next: &mut features_13 as *mut _ as *const _, // link to next
buffer_device_address: 1,
descriptor_indexing: 1,
..unsafe { core::mem::zeroed() }
};
let device_info = DeviceCreateInfo {
s_type: StructureType::DEVICE_CREATE_INFO,
p_next: &mut features_12 as *mut _ as *const _, // link to chain
// ...
};
This is error-prone: wrong sType, dangling pointers, forgetting to
link the chain. vulkan_rust builders fix all of these problems.
With vulkan_rust builders (type-safe)
use vulkan_rust::vk;
use vulkan_rust::vk::*;
let mut features_12 = *PhysicalDeviceVulkan12Features::builder()
.buffer_device_address(1)
.descriptor_indexing(1);
let mut features_13 = *PhysicalDeviceVulkan13Features::builder()
.dynamic_rendering(1)
.synchronization2(1);
let device_info = DeviceCreateInfo::builder()
.push_next(&mut features_12)
.push_next(&mut features_13)
// ... other fields
;
The builder handles:
sTypeis set automatically bybuilder().pNextlinking is handled bypush_next, which prepends each struct to the chain.- Type safety via marker traits:
push_nextonly accepts types that the Vulkan spec says are valid extensions for that struct. Passing an invalid type is a compile error.
Before reading on: what do you think happens if you chain a struct that the driver doesn’t recognize (e.g., an extension struct the driver doesn’t support)?
Answer: The driver skips it. Every struct in the chain has an
sTypeheader. The driver reads eachsType, processes structs it recognizes, and follows thepNextpointer past structs it doesn’t. This is how forward compatibility works: old drivers ignore new extension structs.
How push_next works
The push_next method prepends to the chain. Each call inserts the
new struct at the front:
// push_next implementation (simplified):
pub fn push_next<T: ExtendsDeviceCreateInfo>(mut self, next: &'a mut T) -> Self {
unsafe {
let next_ptr = next as *mut T as *mut BaseOutStructure;
// Point the new struct's pNext to the current chain head.
(*next_ptr).p_next = self.inner.p_next as *mut _;
// Make the new struct the chain head.
self.inner.p_next = next_ptr as *const _;
}
self
}
After two push_next calls:
DeviceCreateInfo.pNext → features_13 → features_12 → null
(last pushed (first pushed
is first) is last)
The order in the chain does not matter to the driver. It walks the entire chain regardless of order.
The Extends marker traits
For each extensible struct, vulkan_rust generates an unsafe trait:
pub unsafe trait ExtendsDeviceCreateInfo {}
Types that the Vulkan spec says can appear in DeviceCreateInfo’s
pNext chain implement this trait:
unsafe impl ExtendsDeviceCreateInfo for PhysicalDeviceVulkan12Features {}
unsafe impl ExtendsDeviceCreateInfo for PhysicalDeviceVulkan13Features {}
unsafe impl ExtendsDeviceCreateInfo for DevicePrivateDataCreateInfo {}
// ... hundreds more
These traits are generated from the structextends attribute in
vk.xml, so they are always in sync with the Vulkan spec.
If you try to push_next a struct that doesn’t implement the trait:
use vulkan_rust::vk;
use vulkan_rust::vk::*;
// Compile error: PhysicalDeviceMemoryProperties does not implement
// ExtendsDeviceCreateInfo
let info = DeviceCreateInfo::builder()
.push_next(&mut mem_props); // ← won't compile
The builder Deref pattern
vulkan_rust builders implement Deref<Target = InnerStruct>, so you can
pass a builder anywhere a reference to the inner struct is expected:
use vulkan_rust::vk;
use vulkan_rust::vk::*;
let info = DeviceCreateInfo::builder()
.queue_create_infos(&queue_infos)
.push_next(&mut features_12);
// No need to call .build(), just pass &info or *info.
let device = unsafe { instance.create_device(physical_device, &info, None)? };
The *info dereference gives you the inner DeviceCreateInfo.
The &info auto-derefs to &DeviceCreateInfo through Deref.
Lifetime safety
Builders carry a lifetime parameter 'a to ensure that references
passed to push_next (and slice methods like queue_create_infos)
live long enough:
pub struct DeviceCreateInfoBuilder<'a> {
inner: DeviceCreateInfo,
_marker: PhantomData<&'a ()>,
}
This means the builder and everything chained into it must live in the same scope. The compiler enforces this:
use vulkan_rust::vk;
use vulkan_rust::vk::*;
let info = {
let mut features = PhysicalDeviceVulkan12Features::builder();
DeviceCreateInfo::builder()
.push_next(&mut features)
// ← compile error: `features` does not live long enough
};
Common pNext patterns
Querying supported features
Chain feature structs into PhysicalDeviceFeatures2 and call
get_physical_device_features2:
use vulkan_rust::vk;
use vulkan_rust::vk::*;
let mut features_12 = *PhysicalDeviceVulkan12Features::builder();
let mut features_13 = *PhysicalDeviceVulkan13Features::builder();
let mut features2 = PhysicalDeviceFeatures2::builder()
.push_next(&mut features_12)
.push_next(&mut features_13);
unsafe {
instance.get_physical_device_features2(physical_device, &mut *features2);
};
// Now features_12 and features_13 are filled in by the driver.
if features_12.buffer_device_address != 0 {
println!("Buffer device address is supported");
}
Enabling features at device creation
Pass the same structs (with your desired features set to 1) into
DeviceCreateInfo via push_next, as shown in the worked example
above.
Formal reference
Key types
| Type | Purpose |
|---|---|
BaseInStructure | Generic pNext chain traversal (const). Fields: s_type, p_next. |
BaseOutStructure | Generic pNext chain traversal (mutable). Fields: s_type, p_next. |
StructureType | Enum identifying each struct type. Set automatically by builder(). |
ExtendsXxx traits | Marker traits generated from vk.xml structextends attribute. |
Rules
- Never set
sTypemanually.builder()does it for you. - Never manipulate
pNextdirectly. Usepush_next. - Order in the chain does not matter. The driver walks the full chain.
- Lifetimes must be valid. All chained structs must outlive the API call that consumes them.
- Unknown structs are skipped. Chaining an extension struct the driver doesn’t support is safe, it will be ignored.
API reference links
Key takeaways
pNextis a linked list that lets extensions add data to existing structs without changing their layout.vulkan_rustbuilders make pNext chains type-safe:push_nextonly accepts types the spec allows,sTypeis set automatically, and lifetimes are enforced by the compiler.- The most common use case is enabling device features from Vulkan 1.2, 1.3, or extensions at device creation time.
- Chain order does not matter. Unknown structs are silently skipped.
Validation Layers & Debugging
Motivation
Vulkan does almost no error checking at runtime, calling a function incorrectly is undefined behavior, not an error message. This is fast but makes debugging brutal. A typo in a pipeline barrier’s access mask won’t crash immediately; it will cause a subtle rendering glitch three frames later on one specific GPU.
Validation layers are optional middleware that intercepts every Vulkan call and checks it against the spec. They catch invalid usage, report synchronization hazards, and point you to the exact spec section that explains what went wrong. You should always enable them during development.
Intuition
The strict code reviewer
Validation layers are a strict code reviewer sitting between your application and the driver. Every API call passes through the reviewer first. In development, the reviewer catches your mistakes before they reach the driver. In production, you remove the reviewer and calls go straight through.
Your app ──> Validation Layer ──> Vulkan Driver ──> GPU
│
│ "ERROR: Buffer 0x42 was not created with
│ TRANSFER_DST usage, but you're using it
│ as a copy destination. See spec section 7.4."
v
Callback (your code logs or prints this)
Without validation layers:
Your app ──────────────────────> Vulkan Driver ──> GPU
│
│ (undefined behavior,
│ maybe works, maybe
│ corrupts memory,
│ maybe crashes later)
Before reading on: why do you think Vulkan chose to make error checking optional instead of always-on?
Answer: Performance. Validation checking every API call adds measurable overhead (sometimes 2-5x slower). For a shipped game running at 60fps, that cost is unacceptable. By making validation optional, development builds get thorough checking while release builds get maximum performance.
Worked example: enabling validation with a debug messenger
Step 1: Enable the validation layer at instance creation
use std::ffi::CStr;
use vulkan_rust::vk;
use vk::*;
// The standard validation layer name.
let validation_layer = c"VK_LAYER_KHRONOS_validation";
let layer_names = [validation_layer.as_ptr()];
// The debug utils extension lets us receive callbacks.
use vk::extension_names::EXT_DEBUG_UTILS_EXTENSION_NAME;
let extension_names = [
EXT_DEBUG_UTILS_EXTENSION_NAME.as_ptr(),
];
let instance_info = InstanceCreateInfo::builder()
.enabled_layer_names(&layer_names)
.enabled_extension_names(&extension_names);
let instance = unsafe { entry.create_instance(&instance_info, None)? };
Step 2: Set up a debug messenger
The debug messenger calls your function whenever validation finds a problem.
use vulkan_rust::vk;
use vk::*;
// This callback receives validation messages.
// The signature must match PFN_vkDebugUtilsMessengerCallbackEXT.
unsafe extern "system" fn debug_callback(
severity: DebugUtilsMessageSeverityFlagsEXT,
message_type: DebugUtilsMessageTypeFlagsEXT,
callback_data: *const DebugUtilsMessengerCallbackDataEXT,
_user_data: *mut core::ffi::c_void,
) -> u32 {
let message = if !callback_data.is_null() {
let data = &*callback_data;
if !data.p_message.is_null() {
CStr::from_ptr(data.p_message).to_string_lossy()
} else {
std::borrow::Cow::Borrowed("(no message)")
}
} else {
std::borrow::Cow::Borrowed("(no callback data)")
};
if severity & DebugUtilsMessageSeverityFlagsEXT::ERROR
!= DebugUtilsMessageSeverityFlagsEXT::empty()
{
eprintln!("[VULKAN ERROR] {message}");
} else if severity & DebugUtilsMessageSeverityFlagsEXT::WARNING
!= DebugUtilsMessageSeverityFlagsEXT::empty()
{
eprintln!("[VULKAN WARNING] {message}");
}
0 // returning 1 would abort the Vulkan call that triggered this
}
use vulkan_rust::vk;
use vk::*;
// Create the messenger.
let messenger_info = DebugUtilsMessengerCreateInfoEXT::builder()
.message_severity(
DebugUtilsMessageSeverityFlagsEXT::WARNING
| DebugUtilsMessageSeverityFlagsEXT::ERROR,
)
.message_type(
DebugUtilsMessageTypeFlagsEXT::GENERAL
| DebugUtilsMessageTypeFlagsEXT::VALIDATION
| DebugUtilsMessageTypeFlagsEXT::PERFORMANCE,
)
.pfn_user_callback(Some(debug_callback));
let messenger = unsafe {
instance.create_debug_utils_messenger_ext(&messenger_info, None)?
};
Step 3: Trigger an error (intentionally)
To verify validation is working, do something wrong on purpose:
use vulkan_rust::vk;
use vk::*;
// Create a buffer without TRANSFER_DST usage, then try to copy into it.
let bad_buffer_info = BufferCreateInfo::builder()
.size(1024)
.usage(BufferUsageFlags::VERTEX_BUFFER) // no TRANSFER_DST!
.sharing_mode(SharingMode::EXCLUSIVE);
let bad_buffer = unsafe { device.create_buffer(&bad_buffer_info, None)? };
// Recording a copy to this buffer will produce a validation error:
// "vkCmdCopyBuffer: dstBuffer was not created with VK_BUFFER_USAGE_TRANSFER_DST_BIT"
Step 4: Clean up
use vulkan_rust::vk;
// Destroy the messenger before destroying the instance.
unsafe {
instance.destroy_debug_utils_messenger_ext(messenger, None);
};
Message severity levels
| Severity | Meaning | Action |
|---|---|---|
VERBOSE | Diagnostic noise (loader info, layer status) | Usually filtered out |
INFO | Informational (resource creation, state changes) | Useful for deep debugging |
WARNING | Potential problem (suboptimal usage, deprecated behavior) | Investigate |
ERROR | Spec violation (undefined behavior if ignored) | Fix immediately |
Filter severity in the messenger creation to control verbosity. Most
applications enable WARNING | ERROR and only enable VERBOSE | INFO
when debugging specific issues.
Message types
| Type | What it checks |
|---|---|
GENERAL | General events (loader, layer lifecycle) |
VALIDATION | Spec violations (the most important type) |
PERFORMANCE | Suboptimal API usage that may hurt performance |
DEVICE_ADDRESS_BINDING | Buffer device address binding events |
Catching errors during instance creation
There is a bootstrap problem: you need an instance to create a debug
messenger, but errors can occur during instance creation. The
solution: chain the messenger create info into the instance create info
via pNext. The validation layer will use it for messages during
create_instance:
use vulkan_rust::vk;
use vk::*;
let mut debug_info = DebugUtilsMessengerCreateInfoEXT::builder()
.message_severity(
DebugUtilsMessageSeverityFlagsEXT::WARNING
| DebugUtilsMessageSeverityFlagsEXT::ERROR,
)
.message_type(
DebugUtilsMessageTypeFlagsEXT::GENERAL
| DebugUtilsMessageTypeFlagsEXT::VALIDATION
| DebugUtilsMessageTypeFlagsEXT::PERFORMANCE,
)
.pfn_user_callback(Some(debug_callback));
// Chain into instance creation via pNext.
// DebugUtilsMessengerCreateInfoEXT implements ExtendsInstanceCreateInfo.
let instance_info = InstanceCreateInfo::builder()
.enabled_layer_names(&layer_names)
.enabled_extension_names(&extension_names)
.push_next(&mut debug_info);
// Validation errors during create_instance will now trigger the callback.
let instance = unsafe { entry.create_instance(&instance_info, None)? };
// After instance creation, create a persistent messenger for the
// rest of the application's lifetime.
let messenger = unsafe {
instance.create_debug_utils_messenger_ext(&debug_info, None)?
};
This is a practical example of pNext in action (see The pNext Extension Chain).
Common validation errors and what they mean
| Error message (abbreviated) | Cause | Fix |
|---|---|---|
| “not created with … usage” | Resource missing a usage flag | Add the required usage flag at creation |
| “layout is UNDEFINED but expected …” | Image in wrong layout | Add a pipeline barrier to transition |
| “access mask … not supported by stage …” | Access mask doesn’t match pipeline stage | Check the barrier recipes table |
| “must not be in RECORDING state” | Submitting a command buffer that wasn’t ended | Call end_command_buffer before submitting |
| “is still in use by the GPU” | Destroying an object the GPU is using | Wait for the fence before destroying |
| “extension not enabled” | Using an extension feature without enabling it | Add the extension to instance/device creation |
Performance impact
Validation layers add significant overhead:
- CPU time: Every API call is checked against the spec. Expect 2-5x slower CPU-side Vulkan calls.
- Memory: The layer tracks all objects and their state.
- GPU time: Minimal, but synchronization validation may serialize GPU work.
Always disable validation in release builds. A common pattern:
let enable_validation = cfg!(debug_assertions);
let layer_names: Vec<*const i8> = if enable_validation {
vec![c"VK_LAYER_KHRONOS_validation".as_ptr()]
} else {
vec![]
};
Formal reference
Key types
| Type | Purpose |
|---|---|
DebugUtilsMessengerEXT | Handle to the debug messenger |
DebugUtilsMessengerCreateInfoEXT | Configuration: severity filter, type filter, callback |
DebugUtilsMessageSeverityFlagsEXT | Severity bitmask (VERBOSE, INFO, WARNING, ERROR) |
DebugUtilsMessageTypeFlagsEXT | Type bitmask (GENERAL, VALIDATION, PERFORMANCE) |
Required extension
The debug messenger requires the VK_EXT_debug_utils instance
extension. Enable it with vk::extension_names::EXT_DEBUG_UTILS_EXTENSION_NAME.
Destruction order
- Destroy the debug messenger before destroying the instance.
- The pNext-chained messenger (for instance creation) is temporary and does not need separate destruction.
API reference links
DebugUtilsMessengerEXTDebugUtilsMessengerCreateInfoEXTDebugUtilsMessageSeverityFlagsEXT- Vulkan spec: Debugging
Key takeaways
- Always enable validation layers during development. They catch undefined behavior that would otherwise silently corrupt rendering.
- Set up a debug messenger callback to receive errors in your code. Don’t rely on console output, some platforms don’t have one.
- Chain
DebugUtilsMessengerCreateInfoEXTintoInstanceCreateInfovia pNext to catch errors during instance creation. - Filter by severity (WARNING + ERROR) and type (VALIDATION + PERFORMANCE) for the best signal-to-noise ratio.
- Disable validation in release builds. The overhead is significant.
Load and Sample Textures
Task: Load an image from disk, upload it to GPU memory, and sample it in a fragment shader.
Prerequisites
You should be comfortable with:
- Memory Management (staging buffers, memory types)
- Command Buffers (one-shot transfers)
- Descriptor Sets (binding samplers)
- Synchronization (image layout transitions)
Overview
Sampling a texture in Vulkan requires several steps that OpenGL handled behind the scenes: creating a staging buffer, allocating a device-local image, transitioning layouts with pipeline barriers, copying data, and finally binding the image through a descriptor set. This recipe walks through each step.
Step 1: Load pixels from disk
Use the image crate to decode an image file into raw RGBA pixels.
let img = image::open("assets/texture.png")
.expect("Failed to open image")
.to_rgba8();
let (width, height) = img.dimensions();
let pixels = img.as_raw();
let image_size = (width * height * 4) as u64; // 4 bytes per RGBA pixel
Step 2: Create a staging buffer
The CPU cannot write directly to device-local memory on most hardware. Upload the pixels into a host-visible staging buffer first.
use vulkan_rust::vk;
use vk::*;
let staging_info = BufferCreateInfo::builder()
.size(image_size)
.usage(BufferUsageFlags::TRANSFER_SRC)
.sharing_mode(SharingMode::EXCLUSIVE);
let staging_buffer = unsafe { device.create_buffer(&staging_info, None) }
.expect("Failed to create staging buffer");
let staging_reqs = unsafe { device.get_buffer_memory_requirements(staging_buffer) };
let staging_memory = allocate_and_bind_buffer(
device,
staging_buffer,
&staging_reqs,
&mem_properties,
MemoryPropertyFlags::HOST_VISIBLE | MemoryPropertyFlags::HOST_COHERENT,
);
// Map, copy pixels, unmap.
unsafe {
let ptr = device.map_memory(
staging_memory, 0, image_size,
MemoryMapFlags::empty(),
)
.expect("Failed to map memory");
core::ptr::copy_nonoverlapping(
pixels.as_ptr(), ptr as *mut u8, image_size as usize,
);
device.unmap_memory(staging_memory);
}
See Memory Management for the
allocate_and_bind_bufferhelper and thefind_memory_typealgorithm.
Step 3: Create the device-local image
The image needs TRANSFER_DST (we will copy into it) and SAMPLED
(the fragment shader will sample it).
use vulkan_rust::vk;
use vk::*;
let image_info = ImageCreateInfo::builder()
.image_type(ImageType::_2D)
.format(Format::R8G8B8A8_SRGB)
.extent(Extent3D { width, height, depth: 1 })
.mip_levels(1)
.array_layers(1)
.samples(SampleCountFlagBits::_1)
.tiling(ImageTiling::OPTIMAL)
.usage(
ImageUsageFlags::TRANSFER_DST
| ImageUsageFlags::SAMPLED
)
.sharing_mode(SharingMode::EXCLUSIVE)
.initial_layout(ImageLayout::UNDEFINED);
let texture_image = unsafe { device.create_image(&image_info, None) }
.expect("Failed to create image");
// Allocate DEVICE_LOCAL memory and bind it to the image.
let img_reqs = unsafe { device.get_image_memory_requirements(texture_image) };
let texture_memory = allocate_and_bind_image(
device, texture_image, &img_reqs, &mem_properties,
MemoryPropertyFlags::DEVICE_LOCAL,
);
Step 4: Transition layout UNDEFINED to TRANSFER_DST_OPTIMAL
Before copying into the image, transition it to a layout the transfer engine can write to. This requires a pipeline barrier.
Before reading on: why can’t we just copy into an image that is in UNDEFINED layout? What does the layout tell the driver?
use vulkan_rust::vk;
use vk::*;
use vk::constants;
let barrier_to_transfer = ImageMemoryBarrier::builder()
.old_layout(ImageLayout::UNDEFINED)
.new_layout(ImageLayout::TRANSFER_DST_OPTIMAL)
.src_queue_family_index(QUEUE_FAMILY_IGNORED)
.dst_queue_family_index(QUEUE_FAMILY_IGNORED)
.image(texture_image)
.subresource_range(ImageSubresourceRange {
aspect_mask: ImageAspectFlags::COLOR,
base_mip_level: 0,
level_count: 1,
base_array_layer: 0,
layer_count: 1,
})
// No prior access to wait for (image was UNDEFINED).
.src_access_mask(AccessFlags::NONE)
// The transfer write must wait until the transition completes.
.dst_access_mask(AccessFlags::TRANSFER_WRITE);
unsafe {
device.cmd_pipeline_barrier(
cmd,
PipelineStageFlags::TOP_OF_PIPE, // src stage: nothing before
PipelineStageFlags::TRANSFER, // dst stage: transfer write
DependencyFlags::empty(),
&[], // memory barriers
&[], // buffer memory barriers
&[*barrier_to_transfer],
);
}
See Synchronization for a deeper explanation of pipeline barriers and access masks.
Step 5: Copy staging buffer to image
use vulkan_rust::vk;
use vk::*;
let region = BufferImageCopy {
buffer_offset: 0,
// 0 means tightly packed (no padding between rows).
buffer_row_length: 0,
buffer_image_height: 0,
image_subresource: ImageSubresourceLayers {
aspect_mask: ImageAspectFlags::COLOR,
mip_level: 0,
base_array_layer: 0,
layer_count: 1,
},
image_offset: Offset3D { x: 0, y: 0, z: 0 },
image_extent: Extent3D { width, height, depth: 1 },
};
unsafe {
device.cmd_copy_buffer_to_image(
cmd,
staging_buffer,
texture_image,
ImageLayout::TRANSFER_DST_OPTIMAL,
&[region],
);
}
Step 6: Transition layout TRANSFER_DST to SHADER_READ_ONLY
After the copy, transition the image to a layout the shader can read.
use vulkan_rust::vk;
use vk::*;
use vk::constants;
let barrier_to_shader = ImageMemoryBarrier::builder()
.old_layout(ImageLayout::TRANSFER_DST_OPTIMAL)
.new_layout(ImageLayout::SHADER_READ_ONLY_OPTIMAL)
.src_queue_family_index(QUEUE_FAMILY_IGNORED)
.dst_queue_family_index(QUEUE_FAMILY_IGNORED)
.image(texture_image)
.subresource_range(ImageSubresourceRange {
aspect_mask: ImageAspectFlags::COLOR,
base_mip_level: 0,
level_count: 1,
base_array_layer: 0,
layer_count: 1,
})
.src_access_mask(AccessFlags::TRANSFER_WRITE)
.dst_access_mask(AccessFlags::SHADER_READ);
unsafe {
device.cmd_pipeline_barrier(
cmd,
PipelineStageFlags::TRANSFER,
PipelineStageFlags::FRAGMENT_SHADER,
DependencyFlags::empty(),
&[], &[],
&[*barrier_to_shader],
);
}
Step 7: Create image view and sampler
The shader does not access images directly. It reads through an image view (which selects format, mip levels, and array layers) and a sampler (which controls filtering and addressing).
use vulkan_rust::vk;
use vk::*;
let view_info = ImageViewCreateInfo::builder()
.image(texture_image)
.view_type(ImageViewType::_2D)
.format(Format::R8G8B8A8_SRGB)
.subresource_range(ImageSubresourceRange {
aspect_mask: ImageAspectFlags::COLOR,
base_mip_level: 0,
level_count: 1,
base_array_layer: 0,
layer_count: 1,
});
let texture_view = unsafe { device.create_image_view(&view_info, None) }
.expect("Failed to create image view");
let sampler_info = SamplerCreateInfo::builder()
.mag_filter(Filter::LINEAR)
.min_filter(Filter::LINEAR)
.address_mode_u(SamplerAddressMode::REPEAT)
.address_mode_v(SamplerAddressMode::REPEAT)
.address_mode_w(SamplerAddressMode::REPEAT)
// Requires the samplerAnisotropy device feature to be enabled.
// Set anisotropy_enable(0) if the feature is not available.
.anisotropy_enable(true)
.max_anisotropy(16.0)
.border_color(BorderColor::INT_OPAQUE_BLACK)
.mipmap_mode(SamplerMipmapMode::LINEAR)
.min_lod(0.0)
.max_lod(0.0);
let sampler = unsafe { device.create_sampler(&sampler_info, None) }
.expect("Failed to create sampler");
Step 8: Bind via descriptor set
Update a descriptor set so the shader can access the combined image/sampler pair at a binding point.
use vulkan_rust::vk;
use vk::*;
let image_descriptor = DescriptorImageInfo {
sampler,
image_view: texture_view,
image_layout: ImageLayout::SHADER_READ_ONLY_OPTIMAL,
};
let write = WriteDescriptorSet::builder()
.dst_set(descriptor_set)
.dst_binding(1) // must match the binding in the shader
.dst_array_element(0)
.descriptor_type(DescriptorType::COMBINED_IMAGE_SAMPLER)
.image_info(&[image_descriptor]);
unsafe { device.update_descriptor_sets(&[*write], &[]) };
In the fragment shader (GLSL):
layout(set = 0, binding = 1) uniform sampler2D texSampler;
void main() {
outColor = texture(texSampler, fragTexCoord);
}
See Descriptor Sets for descriptor pool creation and layout setup.
Cleanup
Because vulkan_rust handles do not implement Drop, you must destroy
resources manually when they are no longer needed.
// Wait for the GPU to finish using these resources first.
unsafe {
device.device_wait_idle()
.expect("Failed to wait for device idle");
device.destroy_sampler(sampler, None);
device.destroy_image_view(texture_view, None);
device.destroy_image(texture_image, None);
device.free_memory(texture_memory, None);
// Staging buffer should already be destroyed after the upload.
}
Notes
- Format choice.
R8G8B8A8_SRGBis correct for most color textures. UseR8G8B8A8_UNORMfor data textures (normal maps, roughness) where sRGB gamma correction would be wrong. - Mipmaps. This recipe creates a single mip level. For proper texture
filtering at a distance, generate a full mip chain using
cmd_blit_imagein a loop, with a barrier between each level. - One-shot command buffer. Steps 4 through 6 are typically recorded into a short-lived command buffer that is submitted and waited on immediately. Reuse command buffers from a transient pool for this.
Implement Double Buffering
Task: Set up frames-in-flight so the CPU records frame N+1 while the GPU renders frame N.
Prerequisites
- Synchronization (fences, semaphores)
- Command Buffers
- Hello Triangle, Part 4 (basic render loop)
The problem
In a single-buffered render loop, the CPU submits a frame and then waits for the GPU to finish before it can start recording the next frame. This means the CPU sits idle during GPU rendering, and the GPU sits idle during CPU recording. You get roughly half the throughput you could.
Single buffered:
CPU: [record 0]...........[record 1]...........[record 2]...
GPU: ...........[render 0]...........[render 1]...........[render 2]
└── idle ──┘ └── idle ──┘
With double buffering (two frames in flight), the CPU records the next frame while the GPU is still rendering the current one:
Double buffered:
CPU: [record 0][record 1][record 2][record 3]...
GPU: .......[render 0][render 1][render 2][render 3]...
The overlap keeps both processors busy.
Step 1: Define the frame count
Two frames in flight is the standard choice. Three is occasionally used, but adds latency without much throughput gain on most hardware.
const MAX_FRAMES_IN_FLIGHT: usize = 2;
Step 2: Create per-frame synchronization objects
Each frame in flight needs its own set of sync primitives:
- Fence: the CPU waits on this before reusing the frame’s resources.
- Image-available semaphore: signals when the swapchain image is ready to be rendered into.
- Render-finished semaphore: signals when rendering is done and the image can be presented.
Before reading on: why does each frame need its own fence? What would go wrong if all frames shared a single fence?
use vulkan_rust::vk;
use vk::*;
struct FrameSync {
in_flight_fence: Fence,
image_available: Semaphore,
render_finished: Semaphore,
}
let fence_info = FenceCreateInfo::builder()
.flags(FenceCreateFlags::SIGNALED); // start signaled so frame 0 doesn't deadlock
let semaphore_info = SemaphoreCreateInfo::builder();
let mut frame_sync = Vec::with_capacity(MAX_FRAMES_IN_FLIGHT);
for _ in 0..MAX_FRAMES_IN_FLIGHT {
let sync = FrameSync {
in_flight_fence: unsafe { device.create_fence(&fence_info, None) }
.expect("Failed to create fence"),
image_available: unsafe { device.create_semaphore(&semaphore_info, None) }
.expect("Failed to create semaphore"),
render_finished: unsafe { device.create_semaphore(&semaphore_info, None) }
.expect("Failed to create semaphore"),
};
frame_sync.push(sync);
}
Note the SIGNALED flag on the fences. The render loop starts by
waiting on the fence, so frame 0 needs the fence to be signaled already
or the first wait_for_fences call will block forever.
Step 3: Create per-frame command buffers
Each frame in flight needs its own command buffer so the CPU can record into one while the GPU executes the other.
use vulkan_rust::vk;
use vk::*;
let alloc_info = CommandBufferAllocateInfo::builder()
.command_pool(command_pool)
.level(CommandBufferLevel::PRIMARY)
.command_buffer_count(MAX_FRAMES_IN_FLIGHT as u32);
let command_buffers = unsafe {
device.allocate_command_buffers(&alloc_info)
}
.expect("Failed to allocate command buffers");
Step 4: The render loop
The frame index cycles through 0..MAX_FRAMES_IN_FLIGHT. Each
iteration uses only the resources belonging to that frame index.
use vulkan_rust::vk;
use vk::*;
let mut current_frame: usize = 0;
loop {
// Handle window events (poll_events, etc.)
// ...
let sync = &frame_sync[current_frame];
let cmd = command_buffers[current_frame];
unsafe {
// --- 1. Wait for this frame's previous submission to finish ---
device.wait_for_fences(&[sync.in_flight_fence], true, u64::MAX)
.expect("Failed to wait for fence");
// --- 2. Acquire the next swapchain image ---
let image_index = device.acquire_next_image_khr(
swapchain,
u64::MAX,
sync.image_available, // signaled when image is ready
Fence::null(),
)
.expect("Failed to acquire swapchain image");
// --- 3. Reset the fence only after we know we will submit work ---
// Resetting before acquire_next_image could deadlock if acquire fails.
device.reset_fences(&[sync.in_flight_fence])
.expect("Failed to reset fence");
// --- 4. Record commands ---
device.reset_command_buffer(cmd, CommandBufferResetFlags::empty())
.expect("Failed to reset command buffer");
let begin_info = CommandBufferBeginInfo::builder();
device.begin_command_buffer(cmd, &begin_info)
.expect("Failed to begin command buffer");
// ... record render pass, draw calls, etc. ...
device.end_command_buffer(cmd)
.expect("Failed to end command buffer");
// --- 5. Submit ---
let wait_semaphores = [sync.image_available];
let wait_stages = [PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT];
let signal_semaphores = [sync.render_finished];
let command_buffers_to_submit = [cmd];
let submit_info = SubmitInfo::builder()
.wait_semaphores(&wait_semaphores)
.wait_dst_stage_mask(&wait_stages)
.command_buffers(&command_buffers_to_submit)
.signal_semaphores(&signal_semaphores);
device.queue_submit(
graphics_queue,
&[*submit_info],
sync.in_flight_fence, // signal this fence when done
)
.expect("Failed to submit");
// --- 6. Present ---
let swapchains = [swapchain];
let image_indices = [image_index];
let present_info = PresentInfoKHR::builder()
.wait_semaphores(&signal_semaphores)
.swapchains(&swapchains)
.image_indices(&image_indices);
device.queue_present_khr(graphics_queue, &present_info)
.expect("Failed to present");
}
// --- 7. Advance frame index ---
current_frame = (current_frame + 1) % MAX_FRAMES_IN_FLIGHT;
}
Step 5: Clean shutdown
Before destroying anything, wait for all frames to finish.
unsafe {
device.device_wait_idle()
.expect("Failed to wait for device idle");
for sync in &frame_sync {
device.destroy_fence(sync.in_flight_fence, None);
device.destroy_semaphore(sync.image_available, None);
device.destroy_semaphore(sync.render_finished, None);
}
}
The synchronization flow
Each frame follows this dependency chain:
wait_for_fences(in_flight_fence) CPU blocks until frame N-2 is done
│
acquire_next_image(image_available) GPU signals when image is ready
│
reset_fences(in_flight_fence) Safe to reset now
│
record commands CPU work, no GPU dependency
│
queue_submit( GPU work begins
wait: image_available, Wait for image before color output
signal: render_finished, Signal when rendering is done
fence: in_flight_fence Signal fence when fully complete
)
│
queue_present( Present to screen
wait: render_finished Wait for rendering before presenting
)
Common mistakes
Fence reset before acquire. If you reset the fence before
acquire_next_image, and the acquire call returns an error (e.g.
OUT_OF_DATE_KHR), the fence stays unsignaled. The next iteration will
wait on it forever. Always reset the fence after a successful acquire.
Sharing command buffers. If two frames in flight use the same command buffer, the CPU might overwrite it while the GPU is still reading it. Always use one command buffer per frame in flight.
Forgetting SIGNALED on initial fences. The loop starts with
wait_for_fences. If the fence starts unsignaled, the first frame
deadlocks.
Notes
- Triple buffering. Setting
MAX_FRAMES_IN_FLIGHT = 3adds one more frame of latency but can help if the CPU or GPU has variable frame times. Measure before committing to it. - Swapchain images vs frames in flight. The number of swapchain
images (typically 2 or 3) is independent of
MAX_FRAMES_IN_FLIGHT. Frames in flight control CPU/GPU overlap; swapchain image count controls how many images the presentation engine juggles. - Resize handling. When the swapchain is recreated after a window resize, you need to wait for all in-flight frames to finish first. See Handle Window Resize.
Handle Window Resize
Task: Detect window resize events and recreate the swapchain without crashing or leaking resources.
Prerequisites
- Hello Triangle, Part 2 (swapchain creation)
- Synchronization (device idle)
- Implement Double Buffering (frames in flight)
The problem
When the window is resized, the swapchain images no longer match the window dimensions. Vulkan tells you this has happened through two mechanisms:
acquire_next_image_khrorqueue_present_khrreturnsERROR_OUT_OF_DATE, meaning the swapchain is no longer compatible with the surface.queue_present_khrreturnsSUBOPTIMAL, meaning the swapchain still works but no longer matches the surface properties perfectly.
In either case, you must recreate the swapchain (and everything that depends on its images) before rendering can continue.
Step 1: Detect the resize
Track resize events from your windowing library and from Vulkan return codes.
let mut framebuffer_resized = false;
// In your window event handler (winit example):
match event {
WindowEvent::Resized(_) => {
framebuffer_resized = true;
}
_ => {}
}
In the render loop, check both the flag and the Vulkan result codes:
use vulkan_rust::vk;
use vk::*;
use vk::Result as VkError;
let acquire_result = unsafe {
device.acquire_next_image_khr(
swapchain, u64::MAX, image_available_semaphore, Fence::null(),
)
};
let image_index = match acquire_result {
Ok(index) => index,
Err(VkError::ERROR_OUT_OF_DATE) => {
recreate_swapchain(/* ... */);
continue; // restart this loop iteration
}
Err(e) => panic!("Failed to acquire swapchain image: {e:?}"),
};
// ... record and submit ...
let present_result = unsafe {
device.queue_present_khr(graphics_queue, &present_info)
};
match present_result {
Ok(_) => {}
Err(VkError::ERROR_OUT_OF_DATE | VkError::SUBOPTIMAL) => {
framebuffer_resized = false;
recreate_swapchain(/* ... */);
}
Err(e) => panic!("Failed to present: {e:?}"),
}
// Also check the manual flag (some platforms don't always return OUT_OF_DATE).
if framebuffer_resized {
framebuffer_resized = false;
recreate_swapchain(/* ... */);
}
Before reading on: why do we check
framebuffer_resizedseparately from the Vulkan error codes? Why not rely onOUT_OF_DATE_KHRalone?
Some window systems (notably X11) do not always report out-of-date when the window is resized. The manual flag from the window event handler catches those cases.
Step 2: Wait for the GPU
Before destroying any swapchain-related resources, all in-flight work must finish.
unsafe { device.device_wait_idle() }
.expect("Failed to wait for device idle");
This is simple and correct. For higher performance you could track
individual fences per swapchain image, but device_wait_idle is the
right choice for a resize path that runs infrequently.
Step 3: Destroy old resources
Destroy everything that depends on the swapchain images, in reverse creation order.
// Destroy framebuffers (one per swapchain image).
for &fb in &swapchain_framebuffers {
unsafe { device.destroy_framebuffer(fb, None); }
}
// Destroy image views (one per swapchain image).
for &view in &swapchain_image_views {
unsafe { device.destroy_image_view(view, None); }
}
// Do NOT destroy the old swapchain yet, we pass it to the new one.
You do not need to destroy the swapchain images themselves. They are owned by the swapchain and will be cleaned up when the old swapchain is destroyed.
Step 4: Query new surface capabilities
The surface extent may have changed, so re-query it.
use vulkan_rust::vk;
use vk::*;
let surface_caps = unsafe {
instance.get_physical_device_surface_capabilities_khr(physical_device, surface)
}
.expect("Failed to query surface capabilities");
let new_extent = if surface_caps.current_extent.width != u32::MAX {
// The surface has a defined size, use it.
surface_caps.current_extent
} else {
// The surface size is undefined (e.g. Wayland), clamp to limits.
let window_size = window.inner_size();
Extent2D {
width: window_size.width.clamp(
surface_caps.min_image_extent.width,
surface_caps.max_image_extent.width,
),
height: window_size.height.clamp(
surface_caps.min_image_extent.height,
surface_caps.max_image_extent.height,
),
}
};
Step 5: Handle minimized windows
When a window is minimized, the surface extent can be (0, 0). You
cannot create a swapchain with zero dimensions. Pause the render loop
until the window is restored.
if new_extent.width == 0 || new_extent.height == 0 {
// Window is minimized. Wait for a resize event before continuing.
// With winit, use Event::MainEventsCleared to avoid busy-waiting.
return Ok(());
}
Step 6: Create the new swapchain
Pass the old swapchain handle to old_swapchain. This lets the driver
reuse internal resources and can make the transition smoother.
use vulkan_rust::vk;
use vk::*;
let old_swapchain = swapchain; // save the handle
let swapchain_info = SwapchainCreateInfoKHR::builder()
.surface(surface)
.min_image_count(desired_image_count)
.image_format(surface_format.format)
.image_color_space(surface_format.color_space)
.image_extent(new_extent)
.image_array_layers(1)
.image_usage(ImageUsageFlags::COLOR_ATTACHMENT)
.image_sharing_mode(SharingMode::EXCLUSIVE)
.pre_transform(surface_caps.current_transform)
.composite_alpha(CompositeAlphaFlagBitsKHR::OPAQUE)
.present_mode(present_mode)
.clipped(true)
.old_swapchain(old_swapchain); // <-- reuse hint
swapchain = unsafe { device.create_swapchain_khr(&swapchain_info, None) }
.expect("Failed to create swapchain");
// Now destroy the old swapchain.
unsafe { device.destroy_swapchain_khr(old_swapchain, None); }
Step 7: Recreate image views and framebuffers
The new swapchain has new images, so create fresh image views and framebuffers.
use vulkan_rust::vk;
use vk::*;
let swapchain_images = unsafe { device.get_swapchain_images_khr(swapchain) }
.expect("Failed to get swapchain images");
swapchain_image_views = swapchain_images
.iter()
.map(|&image| {
let view_info = ImageViewCreateInfo::builder()
.image(image)
.view_type(ImageViewType::_2D)
.format(surface_format.format)
.subresource_range(ImageSubresourceRange {
aspect_mask: ImageAspectFlags::COLOR,
base_mip_level: 0,
level_count: 1,
base_array_layer: 0,
layer_count: 1,
});
unsafe { device.create_image_view(&view_info, None) }
.expect("Failed to create image view")
})
.collect();
swapchain_framebuffers = swapchain_image_views
.iter()
.map(|&view| {
let attachments = [view];
let fb_info = FramebufferCreateInfo::builder()
.render_pass(render_pass)
.attachments(&attachments)
.width(new_extent.width)
.height(new_extent.height)
.layers(1);
unsafe { device.create_framebuffer(&fb_info, None) }
.expect("Failed to create framebuffer")
})
.collect();
Putting it all together
A helper function that bundles the recreation logic:
use vulkan_rust::vk;
use vk::*;
fn recreate_swapchain(
instance: &vulkan_rust::Instance,
device: &vulkan_rust::Device,
physical_device: PhysicalDevice,
surface: SurfaceKHR,
window: &winit::window::Window,
render_pass: RenderPass,
swapchain: &mut SwapchainKHR,
swapchain_image_views: &mut Vec<ImageView>,
swapchain_framebuffers: &mut Vec<Framebuffer>,
surface_format: SurfaceFormatKHR,
present_mode: PresentModeKHR,
) -> Extent2D {
unsafe {
device.device_wait_idle()
.expect("Failed to wait for device idle");
// Destroy old framebuffers and image views.
for &fb in swapchain_framebuffers.iter() {
device.destroy_framebuffer(fb, None);
}
for &view in swapchain_image_views.iter() {
device.destroy_image_view(view, None);
}
}
// Query new extent, create new swapchain, views, framebuffers.
// ... (Steps 4 through 7 from above) ...
new_extent
}
Common mistakes
Forgetting to update the viewport and scissor. If you use dynamic viewport/scissor state (which you should), update them to the new extent each frame. If you baked them into the pipeline, you need to recreate the pipeline too.
Leaking old image views. Every create_image_view must have a
matching destroy_image_view. If you overwrite the Vec without
destroying the old views first, those handles leak.
Not handling SUBOPTIMAL. SUBOPTIMAL from queue_present_khr is
not a fatal error, but ignoring it means you render at the wrong
resolution until something else triggers an ERROR_OUT_OF_DATE.
Notes
- Depth buffers. If your render pass uses a depth attachment, you must also recreate the depth image, its memory, and its image view when the swapchain extent changes.
- Render pass compatibility. The render pass itself does not depend on the swapchain extent, only on the image format. You do not need to recreate it unless the surface format changes (which is extremely rare).
- Dynamic state. Using
DynamicState::VIEWPORTandDynamicState::SCISSORavoids having to recreate the pipeline on resize. This is the recommended approach.
Use Push Constants
Task: Pass small, frequently-changing data (like a model matrix) to shaders without descriptor sets or buffer allocations.
Prerequisites
- Pipelines (pipeline layout)
- Descriptor Sets (for comparison with uniform buffers)
What push constants are
Push constants are a small block of data written directly into the command buffer. Unlike uniform buffers, they require no buffer allocation, no memory binding, and no descriptor set update. You declare a range in the pipeline layout, record the data inline during command recording, and the shader reads it.
The tradeoff is size: the Vulkan spec guarantees at least 128 bytes of push constant storage. Most desktop GPUs offer 256 bytes. This is enough for a 4x4 matrix (64 bytes) plus a handful of scalar parameters, but not enough for large data sets.
When to use push constants vs uniform buffers
| Criterion | Push constants | Uniform buffers |
|---|---|---|
| Size | Up to 128-256 bytes | Unlimited |
| Setup cost | None (inline in command buffer) | Allocate buffer, bind memory, write descriptor |
| Per-draw update | Free (just cmd_push_constants) | Requires dynamic offsets or multiple descriptors |
| Best for | Model matrix, time, material index | Large arrays, shared view/projection data |
Rule of thumb: if the data changes per draw call and fits in 128 bytes, use push constants. For anything larger or shared across many draws, use a uniform buffer.
Step 1: Define the push constant data
Create a #[repr(C)] struct that matches the layout the shader expects.
#[repr(C)]
#[derive(Clone, Copy)]
struct PushConstants {
model: [f32; 16], // 4x4 matrix, 64 bytes
time: f32, // 4 bytes
_padding: [f32; 3], // align to 16 bytes if needed
}
Before reading on: why does the struct need
#[repr(C)]? What would happen if Rust reordered the fields?
#[repr(C)] guarantees that the fields are laid out in declaration
order with C-compatible alignment. Without it, the Rust compiler may
reorder fields, and the shader would read garbage.
Step 2: Declare push constant range in the pipeline layout
The push constant range tells Vulkan how many bytes of push constant data your shaders use and which stages access them.
use vulkan_rust::vk;
use vk::*;
let push_constant_range = PushConstantRange {
stage_flags: ShaderStageFlags::VERTEX,
offset: 0,
size: std::mem::size_of::<PushConstants>() as u32,
};
let push_ranges = [push_constant_range];
let layout_info = PipelineLayoutCreateInfo::builder()
.set_layouts(&descriptor_set_layouts) // can be empty if you have no descriptors
.push_constant_ranges(&push_ranges);
let pipeline_layout = unsafe {
device.create_pipeline_layout(&layout_info, None)
}
.expect("Failed to create pipeline layout");
If both vertex and fragment shaders read push constants, you have two options:
- One range with
stage_flags: VERTEX | FRAGMENTif both stages read the same bytes. - Two ranges at different offsets if each stage reads different data.
use vulkan_rust::vk;
use vk::*;
// Example: vertex reads bytes 0..64, fragment reads bytes 64..80.
let ranges = [
PushConstantRange {
stage_flags: ShaderStageFlags::VERTEX,
offset: 0,
size: 64,
},
PushConstantRange {
stage_flags: ShaderStageFlags::FRAGMENT,
offset: 64,
size: 16,
},
];
Step 3: Declare push constants in the shader
In GLSL, push constants appear as a uniform block with the
push_constant layout qualifier.
Vertex shader:
#version 450
layout(push_constant) uniform PushConstants {
mat4 model;
float time;
} pc;
layout(location = 0) in vec3 inPosition;
void main() {
gl_Position = pc.model * vec4(inPosition, 1.0);
}
There can be only one push_constant block per shader stage. The block
members must match the byte layout of your Rust struct.
Step 4: Record push constants during command recording
Use cmd_push_constants to write the data into the command buffer. This
is typically called once per draw, right before the draw command.
use vulkan_rust::vk;
use vk::*;
let push_data = PushConstants {
model: compute_model_matrix(entity),
time: elapsed_seconds,
_padding: [0.0; 3],
};
unsafe {
device.cmd_push_constants(
cmd,
pipeline_layout,
ShaderStageFlags::VERTEX,
0, // offset in bytes
std::slice::from_raw_parts(
&push_data as *const PushConstants as *const u8,
std::mem::size_of::<PushConstants>(),
),
);
device.cmd_draw(cmd, vertex_count, 1, 0, 0);
}
For a scene with many objects, you push new constants before each draw:
use vulkan_rust::vk;
use vk::*;
for entity in &scene.entities {
let push_data = PushConstants {
model: entity.transform,
time: elapsed_seconds,
_padding: [0.0; 3],
};
unsafe {
device.cmd_push_constants(
cmd, pipeline_layout,
ShaderStageFlags::VERTEX,
0,
std::slice::from_raw_parts(
&push_data as *const PushConstants as *const u8,
std::mem::size_of::<PushConstants>(),
),
);
device.cmd_draw_indexed(
cmd, entity.index_count, 1, entity.first_index, 0, 0,
);
}
}
A helper for safe byte casting
The std::slice::from_raw_parts pattern is error-prone. A small
helper makes it clearer:
use vulkan_rust::vk;
use vk::*;
/// Reinterpret a reference to a `Copy` type as a `&[u8]` slice
/// suitable for `cmd_push_constants`.
///
/// # Safety
/// The type must be `#[repr(C)]` with no padding that contains
/// uninitialized bytes.
unsafe fn as_push_bytes<T: Copy>(data: &T) -> &[u8] {
std::slice::from_raw_parts(
data as *const T as *const u8,
std::mem::size_of::<T>(),
)
}
// Usage:
unsafe {
device.cmd_push_constants(
cmd, pipeline_layout,
ShaderStageFlags::VERTEX,
0,
as_push_bytes(&push_data),
);
}
Common mistakes
Exceeding the size limit. If your push constant struct is larger
than the device’s max_push_constants_size (query from
PhysicalDeviceLimits), pipeline creation will fail. Check the limit
at startup.
Mismatched stage flags. The stage_flags in cmd_push_constants
must match the flags declared in the push constant range. If your range
says VERTEX | FRAGMENT but you push with VERTEX only, the
validation layer will warn.
Incorrect offset. The offset parameter in cmd_push_constants is
a byte offset into the push constant block. If you update only part of
the block (e.g. fragment-only data at offset 64), the vertex portion
retains its previously pushed values.
Forgetting #[repr(C)]. Without it, Rust may reorder struct fields.
The GPU will read bytes at fixed offsets, so reordered fields mean
corrupted data with no obvious error.
Notes
- Alignment. GLSL
push_constantblocks followstd430layout rules. Avec3takes 12 bytes (not 16) but the next member aligns to its own size. Prefervec4/mat4to avoid alignment surprises, or add explicit padding in your Rust struct. - Performance. Push constants are the fastest way to pass small per-draw data. On most architectures they live in GPU registers or a small on-chip cache, not in memory.
- Compatibility. 128 bytes is the guaranteed minimum. If you need
more, check
max_push_constants_sizeinPhysicalDeviceLimits. Most desktop drivers report 256 bytes. - Combining with descriptors. Push constants and descriptor sets are complementary. A typical setup uses push constants for per-draw data (model matrix) and uniform buffers via descriptors for per-frame data (view/projection matrices, lighting).
Port from ash to vulkan_rust
Task: Migrate an existing
ash-based project tovulkan_rust(published asvulkan-ruston crates.io).
If you already have a working ash project, switching to vulkan_rust
is mostly mechanical. The Vulkan concepts are identical, and the API
surface maps one-to-one. This guide covers every difference you will
encounter.
What stays the same
Before diving into differences, note what does not change:
- All Vulkan functions are
unsafe. - You must explicitly destroy every object you create (no RAII/Drop on handles).
- Handles are lightweight
Copytypes. - The same Vulkan mental model applies: instances, devices, queues, command buffers, pipelines, descriptor sets, synchronization primitives.
Key differences at a glance
| Aspect | ash | vulkan_rust |
|---|---|---|
| Crate name | ash | vulkan-rust |
| Command style | Trait methods (DeviceV1_0, KhrSwapchainFn) | Inherent methods on Device / Instance |
| Trait imports | One per API version + one per extension | None needed |
| Raw types | ash::vk::* | vulkan_rust::vk::* |
| Builders | ::builder() returns Builder, call .build() | ::builder() returns Builder that derefs to inner struct |
| Extensions | Manual loader structs (ash::khr::swapchain::Device) | All loaded automatically, call methods on Device directly |
| Interop | Limited from_raw on some types | Instance::from_raw_parts / Device::from_raw_parts |
| Error type | ash::vk::Result with separate success/error enums | VkResult<T> wrapping vk::Result |
Step 1: Replace the Cargo dependency
# Before (ash)
[dependencies]
ash = "0.38"
# After (vulkan_rust)
[dependencies]
vulkan-rust = "0.10"
Step 2: Remove trait imports
This is the single biggest ergonomic difference. In ash, every
Vulkan API version and extension requires a trait import:
// ash: you need these traits in scope to call device methods
use ash::vk;
use ash::Device;
// Without this import, device.create_buffer() does not exist:
use ash::version::DeviceV1_0;
// Without this import, device.create_swapchain_khr() does not exist:
use ash::khr::swapchain::Device as SwapchainDevice;
In vulkan_rust, every command is an inherent method on Device or
Instance. No trait imports, no extension loader structs:
// vulkan_rust: this is all you need
use vulkan_rust::vk;
use vulkan_rust::Device;
// device.create_buffer() and device.create_swapchain_khr()
// are both available immediately.
Migration action: Delete all use ash::version::* and
use ash::extensions::* imports. Replace use ash::vk with
use vulkan_rust::vk.
Step 3: Replace Entry, Instance, and Device creation
Entry and Instance
// ── ash ─────────────────────────────────────────────────
let entry = ash::Entry::linked();
let app_info = vk::ApplicationInfo::builder()
.api_version(vk::make_api_version(0, 1, 3, 0))
.build();
let create_info = vk::InstanceCreateInfo::builder()
.application_info(&app_info)
.build();
let instance = unsafe { entry.create_instance(&create_info, None)? };
// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;
let loader = vulkan_rust::LibloadingLoader::new()
.expect("Failed to load Vulkan");
let entry = unsafe { vulkan_rust::Entry::new(loader) }
.expect("Failed to create entry");
let app_info = ApplicationInfo::builder()
.api_version((1 << 22) | (3 << 12)); // Vulkan 1.3
let create_info = InstanceCreateInfo::builder()
.application_info(&app_info);
let instance = unsafe { entry.create_instance(&create_info, None) }
.expect("Failed to create instance");
The main changes: Entry is loaded through LibloadingLoader instead
of linked(), make_api_version is replaced with a raw u32
expression, .application_info() stays .application_info(), and
.build() calls are removed. The builder derefs to the inner struct,
so you can pass &create_info directly where a &InstanceCreateInfo
is expected.
Device
// ── ash ─────────────────────────────────────────────────
let queue_info = vk::DeviceQueueCreateInfo::builder()
.queue_family_index(0)
.queue_priorities(&[1.0])
.build();
let device_info = vk::DeviceCreateInfo::builder()
.queue_create_infos(std::slice::from_ref(&queue_info))
.build();
let device = unsafe {
instance.create_device(physical_device, &device_info, None)?
};
// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;
let queue_info = DeviceQueueCreateInfo::builder()
.queue_family_index(0)
.queue_priorities(&[1.0]);
let device_info = DeviceCreateInfo::builder()
.queue_create_infos(std::slice::from_ref(&queue_info));
let device = unsafe {
instance.create_device(physical_device, &device_info, None)
}
.expect("Failed to create device");
Step 4: Update builders (drop .build())
In ash, builders require .build() to produce the final struct.
In vulkan_rust, builders implement Deref<Target = T>, so the
conversion is implicit:
// ── ash ─────────────────────────────────────────────────
let info = vk::BufferCreateInfo::builder()
.size(1024)
.usage(vk::BufferUsageFlags::VERTEX_BUFFER)
.sharing_mode(vk::SharingMode::EXCLUSIVE)
.build(); // <-- required in ash
// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;
let info = BufferCreateInfo::builder()
.size(1024)
.usage(BufferUsageFlags::VERTEX_BUFFER)
.sharing_mode(SharingMode::EXCLUSIVE);
// No .build(), pass &info directly to create_buffer()
Migration action: Search your codebase for .build() and remove
every occurrence on Vulkan builder types.
Step 5: Command buffer recording
The pattern is identical, just without trait imports:
// ── ash ─────────────────────────────────────────────────
use ash::version::DeviceV1_0; // required for begin/end
let begin_info = vk::CommandBufferBeginInfo::builder()
.flags(vk::CommandBufferUsageFlags::ONE_TIME_SUBMIT)
.build();
unsafe {
device.begin_command_buffer(cmd, &begin_info)?;
device.cmd_bind_pipeline(cmd, vk::PipelineBindPoint::GRAPHICS, pipeline);
device.cmd_draw(cmd, 3, 1, 0, 0);
device.end_command_buffer(cmd)?;
}
// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;
let begin_info = CommandBufferBeginInfo::builder()
.flags(CommandBufferUsageFlags::ONE_TIME_SUBMIT);
unsafe {
device.begin_command_buffer(cmd, &begin_info)
.expect("Failed to begin command buffer");
device.cmd_bind_pipeline(cmd, PipelineBindPoint::GRAPHICS, pipeline);
device.cmd_draw(cmd, 3, 1, 0, 0);
device.end_command_buffer(cmd)
.expect("Failed to end command buffer");
}
Step 6: Queue submission
// ── ash ─────────────────────────────────────────────────
let submit_info = vk::SubmitInfo::builder()
.command_buffers(&[cmd])
.wait_semaphores(&[image_available])
.wait_dst_stage_mask(&[vk::PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT])
.signal_semaphores(&[render_finished])
.build();
unsafe { device.queue_submit(queue, &[submit_info.build()], fence)? };
// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;
let wait_stages = [PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT];
let cmd_bufs = [cmd];
let wait_sems = [image_available];
let signal_sems = [render_finished];
let submit_info = SubmitInfo::builder()
.command_buffers(&cmd_bufs)
.wait_semaphores(&wait_sems)
.wait_dst_stage_mask(&wait_stages)
.signal_semaphores(&signal_sems);
unsafe {
device.queue_submit(queue, &[*submit_info], fence)
.expect("Failed to submit");
};
Step 7: Error handling
ash splits Vulkan results into success codes and error codes.
vulkan_rust uses a single VkResult<T> type:
// ── ash ─────────────────────────────────────────────────
match unsafe { device.create_buffer(&info, None) } {
Ok(buffer) => { /* ... */ }
Err(vk::Result::ERROR_OUT_OF_DEVICE_MEMORY) => { /* ... */ }
Err(e) => panic!("Unexpected: {:?}", e),
}
// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::Result as VkError;
match unsafe { device.create_buffer(&info, None) } {
Ok(buffer) => { /* ... */ }
Err(VkError::ERROR_OUT_OF_DEVICE_MEMORY) => { /* ... */ }
Err(e) => panic!("Unexpected: {e:?}"),
}
The match arms look the same. The difference is that VkResult<T>
implements std::error::Error, so it works with anyhow, eyre,
and the ? operator out of the box.
Step 8: Extensions
In ash, extensions require separate loader structs:
// ash: manual extension loading
let swapchain_loader = ash::khr::swapchain::Device::new(&instance, &device);
let swapchain = unsafe {
swapchain_loader.create_swapchain(&create_info, None)?
};
In vulkan_rust, all extension functions are loaded automatically when
the Device or Instance is created. You call them as regular methods:
// vulkan_rust: no loader, just call the method
let swapchain = unsafe {
device.create_swapchain_khr(&create_info, None)
}
.expect("Failed to create swapchain");
Migration action: Delete all extension loader struct construction.
Replace loader.method() with device.method() or instance.method().
Step 9: Interop with from_raw_parts
If another library (OpenXR, a C plugin, a test harness) gives you raw
Vulkan handles, vulkan_rust provides from_raw_parts to wrap them:
// Wrap an externally-created VkInstance
let instance = unsafe {
vulkan_rust::Instance::from_raw_parts(raw_instance, get_instance_proc_addr)
};
// Wrap an externally-created VkDevice
let device = unsafe {
vulkan_rust::Device::from_raw_parts(raw_device, get_device_proc_addr)
};
This loads all function pointers from the provided get_*_proc_addr,
so the wrapped object works identically to one created through Entry.
Quick-reference migration checklist
- Replace
ashwithvulkan-rustinCargo.toml - Replace
use ash::vkwithuse vulkan_rust::vk - Delete all
use ash::version::*trait imports - Delete all extension loader struct construction
- Remove every
.build()on Vulkan builder types - Replace
ash::Entry/ash::Instance/ash::Devicewithvulkan_rust::* - Replace extension loader method calls with direct
device.method()calls - Update error handling if you matched on ash-specific error types
- Compile and fix any remaining type mismatches
Map C Vulkan Calls to vulkan_rust
Task: You have C Vulkan code (or you are reading the Vulkan spec) and want to find the equivalent
vulkan_rustAPI.
This page is a translation reference. It covers the naming rules, the structural patterns that differ between C and Rust, and a lookup table for the most common API calls.
Naming conventions
Functions
Strip the vk prefix, convert to snake_case, and call as a method
on the parent object (Device or Instance):
| C | vulkan_rust |
|---|---|
vkCreateBuffer(device, ...) | device.create_buffer(...) |
vkCmdDraw(commandBuffer, ...) | device.cmd_draw(command_buffer, ...) |
vkEnumeratePhysicalDevices(instance, ...) | instance.enumerate_physical_devices() |
vkDestroyPipeline(device, ...) | device.destroy_pipeline(...) |
Note that vkCmd* functions take the CommandBuffer as a parameter
but are still called on Device, not on the command buffer handle.
Types
Strip the Vk prefix. All types are re-exported at the vk root:
| C | vulkan_rust |
|---|---|
VkBuffer | vk::Buffer |
VkBufferCreateInfo | vk::BufferCreateInfo |
VkPhysicalDeviceProperties | vk::PhysicalDeviceProperties |
VkInstance | vk::Instance (the raw handle) |
Use use vk::* to bring them into scope without the module prefix.
Enum variants
Strip the type prefix and keep SCREAMING_CASE:
| C | vulkan_rust |
|---|---|
VK_FORMAT_R8G8B8A8_SRGB | vk::Format::R8G8B8A8_SRGB |
VK_IMAGE_LAYOUT_UNDEFINED | vk::ImageLayout::UNDEFINED |
VK_PRESENT_MODE_FIFO_KHR | vk::PresentModeKHR::FIFO |
VK_SUCCESS | vk::Result::SUCCESS |
Bitmask flags
Strip the type prefix and the _BIT suffix:
| C | vulkan_rust |
|---|---|
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | vk::BufferUsageFlags::VERTEX_BUFFER |
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT | vk::ImageUsageFlags::COLOR_ATTACHMENT |
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT | vk::PipelineStageFlags::FRAGMENT_SHADER |
Combine flags with the | operator, just like in C:
use vulkan_rust::vk;
use vk::*;
let usage = BufferUsageFlags::VERTEX_BUFFER
| BufferUsageFlags::TRANSFER_DST;
Extension names
// C: VK_KHR_SWAPCHAIN_EXTENSION_NAME
// Rust: generated constants in vk::extension_names
use vulkan_rust::vk::extension_names::KHR_SWAPCHAIN_EXTENSION_NAME;
let device_extensions = [KHR_SWAPCHAIN_EXTENSION_NAME.as_ptr()];
Structural patterns
Struct initialization
C uses designated initializers. vulkan_rust uses the builder pattern,
which auto-fills sType and zeroes all other fields:
// C
VkBufferCreateInfo info = {
.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
.pNext = NULL,
.size = 1024,
.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
.sharingMode = VK_SHARING_MODE_EXCLUSIVE,
};
VkBuffer buffer;
vkCreateBuffer(device, &info, NULL, &buffer);
// vulkan_rust
use vulkan_rust::vk;
use vk::*;
let info = BufferCreateInfo::builder()
.size(1024)
.usage(BufferUsageFlags::VERTEX_BUFFER)
.sharing_mode(SharingMode::EXCLUSIVE);
let buffer = unsafe { device.create_buffer(&info, None) }
.expect("Failed to create buffer");
Key differences:
sTypeis set automatically by::builder().pNextdefaults to null (usepush_next()to chain extensions).- The result is returned, not written through an output pointer.
- The allocator callback (
NULLin C) becomesNone.
pNext extension chains
In C, you manually link structs through pNext:
// C
VkPhysicalDeviceVulkan12Features features12 = {
.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_2_FEATURES,
.pNext = NULL,
.bufferDeviceAddress = VK_TRUE,
};
VkDeviceCreateInfo info = {
.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
.pNext = &features12,
// ...
};
In vulkan_rust, use push_next():
// vulkan_rust
use vulkan_rust::vk;
use vk::*;
let mut features12 = *PhysicalDeviceVulkan12Features::builder()
.buffer_device_address(1); // VkBool32: 1 = true
let info = DeviceCreateInfo::builder()
.push_next(&mut features12)
.queue_create_infos(&queue_infos);
push_next is type-safe: you can only chain structs the Vulkan spec
allows for that parent struct.
The two-call enumerate pattern
Many C Vulkan functions require two calls: one to get the count, one to fill the array:
// C: two calls to enumerate physical devices
uint32_t count = 0;
vkEnumeratePhysicalDevices(instance, &count, NULL);
VkPhysicalDevice* devices = malloc(count * sizeof(VkPhysicalDevice));
vkEnumeratePhysicalDevices(instance, &count, devices);
In vulkan_rust, these return a Vec directly:
// vulkan_rust: one call, returns Vec
let devices = unsafe { instance.enumerate_physical_devices() }
.expect("Failed to enumerate devices");
The crate handles the two-call pattern internally.
Output parameters
C Vulkan uses pointer parameters for output values. vulkan_rust
returns them as VkResult<T> or plain T:
// C
VkBuffer buffer;
VkResult result = vkCreateBuffer(device, &info, NULL, &buffer);
if (result != VK_SUCCESS) { /* handle error */ }
// vulkan_rust
use vulkan_rust::vk;
use vk::*;
let buffer: Buffer = unsafe { device.create_buffer(&info, None) }
.expect("Failed to create buffer");
Functions that output multiple handles (like vkAllocateCommandBuffers)
return a Vec directly:
use vulkan_rust::vk;
use vk::*;
let cmd_buffers = unsafe {
device.allocate_command_buffers(&alloc_info)
}
.expect("Failed to allocate command buffers");
Search tip: #[doc(alias)]
All vulkan_rust types and functions carry #[doc(alias = "vkOriginalName")]
attributes. If you know the C name, type it into the rustdoc search bar
and it will find the Rust equivalent. For example, searching for
VkBufferCreateInfo will find vk::BufferCreateInfo.
Common API mapping table
| C function | vulkan_rust method | Returns |
|---|---|---|
vkCreateInstance | entry.create_instance(&info, None) | VkResult<Instance> |
vkDestroyInstance | instance.destroy_instance(None) | () |
vkEnumeratePhysicalDevices | instance.enumerate_physical_devices() | VkResult<Vec<PhysicalDevice>> |
vkGetPhysicalDeviceProperties | instance.get_physical_device_properties(phys) | PhysicalDeviceProperties |
vkGetPhysicalDeviceQueueFamilyProperties | instance.get_physical_device_queue_family_properties(phys) | Vec<QueueFamilyProperties> |
vkCreateDevice | instance.create_device(phys, &info, None) | VkResult<Device> |
vkDestroyDevice | device.destroy_device(None) | () |
vkGetDeviceQueue | device.get_device_queue(family, index) | Queue |
vkCreateBuffer | device.create_buffer(&info, None) | VkResult<Buffer> |
vkDestroyBuffer | device.destroy_buffer(buffer, None) | () |
vkAllocateMemory | device.allocate_memory(&info, None) | VkResult<DeviceMemory> |
vkFreeMemory | device.free_memory(memory, None) | () |
vkBindBufferMemory | device.bind_buffer_memory(buffer, memory, offset) | VkResult<()> |
vkMapMemory | device.map_memory(memory, offset, size, flags) | VkResult<*mut c_void> |
vkUnmapMemory | device.unmap_memory(memory) | () |
vkCreateImage | device.create_image(&info, None) | VkResult<Image> |
vkDestroyImage | device.destroy_image(image, None) | () |
vkCreateImageView | device.create_image_view(&info, None) | VkResult<ImageView> |
vkCreateRenderPass | device.create_render_pass(&info, None) | VkResult<RenderPass> |
vkCreateGraphicsPipelines | device.create_graphics_pipelines(cache, &infos, None) | VkResult<Vec<Pipeline>> |
vkCreateCommandPool | device.create_command_pool(&info, None) | VkResult<CommandPool> |
vkAllocateCommandBuffers | device.allocate_command_buffers(&info) | VkResult<Vec<CommandBuffer>> |
vkBeginCommandBuffer | device.begin_command_buffer(cmd, &info) | VkResult<()> |
vkEndCommandBuffer | device.end_command_buffer(cmd) | VkResult<()> |
vkCmdBeginRenderPass | device.cmd_begin_render_pass(cmd, &info, contents) | () |
vkCmdEndRenderPass | device.cmd_end_render_pass(cmd) | () |
vkCmdBindPipeline | device.cmd_bind_pipeline(cmd, bind_point, pipeline) | () |
vkCmdDraw | device.cmd_draw(cmd, vertices, instances, first_v, first_i) | () |
vkCmdCopyBuffer | device.cmd_copy_buffer(cmd, src, dst, ®ions) | () |
vkQueueSubmit | device.queue_submit(queue, &submits, fence) | VkResult<()> |
vkQueuePresentKHR | device.queue_present_khr(queue, &info) | VkResult<()> |
vkDeviceWaitIdle | device.device_wait_idle() | VkResult<()> |
vkCreateFence | device.create_fence(&info, None) | VkResult<Fence> |
vkWaitForFences | device.wait_for_fences(&fences, wait_all, timeout) | VkResult<()> |
vkResetFences | device.reset_fences(&fences) | VkResult<()> |
vkCreateSemaphore | device.create_semaphore(&info, None) | VkResult<Semaphore> |
vkCreateDescriptorSetLayout | device.create_descriptor_set_layout(&info, None) | VkResult<DescriptorSetLayout> |
vkAllocateDescriptorSets | device.allocate_descriptor_sets(&info) | VkResult<Vec<DescriptorSet>> |
vkUpdateDescriptorSets | device.update_descriptor_sets(&writes, &copies) | () |
Worked example: full translation
C version
// Create a vertex buffer, bind memory, copy data
VkBufferCreateInfo buf_info = {
.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
.size = sizeof(vertices),
.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
.sharingMode = VK_SHARING_MODE_EXCLUSIVE,
};
VkBuffer buffer;
vkCreateBuffer(device, &buf_info, NULL, &buffer);
VkMemoryRequirements mem_req;
vkGetBufferMemoryRequirements(device, buffer, &mem_req);
VkMemoryAllocateInfo alloc_info = {
.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
.allocationSize = mem_req.size,
.memoryTypeIndex = find_memory_type(mem_req.memoryTypeBits,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT),
};
VkDeviceMemory memory;
vkAllocateMemory(device, &alloc_info, NULL, &memory);
vkBindBufferMemory(device, buffer, memory, 0);
void* data;
vkMapMemory(device, memory, 0, buf_info.size, 0, &data);
memcpy(data, vertices, sizeof(vertices));
vkUnmapMemory(device, memory);
vulkan_rust version
use vulkan_rust::vk;
use vk::*;
unsafe {
let buf_info = BufferCreateInfo::builder()
.size(std::mem::size_of_val(&vertices) as u64)
.usage(BufferUsageFlags::VERTEX_BUFFER)
.sharing_mode(SharingMode::EXCLUSIVE);
let buffer = device.create_buffer(&buf_info, None)
.expect("Failed to create buffer");
let mem_req = device.get_buffer_memory_requirements(buffer);
let alloc_info = MemoryAllocateInfo::builder()
.allocation_size(mem_req.size)
.memory_type_index(find_memory_type(
mem_req.memory_type_bits,
MemoryPropertyFlags::HOST_VISIBLE
| MemoryPropertyFlags::HOST_COHERENT,
));
let memory = device.allocate_memory(&alloc_info, None)
.expect("Failed to allocate memory");
device.bind_buffer_memory(buffer, memory, 0)
.expect("Failed to bind buffer memory");
let data = device.map_memory(
memory, 0, buf_info.size, MemoryMapFlags::empty(),
)
.expect("Failed to map memory");
std::ptr::copy_nonoverlapping(
vertices.as_ptr() as *const u8,
data as *mut u8,
std::mem::size_of_val(&vertices),
);
device.unmap_memory(memory);
}
The structure is the same: create, query requirements, allocate, bind, map, copy, unmap. The differences are syntactic, not conceptual.
Design Decisions & Safety Model
This page explains the major design decisions in vulkan_rust and why they
were made. Each section addresses a common “why not do it the other way?”
question.
Why two crates?
vulkan_rust is split into two crates with distinct roles:
vulkan-rust-sysis machine-generated fromvk.xml. It contains ~40,000 lines of#[repr(C)]structs,#[repr(transparent)]enum newtypes, bitmask types, handle types, and function pointer typedefs. It is#![no_std].vulkan-rustis hand-written. It providesEntry,Instance,Device, command loading, builders, surface helpers, and the error types.
Users depend on vulkan-rust and access raw types via vulkan_rust::vk::*.
This separation exists for three reasons:
- Build speed. Regenerating
vulkan-rust-sysonly happens when a new Vulkan spec version arrives. Day-to-day development invulkan-rustdoes not trigger a rebuild of 40k lines of generated code. - Reviewability. Generated code is validated by the generator’s test suite, not by human review. Hand-written code gets normal review. Mixing them in one crate blurs that boundary.
no_stdcompatibility.vulkan-rust-syshas zero dependencies and can be used in environments withoutstd.vulkan-rustrequiresstdfor library loading and allocation.
Why inherent methods instead of traits?
All Vulkan commands are inherent methods on Device or Instance:
use vulkan_rust::vk;
// No trait import needed, just call the method.
let buffer = unsafe { device.create_buffer(&info, None) }?;
Some Vulkan wrappers split commands across extension traits (e.g.
KhrSwapchainExtension). This forces callers to import the right trait
before calling the method, and IDE autocomplete only works when the trait
is already in scope.
With inherent methods, every command appears in autocomplete on Device
immediately, and there is nothing to import.
Why complete command loading?
When Device or Instance is created, vulkan_rust loads every function
pointer from every enabled extension in a single pass. Some wrappers require
callers to explicitly request which extension command tables to load.
Complete loading avoids that bookkeeping. The cost is negligible: loading a
few hundred function pointers takes microseconds at startup, and the
per-pointer memory cost is one Option<fn> each.
Why from_raw_parts?
Both Instance and Device provide an unsafe fn from_raw_parts
constructor that wraps an externally-owned Vulkan handle:
use vulkan_rust::Device;
let device = unsafe {
Device::from_raw_parts(raw_vk_device, Some(get_device_proc_addr_fn))
};
This exists for three use cases:
- OpenXR interop. The XR runtime creates the
VkInstanceandVkDevice. Your code receives raw handles and needs to wrap them. - Middleware. Profiling layers and debug tools may hand you raw handles.
- Testing. Unit tests can construct wrapper objects without a real GPU.
Why no Drop on handles?
Instance and Device do not implement Drop. Destruction is explicit:
use vulkan_rust::vk;
unsafe { device.destroy_device(None) };
Automatic destruction via Drop is tempting, but breaks in several
real scenarios:
from_raw_partsand shared ownership. If two wrappers hold the same raw handle (e.g. your code and an OpenXR runtime), aDropimpl would double-destroy it.- GPU-async lifetimes. The GPU may still be using resources when Rust
drops a handle. Correct destruction requires calling
device_wait_idleor using fences first. ADropimpl cannot know when the GPU is done. - Destruction order. Vulkan objects have strict parent-child destruction ordering. Rust’s drop order (reverse declaration order within a scope) may not match what Vulkan requires.
Explicit destruction makes the caller responsible, which matches Vulkan’s own model.
Why builders Deref to the inner struct?
Every builder dereferences to its inner vk::* struct:
use vulkan_rust::vk;
use vk::*;
let info = BufferCreateInfo::builder()
.size(1024)
.usage(BufferUsageFlags::VERTEX_BUFFER);
// Pass the builder directly where a &BufferCreateInfo is expected.
let buffer = unsafe { device.create_buffer(&info, None) }?;
Because BufferCreateInfoBuilder implements Deref<Target = BufferCreateInfo>,
there is no .build() call. The builder is the struct, with a convenient
setter API on top. This means you can pass a builder reference anywhere a
struct reference is expected.
Why #[repr(transparent)] newtypes for enums?
Vulkan “enums” are integer constants, not closed sets. Drivers and extensions
can return values that did not exist when your code was compiled. A Rust
enum with unknown discriminants is instant undefined behavior.
Instead, vulkan-rust-sys represents each Vulkan enum as a #[repr(transparent)]
newtype around i32:
use vulkan_rust::vk;
use vk::*;
#[repr(transparent)]
pub struct Format(i32);
impl Format {
pub const UNDEFINED: Self = Self(0);
pub const R8G8B8A8_UNORM: Self = Self(37);
// ... hundreds more
}
Unknown values are perfectly legal, they just lack a named constant. Pattern matching uses associated constants, and the compiler does not assume the set is exhaustive.
The safety model
All Vulkan command wrappers are unsafe fn. The caller is responsible
for meeting every precondition the Vulkan spec defines: valid handles,
correct synchronization, matching lifetimes, and so on.
vulkan_rust does not attempt to encode Vulkan’s safety rules in the Rust
type system. The spec is too large and too nuanced for compile-time
enforcement to be practical without severe ergonomic cost.
Instead, the safety strategy is:
- Validation layers during development. Enable
VK_LAYER_KHRONOS_validationin debug builds. The validation layer catches spec violations, use-after-free, missing synchronization, and more. It is the primary safety net. - Type-safe newtypes. You cannot accidentally pass a
Bufferwhere aPipelineis expected. This catches a class of handle mixups at compile time. - Builder
push_nextwith marker traits. Thepush_nextmethod on builders is generic over anExtends*marker trait, so only structs that the spec allows in a given pNext chain can be appended. - Panic on missing function pointers. If you call a command from an
extension that was not enabled, the stub panics with a descriptive message
(e.g.
"VK_KHR_surface not loaded"). This catches misconfiguration early.
What the generator handles vs what is hand-written
Generated (vulkan-rust-sys) | Hand-written (vulkan-rust) |
|---|---|
#[repr(C)] struct definitions | Entry, Instance, Device wrappers |
#[repr(transparent)] enum newtypes | Command loading and dispatch tables |
| Bitmask types and flag constants | from_raw_parts constructors |
| Handle newtypes | Error types (VkResult, LoadError) |
| Function pointer typedefs | Surface creation (SurfaceError) |
Builder structs with Deref | SPIR-V bytecode loading |
push_next methods + Extends* traits | Version parsing |
Wrapper methods on Device/Instance | Loader trait and library loading |
Error Handling Philosophy
This page explains how vulkan_rust maps Vulkan’s C-style error model into
idiomatic Rust, and where the boundaries between error types lie.
Vulkan’s error model
Every Vulkan command that can fail returns a VkResult, which is a plain
int32_t. The spec defines named constants for it:
- Success codes are non-negative:
VK_SUCCESS(0),VK_INCOMPLETE(5),VK_SUBOPTIMAL_KHR(1000001003), and a few others. - Error codes are negative:
VK_ERROR_OUT_OF_HOST_MEMORY(-1),VK_ERROR_DEVICE_LOST(-2), etc.
There is no exception system, no errno, no callback. The caller checks the return value after every call.
The VkResult<T> type alias
vulkan-rust defines a single result type for all Vulkan command wrappers:
use vulkan_rust::vk;
pub type VkResult<T> = std::result::Result<T, vk::Result>;
Here vk::Result is the #[repr(transparent)] i32 newtype from vulkan-rust-sys.
The Err variant holds any negative value. The Ok variant holds the
command’s output (a handle, a vector of properties, or just ()).
A helper function performs the conversion:
use vulkan_rust::vk;
pub(crate) fn check(result: vk::Result) -> VkResult<()> {
if result.as_raw() >= 0 {
Ok(())
} else {
Err(result)
}
}
This means all non-negative codes, including INCOMPLETE and SUBOPTIMAL,
are treated as success by default.
Success codes that are not SUCCESS
Some Vulkan commands return positive success codes that carry meaning:
INCOMPLETEfrom enumeration commands means the output buffer was too small.vulkan-rust’s two-call helpers handle this internally by querying the count first, so callers rarely see it.SUBOPTIMAL_KHRfromvkAcquireNextImageKHRmeans the swapchain still works but no longer matches the surface optimally. You should recreate the swapchain, but the current frame is still valid.
Because check maps all non-negative codes to Ok(()), these success
codes do not propagate as errors. Wrapper methods that need to distinguish
them (e.g. swapchain acquisition) inspect the raw code explicitly after
the check call.
LoadError for library loading
Before any Vulkan command runs, the shared library (vulkan-1.dll,
libvulkan.so) must be loaded and vkGetInstanceProcAddr resolved.
Failures here are not Vulkan API errors, they mean the Vulkan runtime
is not available at all.
LoadError captures these:
use vulkan_rust::vk;
pub enum LoadError {
/// The Vulkan shared library could not be found or opened.
Library(libloading::Error),
/// vkGetInstanceProcAddr could not be resolved from the library.
MissingEntryPoint,
}
LoadError implements std::error::Error and is returned from
Entry::new. It is entirely separate from vk::Result.
SurfaceError for surface creation
Creating a window surface involves platform-specific logic and
raw-window-handle integration. Three distinct failure modes exist:
use vulkan_rust::vk;
pub enum SurfaceError {
/// The display/window handle combination is not supported.
UnsupportedPlatform,
/// raw-window-handle returned an error.
HandleError(raw_window_handle::HandleError),
/// Vulkan error from the surface creation call.
Vulkan(vk::Result),
}
SurfaceError unifies platform detection failures, handle errors, and
the underlying Vulkan error into one type, so callers of
Instance::create_surface have a single Result to handle.
When vulkan_rust panics
Panics are reserved for programmer mistakes, never for runtime failures that a correct program could encounter:
- Calling an unloaded function pointer. If you call a command from an
extension that was not enabled at instance or device creation, the function
pointer is
None. The generated wrapper calls.expect()with a message like"VK_KHR_surface not loaded". This is a configuration error, not a recoverable failure. - Internal invariant violations. These should never happen. If they do, a panic with a descriptive message is the right response.
Vulkan runtime errors (out of memory, device lost, surface lost) are always
returned as Err(vk::Result), never panicked.
The standard pattern
Most application code follows the same pattern: call the command, propagate
errors with ?, handle them at the boundary.
use vulkan_rust::vk;
use vulkan_rust::Device;
use vk::*;
unsafe fn create_pipeline(
device: &Device,
layout: PipelineLayout,
render_pass: RenderPass,
// ...
) -> VkResult<Pipeline> {
let shader = device.create_shader_module(&shader_info, None)?;
let pipeline = device.create_graphics_pipelines(
PipelineCache::null(),
&[pipeline_info],
None,
)?[0];
device.destroy_shader_module(shader, None);
Ok(pipeline)
}
Individual commands propagate errors upward. The top-level caller (your main loop or initialization function) decides whether to retry, fall back, or exit.
Validation layers vs error codes
These are complementary, not overlapping:
| Concern | Mechanism |
|---|---|
| Spec violations (wrong usage, missing sync) | Validation layers (VK_LAYER_KHRONOS_validation) |
| Recoverable runtime failures (OOM, device lost) | vk::Result error codes via VkResult<T> |
| Missing Vulkan runtime | LoadError |
| Platform surface issues | SurfaceError |
| Programmer misconfiguration (extension not enabled) | Panic |
Validation layers are a development-time tool. They intercept every Vulkan call, check it against the spec, and report violations via debug callbacks. They have significant overhead and are typically disabled in release builds.
Error codes are a production-time mechanism. They report conditions the application can respond to: allocate less memory, recreate the swapchain, or shut down gracefully.
A well-structured vulkan_rust application uses both: validation layers to
catch bugs during development, error propagation to handle failures in
production.