Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The vulkan_rust Guide

Vulkan is a powerful graphics and compute API, but its explicitness comes at a cost: there is a lot to learn before you can put a single triangle on screen. Most documentation dumps the full specification on you and expects you to swim. This guide takes a different approach.

Every concept in this guide follows the same progression:

  1. Why it matters, the problem this concept solves, in plain language.
  2. Intuition, a mental model, analogy, or diagram that builds the right picture before you see any code.
  3. Worked example, annotated code you can read, run, and modify.
  4. Formal reference, spec terminology, edge cases, and links to the rustdoc API reference for when you need the full picture.

This structure is deliberate. Research in cognitive science shows that understanding develops from concrete to abstract, not the other way around. We build your intuition first, then formalize it.

Who this guide is for

You know Rust. You have some idea that GPUs exist and do interesting things. You may or may not have used OpenGL, DirectX, Metal, or WebGPU before, none of that is required. This guide assumes zero prior Vulkan knowledge.

If you are coming from another Vulkan crate like ash, the migration guide shows the differences side by side.

How this guide is organized

This guide follows the Diataxis documentation framework, which separates content by purpose:

SectionPurposeStart here if…
Getting StartedStep-by-step tutorialsYou want to draw something now
ConceptsExplanations of how Vulkan worksYou want to understand why
How-To GuidesRecipes for specific tasksYou know what you need to do
ArchitectureDesign decisions behind vulkan_rustYou want to contribute or evaluate

Concept dependency map

The concepts section is ordered so each chapter builds on the ones before it. Here is the dependency structure:

Object Model
    |
    +---> Memory Management
    |         |
    |         v
    +---> Command Buffers ----+
    |         |               |
    |         v               v
    +---> Synchronization   Render Passes
    |         |               |
    |         v               v
    +---> Pipelines <---------+
    |         |
    |         v
    +---> Descriptor Sets
    |
    +---> pNext Extension Chain  (independent, read any time)
    +---> Validation Layers      (independent, read any time)

You can read linearly from top to bottom, or jump to whatever you need. The dependency map shows you which chapters you should read first if something doesn’t make sense.

API documentation

This guide is a companion to the API reference. The API docs cover every type, method, and constant with spec links, error codes, safety requirements, and thread safety annotations. This guide covers the why and how that API docs cannot.

Quick taste

Here is the minimum code to initialize Vulkan with vulkan_rust:

use vulkan_rust::{Entry, LibloadingLoader};

fn main() {
    // Load the Vulkan loader library from the system.
    let loader = unsafe { LibloadingLoader::new() }.expect("Failed to find Vulkan");
    let entry = unsafe { Entry::new(loader) }.expect("Failed to load Vulkan");

    // Query the highest Vulkan version the driver supports.
    let version = entry.version().expect("Failed to query version");
    println!("Vulkan {}.{}.{}", version.major, version.minor, version.patch);
}

Ready to go further? Start with Installation.

Installation

Add vulkan_rust to your project

[dependencies]
vulkan-rust = "0.10"

Platform requirements

Windows

Install the LunarG Vulkan SDK. This provides vulkan-1.dll and the validation layers.

Linux

Install your distribution’s Vulkan packages:

# Ubuntu / Debian
sudo apt install libvulkan-dev vulkan-validationlayers

# Fedora
sudo dnf install vulkan-loader-devel vulkan-validation-layers

# Arch
sudo pacman -S vulkan-icd-loader vulkan-validation-layers

macOS

Install the LunarG Vulkan SDK for macOS, which includes MoltenVK for Vulkan-on-Metal translation.

Verify your setup

After installing, run this to confirm Vulkan is available:

# If you installed the Vulkan SDK:
vulkaninfo --summary

You should see your GPU listed with a supported Vulkan version.

Next steps

Ready to write code? Continue to Hello Triangle, Part 1.

Hello Triangle, Part 1: Instance & Device

This is the first part of a four-part tutorial that builds a complete Vulkan application from scratch. By the end of part 4, you will have a colored triangle on screen. By the end of this part, you will have a working connection to your GPU.

What we build in this part:

Load Vulkan ──> Create Instance ──> Pick a GPU ──> Create Device ──> Get a Queue

Each step depends on the previous one. We will take them one at a time, with an explanation of why each step exists before the code.

Prerequisites

Create the project

cargo new hello-triangle
cd hello-triangle

Add vulkan-rust to your Cargo.toml:

[dependencies]
vulkan-rust = "0.10"

Step 1: Load the Vulkan library

Before you can call any Vulkan function, you must load the Vulkan shared library (vulkan-1.dll on Windows, libvulkan.so on Linux, libvulkan.dylib on macOS). This library is the loader, the entry point that routes your calls to the correct GPU driver.

use vulkan_rust::{Entry, LibloadingLoader};

fn main() {
    // Load the Vulkan shared library from the system.
    // This can fail if the Vulkan SDK is not installed.
    let loader = LibloadingLoader::new()
        .expect("Failed to find Vulkan library");

    // Create the Entry, which resolves the bootstrap function pointers
    // (vkGetInstanceProcAddr, vkGetDeviceProcAddr).
    let entry = unsafe { Entry::new(loader) }
        .expect("Failed to load Vulkan entry points");

    // Verify: query the highest Vulkan version the driver supports.
    let version = entry.version().expect("Failed to query Vulkan version");
    println!("Vulkan {}.{}.{}", version.major, version.minor, version.patch);
}

Run this with cargo run. If you see output like Vulkan 1.3.280, your setup is working.

Why is this unsafe? Loading a shared library and calling its functions through raw pointers is inherently unsafe. The compiler cannot verify that the library is valid or that the function pointers it returns are correct. This is the only unsafe we need to understand right now; the rest follow the same pattern.

Step 2: Create a Vulkan Instance

An Instance is your application’s connection to the Vulkan runtime. It loads the driver, enables validation layers, and provides access to the physical GPUs on the system.

Think of it as opening a session: “I am application X, I want to use Vulkan version Y, please give me access.”

use vulkan_rust::vk;
use vk::*;

// ── Describe your application ──────────────────────────────────
//
// ApplicationInfo tells the driver who you are. This is optional
// but helps driver vendors optimize for known applications.
let app_info = ApplicationInfo::builder()
    .application_name(c"Hello Triangle")
    .application_version(1)
    .engine_name(c"No Engine")
    .engine_version(1)
    .api_version(1 << 22);  // Vulkan 1.0

// ── Describe what you need ─────────────────────────────────────
//
// No layers or extensions yet. We will add validation layers and
// surface extensions in later parts.
let create_info = InstanceCreateInfo::builder()
    .application_info(&app_info);

// ── Create the instance ────────────────────────────────────────
let instance = unsafe { entry.create_instance(&create_info, None) }
    .expect("Failed to create Vulkan instance");

println!("Instance created successfully");

Before reading on: why do you think the Instance takes an api_version field? What would happen if you requested a version the driver doesn’t support?

The api_version tells the driver the highest Vulkan version your application is written against. If the driver supports that version or higher, it succeeds. If you request 1.3 on a 1.0-only driver, instance creation fails with ERROR_INCOMPATIBLE_DRIVER.

Step 3: Pick a physical device (GPU)

A system can have multiple GPUs: a discrete NVIDIA/AMD card, an integrated Intel GPU, or even a software renderer. You must choose one.

use vk::PhysicalDeviceType;

// ── Enumerate GPUs ─────────────────────────────────────────────
let physical_devices = unsafe { instance.enumerate_physical_devices() }
    .expect("Failed to enumerate GPUs");

println!("Found {} GPU(s):", physical_devices.len());

// ── Inspect each one ───────────────────────────────────────────
for (i, &pd) in physical_devices.iter().enumerate() {
    let props = unsafe { instance.get_physical_device_properties(pd) };

    // The device name is a null-terminated C string in a fixed-size array.
    let name_bytes: Vec<u8> = props.device_name
        .iter()
        .take_while(|&&c| c != 0)
        .map(|&c| c as u8)
        .collect();
    let name = String::from_utf8_lossy(&name_bytes);

    let device_type = match props.device_type {
        PhysicalDeviceType::DISCRETE_GPU => "Discrete GPU",
        PhysicalDeviceType::INTEGRATED_GPU => "Integrated GPU",
        PhysicalDeviceType::VIRTUAL_GPU => "Virtual GPU",
        PhysicalDeviceType::CPU => "CPU (software)",
        _ => "Other",
    };

    println!("  [{}] {} ({})", i, name, device_type);
}

// ── Pick the first GPU ─────────────────────────────────────────
//
// A real application would score GPUs by capability (discrete >
// integrated, required features, memory size). For this tutorial,
// the first one is fine.
let physical_device = physical_devices[0];

Before reading on: the code above uses get_physical_device_properties to read the GPU name and type. What other information do you think the driver exposes about each physical device?

The PhysicalDeviceProperties struct also contains the driver version, the Vulkan API version the device supports, and limits, a struct with hundreds of fields describing maximum texture sizes, buffer alignments, and other hardware limits.

Step 4: Find a queue family that supports graphics

The GPU exposes queues, which are the endpoints where you submit work. Queues are grouped into families, where each family supports a specific set of operations (graphics, compute, transfer, etc.).

We need a queue family that supports graphics operations.

use vk::QueueFlags;

// ── Query queue families ───────────────────────────────────────
let queue_families = unsafe {
    instance.get_physical_device_queue_family_properties(physical_device)
};

// ── Find one that supports graphics ────────────────────────────
let graphics_family_index = queue_families
    .iter()
    .enumerate()
    .find(|(_, family)| {
        family.queue_flags & QueueFlags::GRAPHICS
            != QueueFlags::empty()
    })
    .map(|(index, _)| index as u32)
    .expect("No graphics queue family found");

println!("Using queue family {} for graphics", graphics_family_index);

Queue families are identified by their index in the array. We will pass this index to device creation (to request a queue from that family) and to many other calls throughout the application.

Step 5: Create a logical Device

A Device is your interface to one physical GPU. It loads all the device-level function pointers and provides the methods you will use for the rest of the application: creating buffers, recording commands, submitting work.

Creating a Device also creates the queues you requested.

use vk::*;

// ── Request one queue from the graphics family ─────────────────
let queue_priority = 1.0_f32;

let queue_info = DeviceQueueCreateInfo::builder()
    .queue_family_index(graphics_family_index)
    .queue_priorities(std::slice::from_ref(&queue_priority));

// ── Create the device ──────────────────────────────────────────
//
// No extensions or features yet. We will add the swapchain
// extension in Part 2.
let device_info = DeviceCreateInfo::builder()
    .queue_create_infos(std::slice::from_ref(&queue_info));

let device = unsafe {
    instance.create_device(physical_device, &device_info, None)
}
.expect("Failed to create logical device");

println!("Device created successfully");

Before reading on: we requested a queue with priority 1.0. What do you think the priority controls?

Queue priority is a hint to the driver about how to schedule work when multiple queues compete for GPU resources. 1.0 is the highest priority. Most applications use a single queue and set it to 1.0. The actual effect is driver-dependent.

Step 6: Get a queue handle

The Device created our queues internally. We retrieve handles to them with get_device_queue.

// ── Retrieve the graphics queue ────────────────────────────────
//
// Queue family index: the family we chose above.
// Queue index: 0, because we only requested 1 queue from this family.
let graphics_queue = unsafe {
    device.get_device_queue(graphics_family_index, 0)
};

println!("Graphics queue ready");

The queue handle is not created or destroyed by you. It is owned by the Device and valid for the Device’s lifetime. (See The Vulkan Object Model for the distinction between created, allocated, and enumerated objects.)

Step 7: Clean up

Vulkan requires explicit destruction in reverse creation order. vulkan_rust has no Drop implementations on purpose, so you must call the destroy methods yourself.

// ── Destroy in reverse order ───────────────────────────────────
//
// Queue handles are owned by the Device, no destroy needed.
// Device must be destroyed before Instance.
// Instance must be destroyed last.
unsafe {
    device.destroy_device(None);
    instance.destroy_instance(None);
}

println!("Cleaned up successfully");

Putting it all together

Here is the complete program. Copy this into src/main.rs and run it with cargo run.

use vulkan_rust::{Entry, LibloadingLoader};
use vulkan_rust::vk;
use vk::*;

fn main() {
    // ── Step 1: Load Vulkan ────────────────────────────────────
    let loader = LibloadingLoader::new()
        .expect("Vulkan library not found");
    let entry = unsafe { Entry::new(loader) }
        .expect("Failed to load Vulkan");

    let version = entry.version().expect("Failed to query version");
    println!("Vulkan {}.{}.{}", version.major, version.minor, version.patch);

    // ── Step 2: Create Instance ────────────────────────────────
    let app_info = ApplicationInfo::builder()
        .application_name(c"Hello Triangle")
        .application_version(1)
        .engine_name(c"No Engine")
        .engine_version(1)
        .api_version(1 << 22);  // Vulkan 1.0

    let create_info = InstanceCreateInfo::builder()
        .application_info(&app_info);

    let instance = unsafe { entry.create_instance(&create_info, None) }
        .expect("Failed to create instance");

    // ── Step 3: Pick a GPU ─────────────────────────────────────
    let physical_devices = unsafe { instance.enumerate_physical_devices() }
        .expect("Failed to enumerate GPUs");

    let physical_device = physical_devices[0];

    let props = unsafe {
        instance.get_physical_device_properties(physical_device)
    };
    let name_bytes: Vec<u8> = props.device_name
        .iter()
        .take_while(|&&c| c != 0)
        .map(|&c| c as u8)
        .collect();
    println!("GPU: {}", String::from_utf8_lossy(&name_bytes));

    // ── Step 4: Find a graphics queue family ───────────────────
    let queue_families = unsafe {
        instance.get_physical_device_queue_family_properties(physical_device)
    };

    let graphics_family_index = queue_families
        .iter()
        .enumerate()
        .find(|(_, family)| {
            family.queue_flags & QueueFlags::GRAPHICS
                != QueueFlags::empty()
        })
        .map(|(index, _)| index as u32)
        .expect("No graphics queue family found");

    // ── Step 5: Create Device ──────────────────────────────────
    let queue_priority = 1.0_f32;
    let queue_info = DeviceQueueCreateInfo::builder()
        .queue_family_index(graphics_family_index)
        .queue_priorities(std::slice::from_ref(&queue_priority));

    let device_info = DeviceCreateInfo::builder()
        .queue_create_infos(std::slice::from_ref(&queue_info));

    let device = unsafe {
        instance.create_device(physical_device, &device_info, None)
    }
    .expect("Failed to create device");

    // ── Step 6: Get the graphics queue ─────────────────────────
    let _graphics_queue = unsafe {
        device.get_device_queue(graphics_family_index, 0)
    };

    println!("Vulkan initialized successfully!");
    println!("Ready for Part 2: Swapchain & Surface");

    // ── Step 7: Clean up ───────────────────────────────────────
    unsafe {
        device.destroy_device(None);
        instance.destroy_instance(None);
    }
}

Expected output:

Vulkan 1.3.280
GPU: NVIDIA GeForce RTX 4070
Vulkan initialized successfully!
Ready for Part 2: Swapchain & Surface

(Your version number and GPU name will differ.)

What we learned

This part covered the Vulkan initialization sequence:

StepWhatWhy
Load libraryLibloadingLoader::new() + Entry::new()Get access to Vulkan function pointers
Create Instanceentry.create_instance()Open a session with the Vulkan driver
Pick GPUenumerate_physical_devices() + get_physical_device_properties()Choose which hardware to use
Find queue familyget_physical_device_queue_family_properties()Find a queue that supports graphics
Create Deviceinstance.create_device()Get a logical interface to the GPU
Get queuedevice.get_device_queue()Get the submission endpoint

Every Vulkan application does these steps. They are the foundation that everything else builds on.

What we skipped (and will add later)

  • Validation layers (Part 2), catch API misuse during development. See Validation Layers for the concept.
  • Surface and swapchain (Part 2), connect to a window so we can display pixels.
  • Extensions, we will enable VK_KHR_swapchain and surface extensions in Part 2.

Exercises

  1. Print all GPUs. Modify the program to print every physical device with its name and type, not just the first one.
  2. Print all queue families. For the chosen GPU, print every queue family with its flags (GRAPHICS, COMPUTE, TRANSFER) and queue count.
  3. Choose discrete over integrated. Modify the GPU selection to prefer a discrete GPU when one is available.

Next

Part 2: Swapchain & Surface adds a window, creates a swapchain, and introduces validation layers.

Hello Triangle, Part 2: Swapchain & Surface

In Part 1 we loaded Vulkan, created an Instance and Device, and retrieved a graphics queue. We can talk to the GPU, but we have nowhere to show anything.

What we build in this part:

Open a window ──> Create Surface ──> Create Swapchain ──> Get image views
                                                          + validation layers

By the end of this part, we will have a window with a swapchain ready to receive rendered frames.

New dependencies

We need a windowing library. This tutorial uses winit, but vulkan_rust works with anything that implements raw-window-handle.

[dependencies]
vulkan-rust = "0.10"
winit = "0.30"

Step 1: Open a window

Before creating a Vulkan surface, we need a platform window.

use winit::application::ApplicationHandler;
use winit::event::WindowEvent;
use winit::event_loop::{ActiveEventLoop, EventLoop};
use winit::window::{Window, WindowId};

struct App {
    window: Option<Window>,
}

impl ApplicationHandler for App {
    fn resumed(&mut self, event_loop: &ActiveEventLoop) {
        if self.window.is_some() {
            return;
        }

        let attrs = Window::default_attributes()
            .with_title("Hello Triangle")
            .with_inner_size(winit::dpi::LogicalSize::new(800, 600));
        let window = event_loop
            .create_window(attrs)
            .expect("Failed to create window");

        // ... Vulkan initialization uses &window here ...

        self.window = Some(window);
    }

    fn window_event(
        &mut self,
        event_loop: &ActiveEventLoop,
        _id: WindowId,
        event: WindowEvent,
    ) {
        if matches!(event, WindowEvent::CloseRequested) {
            event_loop.exit();
        }
    }
}

fn main() {
    let event_loop = EventLoop::new().expect("Failed to create event loop");
    let mut app = App { window: None };
    event_loop.run_app(&mut app).expect("Event loop error");
}

Step 2: Create the Instance with surface extensions

In Part 1 we created an Instance with no extensions. Now we need the platform surface extensions so Vulkan can render to our window.

vulkan_rust provides required_extensions() which returns the right extensions for your platform.

use vulkan_rust::{Entry, LibloadingLoader};
use vulkan_rust::vk;
use vk::*;

// ── Load Vulkan ────────────────────────────────────────────────
let loader = LibloadingLoader::new()
    .expect("Vulkan library not found");
let entry = unsafe { Entry::new(loader) }
    .expect("Failed to load Vulkan");

// ── Gather required extensions ─────────────────────────────────
//
// required_extensions() returns platform-specific extensions:
//   Windows: VK_KHR_surface + VK_KHR_win32_surface
//   Linux:   VK_KHR_surface + VK_KHR_xlib_surface + VK_KHR_wayland_surface
//   macOS:   VK_KHR_surface + VK_EXT_metal_surface
let surface_extensions = vulkan_rust::required_extensions();
let extension_ptrs: Vec<*const i8> = surface_extensions
    .iter()
    .map(|ext| ext.as_ptr())
    .collect();

// ── Enable the validation layer ────────────────────────────────
//
// Always enable during development. See the Validation Layers
// concept chapter for details.
let validation_layer = c"VK_LAYER_KHRONOS_validation";
let layer_ptrs = [validation_layer.as_ptr()];

// ── Create the instance ────────────────────────────────────────
let app_info = ApplicationInfo::builder()
    .application_name(c"Hello Triangle")
    .application_version(1)
    .engine_name(c"No Engine")
    .engine_version(1)
    .api_version(1 << 22);  // Vulkan 1.0

let create_info = InstanceCreateInfo::builder()
    .application_info(&app_info)
    .enabled_extension_names(&extension_ptrs)
    .enabled_layer_names(&layer_ptrs);

let instance = unsafe { entry.create_instance(&create_info, None) }
    .expect("Failed to create instance");

Before reading on: we enabled validation layers here but did not set up a debug messenger callback. What happens to validation errors?

They go to stderr on most platforms. Setting up a debug messenger (as shown in the Validation chapter) gives you programmatic control over the output. For a tutorial, stderr is fine.

Step 3: Create a Surface

A Surface is Vulkan’s abstraction over a platform window. It represents the thing you render to: a Win32 HWND, an X11 Window, a Wayland wl_surface, etc.

vulkan_rust provides instance.create_surface() which handles the platform dispatch for you via raw-window-handle.

// ── Create the surface ─────────────────────────────────────────
//
// create_surface uses raw-window-handle to detect the platform
// and call the right vkCreate*Surface function.
let surface = unsafe { instance.create_surface(&window, &window, None) }
    .expect("Failed to create surface");

The surface is an Instance-level object. It must be destroyed before the Instance.

Step 4: Pick a GPU (with presentation support)

In Part 1 we picked the first GPU. Now we also need to verify it can present to our surface, which means it has a queue family that supports both graphics and presentation.

// ── Enumerate GPUs ─────────────────────────────────────────────
let physical_devices = unsafe { instance.enumerate_physical_devices() }
    .expect("Failed to enumerate GPUs");

// ── Find a GPU with a queue family that supports both graphics
//    and presentation to our surface ────────────────────────────
use vk::*;

let mut physical_device = PhysicalDevice::null();
let mut graphics_family_index = 0u32;

'outer: for &pd in &physical_devices {
    let queue_families = unsafe {
        instance.get_physical_device_queue_family_properties(pd)
    };

    for (i, family) in queue_families.iter().enumerate() {
        let supports_graphics =
            family.queue_flags & QueueFlags::GRAPHICS
            != QueueFlags::empty();

        // Check if this queue family can present to our surface.
        let supports_present = unsafe {
            instance.get_physical_device_surface_support_khr(
                pd,
                i as u32,
                surface,
            )
        }
        .unwrap_or(false);

        if supports_graphics && supports_present {
            physical_device = pd;
            graphics_family_index = i as u32;
            break 'outer;
        }
    }
}

assert!(
    !physical_device.is_null(),
    "No GPU found with graphics + presentation support"
);

Before reading on: why do we check for presentation support separately from graphics support? Can a queue family support graphics but not presentation?

Yes. On some hardware, a queue family can execute graphics commands but cannot present to a specific surface. Presentation support depends on both the queue family and the surface (which is tied to a specific monitor/display). Always check with get_physical_device_surface_support_khr.

Step 5: Create the Device with the swapchain extension

Now we add VK_KHR_swapchain, the extension that lets us create a swapchain.

use vk::extension_names::KHR_SWAPCHAIN_EXTENSION_NAME;
let device_extensions = [KHR_SWAPCHAIN_EXTENSION_NAME.as_ptr()];

let queue_priority = 1.0_f32;
let queue_info = DeviceQueueCreateInfo::builder()
    .queue_family_index(graphics_family_index)
    .queue_priorities(std::slice::from_ref(&queue_priority));

let device_info = DeviceCreateInfo::builder()
    .queue_create_infos(std::slice::from_ref(&queue_info))
    .enabled_extension_names(&device_extensions);

let device = unsafe {
    instance.create_device(physical_device, &device_info, None)
}
.expect("Failed to create device");

let graphics_queue = unsafe {
    device.get_device_queue(graphics_family_index, 0)
};

Step 6: Query surface capabilities

Before creating a swapchain, we must ask the surface what it supports: image formats, present modes, minimum/maximum image count, and supported image sizes.

// ── Query what the surface supports ────────────────────────────
let capabilities = unsafe {
    instance.get_physical_device_surface_capabilities_khr(
        physical_device,
        surface,
    )
}
.expect("Failed to query surface capabilities");

let formats = unsafe {
    instance.get_physical_device_surface_formats_khr(
        physical_device,
        surface,
    )
}
.expect("Failed to query surface formats");

let present_modes = unsafe {
    instance.get_physical_device_surface_present_modes_khr(
        physical_device,
        surface,
    )
}
.expect("Failed to query present modes");

Step 7: Choose swapchain settings

We need to decide three things: the image format, the present mode, and the image extent (resolution).

use vk::*;

// ── Choose format ──────────────────────────────────────────────
//
// Prefer B8G8R8A8_SRGB with SRGB_NONLINEAR color space.
// Fall back to whatever is available.
let surface_format = formats
    .iter()
    .find(|f| {
        f.format == Format::B8G8R8A8_SRGB
            && f.color_space == ColorSpaceKHR::SRGB_NONLINEAR
    })
    .unwrap_or(&formats[0]);

// ── Choose present mode ────────────────────────────────────────
//
// MAILBOX = triple buffering (low latency, no tearing).
// FIFO = vsync (guaranteed available).
let present_mode = if present_modes.contains(&PresentModeKHR::MAILBOX) {
    PresentModeKHR::MAILBOX
} else {
    PresentModeKHR::FIFO  // always available
};

// ── Choose extent (resolution) ─────────────────────────────────
//
// If current_extent is 0xFFFFFFFF, the surface size is determined
// by the swapchain extent. Otherwise, use the surface's size.
let extent = if capabilities.current_extent.width != u32::MAX {
    capabilities.current_extent
} else {
    let size = window.inner_size();
    Extent2D {
        width: size.width.clamp(
            capabilities.min_image_extent.width,
            capabilities.max_image_extent.width,
        ),
        height: size.height.clamp(
            capabilities.min_image_extent.height,
            capabilities.max_image_extent.height,
        ),
    }
};

// ── Choose image count ─────────────────────────────────────────
//
// Request one more than the minimum so we always have an image
// to render to while the display is reading another.
let image_count = {
    let desired = capabilities.min_image_count + 1;
    if capabilities.max_image_count > 0 {
        desired.min(capabilities.max_image_count)
    } else {
        desired // max_image_count == 0 means no upper limit
    }
};

Step 8: Create the swapchain

let swapchain_info = SwapchainCreateInfoKHR::builder()
    .surface(surface)
    .min_image_count(image_count)
    .image_format(surface_format.format)
    .image_color_space(surface_format.color_space)
    .image_extent(extent)
    .image_array_layers(1)
    .image_usage(ImageUsageFlags::COLOR_ATTACHMENT)
    .image_sharing_mode(SharingMode::EXCLUSIVE)
    .pre_transform(capabilities.current_transform)
    .composite_alpha(CompositeAlphaFlagBitsKHR::OPAQUE)
    .present_mode(present_mode)
    .clipped(true)       // discard pixels behind other windows
    .old_swapchain(SwapchainKHR::null());

let swapchain = unsafe {
    device.create_swapchain_khr(&swapchain_info, None)
}
.expect("Failed to create swapchain");

The swapchain now owns a set of images. We retrieve their handles next.

Step 9: Get swapchain images and create image views

The swapchain images are owned by the swapchain, so we do not destroy them ourselves. But we need image views to use them in render passes and framebuffers.

// ── Get the swapchain images ───────────────────────────────────
let swapchain_images = unsafe {
    device.get_swapchain_images_khr(swapchain)
}
.expect("Failed to get swapchain images");

println!("Swapchain has {} images", swapchain_images.len());

// ── Create an image view for each swapchain image ──────────────
let swapchain_image_views: Vec<ImageView> = swapchain_images
    .iter()
    .map(|&image| {
        let view_info = ImageViewCreateInfo::builder()
            .image(image)
            .view_type(ImageViewType::_2D)
            .format(surface_format.format)
            .components(ComponentMapping {
                r: ComponentSwizzle::IDENTITY,
                g: ComponentSwizzle::IDENTITY,
                b: ComponentSwizzle::IDENTITY,
                a: ComponentSwizzle::IDENTITY,
            })
            .subresource_range(ImageSubresourceRange {
                aspect_mask: ImageAspectFlags::COLOR,
                base_mip_level: 0,
                level_count: 1,
                base_array_layer: 0,
                layer_count: 1,
            });

        unsafe { device.create_image_view(&view_info, None) }
            .expect("Failed to create image view")
    })
    .collect();

Where we are now

At this point we have:

Window (winit)
  │
  └── Surface (VK_KHR_surface)
        │
        └── Swapchain (VK_KHR_swapchain)
              │
              ├── Image 0 ──> ImageView 0
              ├── Image 1 ──> ImageView 1
              └── Image 2 ──> ImageView 2

The swapchain gives us images to render into. The image views let us use those images in render passes. In Part 3, we will create a render pass and a graphics pipeline so we can actually draw something.

Clean up

Destruction in reverse creation order:

unsafe {
    // Image views (we created these)
    for &view in &swapchain_image_views {
        device.destroy_image_view(view, None);
    }

    // Swapchain (device-level)
    device.destroy_swapchain_khr(swapchain, None);

    // Device
    device.destroy_device(None);

    // Surface (instance-level, before instance)
    instance.destroy_surface(surface, None);

    // Instance
    instance.destroy_instance(None);
}

What we learned

StepWhatWhy
Surface extensionsrequired_extensions()Platform-specific window integration
Validation layerVK_LAYER_KHRONOS_validationCatch mistakes during development
Surfaceinstance.create_surface()Connect Vulkan to a window
Presentation checkget_physical_device_surface_support_khrEnsure the GPU can present to this surface
Swapchain extensionVK_KHR_swapchainEnable swapchain creation on the device
Surface capabilitiesget_physical_device_surface_capabilities_khrQuery supported formats, sizes, present modes
Swapchaincreate_swapchain_khrA set of images the display rotates through
Image viewscreate_image_viewMake swapchain images usable by render passes

Concepts to explore

Exercises

  1. Print all surface formats. Before choosing a format, print every format and color space the surface supports.
  2. Print the chosen present mode. Print which present mode was selected (MAILBOX or FIFO) and why.
  3. Handle no validation layer. What happens if the validation layer is not installed? Modify the code to check for its availability with enumerate_instance_layer_properties and skip it gracefully.

Next

Part 3: Render Pass & Pipeline creates the graphics pipeline that defines how we draw our triangle.

Hello Triangle, Part 3: Render Pass & Pipeline

In Part 2 we opened a window, created a surface and swapchain, and retrieved image views. We have somewhere to render, but no instructions for how to render.

What we build in this part:

Write shaders ──> Create Render Pass ──> Create Pipeline ──> Create Framebuffers

Threshold concept. The graphics pipeline is one of Vulkan’s biggest conceptual shifts. Instead of setting state one call at a time (like OpenGL’s glEnable(GL_DEPTH_TEST)), you define all rendering state in a single pipeline object. This is verbose, but it means the driver has complete information at creation time and compiles everything to GPU machine code once, not at draw time.

Step 1: Write shaders

We need a vertex shader (positions the triangle) and a fragment shader (colors it). Write these as GLSL and compile to SPIR-V.

triangle.vert:

#version 450

// Hard-coded triangle vertices (no vertex buffer needed).
vec2 positions[3] = vec2[](
    vec2( 0.0, -0.5),
    vec2( 0.5,  0.5),
    vec2(-0.5,  0.5)
);

vec3 colors[3] = vec3[](
    vec3(1.0, 0.0, 0.0),   // red
    vec3(0.0, 1.0, 0.0),   // green
    vec3(0.0, 0.0, 1.0)    // blue
);

layout(location = 0) out vec3 frag_color;

void main() {
    gl_Position = vec4(positions[gl_VertexIndex], 0.0, 1.0);
    frag_color = colors[gl_VertexIndex];
}

triangle.frag:

#version 450

layout(location = 0) in vec3 frag_color;
layout(location = 0) out vec4 out_color;

void main() {
    out_color = vec4(frag_color, 1.0);
}

Compile them with glslc (included in the Vulkan SDK):

glslc triangle.vert -o triangle.vert.spv
glslc triangle.frag -o triangle.frag.spv

Place the .spv files in your project’s src/ directory (or wherever you prefer, adjust the path in the code below).

Before reading on: this vertex shader hard-codes the triangle positions inside the shader rather than reading them from a vertex buffer. Why might this be useful for a first example?

It eliminates the need for vertex buffers, memory allocation, and buffer binding, letting us focus on the pipeline and render pass without those distractions. A real application reads vertices from buffers (covered in the Memory Management chapter).

Step 2: Load SPIR-V and create shader modules

use vulkan_rust::vk;
use vulkan_rust::cast_to_u32;
use vk::*;

// ── Load SPIR-V bytecode ───────────────────────────────────────
let vert_bytes = include_bytes!("triangle.vert.spv");
let frag_bytes = include_bytes!("triangle.frag.spv");

// SPIR-V must be aligned to 4 bytes. cast_to_u32 checks alignment.
let vert_code = cast_to_u32(vert_bytes)
    .expect("Vertex shader SPIR-V is not 4-byte aligned");
let frag_code = cast_to_u32(frag_bytes)
    .expect("Fragment shader SPIR-V is not 4-byte aligned");

// ── Create shader modules ──────────────────────────────────────
let vert_info = ShaderModuleCreateInfo::builder()
    .code(vert_code);
let frag_info = ShaderModuleCreateInfo::builder()
    .code(frag_code);

let vert_module = unsafe { device.create_shader_module(&vert_info, None) }
    .expect("Failed to create vertex shader module");
let frag_module = unsafe { device.create_shader_module(&frag_info, None) }
    .expect("Failed to create fragment shader module");

Shader modules are temporary containers. After the pipeline is created, we can destroy them.

Step 3: Create the render pass

The render pass describes what attachments we render to and how they are handled. See Render Passes & Framebuffers for the full concept.

use vulkan_rust::vk;
use vk::*;

// ── Color attachment: the swapchain image ──────────────────────
let color_attachment = AttachmentDescription {
    flags: AttachmentDescriptionFlags::empty(),
    format: surface_format.format,  // from Part 2
    samples: SampleCountFlagBits::_1,
    load_op: AttachmentLoadOp::CLEAR,       // clear to black
    store_op: AttachmentStoreOp::STORE,      // keep the result
    stencil_load_op: AttachmentLoadOp::DONT_CARE,
    stencil_store_op: AttachmentStoreOp::DONT_CARE,
    initial_layout: ImageLayout::UNDEFINED,
    final_layout: ImageLayout::PRESENT_SRC,  // ready for display
};

// ── Subpass: use the color attachment ──────────────────────────
let color_ref = AttachmentReference {
    attachment: 0,
    layout: ImageLayout::COLOR_ATTACHMENT_OPTIMAL,
};

let subpass = SubpassDescription {
    flags: SubpassDescriptionFlags::empty(),
    pipeline_bind_point: PipelineBindPoint::GRAPHICS,
    input_attachment_count: 0,
    p_input_attachments: core::ptr::null(),
    color_attachment_count: 1,
    p_color_attachments: &color_ref,
    p_resolve_attachments: core::ptr::null(),
    p_depth_stencil_attachment: core::ptr::null(),
    preserve_attachment_count: 0,
    p_preserve_attachments: core::ptr::null(),
};

// ── Subpass dependency ─────────────────────────────────────────
//
// Ensure the image layout transition happens before we write color.
let dependency = SubpassDependency {
    src_subpass: vk::SUBPASS_EXTERNAL,
    dst_subpass: 0,
    src_stage_mask: PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT,
    dst_stage_mask: PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT,
    src_access_mask: AccessFlags::NONE,
    dst_access_mask: AccessFlags::COLOR_ATTACHMENT_WRITE,
    dependency_flags: DependencyFlags::empty(),
};

let render_pass_info = RenderPassCreateInfo::builder()
    .attachments(std::slice::from_ref(&color_attachment))
    .subpasses(std::slice::from_ref(&subpass))
    .dependencies(std::slice::from_ref(&dependency));

let render_pass = unsafe {
    device.create_render_pass(&render_pass_info, None)
}
.expect("Failed to create render pass");

Step 4: Create the pipeline layout

Our shaders don’t use any descriptors or push constants, so the layout is empty.

use vulkan_rust::vk;
use vk::*;

let layout_info = PipelineLayoutCreateInfo::builder();
let pipeline_layout = unsafe {
    device.create_pipeline_layout(&layout_info, None)
}
.expect("Failed to create pipeline layout");

Step 5: Create the graphics pipeline

This is the largest struct in the Vulkan API. Every piece of rendering state is specified here.

use vulkan_rust::vk;
use vk::*;

// ── Shader stages ──────────────────────────────────────────────
let entry_name = c"main";
let stages = [
    *PipelineShaderStageCreateInfo::builder()
        .stage(ShaderStageFlags::VERTEX)
        .module(vert_module)
        .name(entry_name),
    *PipelineShaderStageCreateInfo::builder()
        .stage(ShaderStageFlags::FRAGMENT)
        .module(frag_module)
        .name(entry_name),
];

// ── Vertex input: empty (positions are hard-coded in shader) ───
let vertex_input = PipelineVertexInputStateCreateInfo::builder();

// ── Input assembly: triangle list ──────────────────────────────
let input_assembly = PipelineInputAssemblyStateCreateInfo::builder()
    .topology(PrimitiveTopology::TRIANGLE_LIST);

// ── Viewport and scissor: dynamic (set at draw time) ───────────
let mut viewport_state = PipelineViewportStateCreateInfo::builder();
viewport_state.viewport_count = 1;
viewport_state.scissor_count = 1;

// ── Rasterization ──────────────────────────────────────────────
let rasterizer = PipelineRasterizationStateCreateInfo::builder()
    .polygon_mode(PolygonMode::FILL)
    .cull_mode(CullModeFlags::BACK)
    .front_face(FrontFace::CLOCKWISE)
    .line_width(1.0);

// ── Multisampling: off ─────────────────────────────────────────
let multisampling = PipelineMultisampleStateCreateInfo::builder()
    .rasterization_samples(SampleCountFlagBits::_1);

// ── Color blending: no blending, write all channels ────────────
let blend_attachment = PipelineColorBlendAttachmentState {
    blend_enable: 0,
    color_write_mask: ColorComponentFlags::R
        | ColorComponentFlags::G
        | ColorComponentFlags::B
        | ColorComponentFlags::A,
    ..unsafe { core::mem::zeroed() }
};

let color_blending = PipelineColorBlendStateCreateInfo::builder()
    .attachments(std::slice::from_ref(&blend_attachment));

// ── Dynamic state ──────────────────────────────────────────────
let dynamic_states = [DynamicState::VIEWPORT, DynamicState::SCISSOR];
let dynamic_state = PipelineDynamicStateCreateInfo::builder()
    .dynamic_states(&dynamic_states);

// ── Assemble the pipeline ──────────────────────────────────────
let pipeline_info = GraphicsPipelineCreateInfo::builder()
    .stages(&stages)
    .vertex_input_state(&vertex_input)
    .input_assembly_state(&input_assembly)
    .viewport_state(&viewport_state)
    .rasterization_state(&rasterizer)
    .multisample_state(&multisampling)
    .color_blend_state(&color_blending)
    .dynamic_state(&dynamic_state)
    .layout(pipeline_layout)
    .render_pass(render_pass)
    .subpass(0);

let pipeline = unsafe {
    device.create_graphics_pipelines(
        PipelineCache::null(),
        &[*pipeline_info],
        None,
    )
}
.expect("Failed to create graphics pipeline")[0];

// ── Shader modules are no longer needed ────────────────────────
unsafe {
    device.destroy_shader_module(vert_module, None);
    device.destroy_shader_module(frag_module, None);
};

Before reading on: we set cull_mode to BACK and front_face to CLOCKWISE. What happens if the triangle vertices are wound counter-clockwise? What would you see?

The triangle would be culled (invisible). Back-face culling discards triangles whose vertices appear in the wrong winding order from the camera’s perspective. If your triangle is invisible, try switching to COUNTER_CLOCKWISE or disabling culling with CullModeFlags::NONE.

Step 6: Create framebuffers

A framebuffer binds specific image views to a render pass. We need one per swapchain image.

use vulkan_rust::vk;
use vk::*;

let framebuffers: Vec<Framebuffer> = swapchain_image_views
    .iter()
    .map(|&view| {
        let views = [view];
        let fb_info = FramebufferCreateInfo::builder()
            .render_pass(render_pass)
            .attachments(&views)
            .width(extent.width)
            .height(extent.height)
            .layers(1);

        unsafe { device.create_framebuffer(&fb_info, None) }
            .expect("Failed to create framebuffer")
    })
    .collect();

Where we are now

Render Pass        "clear to black, store the result, present"
     │
Pipeline           "use these shaders, fill triangles, no blending"
     │
Framebuffers       [swapchain image 0, swapchain image 1, ...]

We have everything needed to describe what to draw and how. In Part 4, we record commands that use the pipeline and render pass, submit them, and present the result.

Clean up (new objects)

Add these to the cleanup sequence from Part 2, before device destruction:

unsafe {
    for &fb in &framebuffers {
        device.destroy_framebuffer(fb, None);
    }
    device.destroy_pipeline(pipeline, None);
    device.destroy_pipeline_layout(pipeline_layout, None);
    device.destroy_render_pass(render_pass, None);
    // ... then image views, swapchain, device, surface, instance
}

What we learned

StepWhatWhy
ShadersGLSL → SPIR-V → ShaderModuleGPU programs that position and color pixels
Render passcreate_render_passDeclares attachments and how they are loaded/stored
Pipeline layoutcreate_pipeline_layoutDeclares what resources shaders expect (none for now)
Graphics pipelinecreate_graphics_pipelinesBakes all rendering state into one compiled object
Framebufferscreate_framebufferBinds specific images to a render pass

Concepts to explore

Exercises

  1. Change the clear color. Modify the render pass begin info (in Part 4) to clear to a different color. The clear value is passed when beginning the render pass, not when creating it.
  2. Add a depth attachment. Create a depth image and image view, add a second attachment to the render pass, and enable depth testing in the pipeline.
  3. Try PolygonMode::LINE. Change the polygon mode to LINE to see the triangle as wireframe. (Requires the fillModeNonSolid device feature.)

Next

Part 4: Command Buffers & Drawing records the draw commands, submits them, and presents the triangle to the screen.

Hello Triangle, Part 4: Command Buffers & Drawing

This is the final part. In Part 3 we created the render pass, pipeline, and framebuffers. Now we record commands, submit them, and present a triangle to the screen.

What we build in this part:

Create sync objects ──> Create command pool/buffers
       │                        │
       └──> Render loop: acquire image ──> record commands ──> submit ──> present

This part ties together every concept from the previous three parts. When you see the triangle, you will have written a complete Vulkan application.

Step 1: Create synchronization objects

We need fences and semaphores to coordinate CPU and GPU work. See Synchronization for the full concept.

use vulkan_rust::vk;
use vk::*;

// ── Semaphores: GPU-to-GPU synchronization ─────────────────────
let sem_info = SemaphoreCreateInfo::builder();

// "The swapchain image is ready to render into."
let image_available = unsafe { device.create_semaphore(&sem_info, None) }
    .expect("Failed to create semaphore");

// "Rendering is done, safe to present."
let render_finished = unsafe { device.create_semaphore(&sem_info, None) }
    .expect("Failed to create semaphore");

// ── Fence: GPU-to-CPU synchronization ──────────────────────────
//
// SIGNALED so the first frame doesn't block forever waiting for
// a "previous frame" that never existed.
let fence_info = FenceCreateInfo::builder()
    .flags(FenceCreateFlags::SIGNALED);

let in_flight_fence = unsafe { device.create_fence(&fence_info, None) }
    .expect("Failed to create fence");

Before reading on: why do we create the fence with SIGNALED? What would happen on the first frame if we didn’t?

The render loop starts by waiting for the fence. On the first frame, no GPU work has been submitted yet, so an unsignaled fence would block forever. Starting it signaled lets the first frame pass through immediately.

Step 2: Create a command pool and command buffer

use vulkan_rust::vk;
use vk::*;

// ── Command pool ───────────────────────────────────────────────
let pool_info = CommandPoolCreateInfo::builder()
    .flags(CommandPoolCreateFlags::RESET_COMMAND_BUFFER)
    .queue_family_index(graphics_family_index);

let command_pool = unsafe { device.create_command_pool(&pool_info, None) }
    .expect("Failed to create command pool");

// ── Allocate one command buffer ────────────────────────────────
let alloc_info = CommandBufferAllocateInfo::builder()
    .command_pool(command_pool)
    .level(CommandBufferLevel::PRIMARY)
    .command_buffer_count(1);

let command_buffer = unsafe {
    device.allocate_command_buffers(&alloc_info)
}
.expect("Failed to allocate command buffer")[0];

Step 3: Record drawing commands

This function records all the commands needed to draw one frame. We call it every frame with the correct framebuffer for the current swapchain image.

use vulkan_rust::vk;
use vk::*;

unsafe fn record_commands(
    device: &vulkan_rust::Device,
    command_buffer: CommandBuffer,
    render_pass: RenderPass,
    framebuffer: Framebuffer,
    pipeline: Pipeline,
    extent: Extent2D,
) {
    unsafe {
    // ── Begin recording ────────────────────────────────────────
    let begin_info = CommandBufferBeginInfo::builder();
    device.begin_command_buffer(command_buffer, &begin_info)
        .expect("Failed to begin command buffer");

    // ── Begin render pass ──────────────────────────────────────
    let clear_value = ClearValue {
        color: ClearColorValue {
            float32: [0.0, 0.0, 0.0, 1.0],  // black
        },
    };

    let clear_values = [clear_value];
    let rp_begin = RenderPassBeginInfo::builder()
        .render_pass(render_pass)
        .framebuffer(framebuffer)
        .render_area(Rect2D {
            offset: Offset2D { x: 0, y: 0 },
            extent,
        })
        .clear_values(&clear_values);

    device.cmd_begin_render_pass(
        command_buffer,
        &rp_begin,
        SubpassContents::INLINE,
    );

    // ── Bind the pipeline ──────────────────────────────────────
    device.cmd_bind_pipeline(
        command_buffer,
        PipelineBindPoint::GRAPHICS,
        pipeline,
    );

    // ── Set dynamic viewport and scissor ───────────────────────
    let viewport = Viewport {
        x: 0.0,
        y: 0.0,
        width: extent.width as f32,
        height: extent.height as f32,
        min_depth: 0.0,
        max_depth: 1.0,
    };
    device.cmd_set_viewport(command_buffer, 0, &[viewport]);

    let scissor = Rect2D {
        offset: Offset2D { x: 0, y: 0 },
        extent,
    };
    device.cmd_set_scissor(command_buffer, 0, &[scissor]);

    // ── Draw the triangle ──────────────────────────────────────
    //
    // 3 vertices, 1 instance, starting at vertex 0, instance 0.
    // The vertex data is hard-coded in the shader.
    device.cmd_draw(command_buffer, 3, 1, 0, 0);

    // ── End render pass and recording ──────────────────────────
    device.cmd_end_render_pass(command_buffer);
    device.end_command_buffer(command_buffer)
        .expect("Failed to end command buffer");
    }
}

This is the core of every Vulkan frame: begin recording, begin render pass, bind pipeline, set state, draw, end render pass, end recording.

Step 4: The render loop

Now we tie everything together in the event loop. Each frame:

  1. Wait for the previous frame’s fence (CPU waits for GPU).
  2. Acquire the next swapchain image (GPU signals image_available).
  3. Record commands into the command buffer.
  4. Submit the command buffer (waits on image_available, signals render_finished and the fence).
  5. Present the image (waits on render_finished).
use winit::application::ApplicationHandler;
use winit::event::WindowEvent;
use winit::event_loop::{ActiveEventLoop, EventLoop};
use winit::window::WindowId;

impl ApplicationHandler for App {
    fn resumed(&mut self, _event_loop: &ActiveEventLoop) {
        // Window and Vulkan setup already done (see Part 2).
    }

    fn window_event(
        &mut self,
        event_loop: &ActiveEventLoop,
        _id: WindowId,
        event: WindowEvent,
    ) {
        match event {
            WindowEvent::CloseRequested => {
                event_loop.exit();
            }
            WindowEvent::RedrawRequested => {
                unsafe { self.draw_frame() };
                // Request the next frame immediately.
                self.window.as_ref().unwrap().request_redraw();
            }
            _ => {}
        }
    }
}

// In main:
let event_loop = EventLoop::new().expect("Failed to create event loop");
event_loop.run_app(&mut app).expect("Event loop error");

The draw_frame function:

use vulkan_rust::vk;
use vk::*;

unsafe fn draw_frame(
    device: &vulkan_rust::Device,
    swapchain: SwapchainKHR,
    in_flight_fence: Fence,
    image_available: Semaphore,
    render_finished: Semaphore,
    command_buffer: CommandBuffer,
    framebuffers: &[Framebuffer],
    render_pass: RenderPass,
    pipeline: Pipeline,
    extent: Extent2D,
    graphics_queue: Queue,
) {
    unsafe {
    // ── 1. Wait for previous frame ─────────────────────────────
    device.wait_for_fences(&[in_flight_fence], true, u64::MAX)
        .expect("Failed to wait for fence");
    device.reset_fences(&[in_flight_fence])
        .expect("Failed to reset fence");

    // ── 2. Acquire next swapchain image ────────────────────────
    let image_index = device
        .acquire_next_image_khr(
            swapchain,
            u64::MAX,
            image_available,
            Fence::null(),
        )
        .expect("Failed to acquire swapchain image");

    // ── 3. Record commands ─────────────────────────────────────
    device.reset_command_buffer(
        command_buffer,
        CommandBufferResetFlags::empty(),
    )
    .expect("Failed to reset command buffer");

    record_commands(
        device,
        command_buffer,
        render_pass,
        framebuffers[image_index as usize],
        pipeline,
        extent,
    );

    // ── 4. Submit ──────────────────────────────────────────────
    let wait_sems = [image_available];
    let wait_stages = [PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT];
    let cmd_bufs = [command_buffer];
    let signal_sems = [render_finished];

    let submit_info = SubmitInfo::builder()
        .wait_semaphores(&wait_sems)
        .wait_dst_stage_mask(&wait_stages)
        .command_buffers(&cmd_bufs)
        .signal_semaphores(&signal_sems);

    device.queue_submit(graphics_queue, &[*submit_info], in_flight_fence)
        .expect("Failed to submit draw command buffer");

    // ── 5. Present ─────────────────────────────────────────────
    let present_wait = [render_finished];
    let swapchains = [swapchain];
    let indices = [image_index];
    let present_info = PresentInfoKHR::builder()
        .wait_semaphores(&present_wait)
        .swapchains(&swapchains)
        .image_indices(&indices);

    device.queue_present_khr(graphics_queue, &present_info)
        .expect("Failed to present");
    }
}

The synchronization flow each frame:

CPU: wait_for_fences ────────────────────────────────────> (free to continue)
       │
       v
GPU: acquire_next_image ──signals──> image_available
                                        │ (GPU waits at COLOR_ATTACHMENT_OUTPUT)
                                        v 
GPU:                        queue_submit ──signals──> render_finished
                                                        │ ──signals──> in_flight_fence
                                                        v                    │
GPU:                                             queue_present               │
                                                                             │ 
CPU: (next frame)                                   wait_for_fences <────────┘

Step 5: Wait before cleanup

Before destroying anything, wait for the GPU to finish all work:

// After the event loop exits:
unsafe { device.device_wait_idle() }
    .expect("Failed to wait for device idle");

Then destroy everything in reverse creation order:

unsafe {
    device.destroy_fence(in_flight_fence, None);
    device.destroy_semaphore(render_finished, None);
    device.destroy_semaphore(image_available, None);
    device.destroy_command_pool(command_pool, None);

    for &fb in &framebuffers {
        device.destroy_framebuffer(fb, None);
    }
    device.destroy_pipeline(pipeline, None);
    device.destroy_pipeline_layout(pipeline_layout, None);
    device.destroy_render_pass(render_pass, None);

    for &view in &swapchain_image_views {
        device.destroy_image_view(view, None);
    }
    device.destroy_swapchain_khr(swapchain, None);

    device.destroy_device(None);
    instance.destroy_surface(surface, None);
    instance.destroy_instance(None);
}

You did it

Run cargo run. You should see a window with a colored triangle on a black background:

┌──────────────────────────────────────┐
│                                      │
│              ▲ (red)                 │
│             ╱ ╲                      │
│            ╱   ╲                     │
│   (blue)  ╱     ╲  (green)           │
│          ▔▔▔▔▔                   │
│                                      │
└──────────────────────────────────────┘

If you see a black window with no triangle, check these common issues:

  1. Validation errors in the console. Read them. They usually point directly at the problem.
  2. Front face winding. If your triangle vertices are wound counter-clockwise but you set CLOCKWISE, the triangle is culled. Try CullModeFlags::NONE to test.
  3. Missing SPIR-V files. include_bytes! panics at compile time if the file is not found.

What we built across all four parts

Part 1: Entry ──> Instance ──> PhysicalDevice ──> Device ──> Queue
Part 2: Window ──> Surface ──> Swapchain ──> ImageViews
Part 3: Shaders ──> RenderPass ──> Pipeline ──> Framebuffers
Part 4: Sync objects ──> CommandPool/Buffer ──> Render loop

Every Vulkan application follows this structure. The details change (more pipelines, more buffers, more complex synchronization), but the architecture is the same.

What we skipped

This tutorial focused on getting a triangle on screen. A production application would add:

  • Multiple frames in flight to avoid the CPU waiting for the GPU every frame. See Double Buffering.
  • Window resize handling to recreate the swapchain when the window size changes. See Handle Window Resize.
  • Vertex buffers to pass vertex data from CPU memory to the GPU. See Memory Management.
  • Descriptor sets to pass uniforms and textures to shaders. See Descriptor Sets.
  • Depth testing for 3D rendering.

Exercises

  1. Change the triangle color. Modify the fragment shader (or the vertex shader’s color array) and recompile the SPIR-V.
  2. Draw a rectangle. Change the shader to output 6 vertices (two triangles) and update the cmd_draw vertex count.
  3. Add frames in flight. Create two sets of sync objects and command buffers. Alternate between them each frame so the CPU can record frame N+1 while the GPU renders frame N.
  4. Handle resize. When the window is resized, recreate the swapchain, image views, and framebuffers. The Handle Window Resize guide covers this.

Where to go from here

How to Read This Section

This section explains how Vulkan works, not as a tutorial to follow, but as a set of mental models you can carry with you while writing any Vulkan code.

Structure of each chapter

Every concept chapter follows the same four-part structure:

PartPurposeHow to use it
MotivationWhy this concept existsRead first, it tells you what problem you’re solving
IntuitionAnalogy, diagram, or informal explanationBuild a mental picture before touching code
Worked exampleAnnotated code showing the concept in practiceRead the annotations, not just the code
Formal referenceSpec terminology, edge cases, API linksCome back to this when you need precision

You do not need to memorize the formal reference on first reading. The intuition and worked example are enough to start writing code. The formal section is there for when your intuition hits an edge case and you need to know exactly what the spec says.

Threshold concepts

Some ideas in Vulkan are threshold concepts, once they click, they permanently change how you understand the API. These are flagged with a marker:

Threshold concept. This idea transforms how you think about Vulkan. If it feels confusing, that is normal, it means your mental model is being restructured. Stay with it.

The three biggest threshold concepts in Vulkan are:

  1. Explicit memory management, you allocate GPU memory yourself and decide what goes where.
  2. Synchronization is your responsibility, the GPU runs asynchronously and Vulkan gives you no implicit ordering guarantees.
  3. State is baked into pipeline objects, you cannot change rendering state on the fly like in OpenGL.

Reading order

The chapters are ordered by dependency, each builds on the ones before it. If a concept doesn’t make sense, check the dependency map to see which prerequisite you might need to revisit.

Two chapters are independent and can be read at any time:

Active reading

Throughout each chapter, you will find questions like:

Before reading on: why do you think Vulkan requires explicit synchronization instead of handling it automatically?

These are retrieval prompts. Pausing to answer, even briefly, even wrong, significantly improves retention. You are not expected to know the answer. The act of thinking about it before reading the explanation is what matters.

The Vulkan Object Model

Motivation

Every Vulkan API call operates on handles, opaque references to objects that live on the GPU or in the driver. Before you can do anything useful in Vulkan, you need to understand what these handles are, how they relate to each other, and who is responsible for destroying them.

If you have used file descriptors on Unix, database connections, or COM objects on Windows, the concept is the same: you request a resource, you get back an opaque identifier, you use that identifier in every subsequent call, and you close it when you are done. Vulkan has roughly 59 different handle types, but they all follow this pattern.

Intuition

Handles are opaque identifiers, not objects

A Vulkan handle is not a pointer to a struct you can inspect. It is an opaque number the driver gives you. You pass it back to the driver in later calls, and the driver uses it to look up the real resource internally. You never dereference a handle or read its fields.

In vulkan_rust, every handle is a #[repr(transparent)] newtype over either usize or u64:

// This is the entire definition of a Buffer handle.
// There is nothing inside it except a number.
#[repr(transparent)]
pub struct Buffer(u64);

Handles form a parent-child tree

Vulkan objects are not independent. They form a hierarchy where each object is created from (and belongs to) a parent:

Instance                          (your connection to the Vulkan driver)
├── PhysicalDevice                (a GPU on the system, enumerated, not created)
│   └── Device                    (your logical interface to that GPU)
│       ├── Queue                 (a submission endpoint, retrieved, not created)
│       ├── CommandPool
│       │   └── CommandBuffer     (allocated from a pool, not created directly)
│       ├── Buffer
│       ├── Image
│       ├── Fence
│       ├── Semaphore
│       ├── Pipeline
│       ├── DescriptorPool
│       │   └── DescriptorSet    (allocated from a pool, not created directly)
│       └── ... (~50 more types)
└── SurfaceKHR                    (a window's rendering target)

This hierarchy determines two things:

  1. Creation order. You cannot create a Buffer without a Device, and you cannot create a Device without a PhysicalDevice, which requires an Instance.
  2. Destruction order. You must destroy children before their parent. If you destroy a Device while it still has live Buffer handles, that is undefined behavior.

Before reading on: look at the tree above. Why do you think CommandBuffer and DescriptorSet are “allocated from a pool” instead of “created directly” like Buffer or Image?

The creation-destruction lifecycle

Almost every Vulkan object follows the same lifecycle:

1. Fill a CreateInfo struct     (describe what you want)
2. Call create_xxx()            (driver creates it, gives you a handle)
3. Use the handle               (pass it to other API calls)
4. Call destroy_xxx()           (you are done, release it)

The exception is objects that are enumerated (PhysicalDevice, Queue) or allocated from pools (CommandBuffer, DescriptorSet). These have slightly different creation/destruction patterns, covered below.

Dispatchable vs non-dispatchable handles

Vulkan has two categories of handle, and the difference matters for understanding how the driver works internally.

Dispatchable handles (Instance, PhysicalDevice, Device, CommandBuffer, Queue) are pointer-sized (usize). Internally, the driver stores a dispatch table at the address the handle points to. When you call a Vulkan function, the loader uses this dispatch table to route the call to the correct driver. There are only 5 dispatchable handle types.

Non-dispatchable handles (Buffer, Image, Fence, Pipeline, and all the rest) are 64-bit integers (u64). They are opaque identifiers that the driver interprets however it likes. There are roughly 54 of these.

You rarely need to think about this distinction in application code. It matters when you are doing interop (passing handles between processes or APIs) or when you are debugging driver internals.

Worked example: the complete lifecycle of a Buffer

This example shows the full create-use-destroy lifecycle. Each step is labeled with its purpose.

use vulkan_rust::vk;
use vulkan_rust::vk::*;
use vulkan_rust::vk::Handle;
use vulkan_rust::Device;

unsafe fn buffer_lifecycle(device: &Device) {
    // ── Step 1: Describe what you want ──────────────────────────
    //
    // Every create call takes a CreateInfo struct. The builder
    // fills in sType automatically and provides a typed API
    // for each field.
    let buffer_info = BufferCreateInfo::builder()
        .size(1024)                           // 1 KiB
        .usage(BufferUsageFlags::VERTEX_BUFFER)
        .sharing_mode(SharingMode::EXCLUSIVE);

    // ── Step 2: Create the object ───────────────────────────────
    //
    // The driver allocates the resource and returns a handle.
    // This can fail (out of memory, invalid parameters), so it
    // returns a Result.
    let buffer: Buffer = device
        .create_buffer(&buffer_info, None)
        .expect("Failed to create buffer");

    // The handle is just a number. You can copy it, compare it,
    // hash it, or check if it is null.
    assert!(!buffer.is_null());

    // ── Step 3: Use the handle ──────────────────────────────────
    //
    // You would normally bind memory to this buffer, then use
    // it in command buffer recording. For this example, we just
    // show that the handle is a lightweight Copy type.
    let buffer_copy = buffer;  // handles are Copy
    assert_eq!(buffer, buffer_copy);

    // ── Step 4: Destroy the object ──────────────────────────────
    //
    // You must destroy the buffer before destroying the Device
    // that created it. vulkan_rust does not track this for you.
    // There is no Drop implementation. You are responsible.
    device.destroy_buffer(buffer, None);

    // After this point, using `buffer` is undefined behavior.
    // Rust's type system does not prevent this, the handle is
    // still a valid Copy value. Vulkan's validation layers
    // will catch use-after-destroy if you enable them.
}

Before reading on: the code above calls device.destroy_buffer(buffer, None). What do you think the second argument (None) is for? Hint: it relates to custom memory allocation, not GPU memory.

Objects that come from pools

CommandBuffers and DescriptorSets are not created individually. They are allocated in bulk from a pool, and freed back to that pool (or the entire pool is reset/destroyed at once):

use vulkan_rust::vk;
use vulkan_rust::vk::*;
use vulkan_rust::vk::Handle;

// Pool-based lifecycle (simplified)
unsafe {
    // Create the pool (this is a normal create/destroy object).
    let pool_info = CommandPoolCreateInfo::builder()
        .queue_family_index(graphics_queue_family);
    let pool = device.create_command_pool(&pool_info, None)?;

    // Allocate command buffers FROM the pool.
    let alloc_info = CommandBufferAllocateInfo::builder()
        .command_pool(pool)
        .level(CommandBufferLevel::PRIMARY)
        .command_buffer_count(2);
    let command_buffers = device.allocate_command_buffers(&alloc_info)?;

    // Use command_buffers[0], command_buffers[1], ...

    // Option A: Free individual command buffers back to the pool.
    device.free_command_buffers(pool, &command_buffers);

    // Option B: Reset the entire pool (returns all buffers to initial state).
    device.reset_command_pool(pool, CommandPoolResetFlags::empty())?;

    // Destroy the pool (implicitly frees all remaining command buffers).
    device.destroy_command_pool(pool, None);
}

This pool pattern exists for performance: allocating and freeing individual small objects is expensive, so Vulkan amortizes the cost by batching them through pools.

Objects that are enumerated, not created

PhysicalDevices and Queues are not created by you. They are discovered:

unsafe {
    // PhysicalDevices: the driver tells you what GPUs exist.
    let physical_devices = instance.enumerate_physical_devices()?;

    // Queues: retrieved from a Device after creation.
    let queue = device.get_device_queue(queue_family_index, 0);
}

You do not destroy enumerated objects. Their lifetime is tied to their parent (PhysicalDevice lives as long as the Instance, Queue lives as long as the Device).

Formal reference

The Handle trait

Every handle type in vulkan_rust implements the Handle trait:

pub trait Handle: Copy + Eq + Hash {
    type Repr;                       // usize or u64
    fn null() -> Self;               // the null handle (0)
    fn from_raw(raw: Self::Repr) -> Self;
    fn as_raw(self) -> Self::Repr;
    fn is_null(self) -> bool;
}

All handles also derive Copy, Clone, PartialEq, Eq, Hash, Default (returns null), and Debug (prints the type name and hex value).

Handle categories

CategoryReprExamplesCount
DispatchableusizeInstance, PhysicalDevice, Device, CommandBuffer, Queue5
Non-dispatchableu64Buffer, Image, Fence, Semaphore, Pipeline, …~54

Destruction rules

  1. You must destroy what you create. vulkan_rust has no Drop implementations on handles. This is deliberate: automatic destruction would require tracking creation order, reference counting, and deferred destruction (the GPU might still be using the object). That complexity belongs in your application, not in the bindings.

  2. Destroy children before parents. The tree above defines the order. Validation layers will warn you if you get it wrong.

  3. The GPU must be done with an object before you destroy it. If a command buffer references a Buffer that you then destroy, the GPU will read freed memory. Use fences or device_wait_idle() to ensure GPU work has completed.

  4. Pool destruction frees all children. Destroying a CommandPool implicitly frees all CommandBuffers allocated from it. Same for DescriptorPool and DescriptorSets.

  5. Enumerated objects are not destroyed. PhysicalDevice and Queue handles are valid for the lifetime of their parent.

Interop: from_raw_parts

If another system creates Vulkan objects for you (OpenXR, a C library, a test harness), you can wrap them:

// Wrap an externally-created Instance.
let instance = unsafe {
    Instance::from_raw_parts(raw_instance_handle, get_instance_proc_addr_fn)
};

// Wrap an externally-created Device.
let device = unsafe {
    Device::from_raw_parts(raw_device_handle, get_device_proc_addr_fn)
};

The wrapped objects load all function pointers from the provided get_*_proc_addr function, so they work identically to objects created through Entry::create_instance.

Key takeaways

  • Vulkan handles are opaque numbers, not pointers to inspectable structs.
  • Handles form a parent-child tree. Create bottom-up, destroy top-down.
  • Most objects follow create → use → destroy. Pools and enumerated objects are the two exceptions.
  • vulkan_rust gives you Copy handles with no Drop. You manage lifetimes. Validation layers are your safety net during development.

Memory Management

Threshold concept. Vulkan memory management permanently changes how you think about GPU resources. In OpenGL, the driver decided where your data lived. In Vulkan, you decide, and that decision affects performance more than almost anything else.

Motivation

A GPU has multiple memory pools with different properties: some are fast for the GPU but invisible to the CPU, some are accessible to both but slower, some are special-purpose. OpenGL hid this complexity behind glBufferData and hoped the driver would make good choices. Sometimes it did. Often it didn’t.

Vulkan exposes this hardware reality directly because the “right” memory choice depends on your workload, and only you know your workload. A mesh that never changes after upload needs different memory than a uniform buffer you update every frame.

Intuition

The warehouse analogy

Think of GPU memory like a warehouse with different storage areas:

  • Device-local memory is the high-speed shelving right next to the assembly line (GPU cores). Fast to access, but the front office (CPU) can’t reach it directly.
  • Host-visible memory is the loading dock, both the warehouse workers (GPU) and the delivery trucks (CPU) can access it, but it’s slower for the assembly line.
  • Host-coherent memory is a special loading dock where changes are immediately visible to both sides, without needing to shout “new stuff here!” (flush/invalidate).
  • Host-cached memory is a loading dock with a clipboard: the CPU reads are fast because they come from a cache, but you need to invalidate before reading to make sure the clipboard is up to date.

The two-step binding model

In Vulkan, creating a Buffer and allocating memory for it are separate operations. This is different from most APIs and is often the first surprise:

1. Create a Buffer         (describes shape and usage, no memory yet)
2. Query memory requirements (driver tells you: size, alignment, compatible types)
3. Allocate DeviceMemory   (reserve a block from a memory pool)
4. Bind memory to buffer   (connect the two)

This separation exists because multiple buffers can share a single memory allocation (sub-allocation), which is far more efficient than allocating individually. Production Vulkan applications almost always use a memory allocator (like VMA) to manage sub-allocation, but understanding the raw API is essential before using one.

Before reading on: why do you think Vulkan separates “create buffer” from “allocate memory”? What advantage does this give you that a single create_buffer_with_memory() call would not?

Memory types and heaps

Every Vulkan device exposes a set of memory heaps (physical pools of VRAM or system RAM) and memory types (combinations of properties that describe how a heap can be used).

┌─────────────────────────────────────────────────────────┐
│ Physical Device Memory Properties                       │
│                                                         │
│  Heaps:                                                 │
│  ┌──────────────────────┐  ┌──────────────────────────┐ │
│  │ Heap 0: 8 GiB        │  │ Heap 1: 16 GiB           │ │
│  │ flags: DEVICE_LOCAL  │  │ flags: (none)            │ │
│  │ (dedicated GPU VRAM) │  │ (system RAM)             │ │
│  └──────────────────────┘  └──────────────────────────┘ │
│                                                         │
│  Memory Types (each points to a heap):                  │
│  ┌─────────────────────────────────────────────┐        │
│  │ Type 0: heap 0, DEVICE_LOCAL                │        │
│  │ Type 1: heap 1, HOST_VISIBLE | HOST_COHERENT│        │
│  │ Type 2: heap 0, DEVICE_LOCAL | HOST_VISIBLE │  ←BAR  │
│  │ Type 3: heap 1, HOST_VISIBLE | HOST_CACHED  │        │
│  └─────────────────────────────────────────────┘        │
│                                                         │
└─────────────────────────────────────────────────────────┘

The number and properties of heaps and types vary between GPUs. A discrete GPU typically has separate heaps for VRAM and system RAM. An integrated GPU often has a single heap that is both device-local and host-visible. Your code must query these at runtime and choose accordingly.

The decision tree

When allocating memory for a resource, follow this logic:

Is this data written by the CPU every frame?
├── Yes → HOST_VISIBLE | HOST_COHERENT
│         (uniform buffers, dynamic vertex data)
│
└── No → Is this data uploaded once and never touched again?
         ├── Yes → DEVICE_LOCAL (use a staging buffer to upload)
         │         (static meshes, textures)
         │
         └── No → Is this data read back by the CPU?
                  ├── Yes → HOST_VISIBLE | HOST_CACHED
                  │         (readback buffers, screenshots)
                  │
                  └── No → DEVICE_LOCAL
                           (render targets, compute output)

Worked example: uploading a mesh to the GPU

This is the most common memory operation in Vulkan: getting vertex data from the CPU into fast GPU memory. It uses the staging buffer pattern.

Step 1: Create the destination buffer

use vulkan_rust::vk;
use vk::*;

// The buffer that will hold the mesh on the GPU.
// TRANSFER_DST means "this buffer can receive data from a copy command."
let buffer_info = BufferCreateInfo::builder()
    .size(vertex_data_size)
    .usage(
        BufferUsageFlags::VERTEX_BUFFER
        | BufferUsageFlags::TRANSFER_DST
    )
    .sharing_mode(SharingMode::EXCLUSIVE);

let gpu_buffer = unsafe { device.create_buffer(&buffer_info, None)? };

Step 2: Query what memory this buffer needs

// The driver tells us: how many bytes, what alignment, and which
// memory types are compatible with this buffer.
let mem_requirements = unsafe {
    device.get_buffer_memory_requirements(gpu_buffer)
};

// mem_requirements.size            → minimum allocation size
// mem_requirements.alignment       → byte alignment requirement
// mem_requirements.memory_type_bits → bitmask of compatible memory types

Step 3: Find the right memory type

use vulkan_rust::vk;
use vk::*;

// Query what memory the hardware offers.
let mem_properties = unsafe {
    instance.get_physical_device_memory_properties(physical_device)
};

// Find a memory type that is:
//   1. Compatible with the buffer (listed in memory_type_bits)
//   2. Device-local (fast GPU access)
let desired = MemoryPropertyFlags::DEVICE_LOCAL;

let memory_type_index = (0..mem_properties.memory_type_count)
    .find(|&i| {
        let type_compatible =
            mem_requirements.memory_type_bits & (1 << i) != 0;
        let properties_match =
            mem_properties.memory_types[i as usize]
                .property_flags & desired == desired;
        type_compatible && properties_match
    })
    .expect("No suitable memory type found");

Before reading on: the code above iterates memory types in order (0, 1, 2, …). The Vulkan spec recommends that drivers list memory types from most preferred to least preferred. Why does picking the first match give you the best performance?

Step 4: Allocate and bind

use vulkan_rust::vk;
use vk::*;

let alloc_info = MemoryAllocateInfo::builder()
    .allocation_size(mem_requirements.size)
    .memory_type_index(memory_type_index);

let gpu_memory = unsafe { device.allocate_memory(&alloc_info, None)? };

// Bind the memory to the buffer. After this, the buffer is backed
// by real memory and can be used.
unsafe { device.bind_buffer_memory(gpu_buffer, gpu_memory, 0)? };

Step 5: Upload via staging buffer

Device-local memory is usually not host-visible, so you can’t write to it directly from the CPU. The solution: create a temporary staging buffer in host-visible memory, write your data there, then copy to the GPU buffer.

use vulkan_rust::vk;
use vk::*;

// Create a temporary staging buffer in host-visible memory.
let staging_info = BufferCreateInfo::builder()
    .size(vertex_data_size)
    .usage(BufferUsageFlags::TRANSFER_SRC)
    .sharing_mode(SharingMode::EXCLUSIVE);

let staging_buffer = unsafe { device.create_buffer(&staging_info, None)? };
let staging_reqs = unsafe {
    device.get_buffer_memory_requirements(staging_buffer)
};

// Find HOST_VISIBLE | HOST_COHERENT memory for the staging buffer.
let staging_desired =
    MemoryPropertyFlags::HOST_VISIBLE
    | MemoryPropertyFlags::HOST_COHERENT;

let staging_type_index = (0..mem_properties.memory_type_count)
    .find(|&i| {
        let type_ok = staging_reqs.memory_type_bits & (1 << i) != 0;
        let props_ok =
            mem_properties.memory_types[i as usize]
                .property_flags & staging_desired == staging_desired;
        type_ok && props_ok
    })
    .expect("No host-visible memory type found");

let staging_alloc = MemoryAllocateInfo::builder()
    .allocation_size(staging_reqs.size)
    .memory_type_index(staging_type_index);

let staging_memory = unsafe {
    device.allocate_memory(&staging_alloc, None)?
};
unsafe { device.bind_buffer_memory(staging_buffer, staging_memory, 0)? };

// Map the staging memory, copy vertex data in, then unmap.
unsafe {
    let data_ptr = device.map_memory(
        staging_memory,
        0,
        vertex_data_size,
        MemoryMapFlags::empty(),
    )?;

    core::ptr::copy_nonoverlapping(
        vertices.as_ptr() as *const u8,
        data_ptr as *mut u8,
        vertex_data_size as usize,
    );

    // Because we chose HOST_COHERENT, we do not need to call
    // flush_mapped_memory_ranges. The write is automatically
    // visible to the GPU.
    device.unmap_memory(staging_memory);
};

// Record a command to copy from staging → gpu buffer.
// (Command buffer recording is covered in the Command Buffers chapter.)
// ... cmd_copy_buffer(staging_buffer, gpu_buffer, &[region]) ...

// After the copy completes on the GPU, clean up the staging buffer.
unsafe {
    device.destroy_buffer(staging_buffer, None);
    device.free_memory(staging_memory, None);
};

Why not skip the staging buffer?

On some GPUs (especially integrated GPUs and GPUs with Resizable BAR), there is a memory type that is both DEVICE_LOCAL and HOST_VISIBLE. In that case, you can map device-local memory directly and skip the staging buffer. But this memory is often limited in size and not available on all hardware. The staging buffer pattern works everywhere.

Formal reference

Memory property flags

FlagMeaning
DEVICE_LOCALFastest for GPU access. Usually not host-visible on discrete GPUs.
HOST_VISIBLECan be mapped with map_memory for CPU read/write.
HOST_COHERENTMapped writes are automatically visible to the GPU (no flush needed).
HOST_CACHEDMapped reads come from CPU cache (fast reads). Requires invalidate before reading GPU-written data.
LAZILY_ALLOCATEDMemory may not be allocated until used. For transient attachments only.
PROTECTEDFor DRM-protected content.

The memory type selection algorithm

use vulkan_rust::vk;
use vk::*;

fn find_memory_type(
    mem_properties: &PhysicalDeviceMemoryProperties,
    type_bits: u32,       // from MemoryRequirements.memory_type_bits
    desired: MemoryPropertyFlags,
) -> Option<u32> {
    (0..mem_properties.memory_type_count).find(|&i| {
        let compatible = type_bits & (1 << i) != 0;
        let has_properties =
            mem_properties.memory_types[i as usize].property_flags
            & desired == desired;
        compatible && has_properties
    })
}

This function appears in nearly every Vulkan application. It finds the first memory type that is compatible with the resource and has the properties you need.

Flush and invalidate

If you use memory that is HOST_VISIBLE but not HOST_COHERENT:

  • After writing from the CPU, call flush_mapped_memory_ranges to make your writes visible to the GPU.
  • Before reading on the CPU (after the GPU has written), call invalidate_mapped_memory_ranges to refresh the CPU’s view.

With HOST_COHERENT memory, neither call is needed. Most applications use coherent memory for simplicity.

Key structs

StructPurpose
PhysicalDeviceMemoryPropertiesDescribes all heaps and types on the hardware
MemoryTypeOne entry: property flags + which heap it draws from
MemoryHeapOne pool: total size in bytes + heap flags
MemoryRequirementsWhat a buffer/image needs: size, alignment, compatible types
MemoryAllocateInfoInput to allocate_memory: how many bytes, which type
MappedMemoryRangeRange for flush/invalidate when not using coherent memory

Destruction order

1. Ensure GPU is not using the buffer/image (fence or device_wait_idle)
2. Destroy the buffer/image    (device.destroy_buffer / device.destroy_image)
3. Free the memory             (device.free_memory)

You must unbind (destroy) all buffers and images from a DeviceMemory before freeing it.

Key takeaways

  • Vulkan separates buffer/image creation from memory allocation. You create the resource, ask what memory it needs, allocate, then bind.
  • Memory types have different properties (device-local, host-visible, coherent, cached). Choose based on your access pattern.
  • The staging buffer pattern (host-visible temp → device-local permanent) is the standard way to upload data on discrete GPUs.
  • Query memory properties at runtime. Never assume a specific memory layout; it varies between GPUs.
  • In production, use a sub-allocator (like VMA). Allocating per-buffer is correct but slow.

Command Buffers

Motivation

In OpenGL, calling glDrawArrays immediately sends work to the GPU (or at least, the driver pretends it does). In Vulkan, you record commands into a buffer, then submit that buffer to a queue. The GPU processes the queue asynchronously while your CPU moves on.

This separation exists for three reasons:

  1. Batching. One submission of many commands is cheaper than many individual calls. Each submission has overhead (kernel transitions, driver bookkeeping), so bundling hundreds of draw calls into a single command buffer and submitting once is dramatically faster.
  2. Reuse. You can record a command buffer once and submit it many times. If a scene doesn’t change, why re-record every frame?
  3. Multi-threading. Different CPU threads can record into different command buffers simultaneously, then submit them all on one thread. This is how modern engines scale across CPU cores.

Intuition

The shopping list analogy

A command buffer is a shopping list. You write down everything you need (“bind this pipeline”, “draw 36 vertices”, “copy this image”), then hand the list to someone else (a GPU queue) who goes and does it all. You don’t stand in the store waiting for each item, you hand off the list and do other work.

The lifecycle looks like this:

┌────────────┐     ┌────────────┐     ┌────────────┐
│   Record   │────>│   Submit   │────>│  Execute   │
│  (CPU)     │     │  (CPU→GPU) │     │  (GPU)     │
│            │     │            │     │            │
│ "bind X"   │     │ hand off   │     │ GPU reads  │
│ "draw Y"   │     │ to queue   │     │ the list   │
│ "copy Z"   │     │            │     │ and acts   │
└────────────┘     └────────────┘     └────────────┘

The CPU is free to do other work (including recording the next frame’s command buffer) while the GPU executes.

Command pools: why they exist

Allocating command buffers one at a time would be like allocating individual bytes from the OS. It’s correct, but the overhead per allocation is huge. Command pools solve this by pre-allocating a chunk of memory, then handing out command buffers from that pool cheaply.

┌──────────── Command Pool ────────────┐
│                                      │
│  ┌──────────┐  ┌──────────┐          │
│  │ CmdBuf 0 │  │ CmdBuf 1 │  ...     │
│  └──────────┘  └──────────┘          │
│                                      │
│  (all allocated from one pool)       │
│  (pool is tied to one queue family)  │
└──────────────────────────────────────┘

Each pool is tied to a single queue family. This lets the driver optimize the memory layout for that queue type.

Before reading on: if command pools are tied to a single queue family, and you want to record commands for both a graphics queue and a transfer queue, how many pools do you need?

Primary vs secondary command buffers

Primary command buffers are what you submit to queues. They can contain any command.

Secondary command buffers cannot be submitted directly. Instead, they are executed from within a primary command buffer using cmd_execute_commands. Think of them as subroutines: you record reusable chunks of work (like “render the UI”) into secondary buffers, then call them from your primary buffer.

Primary command buffer:
  begin render pass
  bind pipeline A
  draw meshes
  execute_commands(secondary_ui_buffer)   ← calls the secondary
  end render pass

Most applications start with primary buffers only and add secondary buffers when they need multi-threaded recording or reusable sub-passes.

Worked example: record and submit

This example creates a command pool, allocates a command buffer, records a simple buffer copy, and submits it.

Step 1: Create a command pool

use vulkan_rust::vk;
use vk::*;

// Create a pool for the graphics queue family.
// RESET_COMMAND_BUFFER lets us reset individual command buffers
// instead of resetting the entire pool.
let pool_info = CommandPoolCreateInfo::builder()
    .flags(CommandPoolCreateFlags::RESET_COMMAND_BUFFER)
    .queue_family_index(graphics_queue_family);

let command_pool = unsafe {
    device.create_command_pool(&pool_info, None)?
};

Step 2: Allocate a command buffer

use vulkan_rust::vk;
use vk::*;

// Allocate one primary command buffer from the pool.
let alloc_info = CommandBufferAllocateInfo::builder()
    .command_pool(command_pool)
    .level(CommandBufferLevel::PRIMARY)
    .command_buffer_count(1);

// allocate_command_buffers returns a Vec of handles.
let command_buffer = unsafe {
    device.allocate_command_buffers(&alloc_info)?
}[0];

Step 3: Record commands

use vulkan_rust::vk;
use vk::*;

// Begin recording. ONE_TIME_SUBMIT tells the driver this buffer
// will be submitted once and then reset or freed, enabling
// driver-side optimizations.
let begin_info = CommandBufferBeginInfo::builder()
    .flags(CommandBufferUsageFlags::ONE_TIME_SUBMIT);

unsafe {
    device.begin_command_buffer(command_buffer, &begin_info)?;
};

// Record a buffer copy command.
// This does NOT execute the copy. It records the instruction
// into the command buffer for later execution.
let copy_region = BufferCopy {
    src_offset: 0,
    dst_offset: 0,
    size: 1024,
};

unsafe {
    device.cmd_copy_buffer(
        command_buffer,
        src_buffer,
        dst_buffer,
        &[copy_region],
    );
};

// Finish recording.
unsafe { device.end_command_buffer(command_buffer)? };

Before reading on: between begin_command_buffer and end_command_buffer, the command buffer is in the “recording” state. What do you think happens if you try to submit a command buffer that is still in the recording state?

Step 4: Submit to a queue

use vulkan_rust::vk;
use vk::*;

// Build a submit info. This describes:
//   - which command buffers to execute
//   - which semaphores to wait on before starting
//   - which semaphores to signal when done
let submit_info = SubmitInfo::builder()
    .command_buffers(&[command_buffer]);

// Submit to the graphics queue.
// The Fence (here Fence::null()) will be signaled when the GPU
// finishes all commands in this submission. Passing null means
// "I don't need to know when it's done from the CPU."
unsafe {
    device.queue_submit(
        graphics_queue,
        &[*submit_info],
        Fence::null(),
    )?;
};

// For this example, we wait for the queue to finish before
// continuing. In a real application, you would use a fence
// instead of blocking the CPU.
unsafe { device.queue_wait_idle(graphics_queue)? };

Step 5: Clean up

use vulkan_rust::vk;
use vk::*;

// Option A: Free the command buffer back to the pool.
unsafe {
    device.free_command_buffers(command_pool, &[command_buffer]);
};

// Option B: Reset for reuse (only if pool was created with
// RESET_COMMAND_BUFFER flag).
unsafe {
    device.reset_command_buffer(
        command_buffer,
        CommandBufferResetFlags::empty(),
    )?;
};

// When you're done with the pool entirely:
unsafe { device.destroy_command_pool(command_pool, None) };
// This implicitly frees all command buffers allocated from it.

Command buffer states

A command buffer is always in one of these states:

                  allocate
   ┌────────────────────────────────┐
   v                                │
 Initial ──begin──> Recording ──end──> Executable ──submit──> Pending
   ^                                      │                      │
   │                                      │                      │
   └──────────── reset ───────────────────┘       (GPU finishes) |
   │                                                             │
   └─────────────────────────────────────────────────────────────┘
                    (returns to Executable or Initial)
StateWhat you can do
InitialNothing useful. Call begin_command_buffer to start recording.
RecordingRecord commands (cmd_* methods). Call end_command_buffer when done.
ExecutableSubmit to a queue. Or reset to record again.
PendingThe GPU is executing it. Do not touch it. Wait for completion.

The most common mistake is trying to re-record or reset a command buffer while it is still pending (the GPU hasn’t finished yet). Validation layers will catch this.

Common patterns

One-shot command buffer for transfers

Many operations (uploading textures, transitioning image layouts) need a command buffer just once. The pattern:

use vulkan_rust::vk;
use vk::*;

unsafe fn one_shot_submit(
    device: &Device,
    pool: CommandPool,
    queue: Queue,
    record: impl FnOnce(CommandBuffer),
) -> VkResult<()> {
    // Allocate
    let alloc_info = CommandBufferAllocateInfo::builder()
        .command_pool(pool)
        .level(CommandBufferLevel::PRIMARY)
        .command_buffer_count(1);
    let cmd = unsafe { device.allocate_command_buffers(&alloc_info)? }[0];

    // Record
    let begin = CommandBufferBeginInfo::builder()
        .flags(CommandBufferUsageFlags::ONE_TIME_SUBMIT);
    unsafe { device.begin_command_buffer(cmd, &begin)? };
    record(cmd);
    unsafe { device.end_command_buffer(cmd)? };

    // Submit and wait
    let submit = SubmitInfo::builder()
        .command_buffers(&[cmd]);
    unsafe {
        device.queue_submit(queue, &[*submit], Fence::null())?;
        device.queue_wait_idle(queue)?;
    };

    // Free
    unsafe { device.free_command_buffers(pool, &[cmd]) };
    Ok(())
}

This is the pattern used for staging buffer uploads in the Memory Management chapter.

Per-frame command buffers

For rendering, you typically have one command buffer per frame in flight:

Frame 0: [record on CPU] ──submit──> [execute on GPU]
Frame 1: [record on CPU] ──submit──> [execute on GPU]
          ↑                              ↑
          recording while               executing the
          GPU runs the                  commands we
          previous frame                just submitted

Each frame waits for its fence before re-recording. See Synchronization for how fences and semaphores coordinate this.

Formal reference

Command pool creation flags

FlagMeaning
TRANSIENTHint: command buffers from this pool are short-lived. Lets the driver optimize allocation.
RESET_COMMAND_BUFFERAllows individual command buffers to be reset. Without this, you can only reset the entire pool.
PROTECTEDCommand buffers allocated from this pool can operate on protected resources.

Command buffer begin flags

FlagMeaning
ONE_TIME_SUBMITThis buffer will be submitted once, then reset or freed. Enables driver optimizations.
RENDER_PASS_CONTINUESecondary command buffer: this will be entirely inside a render pass.
SIMULTANEOUS_USEThis buffer can be submitted to multiple queues or resubmitted while still pending.

Recording methods on Device

All recording methods follow the pattern device.cmd_*(command_buffer, ...). The device dispatches to the correct function pointer, the command_buffer identifies which buffer to record into. Examples:

MethodPurpose
cmd_bind_pipeline(cb, bind_point, pipeline)Set the active pipeline
cmd_draw(cb, vertices, instances, first_vert, first_inst)Draw without an index buffer
cmd_copy_buffer(cb, src, dst, &[regions])Copy between buffers
cmd_begin_render_pass(cb, &begin_info, contents)Start a render pass
cmd_end_render_pass(cb)End the current render pass

The full list has ~150 cmd_* methods covering every Vulkan command.

Destruction rules

  1. Wait for the GPU before freeing. A command buffer in the Pending state must not be freed or reset. Use a fence or device_wait_idle.
  2. Destroying a pool frees all its buffers. You do not need to free command buffers individually before destroying their pool.
  3. Pools are not thread-safe. If two threads record command buffers from the same pool, you must synchronize externally. The typical solution: one pool per thread.

SubmitInfo structure

SubmitInfo connects command buffers to synchronization primitives:

SubmitInfo {
    wait_semaphores    + wait_dst_stage_mask   ← "wait for these before starting"
    command_buffers                             ← "execute these"
    signal_semaphores                           ← "signal these when done"
}

The wait_dst_stage_mask specifies which pipeline stages must wait, not the entire submission. This enables the GPU to start early stages while still waiting for a semaphore on a later stage.

Key takeaways

  • Commands are recorded, not executed. Recording is cheap CPU work; execution happens asynchronously on the GPU.
  • Command pools amortize allocation cost. One pool per queue family, typically one pool per thread.
  • Command buffers have states: Initial → Recording → Executable → Pending. Never touch a Pending buffer.
  • Use ONE_TIME_SUBMIT for throw-away work (uploads, transitions). Use per-frame buffers with fences for rendering.
  • The SubmitInfo struct is where command buffers meet synchronization. That connection is the topic of the next chapter.

Synchronization

Threshold concept. Synchronization is the single most confusing aspect of Vulkan for newcomers. Once you understand it, you understand Vulkan’s execution model. If this chapter takes you three reads, that is completely normal.

Motivation

The GPU does not execute your commands in the order you recorded them. Not between queues, not between submissions, and not even between draw calls within the same command buffer. The GPU pipelines work: while one draw call is running its fragment shader, the next draw call might already be running its vertex shader.

Vulkan gives you zero implicit ordering guarantees.

This sounds terrifying, and it is also why Vulkan is fast. The GPU can overlap operations, reorder for efficiency, and keep all its hardware units busy. But it means you must tell the driver when ordering matters, because only you know which operations depend on each other.

Intuition: the factory

Imagine a factory with multiple assembly lines (queue families). Each line has workers at different stations (pipeline stages) who process items one after another.

Without synchronization, the factory runs at full speed: items flow through stations as fast as possible, and different lines operate independently. This is great, until you have a dependency: “station B needs the output from station A before it can start.”

Vulkan gives you four tools to express these dependencies:

┌─────────┬────────────────────────────┬──────────────────────────┐
│ Tool    │ What it synchronizes       │ Analogy                  │
├─────────┼────────────────────────────┼──────────────────────────┤
│ Fence   │ GPU → CPU                  │ A sign on the factory    │
│         │ "is the GPU done yet?"     │ door: "batch complete"   │
├─────────┼────────────────────────────┼──────────────────────────┤
│ Sema-   │ Queue → Queue              │ A conveyor belt between  │
│ phore   │ "queue B waits for queue A"│ two assembly lines       │
├─────────┼────────────────────────────┼──────────────────────────┤
│ Barrier │ Command → Command          │ A supervisor on one      │
│         │ (within a command buffer)  │ line: "wait for station  │
│         │                            │ A before station B"      │
├─────────┼────────────────────────────┼──────────────────────────┤
│ Event   │ Split barrier              │ A sticky note: "I'll     │
│         │ (signal now, wait later)   │ leave this here, check   │
│         │                            │ for it later"            │
└─────────┴────────────────────────────┴──────────────────────────┘

Each tool solves a different problem. Using the wrong tool for the job is a common source of bugs.

Before reading on: you submit two command buffers to the same queue, one after the other. Does the second one wait for the first to finish before it starts executing?

Answer: No. Submissions to the same queue begin in order, but their execution can overlap. The second submission might start its vertex shader while the first is still running its fragment shader. If you need the first to fully complete before the second starts, you need explicit synchronization.

Worked example 1: CPU waits for GPU (Fence)

Problem: You submitted a command buffer. You need to know when the GPU is done so you can read back the results, or so you can safely re-record the command buffer for the next frame.

Solution: A fence. You pass it to queue_submit, and the GPU signals it when all commands in that submission finish.

use vulkan_rust::vk;
use vk::*;

// ── Create a fence ──────────────────────────────────────────────
//
// SIGNALED means the fence starts in the signaled state.
// This matters for the first frame: wait_for_fences on an
// unsignaled fence with no prior submission would block forever.
let fence_info = FenceCreateInfo::builder()
    .flags(FenceCreateFlags::SIGNALED);

let fence = unsafe { device.create_fence(&fence_info, None)? };
use vulkan_rust::vk;
use vk::*;

// ── The render loop ─────────────────────────────────────────────

// Step 1: Wait for the previous frame's GPU work to finish.
//   timeout = u64::MAX means "wait forever"
//   wait_all = 1 (true) means "wait for ALL fences in the slice"
unsafe {
    device.wait_for_fences(&[fence], true, u64::MAX)?;
};

// Step 2: Reset the fence so it can be signaled again.
//   A fence can only be signaled once. You must reset it
//   before reusing.
unsafe { device.reset_fences(&[fence])? };

// Step 3: Record and submit a command buffer.
// ...record commands...
let submit = SubmitInfo::builder()
    .command_buffers(&[command_buffer]);

// Pass the fence to queue_submit. The GPU will signal it
// when this submission completes.
unsafe {
    device.queue_submit(queue, &[*submit], fence)?;
};

// The CPU continues immediately. The GPU works in parallel.
// Next iteration, wait_for_fences will block until the GPU
// signals this fence.

The fence lifecycle:

    create (SIGNALED)
         │
         v
    ┌─> wait ──> reset ──> submit (with fence) ──> GPU signals ─┐
    │                                                           │
    └───────────────────────────────────────────────────────────┘

When to use fences

  • Waiting for a frame to finish before re-recording its command buffer
  • Waiting for a transfer to complete before reading the result on the CPU
  • Throttling the CPU so it doesn’t race too far ahead of the GPU

Worked example 2: Queue-to-queue sync (Semaphore)

Problem: The swapchain gives you an image to render into, but the image might not be ready yet (the display might still be reading it). After rendering, you need to present the image, but only after the render commands finish.

Solution: Two semaphores. One says “the image is ready to render into.” The other says “rendering is done, safe to present.”

use vulkan_rust::vk;
use vk::*;

// Create two semaphores (no flags needed).
let sem_info = SemaphoreCreateInfo::builder();
let image_available = unsafe { device.create_semaphore(&sem_info, None)? };
let render_finished = unsafe { device.create_semaphore(&sem_info, None)? };
use vulkan_rust::vk;
use vk::*;

// ── Acquire a swapchain image ───────────────────────────────────
//
// This signals image_available when the image is ready.
let image_index = unsafe {
    device.acquire_next_image_khr(
        swapchain,
        u64::MAX,          // timeout
        image_available,   // semaphore to signal
        Fence::null(), // no fence needed here
    )?
};

// ── Submit rendering commands ───────────────────────────────────
//
// Wait on image_available (at the COLOR_ATTACHMENT_OUTPUT stage,
// because that's when we actually write to the image).
// Signal render_finished when done.
let wait_stages = [PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT];

let submit = SubmitInfo::builder()
    .wait_semaphores(&[image_available])
    .wait_dst_stage_mask(&wait_stages)
    .command_buffers(&[command_buffer])
    .signal_semaphores(&[render_finished]);

unsafe {
    device.queue_submit(queue, &[*submit], frame_fence)?;
};

// ── Present the image ───────────────────────────────────────────
//
// Wait on render_finished before the display reads the image.
let present_info = PresentInfoKHR::builder()
    .wait_semaphores(&[render_finished])
    .swapchains(&[swapchain])
    .image_indices(&[image_index]);

unsafe { device.queue_present_khr(queue, &present_info)? };

The semaphore flow:

acquire_next_image ──signals──> image_available
                                      │
                                      │ (GPU waits at COLOR_ATTACHMENT_OUTPUT)
                                      v
                               queue_submit ──signals──> render_finished
                                                               │
                                                               │ (presentation waits)
                                                               v
                                                        queue_present

Semaphores vs fences

FenceSemaphore
Who waits?The CPU (via wait_for_fences)The GPU (another queue operation)
Who signals?The GPU (via queue_submit)The GPU (via queue_submit)
Use caseCPU needs to know when GPU is doneOne GPU operation depends on another
Can you query it?Yes (get_fence_status)No (GPU-only)

Before reading on: why does the submit wait at COLOR_ATTACHMENT_OUTPUT specifically, instead of waiting at TOP_OF_PIPE (the very beginning)? What work can the GPU do before it needs the swapchain image?

Answer: The vertex shader, tessellation, and geometry stages do not write to the swapchain image. They can run while the image is still being read by the display. Only the color attachment output stage needs the image, so we delay waiting until that point. This lets the GPU overlap more work.

Worked example 3: Image layout transition (Pipeline Barrier)

Problem: You want to copy data into an image, then sample it in a fragment shader. The image must be in TRANSFER_DST_OPTIMAL layout for the copy, then transitioned to SHADER_READ_ONLY_OPTIMAL for sampling. The GPU must finish the copy before the transition, and finish the transition before the shader reads.

Solution: A pipeline barrier with an image memory barrier.

use vulkan_rust::vk;
use vk::*;

// Transition image from TRANSFER_DST to SHADER_READ_ONLY.
//
// This barrier says:
//   "All TRANSFER_WRITE operations in the TRANSFER stage must
//    complete before any SHADER_READ operations in the
//    FRAGMENT_SHADER stage can begin. Also, change the image
//    layout."
let barrier = ImageMemoryBarrier::builder()
    .src_access_mask(AccessFlags::TRANSFER_WRITE)
    .dst_access_mask(AccessFlags::SHADER_READ)
    .old_layout(ImageLayout::TRANSFER_DST_OPTIMAL)
    .new_layout(ImageLayout::SHADER_READ_ONLY_OPTIMAL)
    .src_queue_family_index(QUEUE_FAMILY_IGNORED)
    .dst_queue_family_index(QUEUE_FAMILY_IGNORED)
    .image(texture_image)
    .subresource_range(ImageSubresourceRange {
        aspect_mask: ImageAspectFlags::COLOR,
        base_mip_level: 0,
        level_count: 1,
        base_array_layer: 0,
        layer_count: 1,
    });

unsafe {
    device.cmd_pipeline_barrier(
        command_buffer,
        PipelineStageFlags::TRANSFER,          // src stage
        PipelineStageFlags::FRAGMENT_SHADER,    // dst stage
        DependencyFlags::empty(),
        &[],           // no memory barriers
        &[],           // no buffer memory barriers
        &[*barrier],   // one image memory barrier
    );
};

Understanding the three parts of a barrier

A pipeline barrier has three components that work together:

 1. Stage mask:    WHEN must things happen?
                   "Transfer stage must finish before fragment shader starts"

 2. Access mask:   WHAT memory operations are involved?
                   "Writes from transfers must be visible to shader reads"

 3. Layout:        HOW should the image be reorganized?
                   "Convert from transfer-optimal to shader-read-optimal tiling"

All three are needed. The stage mask creates an execution dependency (ordering of operations). The access mask creates a memory dependency (visibility of writes). The layout transition physically reorganizes how the image data is stored in memory.

A common mistake is setting the stage masks correctly but forgetting the access masks, or vice versa. Both are required for correctness.

Common barrier recipes

FromTosrc stagedst stagesrc accessdst access
Transfer → Shader readTRANSFER_DSTSHADER_READ_ONLYTRANSFERFRAGMENT_SHADERTRANSFER_WRITESHADER_READ
Undefined → Transfer dstUNDEFINEDTRANSFER_DSTTOP_OF_PIPETRANSFERNONETRANSFER_WRITE
Undefined → Color attachmentUNDEFINEDCOLOR_ATTACHMENTTOP_OF_PIPECOLOR_ATTACHMENT_OUTPUTNONECOLOR_ATTACHMENT_WRITE
Color attachment → PresentCOLOR_ATTACHMENTPRESENT_SRCCOLOR_ATTACHMENT_OUTPUTBOTTOM_OF_PIPECOLOR_ATTACHMENT_WRITENONE
Color attachment → Shader readCOLOR_ATTACHMENTSHADER_READ_ONLYCOLOR_ATTACHMENT_OUTPUTFRAGMENT_SHADERCOLOR_ATTACHMENT_WRITESHADER_READ

Keep this table handy. Most applications only need these transitions.

Pipeline stages: the execution order

To understand stage masks, you need to know the order the GPU processes work. Here is the graphics pipeline, simplified:

TOP_OF_PIPE                    (pseudo-stage: "before anything")
    │
    v
DRAW_INDIRECT                  (read indirect draw parameters)
    │
    v
VERTEX_INPUT                   (read vertex/index buffers)
    │
    v
VERTEX_SHADER                  (run vertex shader)
    │
    v
EARLY_FRAGMENT_TESTS           (depth/stencil test before fragment shader)
    │
    v
FRAGMENT_SHADER                (run fragment shader)
    │
    v
LATE_FRAGMENT_TESTS            (depth/stencil test after fragment shader)
    │
    v
COLOR_ATTACHMENT_OUTPUT        (write to color attachments)
    │
    v
BOTTOM_OF_PIPE                 (pseudo-stage: "after everything")


Special stages (not in the pipeline order):
  TRANSFER                     (copy/blit/clear operations)
  COMPUTE_SHADER               (compute dispatch)
  HOST                         (CPU reads/writes to mapped memory)
  ALL_GRAPHICS                 (shorthand for all graphics stages)
  ALL_COMMANDS                 (shorthand for everything)

When you set src_stage = TRANSFER and dst_stage = FRAGMENT_SHADER, you are saying: “everything in the TRANSFER stage that came before this barrier must finish before anything in the FRAGMENT_SHADER stage that comes after this barrier can start.”

Events: split barriers

Events are an advanced optimization. A pipeline barrier creates a dependency at a single point in the command buffer. An event lets you split the barrier: signal it at one point, wait for it at a later point. This gives the GPU more room to reorder work between the signal and the wait.

use vulkan_rust::vk;
use vk::*;

// Signal the event after the transfer completes.
unsafe {
    device.cmd_set_event(
        command_buffer,
        event,
        PipelineStageFlags::TRANSFER,
    );
};

// ... other commands that don't depend on the transfer ...

// Wait for the event before the fragment shader reads.
unsafe {
    device.cmd_wait_events(
        command_buffer,
        &[event],
        PipelineStageFlags::TRANSFER,
        PipelineStageFlags::FRAGMENT_SHADER,
        &[], &[], &[*image_barrier],
    );
};

Most applications do not need events. Use pipeline barriers until profiling shows you need the extra overlap.

Formal reference

Synchronization primitives

PrimitiveScopeSignalWaitCreateDestroy
FenceGPU → CPUqueue_submitwait_for_fencescreate_fencedestroy_fence
SemaphoreQueue → Queuequeue_submit (signal)queue_submit (wait)create_semaphoredestroy_semaphore
Pipeline BarrierWithin command bufferN/AN/AN/A (inline command)N/A
EventSplit barriercmd_set_eventcmd_wait_eventscreate_eventdestroy_event

FenceCreateFlags

FlagMeaning
SIGNALEDFence starts in the signaled state. Use this for the first frame so wait_for_fences doesn’t block forever.

PipelineStageFlags (most used)

FlagWhen it runs
TOP_OF_PIPEBefore any work. Used as src when there’s nothing to wait for.
VERTEX_INPUTReading vertex/index buffers.
VERTEX_SHADERRunning the vertex shader.
FRAGMENT_SHADERRunning the fragment shader.
COLOR_ATTACHMENT_OUTPUTWriting to color attachments.
TRANSFERCopy, blit, and clear operations.
COMPUTE_SHADERRunning compute shaders.
BOTTOM_OF_PIPEAfter all work. Used as dst when nothing needs to wait.
ALL_COMMANDSShorthand for every stage. Correct but may be slower than a precise mask.

AccessFlags (most used)

FlagWhat it protects
VERTEX_ATTRIBUTE_READVertex shader reads from vertex buffers.
UNIFORM_READShader reads from uniform buffers.
SHADER_READShader reads (sampled images, storage buffers).
SHADER_WRITEShader writes (storage images, storage buffers).
COLOR_ATTACHMENT_READReading color attachments (e.g., blending).
COLOR_ATTACHMENT_WRITEWriting to color attachments.
TRANSFER_READSource of a copy/blit.
TRANSFER_WRITEDestination of a copy/blit.
HOST_READCPU reads from mapped memory.
HOST_WRITECPU writes to mapped memory.

ImageLayout values

LayoutOptimized for
UNDEFINEDNothing. Contents are discarded. Use as old_layout when you don’t care about existing data.
GENERALAnything, but not optimal for anything. Last resort.
COLOR_ATTACHMENT_OPTIMALWriting as a color attachment (rendering).
DEPTH_STENCIL_ATTACHMENT_OPTIMALWriting as a depth/stencil attachment.
SHADER_READ_ONLY_OPTIMALSampling in a shader.
TRANSFER_SRC_OPTIMALSource of a copy/blit.
TRANSFER_DST_OPTIMALDestination of a copy/blit.
PRESENT_SRCPresentation to the display (swapchain).

The happens-before relationship

Vulkan defines ordering through execution dependencies and memory dependencies:

  • An execution dependency guarantees that operations in the first synchronization scope (src) complete before operations in the second synchronization scope (dst) begin.
  • A memory dependency guarantees that writes in the first access scope are visible to reads in the second access scope.

Both are needed. Without the execution dependency, operations might overlap. Without the memory dependency, caches might serve stale data even after the operation completes.

Common mistakes

  1. Forgetting access masks. Stage masks alone create execution dependencies, but GPU caches can still serve stale data. You need access masks for memory visibility.

  2. Using ALL_COMMANDS / ALL_GRAPHICS everywhere. Correct, but overly broad. The GPU can’t overlap anything across a full-pipeline barrier. Use precise stages for better performance.

  3. Reusing a fence without resetting it. A signaled fence stays signaled forever. wait_for_fences returns immediately on an already-signaled fence. Always reset_fences before resubmitting.

  4. Submitting while a command buffer is still pending. If the GPU hasn’t finished with a command buffer, you cannot re-record it. Wait for its fence first.

  5. Missing the initial fence SIGNALED flag. On the first frame, there is no prior submission to signal the fence. Creating with SIGNALED avoids an infinite wait.

Key takeaways

  • The GPU does not execute commands in order. You must add explicit synchronization where ordering matters.
  • Fences let the CPU wait for the GPU. Semaphores let one GPU operation wait for another. Barriers order commands within a command buffer.
  • Barriers have three parts: when (stage masks), what (access masks), and how (layout transitions). All three are needed.
  • Start with the common barrier recipes table. Most applications only need a handful of transitions.
  • When in doubt, use broader stages (ALL_COMMANDS) to get correct behavior first, then narrow down for performance later.

Render Passes & Framebuffers

Motivation

A render pass tells Vulkan the structure of your rendering: what attachments you use (color, depth), how they are loaded and stored, and how subpasses depend on each other. This information lets the driver make hardware-specific optimizations, especially on tile-based GPUs (mobile) where the render pass boundaries determine what fits in on-chip memory.

If you skip this concept and just try to render, the validation layers will immediately tell you: “you need a render pass.” Understanding why it exists will save you from cargo-culting boilerplate you don’t understand.

Intuition

Blueprint and canvas

A render pass is a blueprint for a painting session. It describes:

  • What surfaces you’ll paint on (attachments: color, depth, stencil)
  • How each surface is prepared before painting (load ops)
  • What happens to each surface after painting (store ops)
  • If there are multiple phases (subpasses) and how they depend on each other

A framebuffer is the specific canvas, the actual images that match the blueprint’s description.

Render Pass (blueprint)              Framebuffer (canvas)
┌───────────────────────┐            ┌────────────────────────┐
│ Attachment 0:         │            │ Attachment 0:          │
│   format: B8G8R8A8    │───matches──│   swapchain_image_view │
│   load: CLEAR         │            │                        │
│   store: STORE        │            │ Attachment 1:          │
│   layout: → PRESENT   │───matches──│   depth_image_view     │
│                       │            │                        │
│ Attachment 1:         │            │ width: 1920            │
│   format: D32_SFLOAT  │            │ height: 1080           │
│   load: CLEAR         │            │ layers: 1              │
│   store: DONT_CARE    │            └────────────────────────┘
│   layout: → DEPTH_OPT │
│                       │
│ Subpass 0:            │
│   color: [0]          │
│   depth: [1]          │
└───────────────────────┘

You create the render pass once. You create a framebuffer for each set of images you render to (typically one per swapchain image).

Load and store ops: why they matter

When a render pass begins, the driver needs to know what to do with each attachment’s existing contents:

Load OpMeaningWhen to use
CLEARFill with a clear valueStart of frame, you want a clean slate
LOADPreserve the existing contentsContinuing previous rendering
DONT_CAREContents are undefinedYou will overwrite every pixel anyway

When the render pass ends:

Store OpMeaningWhen to use
STOREWrite results to memoryYou need the results (color for present, etc.)
DONT_CAREResults may be discardedTransient data (depth buffer you won’t read later)

Before reading on: on a tile-based mobile GPU, rendering happens in small tiles stored in fast on-chip memory. The load op controls whether tile data is loaded from main memory, and the store op controls whether it is written back. Why would DONT_CARE be significantly faster than LOAD on such hardware?

Answer: DONT_CARE lets the driver skip the expensive memory transfer entirely. On a mobile GPU, loading a full-screen depth buffer from main memory into tile memory can take milliseconds. If you are clearing it anyway, CLEAR tells the driver to fill tiles on-chip without touching main memory. DONT_CARE is even cheaper: it does nothing at all.

Worked example: a single-subpass render pass

This is the most common setup: one color attachment (the swapchain image) and one depth attachment.

Step 1: Describe the attachments

use vulkan_rust::vk;
use vk::*;

// Color attachment: the swapchain image we render into.
let color_attachment = AttachmentDescription {
    flags: AttachmentDescriptionFlags::empty(),
    format: swapchain_format,           // e.g. B8G8R8A8_SRGB
    samples: SampleCountFlagBits::_1,
    load_op: AttachmentLoadOp::CLEAR,       // clear at start
    store_op: AttachmentStoreOp::STORE,      // keep the result
    stencil_load_op: AttachmentLoadOp::DONT_CARE,
    stencil_store_op: AttachmentStoreOp::DONT_CARE,
    initial_layout: ImageLayout::UNDEFINED,  // we don't care about previous contents
    final_layout: ImageLayout::PRESENT_SRC,  // ready for presentation after the pass
};

// Depth attachment: used for depth testing, discarded after.
let depth_attachment = AttachmentDescription {
    flags: AttachmentDescriptionFlags::empty(),
    format: Format::D32_SFLOAT,
    samples: SampleCountFlagBits::_1,
    load_op: AttachmentLoadOp::CLEAR,
    store_op: AttachmentStoreOp::DONT_CARE,  // we won't read it later
    stencil_load_op: AttachmentLoadOp::DONT_CARE,
    stencil_store_op: AttachmentStoreOp::DONT_CARE,
    initial_layout: ImageLayout::UNDEFINED,
    final_layout: ImageLayout::DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
};

Step 2: Define the subpass

use vulkan_rust::vk;
use vk::*;

// Subpass 0 uses attachment 0 as color output and attachment 1 as depth.
let color_ref = AttachmentReference {
    attachment: 0,    // index into the attachments array
    layout: ImageLayout::COLOR_ATTACHMENT_OPTIMAL,
};
let depth_ref = AttachmentReference {
    attachment: 1,
    layout: ImageLayout::DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
};

let subpass = SubpassDescription {
    flags: SubpassDescriptionFlags::empty(),
    pipeline_bind_point: PipelineBindPoint::GRAPHICS,
    input_attachment_count: 0,
    p_input_attachments: core::ptr::null(),
    color_attachment_count: 1,
    p_color_attachments: &color_ref,
    p_resolve_attachments: core::ptr::null(),
    p_depth_stencil_attachment: &depth_ref,
    preserve_attachment_count: 0,
    p_preserve_attachments: core::ptr::null(),
};

Step 3: Add a subpass dependency

use vulkan_rust::vk;
use vk::*;

// This dependency ensures that the image layout transition
// (from the previous frame's PRESENT_SRC to our UNDEFINED→COLOR_ATTACHMENT)
// happens before we start writing color output.
let dependency = SubpassDependency {
    src_subpass: SUBPASS_EXTERNAL,  // operations before the render pass
    dst_subpass: 0,                      // our subpass
    src_stage_mask: PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT
        | PipelineStageFlags::EARLY_FRAGMENT_TESTS,
    dst_stage_mask: PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT
        | PipelineStageFlags::EARLY_FRAGMENT_TESTS,
    src_access_mask: AccessFlags::NONE,
    dst_access_mask: AccessFlags::COLOR_ATTACHMENT_WRITE
        | AccessFlags::DEPTH_STENCIL_ATTACHMENT_WRITE,
    dependency_flags: DependencyFlags::empty(),
};

Step 4: Create the render pass

use vulkan_rust::vk;
use vk::*;

let attachments = [color_attachment, depth_attachment];

let render_pass_info = RenderPassCreateInfo::builder()
    .attachments(&attachments)
    .subpasses(&[subpass])
    .dependencies(&[dependency]);

let render_pass = unsafe {
    device.create_render_pass(&render_pass_info, None)?
};

Step 5: Create framebuffers (one per swapchain image)

use vulkan_rust::vk;
use vk::*;

let framebuffers: Vec<Framebuffer> = swapchain_image_views
    .iter()
    .map(|&view| {
        // Each framebuffer uses a different swapchain image view
        // but the same depth image view (shared across frames).
        let attachments = [view, depth_image_view];

        let fb_info = FramebufferCreateInfo::builder()
            .render_pass(render_pass)     // must be compatible
            .attachments(&attachments)
            .width(swapchain_extent.width)
            .height(swapchain_extent.height)
            .layers(1);

        unsafe { device.create_framebuffer(&fb_info, None).unwrap() }
    })
    .collect();

Step 6: Use in command recording

use vulkan_rust::vk;
use vk::*;

let clear_values = [
    ClearValue {
        color: ClearColorValue {
            float32: [0.0, 0.0, 0.0, 1.0],  // black
        },
    },
    ClearValue {
        depth_stencil: ClearDepthStencilValue {
            depth: 1.0,
            stencil: 0,
        },
    },
];

let begin_info = RenderPassBeginInfo::builder()
    .render_pass(render_pass)
    .framebuffer(framebuffers[image_index as usize])
    .render_area(Rect2D {
        offset: Offset2D { x: 0, y: 0 },
        extent: swapchain_extent,
    })
    .clear_values(&clear_values);

unsafe {
    // INLINE means we record drawing commands directly in this
    // primary command buffer (not via secondary command buffers).
    device.cmd_begin_render_pass(
        command_buffer,
        &begin_info,
        SubpassContents::INLINE,
    );

    // ... bind pipeline, bindescriptor sets, draw ...

    device.cmd_end_render_pass(command_buffer);
};

Dynamic rendering (Vulkan 1.3)

Vulkan 1.3 introduced cmd_begin_rendering / cmd_end_rendering, which lets you skip render pass and framebuffer objects entirely. You specify attachments inline at recording time:

use vulkan_rust::vk;
use vk::*;

let color_attachment = RenderingAttachmentInfo::builder()
    .image_view(swapchain_image_view)
    .image_layout(ImageLayout::COLOR_ATTACHMENT_OPTIMAL)
    .load_op(AttachmentLoadOp::CLEAR)
    .store_op(AttachmentStoreOp::STORE)
    .clear_value(ClearValue {
        color: ClearColorValue {
            float32: [0.0, 0.0, 0.0, 1.0],
        },
    });

let rendering_info = RenderingInfo::builder()
    .render_area(Rect2D {
        offset: Offset2D { x: 0, y: 0 },
        extent: swapchain_extent,
    })
    .layer_count(1)
    .color_attachments(&[*color_attachment]);

unsafe {
    device.cmd_begin_rendering(command_buffer, &rendering_info);
    // ... draw ...
    device.cmd_end_rendering(command_buffer);
};

Dynamic rendering is simpler for most use cases. Use traditional render passes when you need subpass dependencies, input attachments, or compatibility with Vulkan 1.0/1.1/1.2.

Formal reference

Key structs

StructPurpose
AttachmentDescriptionDescribes one attachment: format, samples, load/store ops, layouts
AttachmentReferencePoints a subpass to an attachment by index + desired layout
SubpassDescriptionLists which attachments a subpass uses (color, depth, input, preserve)
SubpassDependencySynchronization between subpasses (same fields as a pipeline barrier)
RenderPassCreateInfoCombines attachments + subpasses + dependencies
FramebufferCreateInfoBinds specific image views to a render pass
RenderPassBeginInfoStarts a render pass instance with a framebuffer + clear values

Subpass dependencies are barriers

A SubpassDependency has the same fields as a pipeline barrier: src_stage_mask, dst_stage_mask, src_access_mask, dst_access_mask. The special value SUBPASS_EXTERNAL refers to commands outside the render pass (before it starts or after it ends).

If you understood Synchronization, subpass dependencies will feel familiar. They are barriers that the driver inserts automatically at subpass transitions.

Layout transitions are automatic

The render pass handles image layout transitions for you. Each attachment has an initial_layout and final_layout. The driver transitions the image at render pass begin/end. Within a subpass, the image is in the layout specified by the AttachmentReference.

This is one of the render pass’s biggest conveniences: you do not need to insert manual cmd_pipeline_barrier calls for attachment layout transitions inside a render pass.

Key takeaways

  • A render pass is a blueprint describing attachments, subpasses, and dependencies. A framebuffer binds specific images to that blueprint.
  • Load and store ops tell the driver how to handle attachment data at the start and end of the pass. Choosing DONT_CARE or CLEAR over LOAD can dramatically improve performance on mobile GPUs.
  • Most applications need only a single subpass. Multiple subpasses are for advanced techniques (deferred rendering, input attachments).
  • Vulkan 1.3 dynamic rendering (cmd_begin_rendering) eliminates the need for render pass and framebuffer objects in simple cases.
  • Render passes handle layout transitions automatically. You do not need manual barriers for attachment images inside a render pass.

Pipelines

Threshold concept. In OpenGL, you set rendering state one call at a time, blend mode here, depth test there, and the driver compiles the final state lazily. In Vulkan, all state is compiled into a pipeline object up front. This removes driver guesswork and hitching at the cost of more explicit setup.

Motivation

A GPU is not a general-purpose processor. It is a configurable state machine with fixed-function stages (vertex input, rasterization, blending) and programmable stages (vertex shader, fragment shader). A pipeline object captures the full configuration of this machine, every stage, every setting, so the driver can compile it to hardware instructions once and reuse it many times.

This is why OpenGL applications sometimes stutter when a new material appears: the driver has to compile a new internal pipeline on the fly. In Vulkan, you create all your pipelines at load time and switch between them during rendering with zero compilation cost.

Intuition

The mixing console preset

A pipeline is like a preset on a mixing console. Instead of adjusting every knob during a live performance (and risking a pop or crackle), you save the full board state as a preset and recall it instantly. You can have many presets and switch between them, but you cannot twiddle individual knobs mid-song.

(Vulkan 1.3 added dynamic state to relax this, certain knobs can be adjusted at draw time. But the core idea holds: most state is baked.)

What goes into a graphics pipeline

A graphics pipeline is the largest create info in the Vulkan API. It bundles together every stage of the rendering process:

GraphicsPipelineCreateInfo
│
├── Shader stages            (vertex shader, fragment shader, ...)
├── Vertex input state       (what vertex data looks like)
├── Input assembly state     (triangles, lines, points)
├── Viewport state           (viewport + scissor rectangle)
├── Rasterization state      (polygon mode, culling, depth bias)
├── Multisample state        (MSAA settings)
├── Depth/stencil state      (depth test, stencil test)
├── Color blend state        (blending per attachment)
├── Dynamic state            (which of the above can change at draw time)
├── Pipeline layout          (what resources the shaders expect)
└── Render pass + subpass    (which render pass this pipeline is used in)

Every one of these must be specified. There are no defaults. This is verbose, but it means the driver has complete information at creation time and can optimize aggressively.

Before reading on: if you need to render some objects with blending and some without, how many pipeline objects do you need?

Answer: Two. Each pipeline bakes its blend state. You cmd_bind_pipeline to switch between them during command recording. Dynamic state (Vulkan 1.3) can make some of these switches cheaper, but you still need separate pipelines for fundamental differences like different shaders.

Pipeline layout: the bridge to resources

A pipeline layout declares what resources the shaders expect:

  • Descriptor set layouts: “binding 0 is a uniform buffer, binding 1 is a sampled image” (covered in Descriptor Sets)
  • Push constant ranges: small inline data passed at draw time (covered in Push Constants)

The pipeline layout is shared between pipeline creation and command recording, ensuring the resources you bind match what the shaders expect.

Worked example: creating a graphics pipeline

This is a minimal pipeline for rendering colored triangles.

Step 1: Load shaders

use vulkan_rust::vk;
use vulkan_rust::vk::*;

// SPIR-V bytecode, compiled from GLSL with glslc or shaderc.
let vert_code: &[u32] = /* load from file or include_bytes! */;
let frag_code: &[u32] = /* load from file or include_bytes! */;

let vert_info = ShaderModuleCreateInfo::builder()
    .code(vert_code);
let frag_info = ShaderModuleCreateInfo::builder()
    .code(frag_code);

let vert_module = unsafe { device.create_shader_module(&vert_info, None)? };
let frag_module = unsafe { device.create_shader_module(&frag_info, None)? };

// Shader stage descriptions.
let entry_name = c"main";  // GLSL entry point

let stages = [
    *PipelineShaderStageCreateInfo::builder()
        .stage(ShaderStageFlags::VERTEX)
        .module(vert_module)
        .name(entry_name),
    *PipelineShaderStageCreateInfo::builder()
        .stage(ShaderStageFlags::FRAGMENT)
        .module(frag_module)
        .name(entry_name),
];

Step 2: Define vertex input

use vulkan_rust::vk;
use vulkan_rust::vk::*;

// Describe how vertex data is laid out in memory.
let binding = VertexInputBindingDescription {
    binding: 0,
    stride: std::mem::size_of::<Vertex>() as u32,
    input_rate: VertexInputRate::VERTEX,
};

let attributes = [
    // position: vec3 at offset 0
    VertexInputAttributeDescription {
        location: 0,
        binding: 0,
        format: Format::R32G32B32_SFLOAT,
        offset: 0,
    },
    // color: vec3 at offset 12
    VertexInputAttributeDescription {
        location: 1,
        binding: 0,
        format: Format::R32G32B32_SFLOAT,
        offset: 12,
    },
];

let vertex_input = PipelineVertexInputStateCreateInfo::builder()
    .vertex_binding_descriptions(&[binding])
    .vertex_attribute_descriptions(&attributes);

Step 3: Configure fixed-function state

use vulkan_rust::vk;
use vulkan_rust::vk::*;

let input_assembly = PipelineInputAssemblyStateCreateInfo::builder()
    .topology(PrimitiveTopology::TRIANGLE_LIST);

// Use dynamic viewport and scissor so we don't bake window size
// into the pipeline. Set them at draw time with cmd_set_viewport
// and cmd_set_scissor.
let viewport_state = PipelineViewportStateCreateInfo::builder()
    .viewport_count(1)
    .scissor_count(1);

let rasterizer = PipelineRasterizationStateCreateInfo::builder()
    .polygon_mode(PolygonMode::FILL)
    .cull_mode(CullModeFlags::BACK)
    .front_face(FrontFace::COUNTER_CLOCKWISE)
    .line_width(1.0);

let multisampling = PipelineMultisampleStateCreateInfo::builder()
    .rasterization_samples(SampleCountFlagBits::_1);

let depth_stencil = PipelineDepthStencilStateCreateInfo::builder()
    .depth_test_enable(1)
    .depth_write_enable(1)
    .depth_compare_op(CompareOp::LESS);

// No blending: write color directly.
let blend_attachment = PipelineColorBlendAttachmentState {
    blend_enable: 0,
    color_write_mask: ColorComponentFlags::R
        | ColorComponentFlags::G
        | ColorComponentFlags::B
        | ColorComponentFlags::A,
    ..unsafe { core::mem::zeroed() }
};

let color_blending = PipelineColorBlendStateCreateInfo::builder()
    .attachments(&[blend_attachment]);

// Dynamic state: viewport and scissor are set at draw time.
let dynamic_states = [
    DynamicState::VIEWPORT,
    DynamicState::SCISSOR,
];
let dynamic_state = PipelineDynamicStateCreateInfo::builder()
    .dynamic_states(&dynamic_states);

Step 4: Create pipeline layout and pipeline

use vulkan_rust::vk;
use vulkan_rust::vk::*;
use vulkan_rust::vk::Handle;

// Empty layout (no descriptor sets, no push constants).
let layout_info = PipelineLayoutCreateInfo::builder();
let pipeline_layout = unsafe {
    device.create_pipeline_layout(&layout_info, None)?
};

// Assemble everything into one create info.
let pipeline_info = GraphicsPipelineCreateInfo::builder()
    .stages(&stages)
    .vertex_input_state(&vertex_input)
    .input_assembly_state(&input_assembly)
    .viewport_state(&viewport_state)
    .rasterization_state(&rasterizer)
    .multisample_state(&multisampling)
    .depth_stencil_state(&depth_stencil)
    .color_blend_state(&color_blending)
    .dynamic_state(&dynamic_state)
    .layout(pipeline_layout)
    .render_pass(render_pass)
    .subpass(0);

// create_graphics_pipelines can create multiple pipelines at once.
let pipeline = unsafe {
    device.create_graphics_pipelines(
        PipelineCache::null(),  // no cache for now
        &[*pipeline_info],
        None,
    )?
}[0];

// Shader modules can be destroyed after pipeline creation.
// The compiled code is baked into the pipeline.
unsafe {
    device.destroy_shader_module(vert_module, None);
    device.destroy_shader_module(frag_module, None);
};

Step 5: Use in command recording

use vulkan_rust::vk;
use vulkan_rust::vk::*;

unsafe {
    device.cmd_bind_pipeline(
        command_buffer,
        PipelineBindPoint::GRAPHICS,
        pipeline,
    );

    // Set dynamic state.
    device.cmd_set_viewport(command_buffer, 0, &[viewport]);
    device.cmd_set_scissor(command_buffer, 0, &[scissor]);

    // Draw.
    device.cmd_draw(command_buffer, vertex_count, 1, 0, 0);
};

Compute pipelines

Compute pipelines are dramatically simpler: just a shader stage and a pipeline layout. No vertex input, no rasterization, no blending.

use vulkan_rust::vk;
use vulkan_rust::vk::*;
use vulkan_rust::vk::Handle;

let compute_info = ComputePipelineCreateInfo::builder()
    .stage(*PipelineShaderStageCreateInfo::builder()
        .stage(ShaderStageFlags::COMPUTE)
        .module(compute_module)
        .name(c"main"))
    .layout(compute_layout);

let mut compute_pipeline = Pipeline::null();
unsafe {
    device.create_compute_pipelines(
        PipelineCache::null(),
        &[*compute_info],
        None,
        &mut compute_pipeline,
    )?;
};

Pipeline cache

Creating pipelines involves compiling shaders to GPU-specific machine code. A pipeline cache stores this compiled output so subsequent creations (in the same run or across runs, if you save/load the cache) are faster.

use vulkan_rust::vk;
use vulkan_rust::vk::*;

// Create a cache (optionally seeded with data from a previous run).
let cache_info = PipelineCacheCreateInfo::builder();
let cache = unsafe { device.create_pipeline_cache(&cache_info, None)? };

// Pass the cache when creating pipelines.
unsafe {
    let pipeline = device.create_graphics_pipelines(cache, &[*pipeline_info], None)?[0];
};

// At shutdown, retrieve cache data and save to disk for next run.
// (use get_pipeline_cache_data)

Dynamic state (Vulkan 1.3)

By default, every setting in the pipeline is baked. Dynamic state lets you mark specific settings as “set at draw time”:

Dynamic StateWhat it replaces
VIEWPORTViewport in viewport state
SCISSORScissor in viewport state
LINE_WIDTHLine width in rasterization state
DEPTH_TEST_ENABLEDepth test enable in depth/stencil state
CULL_MODECull mode in rasterization state
FRONT_FACEFront face in rasterization state
PRIMITIVE_TOPOLOGYTopology in input assembly state

Vulkan 1.3 made VIEWPORT and SCISSOR dynamic by convention (almost everyone was using them dynamically anyway). More aggressive dynamic state lets you consolidate pipelines: instead of separate pipelines for different cull modes, use one pipeline with CULL_MODE dynamic.

Formal reference

Graphics pipeline stages (in order)

StageState structRequired?
Vertex inputPipelineVertexInputStateCreateInfoYes
Input assemblyPipelineInputAssemblyStateCreateInfoYes
TessellationPipelineTessellationStateCreateInfoOnly with tessellation shaders
ViewportPipelineViewportStateCreateInfoYes (unless rasterizer discards)
RasterizationPipelineRasterizationStateCreateInfoYes
MultisamplePipelineMultisampleStateCreateInfoYes
Depth/stencilPipelineDepthStencilStateCreateInfoIf render pass has depth attachment
Color blendPipelineColorBlendStateCreateInfoIf render pass has color attachments
DynamicPipelineDynamicStateCreateInfoOptional

Destruction order

  1. Destroy pipelines before their pipeline layout.
  2. Destroy pipeline layouts before their descriptor set layouts.
  3. Shader modules can be destroyed immediately after pipeline creation.
  4. Pipeline caches can be destroyed at any time (they are independent of the pipelines created through them).

Key takeaways

  • A graphics pipeline bakes all rendering state into one object: shaders, vertex layout, rasterization, blending, depth test, everything.
  • You create pipelines at load time and switch between them with cmd_bind_pipeline during rendering. Zero compilation cost at draw time.
  • Compute pipelines are much simpler: just a shader + layout.
  • Dynamic state lets you defer certain settings to draw time, reducing the number of pipeline objects you need.
  • Pipeline caches avoid redundant shader compilation across pipeline creations and across application runs.

Descriptor Sets & Resource Binding

Motivation

Shaders need access to resources: buffers containing transformation matrices, images to sample, storage buffers for compute output. Descriptors are Vulkan’s mechanism for connecting shader bindings (layout(binding = 0) uniform ...) to actual GPU resources.

The descriptor system is more complex than OpenGL’s glBindTexture, but it exists because binding resources one at a time is a bottleneck. Vulkan lets you bind sets of resources at once, and reuse those sets across multiple draw calls.

Intuition

The surgeon’s tray

Think of a descriptor set as a tray of tools laid out for a surgeon:

  • The descriptor set layout is the diagram showing which tool goes in which slot (“slot 0: scalpel, slot 1: forceps, slot 2: sutures”).
  • The descriptor pool is the sterilization room where trays are prepared (pre-allocated memory for many trays).
  • The descriptor set is one prepared tray, with actual tools in each slot.
  • Writing a descriptor set is placing specific tools into the slots.
  • Binding is sliding the tray under the surgeon’s hands during the operation.

The flow:

1. Define layout     →  "what slots exist and what types they hold"
2. Create pool       →  "how many trays can we prepare at once"
3. Allocate set      →  "give me an empty tray matching this layout"
4. Write descriptors →  "put this buffer in slot 0, this image in slot 1"
5. Bind set          →  "use this tray for the next draw calls"

Before reading on: why do you think Vulkan uses descriptor “pools” instead of allocating descriptors individually? What performance problem does this solve?

Answer: Same reason as command pools, individual allocations are expensive because each one requires driver bookkeeping and possibly a kernel call. Pools pre-allocate a block of memory and hand out descriptors cheaply from that block.

Descriptor types

Each slot in a descriptor set has a specific type:

TypeWhat it bindsGLSL example
UNIFORM_BUFFERRead-only buffer (matrices, parameters)layout(binding=0) uniform UBO { mat4 mvp; };
STORAGE_BUFFERRead/write buffer (compute data)layout(binding=0) buffer SSBO { float data[]; };
COMBINED_IMAGE_SAMPLERImage + sampler togetherlayout(binding=0) uniform sampler2D tex;
SAMPLED_IMAGEImage without samplerlayout(binding=0) uniform texture2D tex;
SAMPLERSampler without imagelayout(binding=0) uniform sampler s;
STORAGE_IMAGERead/write image (compute)layout(binding=0, rgba8) uniform image2D img;
INPUT_ATTACHMENTPrevious subpass outputlayout(input_attachment_index=0) uniform subpassInput;

The most common are UNIFORM_BUFFER and COMBINED_IMAGE_SAMPLER.

Worked example: binding a uniform buffer and a texture

Step 1: Create a descriptor set layout

use vulkan_rust::vk;
use vk::*;

// Describe the bindings: slot 0 is a uniform buffer visible to
// the vertex shader, slot 1 is a combined image sampler visible
// to the fragment shader.
let bindings = [
    DescriptorSetLayoutBinding {
        binding: 0,
        descriptor_type: DescriptorType::UNIFORM_BUFFER,
        descriptor_count: 1,
        stage_flags: ShaderStageFlags::VERTEX,
        p_immutable_samplers: core::ptr::null(),
    },
    DescriptorSetLayoutBinding {
        binding: 1,
        descriptor_type: DescriptorType::COMBINED_IMAGE_SAMPLER,
        descriptor_count: 1,
        stage_flags: ShaderStageFlags::FRAGMENT,
        p_immutable_samplers: core::ptr::null(),
    },
];

let layout_info = DescriptorSetLayoutCreateInfo::builder()
    .bindings(&bindings);

let descriptor_layout = unsafe {
    device.create_descriptor_set_layout(&layout_info, None)?
};

// This layout is also passed to create_pipeline_layout, connecting
// the pipeline to the descriptor set structure.

Step 2: Create a descriptor pool

use vulkan_rust::vk;
use vk::*;

// The pool must have enough room for the descriptor types we need.
// If we want 10 sets, each with 1 uniform buffer and 1 image sampler:
let pool_sizes = [
    DescriptorPoolSize {
        r#type: DescriptorType::UNIFORM_BUFFER,
        descriptor_count: 10,
    },
    DescriptorPoolSize {
        r#type: DescriptorType::COMBINED_IMAGE_SAMPLER,
        descriptor_count: 10,
    },
];

let pool_info = DescriptorPoolCreateInfo::builder()
    .max_sets(10)
    .pool_sizes(&pool_sizes);

let descriptor_pool = unsafe {
    device.create_descriptor_pool(&pool_info, None)?
};

Step 3: Allocate a descriptor set

use vulkan_rust::vk;
use vk::*;

let alloc_info = DescriptorSetAllocateInfo::builder()
    .descriptor_pool(descriptor_pool)
    .set_layouts(&[descriptor_layout]);

let descriptor_set = unsafe {
    device.allocate_descriptor_sets(&alloc_info)?
}[0];

Step 4: Write descriptors (point slots to actual resources)

use vulkan_rust::vk;
use vk::*;

// Point binding 0 to our uniform buffer.
let buffer_info = DescriptorBufferInfo {
    buffer: uniform_buffer,
    offset: 0,
    range: std::mem::size_of::<UniformData>() as u64,
};

// Point binding 1 to our texture.
let image_info = DescriptorImageInfo {
    sampler: texture_sampler,
    image_view: texture_image_view,
    image_layout: ImageLayout::SHADER_READ_ONLY_OPTIMAL,
};

let writes = [
    *WriteDescriptorSet::builder()
        .dst_set(descriptor_set)
        .dst_binding(0)
        .descriptor_type(DescriptorType::UNIFORM_BUFFER)
        .buffer_info(&[buffer_info]),
    *WriteDescriptorSet::builder()
        .dst_set(descriptor_set)
        .dst_binding(1)
        .descriptor_type(DescriptorType::COMBINED_IMAGE_SAMPLER)
        .image_info(&[image_info]),
];

// This updates the descriptor set immediately. No command buffer needed.
unsafe { device.update_descriptor_sets(&writes, &[]) };

Step 5: Bind during command recording

use vulkan_rust::vk;
use vk::*;

unsafe {
    device.cmd_bind_descriptor_sets(
        command_buffer,
        PipelineBindPoint::GRAPHICS,
        pipeline_layout,
        0,                       // first set index
        &[descriptor_set],       // sets to bind
        &[],                     // dynamic offsets (none)
    );

    // Now draw calls in this command buffer can access the
    // uniform buffer at binding 0 and the texture at binding 1.
    device.cmd_draw(command_buffer, vertex_count, 1, 0, 0);
};

Multiple descriptor sets

You can bind multiple descriptor sets at once. A common pattern:

Set 0: Per-frame data      (camera matrices, lighting, time)
Set 1: Per-material data   (textures, material properties)
Set 2: Per-object data     (model matrix)

This lets you update and bind sets at different frequencies. Set 0 changes once per frame, set 1 changes when you switch materials, set 2 changes per object. You only rebind the sets that changed.

use vulkan_rust::vk;
use vk::*;

// In pipeline layout creation:
let layouts = [per_frame_layout, per_material_layout, per_object_layout];
let layout_info = PipelineLayoutCreateInfo::builder()
    .set_layouts(&layouts);

// During rendering:
unsafe {
    // Bind set 0 once per frame.
    device.cmd_bind_descriptor_sets(
        cmd, PipelineBindPoint::GRAPHICS,
        pipeline_layout, 0, &[per_frame_set], &[],
    );

    for material in &materials {
        // Bind set 1 per material.
        device.cmd_bind_descriptor_sets(
            cmd, PipelineBindPoint::GRAPHICS,
            pipeline_layout, 1, &[material.descriptor_set], &[],
        );

        for object in &material.objects {
            // Bind set 2 per object.
            device.cmd_bind_descriptor_sets(
                cmd, PipelineBindPoint::GRAPHICS,
                pipeline_layout, 2, &[object.descriptor_set], &[],
            );
            device.cmd_draw(cmd, object.vertex_count, 1, 0, 0);
        }
    }
};

Before reading on: in the pattern above, when you bind set 1 for a new material, does set 0 (per-frame) stay bound or does it need to be rebound?

Answer: It stays bound. Binding set N only affects set N. Sets at other indices remain bound from their previous cmd_bind_descriptor_sets call, as long as the pipeline layout is compatible.

Formal reference

The descriptor set creation flow

DescriptorSetLayoutBinding[]
          │
          v
DescriptorSetLayoutCreateInfo ──> create_descriptor_set_layout ──> DescriptorSetLayout
                                                                          │
                    ┌─────────────────────────────────────────────────────┘
                    v
DescriptorPoolCreateInfo ──> create_descriptor_pool ──> DescriptorPool
                    │                                            │
                    v                                            v
DescriptorSetAllocateInfo ──────> allocate_descriptor_sets ──> DescriptorSet
                                                                    │
                                                                    v
WriteDescriptorSet[] ──────────> update_descriptor_sets   (set is now usable)
                                                            │
                                                            v
cmd_bind_descriptor_sets ──────> (shaders can access resources)

Descriptor types reference

TypeRead/WriteTypical use
UNIFORM_BUFFERReadMatrices, parameters (small, frequently updated)
UNIFORM_BUFFER_DYNAMICReadSame, with dynamic offset at bind time
STORAGE_BUFFERRead/WriteLarge data, compute buffers
STORAGE_BUFFER_DYNAMICRead/WriteSame, with dynamic offset
COMBINED_IMAGE_SAMPLERReadTextures
SAMPLED_IMAGEReadImage without sampler (separate sampler)
SAMPLERN/ASampler without image
STORAGE_IMAGERead/WriteCompute shader image output
INPUT_ATTACHMENTReadPrevious subpass output
INLINE_UNIFORM_BLOCKReadSmall uniform data inline in the set

Destruction order

  1. Destroy pipeline layouts before descriptor set layouts.
  2. Destroying a descriptor pool frees all sets allocated from it.
  3. Descriptor set layouts can be destroyed after pipeline creation (the pipeline bakes a copy of the layout information).

Key takeaways

  • Descriptors connect shader bindings to GPU resources (buffers, images).
  • The flow is: define layout → create pool → allocate set → write → bind.
  • Use multiple descriptor sets (per-frame, per-material, per-object) to minimize rebinding. Only rebind sets that change.
  • Descriptor pools work like command pools: pre-allocate in bulk, hand out cheaply.
  • update_descriptor_sets is a CPU-side operation, not a GPU command. You can update sets between submissions without recording commands.

The pNext Extension Chain

Motivation

Vulkan evolves through extensions, and extensions often need to add fields to existing structs. But Vulkan structs are #[repr(C)] with a fixed layout, you cannot just add fields. The solution is pNext: a linked list pointer in every extensible struct that lets you chain additional data structures onto it.

This is Vulkan’s most powerful extensibility mechanism and one of its most confusing features for newcomers. Once you understand it, enabling new Vulkan features and extensions becomes straightforward.

Intuition

The envelope analogy

Every Vulkan struct with a pNext field is an envelope. The main struct is the letter inside. The pNext chain lets you stuff additional pages into the same envelope.

The driver opens the envelope, reads the main page, then checks if there are more pages. Each extra page has a header (sType) that identifies what it is, so the driver knows how to interpret it. Pages it doesn’t recognize are silently skipped.

DeviceCreateInfo (envelope)
├── sType: DEVICE_CREATE_INFO          (header: "this is a device create info")
├── pNext ──────────────────────────┐
├── ... (normal fields)             │
│                                   v
│               PhysicalDeviceVulkan12Features (extra page)
│               ├── sType: PHYSICAL_DEVICE_VULKAN_1_2_FEATURES
│               ├── pNext ──────────────────────────┐
│               ├── ... (Vulkan 1.2 feature flags)  │
│                                                   v
│                           PhysicalDeviceVulkan13Features (another page)
│                           ├── sType: PHYSICAL_DEVICE_VULKAN_1_3_FEATURES
│                           ├── pNext: null (end of chain)
│                           ├── ... (Vulkan 1.3 feature flags)

Under the hood: two pointers

Every extensible Vulkan struct starts with the same two fields:

pub struct SomeCreateInfo {
    pub s_type: StructureType,           // identifies the struct type
    pub p_next: *const core::ffi::c_void, // pointer to next struct in chain
    // ... rest of the fields
}

The sType field is a discriminator, like a tagged union. The driver reads sType to know what struct it’s looking at, then casts the pointer to the correct type. This is the same pattern as COM’s QueryInterface or protobuf’s Any.

Worked example: enabling Vulkan 1.2 and 1.3 features

The most common use of pNext chains is enabling device features from newer Vulkan versions or extensions.

Without vulkan_rust builders (raw C-style)

use vulkan_rust::vk;
use vulkan_rust::vk::*;

// You would need to manually link the structs:
let mut features_13 = PhysicalDeviceVulkan13Features {
    s_type: StructureType::PHYSICAL_DEVICE_VULKAN_1_3_FEATURES,
    p_next: core::ptr::null_mut() as *const _,
    dynamic_rendering: 1,   // enable dynamic rendering
    synchronization2: 1,    // enable synchronization2
    ..unsafe { core::mem::zeroed() }
};

let mut features_12 = PhysicalDeviceVulkan12Features {
    s_type: StructureType::PHYSICAL_DEVICE_VULKAN_1_2_FEATURES,
    p_next: &mut features_13 as *mut _ as *const _,  // link to next
    buffer_device_address: 1,
    descriptor_indexing: 1,
    ..unsafe { core::mem::zeroed() }
};

let device_info = DeviceCreateInfo {
    s_type: StructureType::DEVICE_CREATE_INFO,
    p_next: &mut features_12 as *mut _ as *const _,  // link to chain
    // ...
};

This is error-prone: wrong sType, dangling pointers, forgetting to link the chain. vulkan_rust builders fix all of these problems.

With vulkan_rust builders (type-safe)

use vulkan_rust::vk;
use vulkan_rust::vk::*;

let mut features_12 = *PhysicalDeviceVulkan12Features::builder()
    .buffer_device_address(1)
    .descriptor_indexing(1);

let mut features_13 = *PhysicalDeviceVulkan13Features::builder()
    .dynamic_rendering(1)
    .synchronization2(1);

let device_info = DeviceCreateInfo::builder()
    .push_next(&mut features_12)
    .push_next(&mut features_13)
    // ... other fields
    ;

The builder handles:

  • sType is set automatically by builder().
  • pNext linking is handled by push_next, which prepends each struct to the chain.
  • Type safety via marker traits: push_next only accepts types that the Vulkan spec says are valid extensions for that struct. Passing an invalid type is a compile error.

Before reading on: what do you think happens if you chain a struct that the driver doesn’t recognize (e.g., an extension struct the driver doesn’t support)?

Answer: The driver skips it. Every struct in the chain has an sType header. The driver reads each sType, processes structs it recognizes, and follows the pNext pointer past structs it doesn’t. This is how forward compatibility works: old drivers ignore new extension structs.

How push_next works

The push_next method prepends to the chain. Each call inserts the new struct at the front:

// push_next implementation (simplified):
pub fn push_next<T: ExtendsDeviceCreateInfo>(mut self, next: &'a mut T) -> Self {
    unsafe {
        let next_ptr = next as *mut T as *mut BaseOutStructure;
        // Point the new struct's pNext to the current chain head.
        (*next_ptr).p_next = self.inner.p_next as *mut _;
        // Make the new struct the chain head.
        self.inner.p_next = next_ptr as *const _;
    }
    self
}

After two push_next calls:

DeviceCreateInfo.pNext → features_13 → features_12 → null
                         (last pushed    (first pushed
                          is first)       is last)

The order in the chain does not matter to the driver. It walks the entire chain regardless of order.

The Extends marker traits

For each extensible struct, vulkan_rust generates an unsafe trait:

pub unsafe trait ExtendsDeviceCreateInfo {}

Types that the Vulkan spec says can appear in DeviceCreateInfo’s pNext chain implement this trait:

unsafe impl ExtendsDeviceCreateInfo for PhysicalDeviceVulkan12Features {}
unsafe impl ExtendsDeviceCreateInfo for PhysicalDeviceVulkan13Features {}
unsafe impl ExtendsDeviceCreateInfo for DevicePrivateDataCreateInfo {}
// ... hundreds more

These traits are generated from the structextends attribute in vk.xml, so they are always in sync with the Vulkan spec.

If you try to push_next a struct that doesn’t implement the trait:

use vulkan_rust::vk;
use vulkan_rust::vk::*;

// Compile error: PhysicalDeviceMemoryProperties does not implement
// ExtendsDeviceCreateInfo
let info = DeviceCreateInfo::builder()
    .push_next(&mut mem_props);  // ← won't compile

The builder Deref pattern

vulkan_rust builders implement Deref<Target = InnerStruct>, so you can pass a builder anywhere a reference to the inner struct is expected:

use vulkan_rust::vk;
use vulkan_rust::vk::*;

let info = DeviceCreateInfo::builder()
    .queue_create_infos(&queue_infos)
    .push_next(&mut features_12);

// No need to call .build(), just pass &info or *info.
let device = unsafe { instance.create_device(physical_device, &info, None)? };

The *info dereference gives you the inner DeviceCreateInfo. The &info auto-derefs to &DeviceCreateInfo through Deref.

Lifetime safety

Builders carry a lifetime parameter 'a to ensure that references passed to push_next (and slice methods like queue_create_infos) live long enough:

pub struct DeviceCreateInfoBuilder<'a> {
    inner: DeviceCreateInfo,
    _marker: PhantomData<&'a ()>,
}

This means the builder and everything chained into it must live in the same scope. The compiler enforces this:

use vulkan_rust::vk;
use vulkan_rust::vk::*;

let info = {
    let mut features = PhysicalDeviceVulkan12Features::builder();
    DeviceCreateInfo::builder()
        .push_next(&mut features)
    // ← compile error: `features` does not live long enough
};

Common pNext patterns

Querying supported features

Chain feature structs into PhysicalDeviceFeatures2 and call get_physical_device_features2:

use vulkan_rust::vk;
use vulkan_rust::vk::*;

let mut features_12 = *PhysicalDeviceVulkan12Features::builder();
let mut features_13 = *PhysicalDeviceVulkan13Features::builder();

let mut features2 = PhysicalDeviceFeatures2::builder()
    .push_next(&mut features_12)
    .push_next(&mut features_13);

unsafe {
    instance.get_physical_device_features2(physical_device, &mut *features2);
};

// Now features_12 and features_13 are filled in by the driver.
if features_12.buffer_device_address != 0 {
    println!("Buffer device address is supported");
}

Enabling features at device creation

Pass the same structs (with your desired features set to 1) into DeviceCreateInfo via push_next, as shown in the worked example above.

Formal reference

Key types

TypePurpose
BaseInStructureGeneric pNext chain traversal (const). Fields: s_type, p_next.
BaseOutStructureGeneric pNext chain traversal (mutable). Fields: s_type, p_next.
StructureTypeEnum identifying each struct type. Set automatically by builder().
ExtendsXxx traitsMarker traits generated from vk.xml structextends attribute.

Rules

  1. Never set sType manually. builder() does it for you.
  2. Never manipulate pNext directly. Use push_next.
  3. Order in the chain does not matter. The driver walks the full chain.
  4. Lifetimes must be valid. All chained structs must outlive the API call that consumes them.
  5. Unknown structs are skipped. Chaining an extension struct the driver doesn’t support is safe, it will be ignored.

Key takeaways

  • pNext is a linked list that lets extensions add data to existing structs without changing their layout.
  • vulkan_rust builders make pNext chains type-safe: push_next only accepts types the spec allows, sType is set automatically, and lifetimes are enforced by the compiler.
  • The most common use case is enabling device features from Vulkan 1.2, 1.3, or extensions at device creation time.
  • Chain order does not matter. Unknown structs are silently skipped.

Validation Layers & Debugging

Motivation

Vulkan does almost no error checking at runtime, calling a function incorrectly is undefined behavior, not an error message. This is fast but makes debugging brutal. A typo in a pipeline barrier’s access mask won’t crash immediately; it will cause a subtle rendering glitch three frames later on one specific GPU.

Validation layers are optional middleware that intercepts every Vulkan call and checks it against the spec. They catch invalid usage, report synchronization hazards, and point you to the exact spec section that explains what went wrong. You should always enable them during development.

Intuition

The strict code reviewer

Validation layers are a strict code reviewer sitting between your application and the driver. Every API call passes through the reviewer first. In development, the reviewer catches your mistakes before they reach the driver. In production, you remove the reviewer and calls go straight through.

Your app ──> Validation Layer ──> Vulkan Driver ──> GPU
              │
              │ "ERROR: Buffer 0x42 was not created with
              │  TRANSFER_DST usage, but you're using it
              │  as a copy destination. See spec section 7.4."
              v
            Callback (your code logs or prints this)

Without validation layers:

Your app ──────────────────────> Vulkan Driver ──> GPU
                                                    │
                                                    │ (undefined behavior,
                                                    │  maybe works, maybe
                                                    │  corrupts memory,
                                                    │  maybe crashes later)

Before reading on: why do you think Vulkan chose to make error checking optional instead of always-on?

Answer: Performance. Validation checking every API call adds measurable overhead (sometimes 2-5x slower). For a shipped game running at 60fps, that cost is unacceptable. By making validation optional, development builds get thorough checking while release builds get maximum performance.

Worked example: enabling validation with a debug messenger

Step 1: Enable the validation layer at instance creation

use std::ffi::CStr;
use vulkan_rust::vk;
use vk::*;

// The standard validation layer name.
let validation_layer = c"VK_LAYER_KHRONOS_validation";
let layer_names = [validation_layer.as_ptr()];

// The debug utils extension lets us receive callbacks.
use vk::extension_names::EXT_DEBUG_UTILS_EXTENSION_NAME;
let extension_names = [
    EXT_DEBUG_UTILS_EXTENSION_NAME.as_ptr(),
];

let instance_info = InstanceCreateInfo::builder()
    .enabled_layer_names(&layer_names)
    .enabled_extension_names(&extension_names);

let instance = unsafe { entry.create_instance(&instance_info, None)? };

Step 2: Set up a debug messenger

The debug messenger calls your function whenever validation finds a problem.

use vulkan_rust::vk;
use vk::*;

// This callback receives validation messages.
// The signature must match PFN_vkDebugUtilsMessengerCallbackEXT.
unsafe extern "system" fn debug_callback(
    severity: DebugUtilsMessageSeverityFlagsEXT,
    message_type: DebugUtilsMessageTypeFlagsEXT,
    callback_data: *const DebugUtilsMessengerCallbackDataEXT,
    _user_data: *mut core::ffi::c_void,
) -> u32 {
    let message = if !callback_data.is_null() {
        let data = &*callback_data;
        if !data.p_message.is_null() {
            CStr::from_ptr(data.p_message).to_string_lossy()
        } else {
            std::borrow::Cow::Borrowed("(no message)")
        }
    } else {
        std::borrow::Cow::Borrowed("(no callback data)")
    };

    if severity & DebugUtilsMessageSeverityFlagsEXT::ERROR
        != DebugUtilsMessageSeverityFlagsEXT::empty()
    {
        eprintln!("[VULKAN ERROR] {message}");
    } else if severity & DebugUtilsMessageSeverityFlagsEXT::WARNING
        != DebugUtilsMessageSeverityFlagsEXT::empty()
    {
        eprintln!("[VULKAN WARNING] {message}");
    }

    0 // returning 1 would abort the Vulkan call that triggered this
}
use vulkan_rust::vk;
use vk::*;

// Create the messenger.
let messenger_info = DebugUtilsMessengerCreateInfoEXT::builder()
    .message_severity(
        DebugUtilsMessageSeverityFlagsEXT::WARNING
        | DebugUtilsMessageSeverityFlagsEXT::ERROR,
    )
    .message_type(
        DebugUtilsMessageTypeFlagsEXT::GENERAL
        | DebugUtilsMessageTypeFlagsEXT::VALIDATION
        | DebugUtilsMessageTypeFlagsEXT::PERFORMANCE,
    )
    .pfn_user_callback(Some(debug_callback));

let messenger = unsafe {
    instance.create_debug_utils_messenger_ext(&messenger_info, None)?
};

Step 3: Trigger an error (intentionally)

To verify validation is working, do something wrong on purpose:

use vulkan_rust::vk;
use vk::*;

// Create a buffer without TRANSFER_DST usage, then try to copy into it.
let bad_buffer_info = BufferCreateInfo::builder()
    .size(1024)
    .usage(BufferUsageFlags::VERTEX_BUFFER)  // no TRANSFER_DST!
    .sharing_mode(SharingMode::EXCLUSIVE);

let bad_buffer = unsafe { device.create_buffer(&bad_buffer_info, None)? };

// Recording a copy to this buffer will produce a validation error:
// "vkCmdCopyBuffer: dstBuffer was not created with VK_BUFFER_USAGE_TRANSFER_DST_BIT"

Step 4: Clean up

use vulkan_rust::vk;

// Destroy the messenger before destroying the instance.
unsafe {
    instance.destroy_debug_utils_messenger_ext(messenger, None);
};

Message severity levels

SeverityMeaningAction
VERBOSEDiagnostic noise (loader info, layer status)Usually filtered out
INFOInformational (resource creation, state changes)Useful for deep debugging
WARNINGPotential problem (suboptimal usage, deprecated behavior)Investigate
ERRORSpec violation (undefined behavior if ignored)Fix immediately

Filter severity in the messenger creation to control verbosity. Most applications enable WARNING | ERROR and only enable VERBOSE | INFO when debugging specific issues.

Message types

TypeWhat it checks
GENERALGeneral events (loader, layer lifecycle)
VALIDATIONSpec violations (the most important type)
PERFORMANCESuboptimal API usage that may hurt performance
DEVICE_ADDRESS_BINDINGBuffer device address binding events

Catching errors during instance creation

There is a bootstrap problem: you need an instance to create a debug messenger, but errors can occur during instance creation. The solution: chain the messenger create info into the instance create info via pNext. The validation layer will use it for messages during create_instance:

use vulkan_rust::vk;
use vk::*;

let mut debug_info = DebugUtilsMessengerCreateInfoEXT::builder()
    .message_severity(
        DebugUtilsMessageSeverityFlagsEXT::WARNING
        | DebugUtilsMessageSeverityFlagsEXT::ERROR,
    )
    .message_type(
        DebugUtilsMessageTypeFlagsEXT::GENERAL
        | DebugUtilsMessageTypeFlagsEXT::VALIDATION
        | DebugUtilsMessageTypeFlagsEXT::PERFORMANCE,
    )
    .pfn_user_callback(Some(debug_callback));

// Chain into instance creation via pNext.
// DebugUtilsMessengerCreateInfoEXT implements ExtendsInstanceCreateInfo.
let instance_info = InstanceCreateInfo::builder()
    .enabled_layer_names(&layer_names)
    .enabled_extension_names(&extension_names)
    .push_next(&mut debug_info);

// Validation errors during create_instance will now trigger the callback.
let instance = unsafe { entry.create_instance(&instance_info, None)? };

// After instance creation, create a persistent messenger for the
// rest of the application's lifetime.
let messenger = unsafe {
    instance.create_debug_utils_messenger_ext(&debug_info, None)?
};

This is a practical example of pNext in action (see The pNext Extension Chain).

Common validation errors and what they mean

Error message (abbreviated)CauseFix
“not created with … usage”Resource missing a usage flagAdd the required usage flag at creation
“layout is UNDEFINED but expected …”Image in wrong layoutAdd a pipeline barrier to transition
“access mask … not supported by stage …”Access mask doesn’t match pipeline stageCheck the barrier recipes table
“must not be in RECORDING state”Submitting a command buffer that wasn’t endedCall end_command_buffer before submitting
“is still in use by the GPU”Destroying an object the GPU is usingWait for the fence before destroying
“extension not enabled”Using an extension feature without enabling itAdd the extension to instance/device creation

Performance impact

Validation layers add significant overhead:

  • CPU time: Every API call is checked against the spec. Expect 2-5x slower CPU-side Vulkan calls.
  • Memory: The layer tracks all objects and their state.
  • GPU time: Minimal, but synchronization validation may serialize GPU work.

Always disable validation in release builds. A common pattern:

let enable_validation = cfg!(debug_assertions);

let layer_names: Vec<*const i8> = if enable_validation {
    vec![c"VK_LAYER_KHRONOS_validation".as_ptr()]
} else {
    vec![]
};

Formal reference

Key types

TypePurpose
DebugUtilsMessengerEXTHandle to the debug messenger
DebugUtilsMessengerCreateInfoEXTConfiguration: severity filter, type filter, callback
DebugUtilsMessageSeverityFlagsEXTSeverity bitmask (VERBOSE, INFO, WARNING, ERROR)
DebugUtilsMessageTypeFlagsEXTType bitmask (GENERAL, VALIDATION, PERFORMANCE)

Required extension

The debug messenger requires the VK_EXT_debug_utils instance extension. Enable it with vk::extension_names::EXT_DEBUG_UTILS_EXTENSION_NAME.

Destruction order

  1. Destroy the debug messenger before destroying the instance.
  2. The pNext-chained messenger (for instance creation) is temporary and does not need separate destruction.

Key takeaways

  • Always enable validation layers during development. They catch undefined behavior that would otherwise silently corrupt rendering.
  • Set up a debug messenger callback to receive errors in your code. Don’t rely on console output, some platforms don’t have one.
  • Chain DebugUtilsMessengerCreateInfoEXT into InstanceCreateInfo via pNext to catch errors during instance creation.
  • Filter by severity (WARNING + ERROR) and type (VALIDATION + PERFORMANCE) for the best signal-to-noise ratio.
  • Disable validation in release builds. The overhead is significant.

Load and Sample Textures

Task: Load an image from disk, upload it to GPU memory, and sample it in a fragment shader.

Prerequisites

You should be comfortable with:

Overview

Sampling a texture in Vulkan requires several steps that OpenGL handled behind the scenes: creating a staging buffer, allocating a device-local image, transitioning layouts with pipeline barriers, copying data, and finally binding the image through a descriptor set. This recipe walks through each step.

Step 1: Load pixels from disk

Use the image crate to decode an image file into raw RGBA pixels.

let img = image::open("assets/texture.png")
    .expect("Failed to open image")
    .to_rgba8();

let (width, height) = img.dimensions();
let pixels = img.as_raw();
let image_size = (width * height * 4) as u64; // 4 bytes per RGBA pixel

Step 2: Create a staging buffer

The CPU cannot write directly to device-local memory on most hardware. Upload the pixels into a host-visible staging buffer first.

use vulkan_rust::vk;
use vk::*;

let staging_info = BufferCreateInfo::builder()
    .size(image_size)
    .usage(BufferUsageFlags::TRANSFER_SRC)
    .sharing_mode(SharingMode::EXCLUSIVE);

let staging_buffer = unsafe { device.create_buffer(&staging_info, None) }
    .expect("Failed to create staging buffer");
let staging_reqs = unsafe { device.get_buffer_memory_requirements(staging_buffer) };

let staging_memory = allocate_and_bind_buffer(
    device,
    staging_buffer,
    &staging_reqs,
    &mem_properties,
    MemoryPropertyFlags::HOST_VISIBLE | MemoryPropertyFlags::HOST_COHERENT,
);

// Map, copy pixels, unmap.
unsafe {
    let ptr = device.map_memory(
        staging_memory, 0, image_size,
        MemoryMapFlags::empty(),
    )
    .expect("Failed to map memory");
    core::ptr::copy_nonoverlapping(
        pixels.as_ptr(), ptr as *mut u8, image_size as usize,
    );
    device.unmap_memory(staging_memory);
}

See Memory Management for the allocate_and_bind_buffer helper and the find_memory_type algorithm.

Step 3: Create the device-local image

The image needs TRANSFER_DST (we will copy into it) and SAMPLED (the fragment shader will sample it).

use vulkan_rust::vk;
use vk::*;

let image_info = ImageCreateInfo::builder()
    .image_type(ImageType::_2D)
    .format(Format::R8G8B8A8_SRGB)
    .extent(Extent3D { width, height, depth: 1 })
    .mip_levels(1)
    .array_layers(1)
    .samples(SampleCountFlagBits::_1)
    .tiling(ImageTiling::OPTIMAL)
    .usage(
        ImageUsageFlags::TRANSFER_DST
        | ImageUsageFlags::SAMPLED
    )
    .sharing_mode(SharingMode::EXCLUSIVE)
    .initial_layout(ImageLayout::UNDEFINED);

let texture_image = unsafe { device.create_image(&image_info, None) }
    .expect("Failed to create image");

// Allocate DEVICE_LOCAL memory and bind it to the image.
let img_reqs = unsafe { device.get_image_memory_requirements(texture_image) };
let texture_memory = allocate_and_bind_image(
    device, texture_image, &img_reqs, &mem_properties,
    MemoryPropertyFlags::DEVICE_LOCAL,
);

Step 4: Transition layout UNDEFINED to TRANSFER_DST_OPTIMAL

Before copying into the image, transition it to a layout the transfer engine can write to. This requires a pipeline barrier.

Before reading on: why can’t we just copy into an image that is in UNDEFINED layout? What does the layout tell the driver?

use vulkan_rust::vk;
use vk::*;
use vk::constants;

let barrier_to_transfer = ImageMemoryBarrier::builder()
    .old_layout(ImageLayout::UNDEFINED)
    .new_layout(ImageLayout::TRANSFER_DST_OPTIMAL)
    .src_queue_family_index(QUEUE_FAMILY_IGNORED)
    .dst_queue_family_index(QUEUE_FAMILY_IGNORED)
    .image(texture_image)
    .subresource_range(ImageSubresourceRange {
        aspect_mask: ImageAspectFlags::COLOR,
        base_mip_level: 0,
        level_count: 1,
        base_array_layer: 0,
        layer_count: 1,
    })
    // No prior access to wait for (image was UNDEFINED).
    .src_access_mask(AccessFlags::NONE)
    // The transfer write must wait until the transition completes.
    .dst_access_mask(AccessFlags::TRANSFER_WRITE);

unsafe {
    device.cmd_pipeline_barrier(
        cmd,
        PipelineStageFlags::TOP_OF_PIPE,   // src stage: nothing before
        PipelineStageFlags::TRANSFER,       // dst stage: transfer write
        DependencyFlags::empty(),
        &[],             // memory barriers
        &[],             // buffer memory barriers
        &[*barrier_to_transfer],
    );
}

See Synchronization for a deeper explanation of pipeline barriers and access masks.

Step 5: Copy staging buffer to image

use vulkan_rust::vk;
use vk::*;

let region = BufferImageCopy {
    buffer_offset: 0,
    // 0 means tightly packed (no padding between rows).
    buffer_row_length: 0,
    buffer_image_height: 0,
    image_subresource: ImageSubresourceLayers {
        aspect_mask: ImageAspectFlags::COLOR,
        mip_level: 0,
        base_array_layer: 0,
        layer_count: 1,
    },
    image_offset: Offset3D { x: 0, y: 0, z: 0 },
    image_extent: Extent3D { width, height, depth: 1 },
};

unsafe {
    device.cmd_copy_buffer_to_image(
        cmd,
        staging_buffer,
        texture_image,
        ImageLayout::TRANSFER_DST_OPTIMAL,
        &[region],
    );
}

Step 6: Transition layout TRANSFER_DST to SHADER_READ_ONLY

After the copy, transition the image to a layout the shader can read.

use vulkan_rust::vk;
use vk::*;
use vk::constants;

let barrier_to_shader = ImageMemoryBarrier::builder()
    .old_layout(ImageLayout::TRANSFER_DST_OPTIMAL)
    .new_layout(ImageLayout::SHADER_READ_ONLY_OPTIMAL)
    .src_queue_family_index(QUEUE_FAMILY_IGNORED)
    .dst_queue_family_index(QUEUE_FAMILY_IGNORED)
    .image(texture_image)
    .subresource_range(ImageSubresourceRange {
        aspect_mask: ImageAspectFlags::COLOR,
        base_mip_level: 0,
        level_count: 1,
        base_array_layer: 0,
        layer_count: 1,
    })
    .src_access_mask(AccessFlags::TRANSFER_WRITE)
    .dst_access_mask(AccessFlags::SHADER_READ);

unsafe {
    device.cmd_pipeline_barrier(
        cmd,
        PipelineStageFlags::TRANSFER,
        PipelineStageFlags::FRAGMENT_SHADER,
        DependencyFlags::empty(),
        &[], &[],
        &[*barrier_to_shader],
    );
}

Step 7: Create image view and sampler

The shader does not access images directly. It reads through an image view (which selects format, mip levels, and array layers) and a sampler (which controls filtering and addressing).

use vulkan_rust::vk;
use vk::*;

let view_info = ImageViewCreateInfo::builder()
    .image(texture_image)
    .view_type(ImageViewType::_2D)
    .format(Format::R8G8B8A8_SRGB)
    .subresource_range(ImageSubresourceRange {
        aspect_mask: ImageAspectFlags::COLOR,
        base_mip_level: 0,
        level_count: 1,
        base_array_layer: 0,
        layer_count: 1,
    });

let texture_view = unsafe { device.create_image_view(&view_info, None) }
    .expect("Failed to create image view");

let sampler_info = SamplerCreateInfo::builder()
    .mag_filter(Filter::LINEAR)
    .min_filter(Filter::LINEAR)
    .address_mode_u(SamplerAddressMode::REPEAT)
    .address_mode_v(SamplerAddressMode::REPEAT)
    .address_mode_w(SamplerAddressMode::REPEAT)
    // Requires the samplerAnisotropy device feature to be enabled.
    // Set anisotropy_enable(0) if the feature is not available.
    .anisotropy_enable(true)
    .max_anisotropy(16.0)
    .border_color(BorderColor::INT_OPAQUE_BLACK)
    .mipmap_mode(SamplerMipmapMode::LINEAR)
    .min_lod(0.0)
    .max_lod(0.0);

let sampler = unsafe { device.create_sampler(&sampler_info, None) }
    .expect("Failed to create sampler");

Step 8: Bind via descriptor set

Update a descriptor set so the shader can access the combined image/sampler pair at a binding point.

use vulkan_rust::vk;
use vk::*;

let image_descriptor = DescriptorImageInfo {
    sampler,
    image_view: texture_view,
    image_layout: ImageLayout::SHADER_READ_ONLY_OPTIMAL,
};

let write = WriteDescriptorSet::builder()
    .dst_set(descriptor_set)
    .dst_binding(1) // must match the binding in the shader
    .dst_array_element(0)
    .descriptor_type(DescriptorType::COMBINED_IMAGE_SAMPLER)
    .image_info(&[image_descriptor]);

unsafe { device.update_descriptor_sets(&[*write], &[]) };

In the fragment shader (GLSL):

layout(set = 0, binding = 1) uniform sampler2D texSampler;

void main() {
    outColor = texture(texSampler, fragTexCoord);
}

See Descriptor Sets for descriptor pool creation and layout setup.

Cleanup

Because vulkan_rust handles do not implement Drop, you must destroy resources manually when they are no longer needed.

// Wait for the GPU to finish using these resources first.
unsafe {
    device.device_wait_idle()
        .expect("Failed to wait for device idle");
    device.destroy_sampler(sampler, None);
    device.destroy_image_view(texture_view, None);
    device.destroy_image(texture_image, None);
    device.free_memory(texture_memory, None);
    // Staging buffer should already be destroyed after the upload.
}

Notes

  • Format choice. R8G8B8A8_SRGB is correct for most color textures. Use R8G8B8A8_UNORM for data textures (normal maps, roughness) where sRGB gamma correction would be wrong.
  • Mipmaps. This recipe creates a single mip level. For proper texture filtering at a distance, generate a full mip chain using cmd_blit_image in a loop, with a barrier between each level.
  • One-shot command buffer. Steps 4 through 6 are typically recorded into a short-lived command buffer that is submitted and waited on immediately. Reuse command buffers from a transient pool for this.

Implement Double Buffering

Task: Set up frames-in-flight so the CPU records frame N+1 while the GPU renders frame N.

Prerequisites

The problem

In a single-buffered render loop, the CPU submits a frame and then waits for the GPU to finish before it can start recording the next frame. This means the CPU sits idle during GPU rendering, and the GPU sits idle during CPU recording. You get roughly half the throughput you could.

Single buffered:
CPU: [record 0]...........[record 1]...........[record 2]...
GPU: ...........[render 0]...........[render 1]...........[render 2]
     └── idle ──┘         └── idle ──┘

With double buffering (two frames in flight), the CPU records the next frame while the GPU is still rendering the current one:

Double buffered:
CPU: [record 0][record 1][record 2][record 3]...
GPU: .......[render 0][render 1][render 2][render 3]...

The overlap keeps both processors busy.

Step 1: Define the frame count

Two frames in flight is the standard choice. Three is occasionally used, but adds latency without much throughput gain on most hardware.

const MAX_FRAMES_IN_FLIGHT: usize = 2;

Step 2: Create per-frame synchronization objects

Each frame in flight needs its own set of sync primitives:

  • Fence: the CPU waits on this before reusing the frame’s resources.
  • Image-available semaphore: signals when the swapchain image is ready to be rendered into.
  • Render-finished semaphore: signals when rendering is done and the image can be presented.

Before reading on: why does each frame need its own fence? What would go wrong if all frames shared a single fence?

use vulkan_rust::vk;
use vk::*;

struct FrameSync {
    in_flight_fence: Fence,
    image_available: Semaphore,
    render_finished: Semaphore,
}

let fence_info = FenceCreateInfo::builder()
    .flags(FenceCreateFlags::SIGNALED); // start signaled so frame 0 doesn't deadlock

let semaphore_info = SemaphoreCreateInfo::builder();

let mut frame_sync = Vec::with_capacity(MAX_FRAMES_IN_FLIGHT);
for _ in 0..MAX_FRAMES_IN_FLIGHT {
    let sync = FrameSync {
        in_flight_fence: unsafe { device.create_fence(&fence_info, None) }
            .expect("Failed to create fence"),
        image_available: unsafe { device.create_semaphore(&semaphore_info, None) }
            .expect("Failed to create semaphore"),
        render_finished: unsafe { device.create_semaphore(&semaphore_info, None) }
            .expect("Failed to create semaphore"),
    };
    frame_sync.push(sync);
}

Note the SIGNALED flag on the fences. The render loop starts by waiting on the fence, so frame 0 needs the fence to be signaled already or the first wait_for_fences call will block forever.

Step 3: Create per-frame command buffers

Each frame in flight needs its own command buffer so the CPU can record into one while the GPU executes the other.

use vulkan_rust::vk;
use vk::*;

let alloc_info = CommandBufferAllocateInfo::builder()
    .command_pool(command_pool)
    .level(CommandBufferLevel::PRIMARY)
    .command_buffer_count(MAX_FRAMES_IN_FLIGHT as u32);

let command_buffers = unsafe {
    device.allocate_command_buffers(&alloc_info)
}
.expect("Failed to allocate command buffers");

Step 4: The render loop

The frame index cycles through 0..MAX_FRAMES_IN_FLIGHT. Each iteration uses only the resources belonging to that frame index.

use vulkan_rust::vk;
use vk::*;

let mut current_frame: usize = 0;

loop {
    // Handle window events (poll_events, etc.)
    // ...

    let sync = &frame_sync[current_frame];
    let cmd = command_buffers[current_frame];

    unsafe {
    // --- 1. Wait for this frame's previous submission to finish ---
    device.wait_for_fences(&[sync.in_flight_fence], true, u64::MAX)
        .expect("Failed to wait for fence");

    // --- 2. Acquire the next swapchain image ---
    let image_index = device.acquire_next_image_khr(
        swapchain,
        u64::MAX,
        sync.image_available,  // signaled when image is ready
        Fence::null(),
    )
    .expect("Failed to acquire swapchain image");

    // --- 3. Reset the fence only after we know we will submit work ---
    // Resetting before acquire_next_image could deadlock if acquire fails.
    device.reset_fences(&[sync.in_flight_fence])
        .expect("Failed to reset fence");

    // --- 4. Record commands ---
    device.reset_command_buffer(cmd, CommandBufferResetFlags::empty())
        .expect("Failed to reset command buffer");

    let begin_info = CommandBufferBeginInfo::builder();
    device.begin_command_buffer(cmd, &begin_info)
        .expect("Failed to begin command buffer");

    // ... record render pass, draw calls, etc. ...

    device.end_command_buffer(cmd)
        .expect("Failed to end command buffer");

    // --- 5. Submit ---
    let wait_semaphores = [sync.image_available];
    let wait_stages = [PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT];
    let signal_semaphores = [sync.render_finished];
    let command_buffers_to_submit = [cmd];

    let submit_info = SubmitInfo::builder()
        .wait_semaphores(&wait_semaphores)
        .wait_dst_stage_mask(&wait_stages)
        .command_buffers(&command_buffers_to_submit)
        .signal_semaphores(&signal_semaphores);

    device.queue_submit(
        graphics_queue,
        &[*submit_info],
        sync.in_flight_fence,  // signal this fence when done
    )
    .expect("Failed to submit");

    // --- 6. Present ---
    let swapchains = [swapchain];
    let image_indices = [image_index];
    let present_info = PresentInfoKHR::builder()
        .wait_semaphores(&signal_semaphores)
        .swapchains(&swapchains)
        .image_indices(&image_indices);

    device.queue_present_khr(graphics_queue, &present_info)
        .expect("Failed to present");
    }

    // --- 7. Advance frame index ---
    current_frame = (current_frame + 1) % MAX_FRAMES_IN_FLIGHT;
}

Step 5: Clean shutdown

Before destroying anything, wait for all frames to finish.

unsafe {
    device.device_wait_idle()
        .expect("Failed to wait for device idle");

    for sync in &frame_sync {
        device.destroy_fence(sync.in_flight_fence, None);
        device.destroy_semaphore(sync.image_available, None);
        device.destroy_semaphore(sync.render_finished, None);
    }
}

The synchronization flow

Each frame follows this dependency chain:

wait_for_fences(in_flight_fence)     CPU blocks until frame N-2 is done
        │
acquire_next_image(image_available)  GPU signals when image is ready
        │
reset_fences(in_flight_fence)        Safe to reset now
        │
record commands                      CPU work, no GPU dependency
        │
queue_submit(                        GPU work begins
    wait: image_available,           Wait for image before color output
    signal: render_finished,         Signal when rendering is done
    fence: in_flight_fence           Signal fence when fully complete
)
        │
queue_present(                       Present to screen
    wait: render_finished            Wait for rendering before presenting
)

Common mistakes

Fence reset before acquire. If you reset the fence before acquire_next_image, and the acquire call returns an error (e.g. OUT_OF_DATE_KHR), the fence stays unsignaled. The next iteration will wait on it forever. Always reset the fence after a successful acquire.

Sharing command buffers. If two frames in flight use the same command buffer, the CPU might overwrite it while the GPU is still reading it. Always use one command buffer per frame in flight.

Forgetting SIGNALED on initial fences. The loop starts with wait_for_fences. If the fence starts unsignaled, the first frame deadlocks.

Notes

  • Triple buffering. Setting MAX_FRAMES_IN_FLIGHT = 3 adds one more frame of latency but can help if the CPU or GPU has variable frame times. Measure before committing to it.
  • Swapchain images vs frames in flight. The number of swapchain images (typically 2 or 3) is independent of MAX_FRAMES_IN_FLIGHT. Frames in flight control CPU/GPU overlap; swapchain image count controls how many images the presentation engine juggles.
  • Resize handling. When the swapchain is recreated after a window resize, you need to wait for all in-flight frames to finish first. See Handle Window Resize.

Handle Window Resize

Task: Detect window resize events and recreate the swapchain without crashing or leaking resources.

Prerequisites

The problem

When the window is resized, the swapchain images no longer match the window dimensions. Vulkan tells you this has happened through two mechanisms:

  1. acquire_next_image_khr or queue_present_khr returns ERROR_OUT_OF_DATE, meaning the swapchain is no longer compatible with the surface.
  2. queue_present_khr returns SUBOPTIMAL, meaning the swapchain still works but no longer matches the surface properties perfectly.

In either case, you must recreate the swapchain (and everything that depends on its images) before rendering can continue.

Step 1: Detect the resize

Track resize events from your windowing library and from Vulkan return codes.

let mut framebuffer_resized = false;

// In your window event handler (winit example):
match event {
    WindowEvent::Resized(_) => {
        framebuffer_resized = true;
    }
    _ => {}
}

In the render loop, check both the flag and the Vulkan result codes:

use vulkan_rust::vk;
use vk::*;
use vk::Result as VkError;

let acquire_result = unsafe {
    device.acquire_next_image_khr(
        swapchain, u64::MAX, image_available_semaphore, Fence::null(),
    )
};

let image_index = match acquire_result {
    Ok(index) => index,
    Err(VkError::ERROR_OUT_OF_DATE) => {
        recreate_swapchain(/* ... */);
        continue; // restart this loop iteration
    }
    Err(e) => panic!("Failed to acquire swapchain image: {e:?}"),
};

// ... record and submit ...

let present_result = unsafe {
    device.queue_present_khr(graphics_queue, &present_info)
};

match present_result {
    Ok(_) => {}
    Err(VkError::ERROR_OUT_OF_DATE | VkError::SUBOPTIMAL) => {
        framebuffer_resized = false;
        recreate_swapchain(/* ... */);
    }
    Err(e) => panic!("Failed to present: {e:?}"),
}

// Also check the manual flag (some platforms don't always return OUT_OF_DATE).
if framebuffer_resized {
    framebuffer_resized = false;
    recreate_swapchain(/* ... */);
}

Before reading on: why do we check framebuffer_resized separately from the Vulkan error codes? Why not rely on OUT_OF_DATE_KHR alone?

Some window systems (notably X11) do not always report out-of-date when the window is resized. The manual flag from the window event handler catches those cases.

Step 2: Wait for the GPU

Before destroying any swapchain-related resources, all in-flight work must finish.

unsafe { device.device_wait_idle() }
    .expect("Failed to wait for device idle");

This is simple and correct. For higher performance you could track individual fences per swapchain image, but device_wait_idle is the right choice for a resize path that runs infrequently.

Step 3: Destroy old resources

Destroy everything that depends on the swapchain images, in reverse creation order.

// Destroy framebuffers (one per swapchain image).
for &fb in &swapchain_framebuffers {
    unsafe { device.destroy_framebuffer(fb, None); }
}

// Destroy image views (one per swapchain image).
for &view in &swapchain_image_views {
    unsafe { device.destroy_image_view(view, None); }
}

// Do NOT destroy the old swapchain yet, we pass it to the new one.

You do not need to destroy the swapchain images themselves. They are owned by the swapchain and will be cleaned up when the old swapchain is destroyed.

Step 4: Query new surface capabilities

The surface extent may have changed, so re-query it.

use vulkan_rust::vk;
use vk::*;

let surface_caps = unsafe {
    instance.get_physical_device_surface_capabilities_khr(physical_device, surface)
}
.expect("Failed to query surface capabilities");

let new_extent = if surface_caps.current_extent.width != u32::MAX {
    // The surface has a defined size, use it.
    surface_caps.current_extent
} else {
    // The surface size is undefined (e.g. Wayland), clamp to limits.
    let window_size = window.inner_size();
    Extent2D {
        width: window_size.width.clamp(
            surface_caps.min_image_extent.width,
            surface_caps.max_image_extent.width,
        ),
        height: window_size.height.clamp(
            surface_caps.min_image_extent.height,
            surface_caps.max_image_extent.height,
        ),
    }
};

Step 5: Handle minimized windows

When a window is minimized, the surface extent can be (0, 0). You cannot create a swapchain with zero dimensions. Pause the render loop until the window is restored.

if new_extent.width == 0 || new_extent.height == 0 {
    // Window is minimized. Wait for a resize event before continuing.
    // With winit, use Event::MainEventsCleared to avoid busy-waiting.
    return Ok(());
}

Step 6: Create the new swapchain

Pass the old swapchain handle to old_swapchain. This lets the driver reuse internal resources and can make the transition smoother.

use vulkan_rust::vk;
use vk::*;

let old_swapchain = swapchain; // save the handle

let swapchain_info = SwapchainCreateInfoKHR::builder()
    .surface(surface)
    .min_image_count(desired_image_count)
    .image_format(surface_format.format)
    .image_color_space(surface_format.color_space)
    .image_extent(new_extent)
    .image_array_layers(1)
    .image_usage(ImageUsageFlags::COLOR_ATTACHMENT)
    .image_sharing_mode(SharingMode::EXCLUSIVE)
    .pre_transform(surface_caps.current_transform)
    .composite_alpha(CompositeAlphaFlagBitsKHR::OPAQUE)
    .present_mode(present_mode)
    .clipped(true)
    .old_swapchain(old_swapchain); // <-- reuse hint

swapchain = unsafe { device.create_swapchain_khr(&swapchain_info, None) }
    .expect("Failed to create swapchain");

// Now destroy the old swapchain.
unsafe { device.destroy_swapchain_khr(old_swapchain, None); }

Step 7: Recreate image views and framebuffers

The new swapchain has new images, so create fresh image views and framebuffers.

use vulkan_rust::vk;
use vk::*;

let swapchain_images = unsafe { device.get_swapchain_images_khr(swapchain) }
    .expect("Failed to get swapchain images");

swapchain_image_views = swapchain_images
    .iter()
    .map(|&image| {
        let view_info = ImageViewCreateInfo::builder()
            .image(image)
            .view_type(ImageViewType::_2D)
            .format(surface_format.format)
            .subresource_range(ImageSubresourceRange {
                aspect_mask: ImageAspectFlags::COLOR,
                base_mip_level: 0,
                level_count: 1,
                base_array_layer: 0,
                layer_count: 1,
            });
        unsafe { device.create_image_view(&view_info, None) }
            .expect("Failed to create image view")
    })
    .collect();

swapchain_framebuffers = swapchain_image_views
    .iter()
    .map(|&view| {
        let attachments = [view];
        let fb_info = FramebufferCreateInfo::builder()
            .render_pass(render_pass)
            .attachments(&attachments)
            .width(new_extent.width)
            .height(new_extent.height)
            .layers(1);
        unsafe { device.create_framebuffer(&fb_info, None) }
            .expect("Failed to create framebuffer")
    })
    .collect();

Putting it all together

A helper function that bundles the recreation logic:

use vulkan_rust::vk;
use vk::*;

fn recreate_swapchain(
    instance: &vulkan_rust::Instance,
    device: &vulkan_rust::Device,
    physical_device: PhysicalDevice,
    surface: SurfaceKHR,
    window: &winit::window::Window,
    render_pass: RenderPass,
    swapchain: &mut SwapchainKHR,
    swapchain_image_views: &mut Vec<ImageView>,
    swapchain_framebuffers: &mut Vec<Framebuffer>,
    surface_format: SurfaceFormatKHR,
    present_mode: PresentModeKHR,
) -> Extent2D {
    unsafe {
        device.device_wait_idle()
            .expect("Failed to wait for device idle");

        // Destroy old framebuffers and image views.
        for &fb in swapchain_framebuffers.iter() {
            device.destroy_framebuffer(fb, None);
        }
        for &view in swapchain_image_views.iter() {
            device.destroy_image_view(view, None);
        }
    }

    // Query new extent, create new swapchain, views, framebuffers.
    // ... (Steps 4 through 7 from above) ...

    new_extent
}

Common mistakes

Forgetting to update the viewport and scissor. If you use dynamic viewport/scissor state (which you should), update them to the new extent each frame. If you baked them into the pipeline, you need to recreate the pipeline too.

Leaking old image views. Every create_image_view must have a matching destroy_image_view. If you overwrite the Vec without destroying the old views first, those handles leak.

Not handling SUBOPTIMAL. SUBOPTIMAL from queue_present_khr is not a fatal error, but ignoring it means you render at the wrong resolution until something else triggers an ERROR_OUT_OF_DATE.

Notes

  • Depth buffers. If your render pass uses a depth attachment, you must also recreate the depth image, its memory, and its image view when the swapchain extent changes.
  • Render pass compatibility. The render pass itself does not depend on the swapchain extent, only on the image format. You do not need to recreate it unless the surface format changes (which is extremely rare).
  • Dynamic state. Using DynamicState::VIEWPORT and DynamicState::SCISSOR avoids having to recreate the pipeline on resize. This is the recommended approach.

Use Push Constants

Task: Pass small, frequently-changing data (like a model matrix) to shaders without descriptor sets or buffer allocations.

Prerequisites

What push constants are

Push constants are a small block of data written directly into the command buffer. Unlike uniform buffers, they require no buffer allocation, no memory binding, and no descriptor set update. You declare a range in the pipeline layout, record the data inline during command recording, and the shader reads it.

The tradeoff is size: the Vulkan spec guarantees at least 128 bytes of push constant storage. Most desktop GPUs offer 256 bytes. This is enough for a 4x4 matrix (64 bytes) plus a handful of scalar parameters, but not enough for large data sets.

When to use push constants vs uniform buffers

CriterionPush constantsUniform buffers
SizeUp to 128-256 bytesUnlimited
Setup costNone (inline in command buffer)Allocate buffer, bind memory, write descriptor
Per-draw updateFree (just cmd_push_constants)Requires dynamic offsets or multiple descriptors
Best forModel matrix, time, material indexLarge arrays, shared view/projection data

Rule of thumb: if the data changes per draw call and fits in 128 bytes, use push constants. For anything larger or shared across many draws, use a uniform buffer.

Step 1: Define the push constant data

Create a #[repr(C)] struct that matches the layout the shader expects.

#[repr(C)]
#[derive(Clone, Copy)]
struct PushConstants {
    model: [f32; 16],  // 4x4 matrix, 64 bytes
    time: f32,          // 4 bytes
    _padding: [f32; 3], // align to 16 bytes if needed
}

Before reading on: why does the struct need #[repr(C)]? What would happen if Rust reordered the fields?

#[repr(C)] guarantees that the fields are laid out in declaration order with C-compatible alignment. Without it, the Rust compiler may reorder fields, and the shader would read garbage.

Step 2: Declare push constant range in the pipeline layout

The push constant range tells Vulkan how many bytes of push constant data your shaders use and which stages access them.

use vulkan_rust::vk;
use vk::*;

let push_constant_range = PushConstantRange {
    stage_flags: ShaderStageFlags::VERTEX,
    offset: 0,
    size: std::mem::size_of::<PushConstants>() as u32,
};

let push_ranges = [push_constant_range];
let layout_info = PipelineLayoutCreateInfo::builder()
    .set_layouts(&descriptor_set_layouts) // can be empty if you have no descriptors
    .push_constant_ranges(&push_ranges);

let pipeline_layout = unsafe {
    device.create_pipeline_layout(&layout_info, None)
}
.expect("Failed to create pipeline layout");

If both vertex and fragment shaders read push constants, you have two options:

  • One range with stage_flags: VERTEX | FRAGMENT if both stages read the same bytes.
  • Two ranges at different offsets if each stage reads different data.
use vulkan_rust::vk;
use vk::*;

// Example: vertex reads bytes 0..64, fragment reads bytes 64..80.
let ranges = [
    PushConstantRange {
        stage_flags: ShaderStageFlags::VERTEX,
        offset: 0,
        size: 64,
    },
    PushConstantRange {
        stage_flags: ShaderStageFlags::FRAGMENT,
        offset: 64,
        size: 16,
    },
];

Step 3: Declare push constants in the shader

In GLSL, push constants appear as a uniform block with the push_constant layout qualifier.

Vertex shader:

#version 450

layout(push_constant) uniform PushConstants {
    mat4 model;
    float time;
} pc;

layout(location = 0) in vec3 inPosition;

void main() {
    gl_Position = pc.model * vec4(inPosition, 1.0);
}

There can be only one push_constant block per shader stage. The block members must match the byte layout of your Rust struct.

Step 4: Record push constants during command recording

Use cmd_push_constants to write the data into the command buffer. This is typically called once per draw, right before the draw command.

use vulkan_rust::vk;
use vk::*;

let push_data = PushConstants {
    model: compute_model_matrix(entity),
    time: elapsed_seconds,
    _padding: [0.0; 3],
};

unsafe {
    device.cmd_push_constants(
        cmd,
        pipeline_layout,
        ShaderStageFlags::VERTEX,
        0, // offset in bytes
        std::slice::from_raw_parts(
            &push_data as *const PushConstants as *const u8,
            std::mem::size_of::<PushConstants>(),
        ),
    );

    device.cmd_draw(cmd, vertex_count, 1, 0, 0);
}

For a scene with many objects, you push new constants before each draw:

use vulkan_rust::vk;
use vk::*;

for entity in &scene.entities {
    let push_data = PushConstants {
        model: entity.transform,
        time: elapsed_seconds,
        _padding: [0.0; 3],
    };

    unsafe {
        device.cmd_push_constants(
            cmd, pipeline_layout,
            ShaderStageFlags::VERTEX,
            0,
            std::slice::from_raw_parts(
                &push_data as *const PushConstants as *const u8,
                std::mem::size_of::<PushConstants>(),
            ),
        );

        device.cmd_draw_indexed(
            cmd, entity.index_count, 1, entity.first_index, 0, 0,
        );
    }
}

A helper for safe byte casting

The std::slice::from_raw_parts pattern is error-prone. A small helper makes it clearer:

use vulkan_rust::vk;
use vk::*;

/// Reinterpret a reference to a `Copy` type as a `&[u8]` slice
/// suitable for `cmd_push_constants`.
///
/// # Safety
/// The type must be `#[repr(C)]` with no padding that contains
/// uninitialized bytes.
unsafe fn as_push_bytes<T: Copy>(data: &T) -> &[u8] {
    std::slice::from_raw_parts(
        data as *const T as *const u8,
        std::mem::size_of::<T>(),
    )
}

// Usage:
unsafe {
    device.cmd_push_constants(
        cmd, pipeline_layout,
        ShaderStageFlags::VERTEX,
        0,
        as_push_bytes(&push_data),
    );
}

Common mistakes

Exceeding the size limit. If your push constant struct is larger than the device’s max_push_constants_size (query from PhysicalDeviceLimits), pipeline creation will fail. Check the limit at startup.

Mismatched stage flags. The stage_flags in cmd_push_constants must match the flags declared in the push constant range. If your range says VERTEX | FRAGMENT but you push with VERTEX only, the validation layer will warn.

Incorrect offset. The offset parameter in cmd_push_constants is a byte offset into the push constant block. If you update only part of the block (e.g. fragment-only data at offset 64), the vertex portion retains its previously pushed values.

Forgetting #[repr(C)]. Without it, Rust may reorder struct fields. The GPU will read bytes at fixed offsets, so reordered fields mean corrupted data with no obvious error.

Notes

  • Alignment. GLSL push_constant blocks follow std430 layout rules. A vec3 takes 12 bytes (not 16) but the next member aligns to its own size. Prefer vec4/mat4 to avoid alignment surprises, or add explicit padding in your Rust struct.
  • Performance. Push constants are the fastest way to pass small per-draw data. On most architectures they live in GPU registers or a small on-chip cache, not in memory.
  • Compatibility. 128 bytes is the guaranteed minimum. If you need more, check max_push_constants_size in PhysicalDeviceLimits. Most desktop drivers report 256 bytes.
  • Combining with descriptors. Push constants and descriptor sets are complementary. A typical setup uses push constants for per-draw data (model matrix) and uniform buffers via descriptors for per-frame data (view/projection matrices, lighting).

Port from ash to vulkan_rust

Task: Migrate an existing ash-based project to vulkan_rust (published as vulkan-rust on crates.io).

If you already have a working ash project, switching to vulkan_rust is mostly mechanical. The Vulkan concepts are identical, and the API surface maps one-to-one. This guide covers every difference you will encounter.

What stays the same

Before diving into differences, note what does not change:

  • All Vulkan functions are unsafe.
  • You must explicitly destroy every object you create (no RAII/Drop on handles).
  • Handles are lightweight Copy types.
  • The same Vulkan mental model applies: instances, devices, queues, command buffers, pipelines, descriptor sets, synchronization primitives.

Key differences at a glance

Aspectashvulkan_rust
Crate nameashvulkan-rust
Command styleTrait methods (DeviceV1_0, KhrSwapchainFn)Inherent methods on Device / Instance
Trait importsOne per API version + one per extensionNone needed
Raw typesash::vk::*vulkan_rust::vk::*
Builders::builder() returns Builder, call .build()::builder() returns Builder that derefs to inner struct
ExtensionsManual loader structs (ash::khr::swapchain::Device)All loaded automatically, call methods on Device directly
InteropLimited from_raw on some typesInstance::from_raw_parts / Device::from_raw_parts
Error typeash::vk::Result with separate success/error enumsVkResult<T> wrapping vk::Result

Step 1: Replace the Cargo dependency

# Before (ash)
[dependencies]
ash = "0.38"

# After (vulkan_rust)
[dependencies]
vulkan-rust = "0.10"

Step 2: Remove trait imports

This is the single biggest ergonomic difference. In ash, every Vulkan API version and extension requires a trait import:

// ash: you need these traits in scope to call device methods
use ash::vk;
use ash::Device;
// Without this import, device.create_buffer() does not exist:
use ash::version::DeviceV1_0;
// Without this import, device.create_swapchain_khr() does not exist:
use ash::khr::swapchain::Device as SwapchainDevice;

In vulkan_rust, every command is an inherent method on Device or Instance. No trait imports, no extension loader structs:

// vulkan_rust: this is all you need
use vulkan_rust::vk;
use vulkan_rust::Device;
// device.create_buffer() and device.create_swapchain_khr()
// are both available immediately.

Migration action: Delete all use ash::version::* and use ash::extensions::* imports. Replace use ash::vk with use vulkan_rust::vk.

Step 3: Replace Entry, Instance, and Device creation

Entry and Instance

// ── ash ─────────────────────────────────────────────────
let entry = ash::Entry::linked();
let app_info = vk::ApplicationInfo::builder()
    .api_version(vk::make_api_version(0, 1, 3, 0))
    .build();
let create_info = vk::InstanceCreateInfo::builder()
    .application_info(&app_info)
    .build();
let instance = unsafe { entry.create_instance(&create_info, None)? };

// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;

let loader = vulkan_rust::LibloadingLoader::new()
    .expect("Failed to load Vulkan");
let entry = unsafe { vulkan_rust::Entry::new(loader) }
    .expect("Failed to create entry");

let app_info = ApplicationInfo::builder()
    .api_version((1 << 22) | (3 << 12));  // Vulkan 1.3
let create_info = InstanceCreateInfo::builder()
    .application_info(&app_info);
let instance = unsafe { entry.create_instance(&create_info, None) }
    .expect("Failed to create instance");

The main changes: Entry is loaded through LibloadingLoader instead of linked(), make_api_version is replaced with a raw u32 expression, .application_info() stays .application_info(), and .build() calls are removed. The builder derefs to the inner struct, so you can pass &create_info directly where a &InstanceCreateInfo is expected.

Device

// ── ash ─────────────────────────────────────────────────
let queue_info = vk::DeviceQueueCreateInfo::builder()
    .queue_family_index(0)
    .queue_priorities(&[1.0])
    .build();
let device_info = vk::DeviceCreateInfo::builder()
    .queue_create_infos(std::slice::from_ref(&queue_info))
    .build();
let device = unsafe {
    instance.create_device(physical_device, &device_info, None)?
};

// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;

let queue_info = DeviceQueueCreateInfo::builder()
    .queue_family_index(0)
    .queue_priorities(&[1.0]);
let device_info = DeviceCreateInfo::builder()
    .queue_create_infos(std::slice::from_ref(&queue_info));
let device = unsafe {
    instance.create_device(physical_device, &device_info, None)
}
.expect("Failed to create device");

Step 4: Update builders (drop .build())

In ash, builders require .build() to produce the final struct. In vulkan_rust, builders implement Deref<Target = T>, so the conversion is implicit:

// ── ash ─────────────────────────────────────────────────
let info = vk::BufferCreateInfo::builder()
    .size(1024)
    .usage(vk::BufferUsageFlags::VERTEX_BUFFER)
    .sharing_mode(vk::SharingMode::EXCLUSIVE)
    .build();  // <-- required in ash

// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;

let info = BufferCreateInfo::builder()
    .size(1024)
    .usage(BufferUsageFlags::VERTEX_BUFFER)
    .sharing_mode(SharingMode::EXCLUSIVE);
    // No .build(), pass &info directly to create_buffer()

Migration action: Search your codebase for .build() and remove every occurrence on Vulkan builder types.

Step 5: Command buffer recording

The pattern is identical, just without trait imports:

// ── ash ─────────────────────────────────────────────────
use ash::version::DeviceV1_0;  // required for begin/end

let begin_info = vk::CommandBufferBeginInfo::builder()
    .flags(vk::CommandBufferUsageFlags::ONE_TIME_SUBMIT)
    .build();
unsafe {
    device.begin_command_buffer(cmd, &begin_info)?;
    device.cmd_bind_pipeline(cmd, vk::PipelineBindPoint::GRAPHICS, pipeline);
    device.cmd_draw(cmd, 3, 1, 0, 0);
    device.end_command_buffer(cmd)?;
}

// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;

let begin_info = CommandBufferBeginInfo::builder()
    .flags(CommandBufferUsageFlags::ONE_TIME_SUBMIT);
unsafe {
    device.begin_command_buffer(cmd, &begin_info)
        .expect("Failed to begin command buffer");
    device.cmd_bind_pipeline(cmd, PipelineBindPoint::GRAPHICS, pipeline);
    device.cmd_draw(cmd, 3, 1, 0, 0);
    device.end_command_buffer(cmd)
        .expect("Failed to end command buffer");
}

Step 6: Queue submission

// ── ash ─────────────────────────────────────────────────
let submit_info = vk::SubmitInfo::builder()
    .command_buffers(&[cmd])
    .wait_semaphores(&[image_available])
    .wait_dst_stage_mask(&[vk::PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT])
    .signal_semaphores(&[render_finished])
    .build();
unsafe { device.queue_submit(queue, &[submit_info.build()], fence)? };

// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::*;

let wait_stages = [PipelineStageFlags::COLOR_ATTACHMENT_OUTPUT];
let cmd_bufs = [cmd];
let wait_sems = [image_available];
let signal_sems = [render_finished];
let submit_info = SubmitInfo::builder()
    .command_buffers(&cmd_bufs)
    .wait_semaphores(&wait_sems)
    .wait_dst_stage_mask(&wait_stages)
    .signal_semaphores(&signal_sems);
unsafe {
    device.queue_submit(queue, &[*submit_info], fence)
        .expect("Failed to submit");
};

Step 7: Error handling

ash splits Vulkan results into success codes and error codes. vulkan_rust uses a single VkResult<T> type:

// ── ash ─────────────────────────────────────────────────
match unsafe { device.create_buffer(&info, None) } {
    Ok(buffer) => { /* ... */ }
    Err(vk::Result::ERROR_OUT_OF_DEVICE_MEMORY) => { /* ... */ }
    Err(e) => panic!("Unexpected: {:?}", e),
}

// ── vulkan_rust ───────────────────────────────────────────
use vulkan_rust::vk;
use vk::Result as VkError;

match unsafe { device.create_buffer(&info, None) } {
    Ok(buffer) => { /* ... */ }
    Err(VkError::ERROR_OUT_OF_DEVICE_MEMORY) => { /* ... */ }
    Err(e) => panic!("Unexpected: {e:?}"),
}

The match arms look the same. The difference is that VkResult<T> implements std::error::Error, so it works with anyhow, eyre, and the ? operator out of the box.

Step 8: Extensions

In ash, extensions require separate loader structs:

// ash: manual extension loading
let swapchain_loader = ash::khr::swapchain::Device::new(&instance, &device);
let swapchain = unsafe {
    swapchain_loader.create_swapchain(&create_info, None)?
};

In vulkan_rust, all extension functions are loaded automatically when the Device or Instance is created. You call them as regular methods:

// vulkan_rust: no loader, just call the method
let swapchain = unsafe {
    device.create_swapchain_khr(&create_info, None)
}
.expect("Failed to create swapchain");

Migration action: Delete all extension loader struct construction. Replace loader.method() with device.method() or instance.method().

Step 9: Interop with from_raw_parts

If another library (OpenXR, a C plugin, a test harness) gives you raw Vulkan handles, vulkan_rust provides from_raw_parts to wrap them:

// Wrap an externally-created VkInstance
let instance = unsafe {
    vulkan_rust::Instance::from_raw_parts(raw_instance, get_instance_proc_addr)
};

// Wrap an externally-created VkDevice
let device = unsafe {
    vulkan_rust::Device::from_raw_parts(raw_device, get_device_proc_addr)
};

This loads all function pointers from the provided get_*_proc_addr, so the wrapped object works identically to one created through Entry.

Quick-reference migration checklist

  • Replace ash with vulkan-rust in Cargo.toml
  • Replace use ash::vk with use vulkan_rust::vk
  • Delete all use ash::version::* trait imports
  • Delete all extension loader struct construction
  • Remove every .build() on Vulkan builder types
  • Replace ash::Entry / ash::Instance / ash::Device with vulkan_rust::*
  • Replace extension loader method calls with direct device.method() calls
  • Update error handling if you matched on ash-specific error types
  • Compile and fix any remaining type mismatches

Map C Vulkan Calls to vulkan_rust

Task: You have C Vulkan code (or you are reading the Vulkan spec) and want to find the equivalent vulkan_rust API.

This page is a translation reference. It covers the naming rules, the structural patterns that differ between C and Rust, and a lookup table for the most common API calls.

Naming conventions

Functions

Strip the vk prefix, convert to snake_case, and call as a method on the parent object (Device or Instance):

Cvulkan_rust
vkCreateBuffer(device, ...)device.create_buffer(...)
vkCmdDraw(commandBuffer, ...)device.cmd_draw(command_buffer, ...)
vkEnumeratePhysicalDevices(instance, ...)instance.enumerate_physical_devices()
vkDestroyPipeline(device, ...)device.destroy_pipeline(...)

Note that vkCmd* functions take the CommandBuffer as a parameter but are still called on Device, not on the command buffer handle.

Types

Strip the Vk prefix. All types are re-exported at the vk root:

Cvulkan_rust
VkBuffervk::Buffer
VkBufferCreateInfovk::BufferCreateInfo
VkPhysicalDevicePropertiesvk::PhysicalDeviceProperties
VkInstancevk::Instance (the raw handle)

Use use vk::* to bring them into scope without the module prefix.

Enum variants

Strip the type prefix and keep SCREAMING_CASE:

Cvulkan_rust
VK_FORMAT_R8G8B8A8_SRGBvk::Format::R8G8B8A8_SRGB
VK_IMAGE_LAYOUT_UNDEFINEDvk::ImageLayout::UNDEFINED
VK_PRESENT_MODE_FIFO_KHRvk::PresentModeKHR::FIFO
VK_SUCCESSvk::Result::SUCCESS

Bitmask flags

Strip the type prefix and the _BIT suffix:

Cvulkan_rust
VK_BUFFER_USAGE_VERTEX_BUFFER_BITvk::BufferUsageFlags::VERTEX_BUFFER
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BITvk::ImageUsageFlags::COLOR_ATTACHMENT
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BITvk::PipelineStageFlags::FRAGMENT_SHADER

Combine flags with the | operator, just like in C:

use vulkan_rust::vk;
use vk::*;

let usage = BufferUsageFlags::VERTEX_BUFFER
    | BufferUsageFlags::TRANSFER_DST;

Extension names

// C:    VK_KHR_SWAPCHAIN_EXTENSION_NAME
// Rust: generated constants in vk::extension_names
use vulkan_rust::vk::extension_names::KHR_SWAPCHAIN_EXTENSION_NAME;
let device_extensions = [KHR_SWAPCHAIN_EXTENSION_NAME.as_ptr()];

Structural patterns

Struct initialization

C uses designated initializers. vulkan_rust uses the builder pattern, which auto-fills sType and zeroes all other fields:

// C
VkBufferCreateInfo info = {
    .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
    .pNext = NULL,
    .size = 1024,
    .usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
    .sharingMode = VK_SHARING_MODE_EXCLUSIVE,
};
VkBuffer buffer;
vkCreateBuffer(device, &info, NULL, &buffer);
// vulkan_rust
use vulkan_rust::vk;
use vk::*;

let info = BufferCreateInfo::builder()
    .size(1024)
    .usage(BufferUsageFlags::VERTEX_BUFFER)
    .sharing_mode(SharingMode::EXCLUSIVE);
let buffer = unsafe { device.create_buffer(&info, None) }
    .expect("Failed to create buffer");

Key differences:

  • sType is set automatically by ::builder().
  • pNext defaults to null (use push_next() to chain extensions).
  • The result is returned, not written through an output pointer.
  • The allocator callback (NULL in C) becomes None.

pNext extension chains

In C, you manually link structs through pNext:

// C
VkPhysicalDeviceVulkan12Features features12 = {
    .sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VULKAN_1_2_FEATURES,
    .pNext = NULL,
    .bufferDeviceAddress = VK_TRUE,
};
VkDeviceCreateInfo info = {
    .sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
    .pNext = &features12,
    // ...
};

In vulkan_rust, use push_next():

// vulkan_rust
use vulkan_rust::vk;
use vk::*;

let mut features12 = *PhysicalDeviceVulkan12Features::builder()
    .buffer_device_address(1);  // VkBool32: 1 = true
let info = DeviceCreateInfo::builder()
    .push_next(&mut features12)
    .queue_create_infos(&queue_infos);

push_next is type-safe: you can only chain structs the Vulkan spec allows for that parent struct.

The two-call enumerate pattern

Many C Vulkan functions require two calls: one to get the count, one to fill the array:

// C: two calls to enumerate physical devices
uint32_t count = 0;
vkEnumeratePhysicalDevices(instance, &count, NULL);
VkPhysicalDevice* devices = malloc(count * sizeof(VkPhysicalDevice));
vkEnumeratePhysicalDevices(instance, &count, devices);

In vulkan_rust, these return a Vec directly:

// vulkan_rust: one call, returns Vec
let devices = unsafe { instance.enumerate_physical_devices() }
    .expect("Failed to enumerate devices");

The crate handles the two-call pattern internally.

Output parameters

C Vulkan uses pointer parameters for output values. vulkan_rust returns them as VkResult<T> or plain T:

// C
VkBuffer buffer;
VkResult result = vkCreateBuffer(device, &info, NULL, &buffer);
if (result != VK_SUCCESS) { /* handle error */ }
// vulkan_rust
use vulkan_rust::vk;
use vk::*;

let buffer: Buffer = unsafe { device.create_buffer(&info, None) }
    .expect("Failed to create buffer");

Functions that output multiple handles (like vkAllocateCommandBuffers) return a Vec directly:

use vulkan_rust::vk;
use vk::*;

let cmd_buffers = unsafe {
    device.allocate_command_buffers(&alloc_info)
}
.expect("Failed to allocate command buffers");

Search tip: #[doc(alias)]

All vulkan_rust types and functions carry #[doc(alias = "vkOriginalName")] attributes. If you know the C name, type it into the rustdoc search bar and it will find the Rust equivalent. For example, searching for VkBufferCreateInfo will find vk::BufferCreateInfo.

Common API mapping table

C functionvulkan_rust methodReturns
vkCreateInstanceentry.create_instance(&info, None)VkResult<Instance>
vkDestroyInstanceinstance.destroy_instance(None)()
vkEnumeratePhysicalDevicesinstance.enumerate_physical_devices()VkResult<Vec<PhysicalDevice>>
vkGetPhysicalDevicePropertiesinstance.get_physical_device_properties(phys)PhysicalDeviceProperties
vkGetPhysicalDeviceQueueFamilyPropertiesinstance.get_physical_device_queue_family_properties(phys)Vec<QueueFamilyProperties>
vkCreateDeviceinstance.create_device(phys, &info, None)VkResult<Device>
vkDestroyDevicedevice.destroy_device(None)()
vkGetDeviceQueuedevice.get_device_queue(family, index)Queue
vkCreateBufferdevice.create_buffer(&info, None)VkResult<Buffer>
vkDestroyBufferdevice.destroy_buffer(buffer, None)()
vkAllocateMemorydevice.allocate_memory(&info, None)VkResult<DeviceMemory>
vkFreeMemorydevice.free_memory(memory, None)()
vkBindBufferMemorydevice.bind_buffer_memory(buffer, memory, offset)VkResult<()>
vkMapMemorydevice.map_memory(memory, offset, size, flags)VkResult<*mut c_void>
vkUnmapMemorydevice.unmap_memory(memory)()
vkCreateImagedevice.create_image(&info, None)VkResult<Image>
vkDestroyImagedevice.destroy_image(image, None)()
vkCreateImageViewdevice.create_image_view(&info, None)VkResult<ImageView>
vkCreateRenderPassdevice.create_render_pass(&info, None)VkResult<RenderPass>
vkCreateGraphicsPipelinesdevice.create_graphics_pipelines(cache, &infos, None)VkResult<Vec<Pipeline>>
vkCreateCommandPooldevice.create_command_pool(&info, None)VkResult<CommandPool>
vkAllocateCommandBuffersdevice.allocate_command_buffers(&info)VkResult<Vec<CommandBuffer>>
vkBeginCommandBufferdevice.begin_command_buffer(cmd, &info)VkResult<()>
vkEndCommandBufferdevice.end_command_buffer(cmd)VkResult<()>
vkCmdBeginRenderPassdevice.cmd_begin_render_pass(cmd, &info, contents)()
vkCmdEndRenderPassdevice.cmd_end_render_pass(cmd)()
vkCmdBindPipelinedevice.cmd_bind_pipeline(cmd, bind_point, pipeline)()
vkCmdDrawdevice.cmd_draw(cmd, vertices, instances, first_v, first_i)()
vkCmdCopyBufferdevice.cmd_copy_buffer(cmd, src, dst, &regions)()
vkQueueSubmitdevice.queue_submit(queue, &submits, fence)VkResult<()>
vkQueuePresentKHRdevice.queue_present_khr(queue, &info)VkResult<()>
vkDeviceWaitIdledevice.device_wait_idle()VkResult<()>
vkCreateFencedevice.create_fence(&info, None)VkResult<Fence>
vkWaitForFencesdevice.wait_for_fences(&fences, wait_all, timeout)VkResult<()>
vkResetFencesdevice.reset_fences(&fences)VkResult<()>
vkCreateSemaphoredevice.create_semaphore(&info, None)VkResult<Semaphore>
vkCreateDescriptorSetLayoutdevice.create_descriptor_set_layout(&info, None)VkResult<DescriptorSetLayout>
vkAllocateDescriptorSetsdevice.allocate_descriptor_sets(&info)VkResult<Vec<DescriptorSet>>
vkUpdateDescriptorSetsdevice.update_descriptor_sets(&writes, &copies)()

Worked example: full translation

C version

// Create a vertex buffer, bind memory, copy data
VkBufferCreateInfo buf_info = {
    .sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
    .size = sizeof(vertices),
    .usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
    .sharingMode = VK_SHARING_MODE_EXCLUSIVE,
};
VkBuffer buffer;
vkCreateBuffer(device, &buf_info, NULL, &buffer);

VkMemoryRequirements mem_req;
vkGetBufferMemoryRequirements(device, buffer, &mem_req);

VkMemoryAllocateInfo alloc_info = {
    .sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
    .allocationSize = mem_req.size,
    .memoryTypeIndex = find_memory_type(mem_req.memoryTypeBits,
        VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
        VK_MEMORY_PROPERTY_HOST_COHERENT_BIT),
};
VkDeviceMemory memory;
vkAllocateMemory(device, &alloc_info, NULL, &memory);
vkBindBufferMemory(device, buffer, memory, 0);

void* data;
vkMapMemory(device, memory, 0, buf_info.size, 0, &data);
memcpy(data, vertices, sizeof(vertices));
vkUnmapMemory(device, memory);

vulkan_rust version

use vulkan_rust::vk;
use vk::*;

unsafe {
    let buf_info = BufferCreateInfo::builder()
        .size(std::mem::size_of_val(&vertices) as u64)
        .usage(BufferUsageFlags::VERTEX_BUFFER)
        .sharing_mode(SharingMode::EXCLUSIVE);
    let buffer = device.create_buffer(&buf_info, None)
        .expect("Failed to create buffer");

    let mem_req = device.get_buffer_memory_requirements(buffer);

    let alloc_info = MemoryAllocateInfo::builder()
        .allocation_size(mem_req.size)
        .memory_type_index(find_memory_type(
            mem_req.memory_type_bits,
            MemoryPropertyFlags::HOST_VISIBLE
                | MemoryPropertyFlags::HOST_COHERENT,
        ));
    let memory = device.allocate_memory(&alloc_info, None)
        .expect("Failed to allocate memory");
    device.bind_buffer_memory(buffer, memory, 0)
        .expect("Failed to bind buffer memory");

    let data = device.map_memory(
        memory, 0, buf_info.size, MemoryMapFlags::empty(),
    )
    .expect("Failed to map memory");
    std::ptr::copy_nonoverlapping(
        vertices.as_ptr() as *const u8,
        data as *mut u8,
        std::mem::size_of_val(&vertices),
    );
    device.unmap_memory(memory);
}

The structure is the same: create, query requirements, allocate, bind, map, copy, unmap. The differences are syntactic, not conceptual.

Design Decisions & Safety Model

This page explains the major design decisions in vulkan_rust and why they were made. Each section addresses a common “why not do it the other way?” question.

Why two crates?

vulkan_rust is split into two crates with distinct roles:

  • vulkan-rust-sys is machine-generated from vk.xml. It contains ~40,000 lines of #[repr(C)] structs, #[repr(transparent)] enum newtypes, bitmask types, handle types, and function pointer typedefs. It is #![no_std].
  • vulkan-rust is hand-written. It provides Entry, Instance, Device, command loading, builders, surface helpers, and the error types.

Users depend on vulkan-rust and access raw types via vulkan_rust::vk::*.

This separation exists for three reasons:

  1. Build speed. Regenerating vulkan-rust-sys only happens when a new Vulkan spec version arrives. Day-to-day development in vulkan-rust does not trigger a rebuild of 40k lines of generated code.
  2. Reviewability. Generated code is validated by the generator’s test suite, not by human review. Hand-written code gets normal review. Mixing them in one crate blurs that boundary.
  3. no_std compatibility. vulkan-rust-sys has zero dependencies and can be used in environments without std. vulkan-rust requires std for library loading and allocation.

Why inherent methods instead of traits?

All Vulkan commands are inherent methods on Device or Instance:

use vulkan_rust::vk;

// No trait import needed, just call the method.
let buffer = unsafe { device.create_buffer(&info, None) }?;

Some Vulkan wrappers split commands across extension traits (e.g. KhrSwapchainExtension). This forces callers to import the right trait before calling the method, and IDE autocomplete only works when the trait is already in scope.

With inherent methods, every command appears in autocomplete on Device immediately, and there is nothing to import.

Why complete command loading?

When Device or Instance is created, vulkan_rust loads every function pointer from every enabled extension in a single pass. Some wrappers require callers to explicitly request which extension command tables to load.

Complete loading avoids that bookkeeping. The cost is negligible: loading a few hundred function pointers takes microseconds at startup, and the per-pointer memory cost is one Option<fn> each.

Why from_raw_parts?

Both Instance and Device provide an unsafe fn from_raw_parts constructor that wraps an externally-owned Vulkan handle:

use vulkan_rust::Device;

let device = unsafe {
    Device::from_raw_parts(raw_vk_device, Some(get_device_proc_addr_fn))
};

This exists for three use cases:

  1. OpenXR interop. The XR runtime creates the VkInstance and VkDevice. Your code receives raw handles and needs to wrap them.
  2. Middleware. Profiling layers and debug tools may hand you raw handles.
  3. Testing. Unit tests can construct wrapper objects without a real GPU.

Why no Drop on handles?

Instance and Device do not implement Drop. Destruction is explicit:

use vulkan_rust::vk;

unsafe { device.destroy_device(None) };

Automatic destruction via Drop is tempting, but breaks in several real scenarios:

  • from_raw_parts and shared ownership. If two wrappers hold the same raw handle (e.g. your code and an OpenXR runtime), a Drop impl would double-destroy it.
  • GPU-async lifetimes. The GPU may still be using resources when Rust drops a handle. Correct destruction requires calling device_wait_idle or using fences first. A Drop impl cannot know when the GPU is done.
  • Destruction order. Vulkan objects have strict parent-child destruction ordering. Rust’s drop order (reverse declaration order within a scope) may not match what Vulkan requires.

Explicit destruction makes the caller responsible, which matches Vulkan’s own model.

Why builders Deref to the inner struct?

Every builder dereferences to its inner vk::* struct:

use vulkan_rust::vk;
use vk::*;

let info = BufferCreateInfo::builder()
    .size(1024)
    .usage(BufferUsageFlags::VERTEX_BUFFER);

// Pass the builder directly where a &BufferCreateInfo is expected.
let buffer = unsafe { device.create_buffer(&info, None) }?;

Because BufferCreateInfoBuilder implements Deref<Target = BufferCreateInfo>, there is no .build() call. The builder is the struct, with a convenient setter API on top. This means you can pass a builder reference anywhere a struct reference is expected.

Why #[repr(transparent)] newtypes for enums?

Vulkan “enums” are integer constants, not closed sets. Drivers and extensions can return values that did not exist when your code was compiled. A Rust enum with unknown discriminants is instant undefined behavior.

Instead, vulkan-rust-sys represents each Vulkan enum as a #[repr(transparent)] newtype around i32:

use vulkan_rust::vk;
use vk::*;

#[repr(transparent)]
pub struct Format(i32);

impl Format {
    pub const UNDEFINED: Self = Self(0);
    pub const R8G8B8A8_UNORM: Self = Self(37);
    // ... hundreds more
}

Unknown values are perfectly legal, they just lack a named constant. Pattern matching uses associated constants, and the compiler does not assume the set is exhaustive.

The safety model

All Vulkan command wrappers are unsafe fn. The caller is responsible for meeting every precondition the Vulkan spec defines: valid handles, correct synchronization, matching lifetimes, and so on.

vulkan_rust does not attempt to encode Vulkan’s safety rules in the Rust type system. The spec is too large and too nuanced for compile-time enforcement to be practical without severe ergonomic cost.

Instead, the safety strategy is:

  1. Validation layers during development. Enable VK_LAYER_KHRONOS_validation in debug builds. The validation layer catches spec violations, use-after-free, missing synchronization, and more. It is the primary safety net.
  2. Type-safe newtypes. You cannot accidentally pass a Buffer where a Pipeline is expected. This catches a class of handle mixups at compile time.
  3. Builder push_next with marker traits. The push_next method on builders is generic over an Extends* marker trait, so only structs that the spec allows in a given pNext chain can be appended.
  4. Panic on missing function pointers. If you call a command from an extension that was not enabled, the stub panics with a descriptive message (e.g. "VK_KHR_surface not loaded"). This catches misconfiguration early.

What the generator handles vs what is hand-written

Generated (vulkan-rust-sys)Hand-written (vulkan-rust)
#[repr(C)] struct definitionsEntry, Instance, Device wrappers
#[repr(transparent)] enum newtypesCommand loading and dispatch tables
Bitmask types and flag constantsfrom_raw_parts constructors
Handle newtypesError types (VkResult, LoadError)
Function pointer typedefsSurface creation (SurfaceError)
Builder structs with DerefSPIR-V bytecode loading
push_next methods + Extends* traitsVersion parsing
Wrapper methods on Device/InstanceLoader trait and library loading

Error Handling Philosophy

This page explains how vulkan_rust maps Vulkan’s C-style error model into idiomatic Rust, and where the boundaries between error types lie.

Vulkan’s error model

Every Vulkan command that can fail returns a VkResult, which is a plain int32_t. The spec defines named constants for it:

  • Success codes are non-negative: VK_SUCCESS (0), VK_INCOMPLETE (5), VK_SUBOPTIMAL_KHR (1000001003), and a few others.
  • Error codes are negative: VK_ERROR_OUT_OF_HOST_MEMORY (-1), VK_ERROR_DEVICE_LOST (-2), etc.

There is no exception system, no errno, no callback. The caller checks the return value after every call.

The VkResult<T> type alias

vulkan-rust defines a single result type for all Vulkan command wrappers:

use vulkan_rust::vk;

pub type VkResult<T> = std::result::Result<T, vk::Result>;

Here vk::Result is the #[repr(transparent)] i32 newtype from vulkan-rust-sys. The Err variant holds any negative value. The Ok variant holds the command’s output (a handle, a vector of properties, or just ()).

A helper function performs the conversion:

use vulkan_rust::vk;

pub(crate) fn check(result: vk::Result) -> VkResult<()> {
    if result.as_raw() >= 0 {
        Ok(())
    } else {
        Err(result)
    }
}

This means all non-negative codes, including INCOMPLETE and SUBOPTIMAL, are treated as success by default.

Success codes that are not SUCCESS

Some Vulkan commands return positive success codes that carry meaning:

  • INCOMPLETE from enumeration commands means the output buffer was too small. vulkan-rust’s two-call helpers handle this internally by querying the count first, so callers rarely see it.
  • SUBOPTIMAL_KHR from vkAcquireNextImageKHR means the swapchain still works but no longer matches the surface optimally. You should recreate the swapchain, but the current frame is still valid.

Because check maps all non-negative codes to Ok(()), these success codes do not propagate as errors. Wrapper methods that need to distinguish them (e.g. swapchain acquisition) inspect the raw code explicitly after the check call.

LoadError for library loading

Before any Vulkan command runs, the shared library (vulkan-1.dll, libvulkan.so) must be loaded and vkGetInstanceProcAddr resolved. Failures here are not Vulkan API errors, they mean the Vulkan runtime is not available at all.

LoadError captures these:

use vulkan_rust::vk;

pub enum LoadError {
    /// The Vulkan shared library could not be found or opened.
    Library(libloading::Error),
    /// vkGetInstanceProcAddr could not be resolved from the library.
    MissingEntryPoint,
}

LoadError implements std::error::Error and is returned from Entry::new. It is entirely separate from vk::Result.

SurfaceError for surface creation

Creating a window surface involves platform-specific logic and raw-window-handle integration. Three distinct failure modes exist:

use vulkan_rust::vk;

pub enum SurfaceError {
    /// The display/window handle combination is not supported.
    UnsupportedPlatform,
    /// raw-window-handle returned an error.
    HandleError(raw_window_handle::HandleError),
    /// Vulkan error from the surface creation call.
    Vulkan(vk::Result),
}

SurfaceError unifies platform detection failures, handle errors, and the underlying Vulkan error into one type, so callers of Instance::create_surface have a single Result to handle.

When vulkan_rust panics

Panics are reserved for programmer mistakes, never for runtime failures that a correct program could encounter:

  • Calling an unloaded function pointer. If you call a command from an extension that was not enabled at instance or device creation, the function pointer is None. The generated wrapper calls .expect() with a message like "VK_KHR_surface not loaded". This is a configuration error, not a recoverable failure.
  • Internal invariant violations. These should never happen. If they do, a panic with a descriptive message is the right response.

Vulkan runtime errors (out of memory, device lost, surface lost) are always returned as Err(vk::Result), never panicked.

The standard pattern

Most application code follows the same pattern: call the command, propagate errors with ?, handle them at the boundary.

use vulkan_rust::vk;
use vulkan_rust::Device;
use vk::*;

unsafe fn create_pipeline(
    device: &Device,
    layout: PipelineLayout,
    render_pass: RenderPass,
    // ...
) -> VkResult<Pipeline> {
    let shader = device.create_shader_module(&shader_info, None)?;
    let pipeline = device.create_graphics_pipelines(
        PipelineCache::null(),
        &[pipeline_info],
        None,
    )?[0];
    device.destroy_shader_module(shader, None);
    Ok(pipeline)
}

Individual commands propagate errors upward. The top-level caller (your main loop or initialization function) decides whether to retry, fall back, or exit.

Validation layers vs error codes

These are complementary, not overlapping:

ConcernMechanism
Spec violations (wrong usage, missing sync)Validation layers (VK_LAYER_KHRONOS_validation)
Recoverable runtime failures (OOM, device lost)vk::Result error codes via VkResult<T>
Missing Vulkan runtimeLoadError
Platform surface issuesSurfaceError
Programmer misconfiguration (extension not enabled)Panic

Validation layers are a development-time tool. They intercept every Vulkan call, check it against the spec, and report violations via debug callbacks. They have significant overhead and are typically disabled in release builds.

Error codes are a production-time mechanism. They report conditions the application can respond to: allocate less memory, recreate the swapchain, or shut down gracefully.

A well-structured vulkan_rust application uses both: validation layers to catch bugs during development, error propagation to handle failures in production.