Rust Vulkan 逐幀渲染流程介紹

文檔列表見:Rust 移動端跨平臺複雜圖形渲染項目開發系列總結(目錄)c++

2019.5.10 修改:渲染到視圖git

gfx-hal接口以1:1模仿Vulkan,下面改用Vulkan接口做說明。因爲Vulkan接口粒度過細,比OpenGL / ES難學。根據我的經驗,對於移動端圖形開發者,照着OpenGL ES的接口講解Vulkan可下降學習難度。從逐幀渲染部分開始學習,跳過這些數據結構的初始化過程,有利於把握Vulkan的核心流程。github

OpenGL / ES 逐幀渲染流程示例

// 準備渲染目標環境
glBindFramebuffer();
glFramebufferTexture2D(); glCheckFramebufferStatus(); // 假如渲染到紋理
glViewport(x, y, width, height);
// 準備Shader須要讀取的數據
glUseProgram(x);
glBindBuffer(i)
loop i in 0..VertexVarCount {
    glEnableVertexAttribArray(i);
    glVertexAttribPointer(i, ...); 
}
loop i in 0..UniformVarCount {
    switch UniformType {
        case NoTexture: glUniformX(i, data); break;
        case Texture: {
            glActiveTexture(j);
            glBindTexture(type, texture_name);
            glUniform1i(location, j);
            break;
        }
        default:ERROR();
    }
}
// 配置其餘Fragment操做,好比glBlend, glStencil
glDrawArrays/Elements/ArraysInstanced...
// 到此完成Draw Call,視狀況調用EGL函數交換先後幀緩衝區,非GL函數,
// 渲染到紋理則無此操做。
// 爲了避免干擾後續繪製,恢復剛纔設置的Framebuffer操做爲默認值。
eglSwapbuffers()/[EAGLContext presentRenderbuffer]; 
複製代碼

可見,OpenGL / ES的接口屏蔽了絕大部分細節,總體代碼量顯得不多,但初學時也很差理解,用久了就成套路,以爲就該這樣,以至於第一次接觸Vulkan發現不少細節以前徹底不瞭解,有點懵。編程

OpenGL / ES造成套路後的缺點是,出錯的第一時間很難定位出是項目代碼的問題,好比狀態機沒設置好,仍是驅動的問題,在iOS上還好,Android真是眼黑。我猜你會說有各廠家的Profile工具和Google gapid,都特麼很差用,高通的技術支持建議咱們用Android 8.0 + Root設備。可是,每每出問題的都是Android 4.x。api

渲染到視圖

gfx-hal(Vulkan)逐幀渲染到視圖的核心調用流程以下所示:安全

-> CommandPool -> ComanndBuffer 

-> Submit -> Submission-> QueueGroup -> CommandQueue -> GraphicsHardware
複製代碼

說明:bash

  • CommandQueue:用於執行不一樣類型任務的隊列,好比渲染任務、計算任務。
  • QueueGroup:CommandQueue集合
  • GraphicsHardware:圖形硬件

具體流程代碼:數據結構

  • 重置Fence,給後面提交Submission到隊列使用。
    device.reset_fence(&frame_fence);
    複製代碼
  • 重置CommandPool,即重置今後Pool中建立的CommandBuffer。若是CommandBuffer還在中,須要開發者實現資源同步操做。
    command_pool.reset();
    複製代碼
  • 從Swapchain獲取Image索引
    let frame = swap_chain.acquire_image(!0, FrameSync::Semaphore(&mut frame_semaphore));
    複製代碼
  • 經過CommandPool建立、配置CommandBuffer,命令錄製結束後獲得有效的Submit對象
    let mut cmd_buffer = command_pool.acquire_command_buffer(false);
    // 一系列相似OpenGL / ES的Fragment操做、綁定數據到Program的配置
    // 兩個值得注意的Pipeline操做
    cmd_buffer.bind_graphics_pipeline(&pipeline);
    cmd_buffer.bind_graphics_descriptor_sets(&pipeline_layout, 0, Some(&desc_set), &[]);
    // 聯合RenderPass的操做
    let mut encoder = cmd_buffer.begin_render_pass_inline(&render_pass,...);
    let submit = cmd_buffer.finish()
    複製代碼
  • 經過Submit建立Submission
    let submission = Submission::new()
        .wait_on(&[(&frame_semaphore, PipelineStage::BOTTOM_OF_PIPE)])
        .submit(Some(submit));
    複製代碼
  • 提交Submission到隊列
    queue.submit(submission, Some(&mut frame_fence));
    複製代碼
  • 等待GPU執行完成
    device.wait_for_fence(&frame_fence, !0);
    複製代碼
  • 交換先後幀緩衝區,至關於eglSwapbuffers
    swap_chain.present(&mut queue_group.queues[0], frame, &[])
    複製代碼

詳細介紹CommandBuffer的配置

OpenGL / ES 2/3.x沒CommandPoolCommandBuffer數據結構,除了最新的OpenGL小版本才加入了SPIR-V和Command,但OpenGL ES還沒更新。Metal的CommandBuffer接口定義不一樣於Vulkan。Metal建立MTLCommandBuffer,由Buffer與RenderPassDescriptor一塊兒建立出 Enconder,而後打包本次渲染相關的資源,最後提交Buffer到隊列讓GPU執行。Vulkan基本把Metal的Encoder操做放到CommandBuffer,只留了很薄的Encoder操做。app

整體流程:函數

  • 由Command Pool分配可用Command Buffer
  • 配置viewport等信息
  • 配置頂點數據緩衝區
  • 配置Uniform與Buffer的對應關係
  • 設置輸出目標RenderPass
  • 設置繪製方式,draw/draw_indexed/draw_indirect等等
  • 結束配置

代碼示例以下:

let submit = {
    // 從緩衝區中取出一個實際爲RawCommandBuffer的實例,
    // 加上線程安全對象,組裝成線程安全的CommandBuffer實例,
    // 這是HAL的編程「套路」,還有不少這類數據結構
    let mut cmd_buffer = command_pool.acquire_command_buffer(false);
    cmd_buffer.begin()

    cmd_buffer.set_viewports(0, &[viewport]);
    cmd_buffer.set_scissors(0, &[viewport.rect]);
    cmd_buffer.bind_graphics_pipeline(&pipeline);
    cmd_buffer.bind_vertex_buffers(0, pso::VertexBufferSet(vec![(&vertex_buffer, 0)]));
    cmd_buffer.bind_graphics_descriptor_sets(&pipeline_layout, 0, Some(&desc_set)); //TODO

    // encoder
    cmd_buffer.begin_render_pass_inline(
            &render_pass,
            &framebuffers[frame.id()],
            viewport.rect,
            &[command::ClearValue::Color(command::ClearColor::Float([0.8, 0.8, 0.8, 1.0]))],
        ).draw(0..6, 0..1);

    cmd_buffer.finish()
};
複製代碼

這段代碼顯示了CommandBuffer兩個很關鍵的操做:bind_graphics_pipeline(GraphicsPipeline)bind_graphics_descriptor_sets(PipelineLayout, DescriptorSet)。GraphicsPipeline至關於OpenGL / ES的Program,PipelineLayoutDescriptorSet描述了Shader的Uniform變量如何讀取Buffer的數據。

渲染到紋理

渲染到紋理(Render to Texture, RTT)場景沒Swapchain,需配置RenderPass.Attachment.format爲紋理的像素格式。接着Submmit到Queue,流程就結束了,無需且沒法調用swap_chain.present()。若是要獲取該CommandBuffer的GPU操做結束事件或耗時,添加相應的回調函數給CommandBuffer便可。

配置RenderPass.Attachment.format

let render_pass = {
    let attachment = Attachment {
        format: Some(format), /// 紋理的像素格式
        samples: 1,
        ops: AttachmentOps::new(AttachmentLoadOp::Clear, AttachmentStoreOp::Store),
        stencil_ops: AttachmentOps::DONT_CARE,
        layouts: Layout::Undefined..Layout::Present,
    };

    let subpass = SubpassDesc {
        colors: &[(0, Layout::ColorAttachmentOptimal)], /// 匹配了 Vulkan 要求
        depth_stencil: None,  /// 同上應該使用 Optimal
        inputs: &[],
        resolves: &[],
        preserves: &[],
    };

    let dependency = SubpassDependency {
        passes: SubpassRef::External..SubpassRef::Pass(0),
        stages: PipelineStage::COLOR_ATTACHMENT_OUTPUT..PipelineStage::COLOR_ATTACHMENT_OUTPUT,
        accesses: Access::empty()..(Access::COLOR_ATTACHMENT_READ | Access::COLOR_ATTACHMENT_WRITE),
    };

    device.create_render_pass(&[attachment], &[subpass], &[dependency])
          .expect("Can't create render pass")
};
複製代碼

提交到CommandQueue

和渲染到視圖同樣提交便可,少一步swap_chain.present()。如何驗證到這步就夠了呢?看源碼是一種方案。若是是Metal,用Xcode Capture GPU Frame也是一種方案。如何對Cargo項目進行Xcode Capture GPU Frame?參考我另外一個文檔:Xcode External Build System 失敗的 Capture GPU Frame 經歷、解決方案與覆盤

// ... lots of previous stuff
queue.submit(submission, Some(&mut frame_fence));
device.wait_for_fence(&frame_fence, !0);
複製代碼

紋理依賴渲染

好比,DrawCall 1輸出紋理爲 DrawCall 2的一個輸入紋理,DrawCall 2輸出紋理爲 DrawCall 3的一個輸入紋理,這類場景在直播、短視頻業務中很常見。

// **************** 準備 RenderPass ***************
// 建立新 RenderPass 前先釋放老的 RenderPass
vkDestroyRenderPass(vkDevice, mVkRenderPass, nullptr);

VkSubpassDescription subpass;
subpass.pipelineBindPoint       = mVkPipelineBindPoint; // 初始化爲 VK_PIPELINE_BIND_POINT_GRAPHICS
subpass.flags                   = 0;
subpass.colorAttachmentCount    = colorFormat        != VK_FORMAT_UNDEFINED ? 1             : 0;
subpass.pColorAttachments       = colorFormat        != VK_FORMAT_UNDEFINED ? &color        : nullptr;
subpass.pDepthStencilAttachment = depthstencilFormat != VK_FORMAT_UNDEFINED ? &depthstencil : nullptr;
subpass.pResolveAttachments     = nullptr;
subpass.inputAttachmentCount    = 0;
subpass.pInputAttachments       = nullptr;
subpass.preserveAttachmentCount = 0;
subpass.pPreserveAttachments    = nullptr;

VkRenderPassCreateInfo info;
info.sType            = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO;
info.pNext            = nullptr;
info.flags            = 0;
info.attachmentCount  = static_cast<uint32_t>(attachments.size());
info.pAttachments     = attachments.data();
info.subpassCount     = 1;
info.pSubpasses       = &subpass;
info.dependencyCount  = 0;
info.pDependencies    = nullptr;

VkResult err = vkCreateRenderPass(vkDevice, &info, nullptr, &mVkRenderPass);
// 錯誤處理 (err != VK_ERROR_OUT_OF_HOST_MEMORY && err != VK_ERROR_OUT_OF_DEVICE_MEMORY);

// **************** 準備 Framebuffer ***************
// 建立新 Framebuffer 前先釋放老的 Framebuffer
vkDestroyFramebuffer(vkDevice, mVkFramebuffer, nullptr);

VkFramebufferCreateInfo info;
info.sType           = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO;
info.pNext           = nullptr;
info.flags           = 0;
info.renderPass      = *renderpass;
info.width           = width;
info.height          = height;
info.layers          = 1;
info.attachmentCount = static_cast<uint32_t>(imageViews->size());
info.pAttachments    = imageViews->data();
VkResult err = vkCreateFramebuffer(vkDevice, &info, nullptr, &mVkFramebuffer);
// 錯誤處理 (err != VK_ERROR_OUT_OF_HOST_MEMORY && err != VK_ERROR_OUT_OF_DEVICE_MEMORY);

// **************** 開始 CommandBuffer ***************
VkCommandBufferBeginInfo info;
info.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
info.pNext = nullptr;
info.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT; // glDraw* 只使用一次?
info.pInheritanceInfo = nullptr;

VkResult err = vkBeginCommandBuffer(mVkCommandBuffers.commandBuffer[mActiveCmdBuffer], &info);
// 錯誤處理 (err != VK_SUCCESS)
mVkCommandBuffers.commandBufferState[mActiveCmdBuffer] = CMD_BUFFER_RECORDING_STATE;

// **************** 開始 RenderPass ***************
VkRenderPassBeginInfo info;
info.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO;
info.pNext = nullptr;
info.framebuffer = *framebuffer;
info.renderPass = mVkRenderPass;
info.renderArea = mVkRenderArea;
info.clearValueCount = 2;
info.pClearValues = mVkClearValues;

const VkSubpassContents subpassContents = VK_SUBPASS_CONTENTS_SECONDARY_COMMAND_BUFFERS;
vkCmdBeginRenderPass(*activeCmdBuffer, &info, subpassContents);     


// **************** 分配 Secondary CommandBuffers ***************
VkCommandBuffer *commandBuffers = new VkCommandBuffer;
VkCommandBufferAllocateInfo cmdAllocInfo;
cmdAllocInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO;
cmdAllocInfo.pNext = nullptr;
cmdAllocInfo.commandPool = mVkCmdPool;
cmdAllocInfo.level = VK_COMMAND_BUFFER_LEVEL_SECONDARY;
cmdAllocInfo.commandBufferCount = numOfBuffers; /// 值爲 1

VkResult err = vkAllocateCommandBuffers(vkDevice, &cmdAllocInfo, commandBuffers);
// 錯誤處理 (err != VK_SUCCESS)
      
// **************** 開始 Secondary CommandBuffers *************** 
VkCommandBufferInheritanceInfo inheritanceInfo;
inheritanceInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_INHERITANCE_INFO;
inheritanceInfo.pNext = nullptr;
inheritanceInfo.renderPass = renderPass;
inheritanceInfo.subpass = 0;
inheritanceInfo.framebuffer = framebuffer;
inheritanceInfo.occlusionQueryEnable = VK_FALSE;
inheritanceInfo.queryFlags = 0;
inheritanceInfo.pipelineStatistics = 0;

VkCommandBufferBeginInfo cmdBeginInfo;
cmdBeginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
cmdBeginInfo.pNext = nullptr;
cmdBeginInfo.flags = VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT;
cmdBeginInfo.pInheritanceInfo = &inheritanceInfo;

VkResult err = vkBeginCommandBuffer(*cmdBuffer, &cmdBeginInfo);
// 錯誤處理 (err != VK_SUCCESS)
複製代碼

HAL相關數據結構定義

FrameSync定義

/// Synchronization primitives which will be signalled once a frame got retrieved.
///
/// The semaphore or fence _must_ be unsignalled.
pub enum FrameSync<'a, B: Backend> {
    /// Semaphore used for synchronization.
    ///
    /// Will be signaled once the frame backbuffer is available.
    Semaphore(&'a B::Semaphore),

    /// Fence used for synchronization.
    ///
    /// Will be signaled once the frame backbuffer is available.
    Fence(&'a B::Fence),
}
複製代碼

CommandBuffer(關鍵數據結構)

/// A strongly-typed command buffer that will only implement methods that are valid for the operations
/// it supports.
pub struct CommandBuffer<'a, B: Backend, C, S: Shot = OneShot, L: Level = Primary> {
    pub(crate) raw: &'a mut B::CommandBuffer,
    pub(crate) _marker: PhantomData<(C, S, L)>
}
複製代碼

Submit

/// Thread-safe finished command buffer for submission.
pub struct Submit<B: Backend, C, S, L>(pub(crate) B::CommandBuffer, pub(crate) PhantomData<(C, S, L)>);
impl<B: Backend, C, S, L> Submit<B, C, S, L> {
    fn new(buffer: B::CommandBuffer) -> Self {
        Submit(buffer, PhantomData)
    }
}
unsafe impl<B: Backend, C, S, L> Send for Submit<B, C, S, L> {}
複製代碼

Submission

/// Submission information for a command queue, generic over a particular
/// backend and a particular queue type.
pub struct Submission<'a, B: Backend, C> {
    cmd_buffers: SmallVec<[Cow<'a, B::CommandBuffer>; 16]>,
    wait_semaphores: SmallVec<[(&'a B::Semaphore, pso::PipelineStage); 16]>,
    signal_semaphores: SmallVec<[&'a B::Semaphore; 16]>,
    marker: PhantomData<C>,
}

/////////////////////////////// submit接口 /////////////////////////////////
/// Append a new list of finished command buffers to this submission.
///
/// All submits for this call must be of the same type.
/// Submission will be automatically promoted to to the minimum required capability
/// to hold all passed submits.
pub fn submit<I, K>(mut self, submits: I) -> Submission<'a, B, <(C, K) as Upper>::Result>
where
    I: IntoIterator,
    I::Item: Submittable<'a, B, K, Primary>,
    (C, K): Upper
{
    self.cmd_buffers.extend(submits.into_iter().map(
        |s| { unsafe { s.into_buffer() } }
    ));
    Submission {
        cmd_buffers: self.cmd_buffers,
        wait_semaphores: self.wait_semaphores,
        signal_semaphores: self.signal_semaphores,
        marker: PhantomData,
    }
}
複製代碼

Vulkan與Metal的CommandBuffer複用與性能等問題討論

根據實踐,持續更新。

CommandBuffer重用

Metal的CommandBuffer一旦Commit到Queue,則不能再次使用。Vulkan可屢次提交。

After a command buffer has been committed for execution, the only valid operations on the command buffer are to wait for it to be scheduled or completed (using synchronous calls or handler blocks) and to check the status of the command buffer execution. When used, scheduled and completed handlers are blocks that are invoked in execution order. These handlers should perform quickly; if expensive or blocking work needs to be scheduled, defer that work to another thread.

In a multithreaded app, it’s advisable to break your overall task into subtasks that can be encoded separately. Create a command buffer for each chunk of work, then call the enqueue() method on these command buffer objects to establish the order of execution. Fill each buffer object (using multiple threads) and commit them. The command queue automatically schedules and executes these command buffers as they become available.

developer.apple.com/documentati…

提交到隊列的函數名區別

提交CommandBuffer到Queue,Metal和Vulkan用了不一樣的單詞。Metal = commit(),Vulkan = submit()

相關文章
相關標籤/搜索