💻

底层/advanced

Created
Jan 16, 2022 03:33 PM
Tags
WebGL 基础概念
WebGL经常被当成3D API,人们总想"我可以使用WebGL和 一些神奇的东西 做出炫酷的3D作品"。 事实上WebGL仅仅是一个光栅化引擎,它可以根据你的代码绘制出点,线和三角形。 想要利用WebGL完成更复杂任务,取决于你能否提供合适的代码,组合使用点,线和三角形代替实现。 WebGL在电脑的GPU中运行。因此你需要使用能够在GPU上运行的代码。 这样的代码需要提供成对的方法。每对方法中一个叫顶点着色器, 另一个叫片断着色器,并且使用一种和C或C++类似的强类型的语言 GLSL。 (GL着色语言)。 每一对组合起来称作一个 program (着色程序)。 顶点着色器的作用是计算顶点的位置。根据计算出的一系列顶点位置,WebGL可以对点, 线和三角形在内的一些图元进行光栅化处理。当对这些图元进行光栅化处理时需要使用片断着色器方法。 片断着色器的作用是计算出当前绘制图元中每个像素的颜色值。 几乎整个WebGL API都是关于如何设置这些成对方法的状态值以及运行它们。 对于想要绘制的每一个对象,都需要先设置一系列状态值,然后通过调用 gl.drawArrays 或 gl.drawElements 运行一个着色方法对,使得你的着色器对能够在GPU上运行。 这些方法对所需的任何数据都需要发送到GPU,这里有着色器获取数据的4种方法。 属性(Attributes)和缓冲 缓冲是发送到GPU的一些二进制数据序列,通常情况下缓冲数据包括位置,法向量,纹理坐标,顶点颜色值等。 你可以存储任何数据。 属性用来指明怎么从缓冲中获取所需数据并将它提供给顶点着色器。 例如你可能在缓冲中用三个32位的浮点型数据存储一个位置值。 对于一个确切的属性你需要告诉它从哪个缓冲中获取数据,获取什么类型的数据(三个32位的浮点数据), 起始偏移值是多少,到下一个位置的字节数是多少。 缓冲不是随意读取的。事实上顶点着色器运行的次数是一个指定的确切数字, 每一次运行属性会从指定的缓冲中按照指定规则依次获取下一个值。 全局变量(Uniforms) 全局变量在着色程序运行前赋值,在运行过程中全局有效。 纹理(Textures) 纹理是一个数据序列,可以在着色程序运行中随意读取其中的数据。 大多数情况存放的是图像数据,但是纹理仅仅是数据序列, 你也可以随意存放除了颜色数据以外的其它数据。 可变量(Varyings) 可变量是一种顶点着色器给片断着色器传值的方式,依照渲染的图元是点, 线还是三角形,顶点着色器中设置的可变量会在片断着色器运行中获取不同的插值。 WebGL只关心两件事:裁剪空间中的坐标值和颜色值。使用WebGL只需要给它提供这两个东西。 你需要提供两个着色器来做这两件事,一个顶点着色器提供裁剪空间坐标值,一个片断着色器提供颜色值。 无论你的画布有多大,裁剪空间的坐标范围永远是 -1 到 1 。 这里有一个简单的WebGL例子展示WebGL的简单用法。 让我们从顶点着色器开始 // 一个属性值,将会从缓冲中获取数据 attribute vec4 a_position; // 所有着色器都有一个main方法 void main() { // gl_Position 是一个顶点着色器主要设置的变量 gl_Position = a_position; } 如果用JavaScript代替GLSL,当它运行的时候,你可以想象它做了类似以下的事情 // *** 伪代码!!
WebGL 基础概念
I want to talk about WebGPU
WebGPU is the new WebGL. That means it is the new way to draw 3D in web browsers. It is, in my opinion, very good actually. It is so good I think it will also replace Canvas and become the new way to draw 2D in web browsers. In fact it is so good I think it will replace Vulkan as well as normal OpenGL, and become just the standard way to draw, in any kind of software, from any programming language. This is pretty exciting to me. WebGPU is a little bit irritating— but only a little bit, and it is massively less irritating than any of the things it replaces. WebGPU goes live… today, actually. Chrome 113 shipped in the final minutes of me finishing this post and should be available in the "About Chrome" dialog right this second. If you click here [https://data.runhello.com/j/wgpu/1/], and you see a rainbow triangle, your web browser has WebGPU. By the end of the year WebGPU will be everywhere, in every browser. (All of this refers to desktop computers. On phones, it won't be in Chrome until later this year; and Apple I don't know. Maybe one additional year after that.) If you are not a programmer, this probably doesn't affect you. It might get us closer to a world where you can just play games in your web browser as a normal thing like you used to be able to with Flash. But probably not because WebGL wasn't the only problem there. If you are a programmer, let me tell you what I think this means for you. Sections below: * A history of graphics APIs (You can skip this) * What's it like? * How do I use it? * Typescript / NPM world * I don't know what a NPM is I Just wanna write CSS and my stupid little script tags * Rust / C++ / Posthuman Intersecting Tetrahedron ---------------------------------------- A HISTORY OF GRAPHICS APIS (YOU CAN SKIP THIS) Yo! Yogi [https://staging.cohostcdn.org/attachment/823fc920-3542-45ad-a90d-a59e8d486631/1991.jpg] 1991 Back in the dawn of time there were two ways to make 3D on a computer: You did a bunch of math; or you bought an SGI machine. SGI were the first people who were designing circuitry to do the rendering parts of a 3D engine for you. They had this C API for describing your 3D models to the hardware. At some point it became clear that people were going to start making plugin cards for regular desktop computers that could do the same acceleration as SGI's big UNIX boxes, so SGI released a public version of their API so it would be possible to write code that would work both on the UNIX boxes and on the hypothetical future PC cards. This was OpenGL. `color()` and `rectf()` in IRIS GL became `glColor()` and `glRectf()` in OpenGL. "Waterfalls" by TLC [https://staging.cohostcdn.org/attachment/88a68e08-9a4f-48c9-9f21-92ef3a64ca92/1995-min.jpg] 1995 When the PC 3D cards actually became a real thing you could buy, things got real messy for a bit. Instead of signing on with OpenGL Microsoft had decided to develop their own thing (Direct3D) and some of the 3D card vendors also developed their own API standards, so for a while certain games were only accelerated on certain graphics cards and people writing games had to write their 3D pipelines like four times, once as a software renderer and a separate one for each card type they wanted to support. My perception is it was Direct3D, not OpenGL, which eventually managed to wrangle all of this into a standard, which really sucked if you were using a non-Microsoft OS at the time. It really seemed like DirectX (and the "X Box" standalone console it spawned) were an attempt to lock game companies into Microsoft OSes by getting them to wire Microsoft exclusivity into their code at the lowest level, and for a while it really worked. Shrek [https://staging.cohostcdn.org/attachment/fde78cb9-8aab-47d4-9c5c-2ebebeb04e3f/2000-min.jpg] 2000 It is the case though it wasn't very long into the Direct3D lifecycle before you started hearing from Direct3D users that it was much, much nicer to use than OpenGL, and OpenGL quickly got to a point where it was literally years behind Direct3D in terms of implementing critical early features like shaders, because the Architecture Review Board of card vendors that defined OpenGL would spend forever bickering over details whereas Microsoft could just implement stuff and expect the card vendor to work it out. Let's talk about shaders. The original OpenGL was a "fixed function renderer", meaning someone had written down the steps in a 3D renderer and it performed those steps in order. [API] → Primitive Processing → (1) Transform and Lighting → Primitive Assembly → Rasterizer → (2) Texture Environment → (2) Color sum → (2) Fog → (2) Alpha Test → Depth/Stencil → Color-buffer Blend → Dither → [Frame Buffer] [https://staging.cohostcdn.org/attachment/dc77a277-6ca9-4d8a-bdf9-e96d6a89982c/FIXED-FUNCTION-FIXED.png] Modified Khronos Group image Each box in the "pipeline" had some dials on the side so you could configure how each feature behaved, but you were pretty much limited to the features the card vendor gave you. If you had shadows, or fog, it was because OpenGL or an extension had exposed a feature for drawing shadows or fog. What if you want some other feature the ARB didn't think of, or want to do shadows or fog in a unique way that makes your game look different from other games? Sucks to be you. This was obnoxious, so eventually "programmable shaders" were introduced. Notice some of the boxes above are yellow? Those boxes became replaceable. The (1) boxes got collapsed into the "Vertex Shader", and the (2) boxes became the "Fragment Shader"². The software would upload a computer program in a simple C-like language (upload the actual text of the program, you weren't expected to compile it like a normal program)³ into the video driver at runtime, and the driver would convert that into configurations of ALUs (or whatever the card was actually doing on the inside) and your program would become that chunk of the pipeline. This opened things up a lot, but more importantly it set card design on a kinda strange path. Suddenly video cards weren't specialized rendering tools anymore. They ran software. Time Magazine, "What kind of President would John Kerry be?" [https://staging.cohostcdn.org/attachment/e51a906b-af08-41a6-87bd-b1bbfbaf1455/2004.jpg] 2004 Pretty shortly after this was another change. Handheld devices were starting to get to the point it made sense to do 3D rendering on them (or at least, to do 2D compositing using 3D video card hardware like desktop machines had started doing). DirectX was never in the running for these applications. But implementing OpenGL on mid-00s mobile silicon was rough. OpenGL was kind of… large, at this point. It had all these leftover functions from the SGI IRIX era, and then it had this new shiny OpenGL 2.0 way of doing things with the shaders and everything and not only did this mean you basically had two unrelated APIs sitting side by side in the same API, but also a lot of the OpenGL 1.x features were traps. The spec said that every video card had to support every OpenGL feature, but it didn't say it had to support them in Hardware, so there were certain early-90s features that 00s card vendors had decided nobody really uses, and so if you used those features the driver would render the screen, copy the entire screen into regular RAM, perform the feature on the CPU and then copy the results back to the video card. Accidentally activating one of these trap features could easily move you from 60 FPS to 1 FPS. All this legacy baggage promised a lot of extra work for the manufacturers of the new mobile GPUs, so to make it easier Khronos (which is what the ARB had become by this point) introduced an OpenGL "ES", which stripped out everything except the features you absolutely needed. Instead of being able to call a function for each polygon or each vertex you had to use the newer API of giving OpenGL a list of coordinates in a block in memory⁴, you had to use either the fixed function or the shader pipeline with no mixing (depending on whether you were using ES 1.x or ES 2.x), etc. This partially made things simpler for programmers, and partially prompted some annoying rewrites. But as with shaders, what's most important is the long-term strange-ing this change presaged: Starting at this point, the decisions of Khronos increasingly were driven entirely by the needs and wants of hardware manufacturers, not programmers. The Apple iPhone [https://staging.cohostcdn.org/attachment/a48dd6a7-3d2e-4a17-bd0f-6950ce16523f/2008-min.jpg] 2008 With OpenGL ES devices in the world, OpenGL started to graduate from being "that other graphics API that exists, I guess" and actually take off. The iPhone, which used OpenGL ES, gave a solid mass-market reason to learn and use OpenGL. Nintendo consoles started to use OpenGL or something like it. OpenGL had more or less caught up with DirectX in features, especially if you were willing to use extensions. Browser vendors, in that spurt of weird hubris that gave us the original WebAudio API, adapted OpenGL ES into JavaScript as "WebGL", which makes no sense because as mentioned OpenGL ES was all about packing bytes into arrays full of geometry and JavaScript doesn't have direct memory access or even integers, but they added packed binary arrays to the language [https://web.dev/webgl-typed-arrays/#history-of-typed-arrays] and did it anyway. So with all this activity, sounds like things are going great, right? Steven Universe [https://staging.cohostcdn.org/attachment/ca876a6c-6753-4de1-bc29-703b0c82bab1/2013.webp] 2013 No! Everything was terrible! As it matured, OpenGL fractured into a variety of slightly different standards with varying degrees of cross-compatibility. OpenGL ES 2.0 was the same as OpenGL 3.3, somehow. WebGL 2.0 is very almost OpenGL ES 3.0 but not quite. Every attempt to resolve OpenGL's remaining early mistakes seemed to wind up duplicating the entire API as new functions with slightly different names and slightly different signatures. A big usability issue with OpenGL was even after the 2.0 rework it had a lot of shared global state, but the add-on systems that were supposed to resolve this (VAOs and VBOs) only wound up being even more global state you had to keep track of. A big trend in the 10s was "GPGPU" (General Purpose GPU); programmers started to realize that graphics cards worked as well as, but were slightly easier to program than, a CPU's vector units, so they just started accelerating random non-graphics programs by doing horrible hacks like stuffing them in pixel shaders and reading back a texture containing an encoded result. Before finally resolving on compute shaders (in other words: before giving up and copying DirectX's solution), Khronos's original steps toward actually catering to this were either poorly adopted (OpenCL) or just plain bad ideas (geometry shaders). It all built up. Just like in the pre-ES era, OpenGL had basically become several unrelated APIs sitting in the same header file, some of which only worked on some machines. Worse, nothing worked quite as well as you wanted it to; different video card vendors botched the complexity, implementing features slightly differently (especially tragically, implementing slightly different versions of the shader language) or just badly, especially in the infamously bad Windows OpenGL drivers. The way out came from, this is how I see it anyway, a short-lived idea called "AZDO [https://www.youtube.com/watch?v=GiDsLRQg_g4]". This technically consisted of a single GDC talk⁵, and I have no reason to believe the GDC talk originated the idea, but what the talk did do is give a name to the idea that underlies Vulkan, DirectX 12, and Metal. "Approaching Zero Driver Overhead". Here is the idea: By 2015 video cards had pretty much standardized on a particular way of working and that way was known and that way wasn't expected to change for ten years at least. Graphics APIs were originally designed around the functionality they exposed, but that functionality hadn't been a 1:1 map to how GPUs look on the inside for ten years at least. Drivers had become complex beasts that rather than just doing what you told them tried to intuit what you were trying to do and then do that in the most optimized way, but often they guessed wrong, leaving software authors in the ugly position of trying to intuit what the driver would intuit in any one scenario. AZDO was about threading your way through the needle of the graphics API in such a way your function calls happened to align precisely with what the hardware was actually doing, such that the driver had nothing to do and stuff just happened. Star Wars: The Force Awakens [https://staging.cohostcdn.org/attachment/b735d6be-49e1-4b9b-b446-aa23aab526f2/tfa-alt.jpg] 2016 Or we could just design the graphics API to be AZDO from the start. That's Vulkan. (And DirectX 12, and Metal.) The modern generation of graphics APIs are about basically throwing out the driver, or rather, letting your program be the driver. The API primitives map directly to GPU internal functionality⁶, and the GPU does what you ask without second guessing. This gives you an incredible amount of power and control. Remember that "pipeline" diagram up top? The modern APIs let you define "pipeline objects"; while graphics shaders let you replace boxes within the diagram, and compute shaders let you replace the diagram with one big shader program, pipeline objects let you draw your own diagram. You decide what blocks of GPU memory are the sources, and which are the destinations, and how they are interpreted, and what the GPU does with them, and what shaders get called. All the old sources of confusion get resolved. State is bound up in neatly defined objects instead of being global. Card vendors always designed their shader compilers different, so we'll replace the textual shader language with a bytecode format that's unambiguous to implement and easier to write compilers for. Vulkan goes so far as to allow⁷ you to write your own allocator/deallocator for GPU memory. So this is all very cool. There is only one problem, which is that with all this fine-grained complexity, Vulkan winds up being basically impossible for humans to write. Actually, that's not really fair. DX12 and Metal offer more or less the same degree of fine-grained complexity, and by all accounts they're not so bad to write. The actual problem is that Vulkan is not designed for humans to write. Literally. Khronos does not want you to write Vulkan, or rather, they don't want you to write it directly. I was in the room when Vulkan was announced, across the street from GDC in 2015, and what they explained to our faces was that game developers were increasingly not actually targeting the gaming API itself, but rather targeting high-level middleware, Unity or Unreal or whatever, and so Vulkan was an API designed for writing middleware. The middleware developers were also in the room at the time, the Unity and Epic and Valve guys. They were beaming as the Khronos guy explained this. Their lives were about to get much, much easier. My life was about to get harder. Vulkan is weird— but it's weird in a way that makes a certain sort of horrifying machine sense. Every Vulkan call involves passing in one or two huge structures which are themselves a forest of other huge structures, and every structure and sub-structure begins with a little protocol header explaining what it is and how big it is. Before you allocate memory you have to fill out a structure to get back a structure that tells you what structure you're supposed to structure your memory allocation request in. None of it makes any sense— unless you've designed a programming language before, in which case everything you're reading jumps out to you as "oh, this is contrived like this because it's designed to be easy to bind to from languages with weird memory-management techniques" "this is a way of designing a forward-compatible ABI while making no assumptions about programming language" etc. The docs are written in a sort of alien English that fosters no understanding— but it's also written exactly the way a hardware implementor would want in order to remove all ambiguity about what a function call does. In short, Vulkan is not for you. It is a byzantine contract between hardware manufacturers and middleware providers, and people like… well, me, are just not part of the transaction. Khronos did not forget about you and me. They just made a judgement, and this actually does make a sort of sense, that they were never going to design the perfectly ergonomic developer API anyway, so it would be better to not even try and instead make it as easy as possible for the perfectly ergonomic API to be written on top, as a library. Khronos thought within a few years of Vulkan⁸ being released there would be a bunch of high-quality open source wrapper libraries that people would use instead of Vulkan directly. These libraries basically did not materialize. It turns out writing software is work and open source projects do not materialize just because people would like them to⁹. Star Wars: The Rise of Skywalker [https://staging.cohostcdn.org/attachment/4ce3c81e-f2d3-479e-aa25-1c60e89a1e65/tros-alt.jpg] 2019 This leads us to the other problem, the one Vulkan developed after the fact. The Apple problem. The theory on Vulkan was it would change the balance of power where Microsoft continually released a high-quality cutting-edge graphics API and OpenGL was the sloppy open-source catch up. Instead, the GPU vendors themselves would provide the API, and Vulkan would be the universal standard while DirectX would be reduced to a platform-specific oddity. But then Apple said no. Apple (who had already launched their own thing, Metal) announced not only would they never support Vulkan, they would not support OpenGL, anymore¹⁰. From my perspective, this is just DirectX again; the dominant OS vendor of our era, as Microsoft was in the 90s, is pushing proprietary graphics tech to foster developer lock-in. But from Apple's perspective it probably looks like— well, the way DirectX probably looked from Microsoft's perspective in the 90s. They're ignoring the jagged-metal thing from the hardware vendors and shipping something their developers will actually want to use. With Apple out, the scene looked different. Suddenly there was a next-gen API for Windows, a next-gen API for Mac/iPhone, and a next-gen API for Linux/Android. Except Linux has a severe driver problem with Vulkan and a lot of the Linux devices I've been checking out don't support Vulkan even now after it's been out seven years. So really the only platform where Vulkan runs natively is Android. This isn't that bad. Vulkan does work on Windows and there are mostly no problems, though people who have the resources to write a DX12 backend seem to prefer doing so. The entire point of these APIs is that they're flyweight things resting very lightly on top of the hardware layer, which means they aren't really that different, to the extent that a Vulkan-on-Metal emulation layer named MoltenVK exists and reportedly adds almost no overhead. But if you're an open source kind of person who doesn't have the resources to pay three separate people to write vaguely-similar platform backends, this isn't great. Your code can technically run on all platforms, but you're writing in the least pleasant of the three APIs to work with and you get the advantage of using a true-native API on neither of the two major platforms. You might even have an easier time just writing DX12 and Metal and forgetting Vulkan (and Android) altogether. In short, Vulkan solves all of OpenGL's problems at the cost of making something that no one wants to use and no one has a reason to use. The way out turned out to be something called ANGLE. Let me back up a bit. Super Meat Boy [https://staging.cohostcdn.org/attachment/d7c61730-2d91-44ae-a57d-2efa441e4ba2/SuperMeatBoy_cover.png] 2010, again WebGL was designed around OpenGL ES. But it was never exactly the same as OpenGL ES, and also technically OpenGL ES never really ran on desktops, and also regular OpenGL on desktops had Problems. So the browser people eventually realized that if you wanted to ship an OpenGL compatibility layer on Windows, it was actually easier to write an OpenGL emulator in DirectX than it was to use OpenGL directly and have to negotiate the various incompatibilities between OpenGL implementations of different video card drivers. The browser people also realized that if slight compatibility differences between different OpenGL drivers was hell, slight incompatibility differences between four different browsers times three OSes times different graphics card drivers would be the worst thing ever. From what I can only assume was desperation, the most successful example I've ever seen of true cross-company open source collaboration emerged: ANGLE, a BSD-licensed OpenGL emulator originally written by Google but with honest-to-goodness contributions from both Firefox and Apple, which is used for WebGL support in literally every web browser. But nobody actually wants to use WebGL, right? We want a "modern" API, one of those AZDO thingies. So a W3C working group sat down to make Web Vulkan, which they named WebGPU. I'm not sure my perception of events is to be trusted, but my perception of how this went from afar was that Apple was the most demanding participant in the working group, and also the participant everyone would naturally by this point be most afraid of just spiking the entire endeavor, so reportedly Apple just got absolutely everything they asked for and WebGPU really looks a lot like Metal. But Metal was always reportedly the nicest of the three modern graphics APIs to use, so that's… good? Encouraged by the success with ANGLE (which by this point was starting to see use as a standalone library in non-web apps¹¹), and mindful people would want to use this new API with WebASM [https://webassembly.org/], they took the step of defining the standard simultaneously as a JavaScript IDL and a C header file, so non-browser apps could use it as a library. WGPU [https://staging.cohostcdn.org/attachment/7fcaa405-7420-4db7-bc18-8204dd47cef0/logo.min.svg] 2023 WebGPU is the child of ANGLE and Metal. WebGPU is the missing open-source "ergonomic layer" for Vulkan. WebGPU is in the web browser, and Microsoft and Apple are on the browser standards committee, so they're "bought in", not only does WebGPU work good-as-native on their platforms but anything WebGPU can do will remain perpetually feasible on their OSes regardless of future developer lock-in efforts. (You don't have to worry about feature drift like we're already seeing with MoltenVK.) WebGPU will be on day one (today) available with perfectly equal compatibility for JavaScript/TypeScript (because it was designed for JavaScript in the first place), for C++ (because the Chrome implementation is in C, and it's open source) and for Rust (because the Firefox implementation is in Rust, and it's open source). I feel like WebGPU is what I've been waiting for this entire time. ---------------------------------------- WHAT'S IT LIKE? I can't compare to DirectX or Metal, as I've personally used neither. But especially compared to OpenGL and Vulkan, I find WebGPU really refreshing to use. I have tried, really tried, to write Vulkan, and been defeated by the complexity each time. By contrast WebGPU does a good job of adding complexity only when the complexity adds something. There are a lot of different objects to keep track of, especially during initialization (see below), but every object represents some Real Thing that I don't think you could eliminate from the API without taking away a useful ability. (And there is at least the nice property that you can stuff all the complexity into init time and make the process of actually drawing a frame very terse.) WebGPU caters to the kind of person who thinks it might be fun to write their own raymarcher, without requiring every programmer to be the kind of person who thinks it would be fun to write their own implementation of malloc. THE PROBLEMS There are three Problems. I will summarize them thusly: * Text * Lines * The Abomination Text and lines are basically the same problem. WebGPU kind of doesn't… have them. It can draw lines, but they're only really for debugging– single-pixel width and you don't have control over antialiasing. So if you want a "normal looking" line you're going to be doing some complicated stuff with small bespoke meshes and an SDF shader. Similarly with text, you will be getting no assistance– you will be parsing OTF font files yourself and writing your own MSDF shader, or more likely finding a library that does text for you. This (no lines or text unless you implement it yourself) is a totally normal situation for a low-level graphics API, but it's a little annoying to me because the web browser already has a sophisticated anti-aliased line renderer (the original Canvas API) and the most advanced text renderer in the world. (There is some way to render text into a Canvas API texture and then transfer the Canvas contents into WebGPU as a texture, which should help for some purposes.) Then there's WGSL, or as I think of it, The Abomination. You will probably not be as annoyed by this as I am. Basically: One of the benefits of Vulkan is that you aren't required to use a particular shader language. OpenGL uses GLSL, DirectX uses HLSL. Vulkan used a bytecode, called SPIR-V, so you could target it from any shader language you wanted. WebGPU was going to use SPIR-V, but then Apple said no¹². So now WebGPU uses WGSL, a new thing developed just for WebGPU, as its only shader language. As far as shader languages go, it is fine. Maybe it is even good. I'm sure it's better than GLSL. For pure JavaScript users, it's probably objectively an improvement to be able to upload shaders as text files instead of having to compile to bytecode. But gosh, it would have been nice to have that choice! (The "desktop" versions of WebGPU still keep SPIR-V as an option.) ---------------------------------------- HOW DO I USE IT? You have three choices for using WebGPU: Use it in JavaScript in the browser, use it in Rust/C++ in WebASM inside the browser, or use it in Rust/C++ in a standalone app. The Rust/C++ APIs are as close to the JavaScript version as language differences will allow; the in-browser/out-of-browser APIs for Rust and C++ are identical (except for standalone-specific features like SPIR-V). In standalone apps you embed the WebGPU components from Chrome or Firefox as a library; your code doesn't need to know if the WebGPU library is a real library or if it's just routing through your calls to the browser. Regardless of language, the official WebGPU spec document [https://www.w3.org/TR/webgpu/] on w3.org is a clear, readable reference guide to the language, suitable for just reading in a way standard specifications sometimes aren't. (I haven't spent as much time looking at the WGSL spec [https://www.w3.org/TR/WGSL/] but it seems about the same.) If you get lost while writing WebGPU, I really do recommend checking the spec. Most of the "work" in WebGPU, other than writing shaders, consists of the construction (when your program/scene first boots) of one or more "pipeline" objects, one per "pass", which describe "what shaders am I running, and what kind of data can get fed into them?"¹³. You can chain pipelines end-to-end within a queue: have a compute pass generate a vertex buffer, have a render pass render into a texture, do a final render pass which renders the computed vertices with the rendered texture. Here, in diagram form, are all the things you need to create to initially set up WebGPU and then draw a frame. This might look a little overwhelming. Don't worry about it! In practice you're just going to be copying and pasting a big block of boilerplate from some sample code. However at some point you're going to need to go back and change that copypasted boilerplate, and then you'll want to come back and look up what the difference between any of these objects is. At init: Context: One <canvas> or window. Exists at boot. WebGPU instance: navigator.gpu. Exists at boot. Adapter: If there’s more than one video card, you can pick one. Feed this to Canvas Configuration. Vends a Device. Vends a Queue. Canvas Configuration: You make this. Feed to Context. Queue: Executes work batches in order. You’ll use this later. Device: An open connection to the adapter. Gives color format to the Canvas Configuration. Vends Buffers, Textures, and Pipelines and compiles code to Shaders. Buffer: A chunk of GPU memory. You’ll use this later. Texture:GPU memory formatted as an image. You’ll use this later. Shader: Vertex, Fragment, or Compute program. Feed to Pipeline. Buffer Layout: Describes how to interpret bytes in a Buffer. Like a C Struct definition. Describes a Buffer. Feed to Pipeline. Vertex Layout: Buffer layout specialized for meshes/triangle lists. Describes a Buffer. Feed to Pipeline. [https://staging.cohostcdn.org/attachment/45fea200-d670-4fab-9788-6462930f8eba/wgpu1-2.0.png] For each frame: Step one: Take a Buffer which you wish to update this frame. This will vend a Mapped Range, which is a Typed array that can read/write data from part of a GPU buffer. When you "unmap" the mapped range, the changes are automatically synchronized with the appropriate queue at that moment Step two: Device vends a Command Encoder. Context vends the Current Texture for this frame. Feed this to the Command Encoder and get a Render Pass. (The Command Encoder can also vend Compute Passes. Feed Viewport and Scissor rects (these are just numbers) to the Render Pass. Feed a Pipeline to the scissor rect. Feed Buffers (uniforms, vertices, indices) to the Render Pass. Feed Textures (inputs to shaders) to the Render Pass. Feed Render Passes and Compute Passes to the Queue. [https://staging.cohostcdn.org/attachment/0f2e871b-f37b-48cb-847f-d908704d4c39/wgpu2-2.0.png] Some observations in no particular order: * When describing a "mesh" (a 3D model to draw), a "vertex" buffer is the list of points in space, and the "index" is an optional buffer containing the order in which to draw the points. Not sure if you knew that. * Right now the "queue" object seems a little pointless because there's only ever one global queue. But someday WebGPU will add threading and then there might be more than one. * A command encoder can only be working on one pass at a time; you have to mark one pass as complete before you request the next one. But you can make more than one command encoder and submit them all to the queue at once. * Back in OpenGL when you wanted to set a uniform, attribute, or texture on a shader, you did it by name. In WebGPU you have to assign these things numbers in the shader and you address them by number.¹⁴ * Although textures and buffers are two different things, you can instruct the GPU to just turn a texture into a buffer or vice versa. * I do not list "pipeline layout" or "bind group layout" objects above because I honestly don't understand what they do. I've only ever set them to default/blank. * In the Rust API, a "Context" is called a "Surface". I don't know if there's a difference. Getting a little more platform-specific: TYPESCRIPT / NPM WORLD The best way to learn WebGPU for TypeScript I know is Alain Galvin's "Raw WebGPU" tutorial [https://alain.xyz/blog/raw-webgpu]. It is a little friendlier to someone who hasn't used a low-level graphics API before than my sandbag introduction above, and it has a list of further resources at the end. Since code snippets don't get you something runnable, Alain's tutorial links a completed source repo with the tutorial code, and also I have a sample repo [https://github.com/mcclure/ts-hello/tree/canvas-gpu] which is based on Alain's tutorial code and adds simple animation as well as Preact¹⁵. Both my and Alain's examples use NPM and WebPack¹⁶. If you don't like TypeScript: I would recommend using TypeScript anyway for WGPU. You don't actually have to add types to anything except your WGPU calls, you can type everything "any". But building that pipeline object involves big trees of descriptors containing other descriptors, and it's all just plain JavaScript dictionaries, which is nice, until you misspell a key, or forget a key, or accidentally pass the GPUPrimitiveState table where it wanted the GPUVertexState table. Your choices are to let TypeScript tell you what errors you made, or be forced to reload over and over watching things break one at a time. I DON'T KNOW WHAT A NPM IS I JUST WANNA WRITE CSS AND MY STUPID LITTLE SCRIPT TAGS If you're writing simple JS embedded in web pages rather than joining the NPM hivemind, honestly you might be happier using something like three.js [https://threejs.org/]¹⁷ in the first place, instead of putting up with WebGPU's (relatively speaking) hyper-low-level verbosity. You can include three.js directly in a script tag using existing CDNs [https://cdnjs.com/libraries/three.js] (although I would recommend putting in a subresource SHA hash [https://developer.mozilla.org/en-US/docs/Web/Security/Subresource_Integrity] to protect yourself from the CDN going rogue). But! If you want to use WebGPU, Alain Galvin's tutorial [https://alain.xyz/blog/raw-webgpu], or renderer.ts from his sample code, still gets you what you want. Just go through and anytime there's a little : GPUBlah wart on a variable delete it and the TypeScript is now JavaScript. And as I've said, the complexity of WebGPU is mostly in pipeline init. So I could imagine writing a single <script> that sets up a pipeline object that is good for various purposes, and then including that script in a bunch of small pages that each import¹⁸ the pipeline, feed some floats into a buffer mapped range, and draw. You could do the whole client page in like ten lines probably. RUST So as I've mentioned, one of the most exciting things about WebGPU to me is you can seamlessly cross-compile code that uses it without changes for either a browser or for desktop. The desktop code uses library-ized versions of the actual browser implementations so there is low chance of behavior divergence. If "include part of a browser in your app" makes you think you're setting up for a code-bloated headache, not in this case; I was able to get my Rust "Hello World" down to 3.3 MB, which isn't much worse than SDL, without even trying. (The browser hello world is like 250k plus a 50k autogenerated loader, again before I've done any serious minification work.) If you want to write WebGPU in Rust¹⁹, I'd recommend checking out this official tutorial from the wgpu project [https://sotrh.github.io/learn-wgpu/], or the examples in the wgpu source repo [https://github.com/gfx-rs/wgpu/tree/trunk/wgpu/examples/]. As of this writing, it's actually a lot easier to use Rust WebGPU on desktop than in browser; the libraries seem to mostly work fine on web, but the Rust-to-wasm build experience is still a bit rough. I did find a pretty good tutorial for wasm-pack here [https://developer.mozilla.org/en-US/docs/WebAssembly/Rust_to_wasm]²⁰. However most Rust-on-web developers seem to use (and love) something called "Trunk [https://trunkrs.dev/]". I haven't used Trunk yet but it replaces wasm-pack as a frontend, and seems to address all the specific frustrations I had with wasm-pack. I do have also a sample Rust repo I made for WebGPU [https://github.com/mcclure/rs-hello], since the examples in the wgpu repo don't come with build scripts. My sample repo is very basic²¹ and is just the "hello-triangle" sample from the wgpu project but with a Cargo.toml added. It does come with working single-line build instructions for web, and when run on desktop with --release it minimizes disk usage. (It also prints an error message when run on web without WebGPU, which the wgpu sample doesn't.) You can see this sample's compiled form running in a browser here [https://data.runhello.com/j/wgpu/2/]. C++ If you're using C++, the library you want to use is called "Dawn". I haven't touched this but there's an excellently detailed-looking Dawn/C++ tutorial/intro here [https://eliemichel.github.io/LearnWebGPU/]. Try that first. POSTHUMAN INTERSECTING TETRAHEDRON I have strange, chaotic daydreams of the future [https://pbs.twimg.com/media/E7Rqz1CXMBYcbOD?format=jpg&name=medium]. There's an experimental project called rust-gpu [https://github.com/EmbarkStudios/rust-gpu] that can compile Rust to SPIR-V. SPIR-V to WGSL compilers already exist, so in principle it should already be possible to write WebGPU shaders in Rust, it's just a matter of writing build tooling that plugs the correct components together. (I do feel, and complained above, that the WGSL requirement creates a roadblock for use of alternate shader languages in dynamic languages, or languages like C++ with a broken or no build system— but Rust is pretty good at complex pre-build processing, so as long as you're not literally constructing shaders on the fly then probably it could make this easy.) I imagine a pure-Rust program where certain functions are tagged as compile-to-shader, and I can share math helper functions between my shaders and my CPU code, or I can quickly toggle certain functions between "run this as a filter before writing to buffer" or "run this as a compute shader" depending on performance considerations and whim. I have an existing project that uses compute shaders and answering the question "would this be faster on the CPU, or in a compute shader?"²² involved writing all my code twice and then writing complex scaffold code to handle switching back and forth. That could have all been automatic. Could I make things even weirder than this? I like Rust for low-level engine code, but sometimes I'd prefer to be writing TypeScript for business logic/"game" code. In the browser I can already mix Rust and TypeScript, there's copious example code for that. Could I mix Rust and TypeScript on desktop too? If wgpu is already my graphics engine, I could shove in Servo or QuickJS or something, and write a cross-platform program that runs in browser as TypeScript with wasm-bindgen Rust embedded inside or runs on desktop as Rust with a TypeScript interpreter inside. Most Rust GUI/game libraries work in wasm already, and there's this pure Rust WebAudio implementation [https://github.com/orottier/web-audio-api-rs] (it's currently not a drop-in replacement for wasm-bindgen WebAudio but that could be fixed). I imagine creating a tiny faux-web game engine that is all the benefits of Electron without any the downsides. Or I could just use Tauri [https://tauri.app/] for the same thing and that would work now without me doing any work at all. Could I make it weirder than that? WebGPU's spec is available as a machine-parseable WebIDL file; would that make it unusually easy to generate bindings for, say, Lua? If I can compile Rust to WGSL and so write a pure-Rust-including-shaders program, could I compile TypeScript, or AssemblyScript or something, to WGSL and write a pure-TypeScript-including-shaders program? Or if what I care about is not having to write my program in two languages and not so much which language I'm writing, why not go the other way? Write an LLVM backend for WGSL, compile it to native+wasm and write an entire-program-including-shaders in WGSL. If the w3 thinks WGSL is supposed to be so great, then why not? Okay that's my blog post. ---------------------------------------- ¹ 113 or newer ² "Fragment" is OpenGL for "Pixel". ³ I am still trying to figure out whether modern video cards are simply based on the internal architecture of Quake 3. ⁴ And those coordinates HAD to describe triangles, now. Want to draw a rectangle? Fuck you, apparently! ⁵ (And a series of OpenGL techniques and extensions no one seems to have really got the chance to use before OpenGL was sunset.) ⁶ Why is a "push constant" different from a "uniform", in Vulkan/WebGPU? Well, because those are two different things inside of the GPU chip. Why would you use one rather than the other? Well, learn what the GPU chip is doing, and then you'll understand why either of these might be more appropriate in certain situations. Does this sound like a lot of mental overhead? Well, sometimes, but honestly, it's less mental overhead than trying to understand whatever "VAO"s were. ⁷ Require ⁸ By the way, have you noticed the cheesy Star Trek joke yet? The companies with seats on the Khronos board have a combined market capitalization of 6.1 trillion dollars. This is the sense of humor that 6.1 trillion dollars buys you. ⁹ There are decent Vulkan-based OSS game engines, though. LÖVR [https://lovr.org/], the Lua-based game engine I use for my job [https://mermaid.industries/], has a very nice pared-down Lua frontend on top of its Vulkan backend that is usable by beginners but exposes most of the GPU flexibility you actually care about. (The Lua API is also itself a thin wrapper atop a LÖVR-specific C API, and the graphics module is designed to be separable from LÖVR in principle, so if I didn't have WebGPU I'd actually probably be using LÖVR's C frontend even outside Lua now.) ¹⁰ This made OpenGL's fragmentation problem even worse, as the "final" form of OpenGL is basically version 4.4-4.6 somewheres, whereas Apple got to 4.1 and simply stopped. So if you want to release OpenGL software on a Mac, for however longer that's allowed, you are targeting something that is almost, but not quite, the final full-featured version of the API. This sucks! There is some important stuff in 4.3. ¹¹ Microsoft shipped ANGLE in Windows 11 as the OpenGL component of their Android compatibility layer, and ANGLE has also been shipped as the graphics engine in a small number of games such as, uh… [checking Wikipedia] Shovel Knight?! You might see it used more if ANGLE had been designed for library reuse from day one like WebGPU was, or if anyone wanted to use OpenGL. ¹² If I were a cynical, paranoid conspiracy theorist, I would float the theory here that Apple at some point decided they wanted to leave open the capability to sue the other video card developers on the Khronos board, so they are aggressively refusing to let their code touch anything that has touched the Vulkan patent pool to insulate themselves from counter-suits. Or that is what I would say if I were a cynical, paranoid conspiracy theorist. Hypothetically. ¹³ If you pay close attention here you'll notice something weird: Pipelines combine buffer interfaces with specific shaders, so you can use a single pipeline with many different buffers but only one shader or shader pair. What early users of both WebGPU and Vulkan have found is that you wind up needing a lot of pipeline objects in a fair-sized program, and although the pipeline objects themselves are lightweight, creating the pipeline objects can be kind of slow, especially if you have to create more than one of them on a single frame. So this is an identified pain point, having to think ahead to all the pipeline objects you'll need and cache them ahead of time, and Vulkan has already tried to address this by introducing something called "shader objects" like one month ago. Hopefully the WebGPU WG will look into doing something similar in the next revision. ¹⁴ This annoys me, but I've talked to people who like it better, I guess because they had problems with typo'ing their uniform names. ¹⁵ This sample is a little less complete than I hoped to have it by the time I posted this. Known problems as of this second: It comes with a Preact Canvas wrapper that enforces aspect ratio and integer-multiple size requirements for the canvas, but it doesn't have an option to run full screen; there are unnecessary scroll bars that appear if you open the sample in a non-WebGPU browser (and possibly under other circumstances as well); there is an unused file named "canvas2image.ts", which was supposed to be used to let you download the state as a PNG and ought to be either wired up or removed; if you do add canvas2image back in it doesn't work, and I don't know if the problem is at my end or Chrome's [https://bugs.chromium.org/p/chromium/issues/detail?id=1431714&q=&can=4]; the comments refer to some concepts from 2021 WebGPU, like swapchains. ¹⁶ If you don't like WebPack, that implies you know enough about JavaScript you already know how to replace the WebPack in the example with something else. ¹⁷ Not a specific three.js endorsement. I've never used it. People seem to like it. There [https://www.babylonjs.com/] (BabylonJS) are [https://github.com/redcamel/RedGPU] (RedGPU) alternatives [https://playcanvas.com/] (PlayCanvas, which by the way is incredibly cool). ¹⁸ Wait, do JS modules/import just work in browsers now? I don't even know lol ¹⁹ If you're using Rust, it's quite possible that you are using WebGPU already. The Rust library quickly got far ahead of its Firefox parent software and has for some time now already been adopted as the base graphics layer in emerging GUI libraries such as Iced [https://iced.rs/]. So you could maybe just use Iced or Bevy for high-level stuff and then do additional drawing in raw WebGPU. I haven't tried. ²⁰ Various warnings if you go this way: If you're on Windows I recommend installing the wasm-pack binary package [https://rustwasm.github.io/wasm-pack/installer/] instead of trying to install it through cargo. If you're making a web build from scratch instead of using my sample, note the slightly alarming "as of 2022-9-20" note here [https://github.com/gfx-rs/wgpu/wiki/Running-on-the-Web-with-WebGPU-and-WebGL] in the wgpu wiki. ²¹ This sample also has as of this writing some caveats: It can only fill the window, it can't do aspect ratios or integer-multiple restrictions; it has no animation; in order to get the fill-the-window behavior, I had to base it on a winit PR [https://github.com/rust-windowing/winit/pull/2074], so the version of winit used is a little older than it could be; there are outstanding warnings; I am unclear on the license status of the wgpu sample code I used, so until I can get clarification or rewrite it you should probably follow the wgpu MIT license even when using this sample on web. I plan to eventually expand this example to include controller support and sound. ²² Horrifyingly, the answer turned out to be "it depends on which device you're running on".
I want to talk about WebGPU