[Alyssa Rosenzweig] Asahi has been working tirelessly for the reverse engineering of GPUs built into Apple’s M1 architecture as part of the Linux effort. If you are not familiar with the project, add support for the Linux kernel and user space for Apple M1 line products. He has made great strides, and even got primitive rendering by working with his own open source code more than a year ago.
The driver tried to mature, however, hit a snag. For complex rendering, something breaks in the GPU and only parts of the content are missing in the frame. Some clever tests have discovered the trigger for accurate failure – excessive total vertex data. Simply put, it is “the number of vertices (geometric complexity) the amount of vertices per vertex (‘shading’ complexity)”. That sounds almost like a buffer fill, but the GPU itself. This is not a buffer with which the driver communicates directly, so all this sleuthing has to be done blindly. Apple drivers don’t have malicious renders like this, so what’s going on?
[Alyssa] GPU offers a quick crash-course on design, distinguishing between a desktop GPU using primarily dedicated memory and a mobile GPU with unified memory. M1 falls into that second category, using a tilebuffer to cache render results when creating a frame. That tile buffer is a certain size. Frame rendering has crash overflow. So how will the driver handle it? The traditional answer is just to assign a larger buffer, but not how the M1 works. Instead, when the buffer is full, the GPU triggers a partial render, which eats up the buffer’s data. The problem is that the partial render is being sent to the screen instead of being properly mixed with the rest of the render. Why? Go back to the command capture used by Apple’s driver.
The driver does something weird, it sets two separate load and store programs. Knowing that the render buffer is moved around the mid-render, it starts to make sense. One function is for partial render, the other for final. Exclude setting up one of these, and when the GPU requires a missing function, it de-references a null pointer and the rendering explodes. So, provide the missing function, get the configuration correctly and the rendering is done correctly. Finally! The taste of victory is never so sweet when it comes to chasing such a mysterious bug.
Need more hope for Linux in your life? [Hector Martin] This past week Floss gave an interview to Weekly, detailing our project.