Switching the Linux graphics stack from GLX to EGL

Hi there! This is a guest post from Robert Mader, who contributed enormous improvements to Firefox’s graphics stack on Linux.

TL;DR

In the upcoming Firefox 94 release we will enable the EGL backend for a big group of our Linux users. This will increase WebGL performance, reduce resource consumption and make our life as developers easier going forward.

Background

In order to use hardware accelerated APIs like OpenGL with windowing systems like X11 or Wayland there needs to be an interface bringing them together. For OpenGL on X11 most programs use GLX, while its successor, EGL, gets used on Wayland, Android and in the embedded space. While EGL has some major advantages compared to GLX and, in theory, can be used on X11 just as well, its adoption there has been very slow.

I can only speculate why exactly that is, but I think it comes down to the following reasons:

  1. Games and similar applications barely benefit from the switch
  2. Applications and toolkits that would benefit from it often don’t enable hardware accelerated rendering on X11 in the first place. Likely because of the bad and complex driver situation in the past etc.
  3. Because of the slow adoption, X11 EGL implementations remained buggy and incomplete → back to 2.

What changed?

Firefox is an application that benefits heavily from hardware acceleration in many areas. However, until recently, software rendering remained the default. It was only this year that finally Webrender, Firefox’s new rendering engine, got enabled for most Linux users.
There is a very long list of developments that made this step easier and thus possible. To name a few:

  1. OpenGL drivers got better
  2. Xorg DDX drivers got better (e.g. the “modesetting” driver becoming the standard for Intel)
  3. Composited desktops became more common
  4. Plugin support (Flash Player) was dropped from Firefox
  5. Webrender made hardware acceleration much more desirable compared the old OpenGL layers backend
  6. New technologies such as Wayland and DMABUF emerged

The last point was crucial for the topic of the post. When Martin Stránský implemented Wayland hardware acceleration support in Firefox, he could not reuse GLX code, but instead used the Android EGL one. From there, an interesting dynamic started.

Improving the EGL backend and sharing code

Step by step, a number of improvements were made to the EGL/Wayland backend which had effects on other platforms as well:

  1. In order to improve WebGL performance and allow efficient hardware video decoding, Martin implemented zero-copy GPU buffer sharing via DMABUF. This is much easier on EGL than on GLX. And while Firefox did have a similar buffer sharing implementation for X11 (using Xrender), that one was never stable enough to get turned on by default.
  2. I improved the EGL backend to not only support OpenGL ES but also “desktop” OpenGL, making sure it’s not lacking behind the GLX backend.
  3. I went on an made it possible to use the EGL backend on X11 as well.
  4. Martin extended the DMABUF and VAAPI support to X11.
  5. Greg, an independent Wayland contributor, wrote an initial implementation for partial damage on EGL.
  6. Jamie Nicol extended the partial damage support to properly work on Android – and thus on X11 as well.
  7. Greg made sure our GPU detection (and smoke test from days when drivers would often crash) works on Wayland without requiring Xwayland to be present, making it not require GLX any more.

This is just a very small extraction of examples and maybe it gives you an idea of what I’m trying to say: more and more code gets shared between Wayland, X11/EGL and Android. This improves code quality, increases available time to spend on features and bugs, reduces the maintenance burden – you name it.

Making EGL the default

Over the last year, more and more user found out about the possibility to use EGL on X11 – likely because it’s a prerequisite for hardware video decoding. Lots of bugs got fixed in Firefox but also other components. Now we finally feel ready to let it ride the trains. As of Firefox 94, users using Mesa driver >= 21 will get it by default. Users of the proprietary Nvidia driver will need to wait a little bit longer as the currently released drivers lack an important extension. However, most likely we’ll be able to enable EGL on the 470 series onwards. DMABUF support (and thus better WebGL performance) requires GBM support and will be limited to the 495 series onwards.

Benefits for users

So what exactly can you expect, and why? Mainly:

  1. Improved WebGL performance. Thanks to DMABUF zero-copy buffer sharing, WebGL can be done both sandboxed and without round-trip to system ram. WebGL is not only used in obvious places such as games, but also in more subtle ways, e.g. on Google Maps.
  2. Reduced power consumption. With partial damage we don’t need to redraw the whole window any more if only a small part of the content changed. Common examples here are small animations on websites or when loading tabs.
  3. Less bugs. EGL is more modern, much better suited for complex hardware accelerated desktop applications and used on more platforms, compared to GLX.
  4. Hardware video decoding by default is another crucial step closer – in fact for most users it should now be only one preference away (but beware, it still has a couple of bugs).

Special thanks

There is long list of people who have contributed to this step. To name a few: Martin Stránský, Andrew Osmond, Jamie Nicol, Grep V, Jan Ikenmeyer (Darkspirit), Michel Dänzer, the Firefox GFX-Team, the Mesa project and contributors, the Nvidia drivers team, the GTK team.

Finally: thanks a lot to all users who filed bugs and helped us fix them!

About the author

Hi, I’m Robert Mader, a free time FOSS contributor, mainly working on Firefox and Mutter/Gnome-Shell.

28 thoughts on “Switching the Linux graphics stack from GLX to EGL

  1. Great news! Can’t wait to improve my testing on webgl applications and throw away chrome! Big THANKS to all contributers!

  2. Great.

    I hope this work will entirely fix Video playback in Firefox.

    Firefox is excellent on big, established, media sites like Youtube or Facebook, but on minor or obscur ones, I find myself, more often than not, trying to resolve weird issues like playback stopping and (sometimes) not resuming unless I refresh the page…, videos not playing at all…, etc.

      1. I thought the laggy video playback is due to a bug in kwin_x11. When I restart kwin_x11 the lagging stops. And this isn’t specific to Firefox. Also happens on Chrome.

  3. Would be nice if you backported patch 1712665 to ESR Firefox, so corporate users could also benefit from EGL on X11 without hanging window.

  4. Testing this even on FF 93 makes several sites I have open every day about 10x more usable on a 4k monitor. Awesome work!

  5. This is really great news to hear, thanks to all involved for this great work.
    Now the only thing I’m eagerly waiting for, is VAAPI enabled by default on Wayland.

  6. >Hardware video decoding by default is another crucial step closer – in fact for most users it should now be only one preference away

    I think there are four relevant preferences actually, if you also want VP8/9 (or has that changed):

    – media.ffmpeg.vaapi.enabled=true
    – media.ffvpx.enabled=false
    – media.rdd-ffvpx.enabled=false
    – media.rdd-vpx.enabled=false

    1. Those prefs disabling RDD aren’t a good idea: they are moving acceleration, which is more privileged, to the content process, which is more exposed.

      MOZ_DISABLE_RDD_SANDBOX=1 might be better. It removes some useful sandboxing, but it maintains the separation between GPU interfaces and web content.

    2. From FF 97 on “media.ffmpeg.vaapi.enabled=true” should be enough for VP8/9 as well – and AV1 may also make it. The sandbox issues around the RDD process have apparently all got resolved and ffmpeg will also run inside it now (media.rdd-ffmpeg.enabled is enabled by default).

  7. How about fixing the annoying bug in EGL on linux/x11 – after suspend the screen is garbled and unreadable if you enable EGL on X11.. How is this still not fixed before enabling EGL by default in linux? What is going on, i was just trying ff94 and this bug is still here!!!!!

    1. Is this on Mesa? Or, more likely, on prop. Nvidia? In the later case, please read the “Making EGL the default” section in the blog post again ;)

  8. Just want to give a huge thank you to Robert Mader et al. I just recently updated to Firefox 94.0 today and noticed it was running noticeably smoother on my Wayland install. Looked up the updates and your improvements are doing wonders (from the naked eye). Thanks again!

    1. In address bar go to about:support and search for X11_EGL (for X, don’t know how it is for Wayland) and DMABUF – they should be at the end of Graphics section. Also for nouveau they are available and on by default.

  9. I have Ubuntu 21.10, with Firefox 94 and nVidia drivers 495.29.05. I have a Dell XP 9560 which has Intel 7th gen CPU (integrated GPU) with nVidia 1050 GPU.

    Does this I can make use of hardware accelerated video decoding or not? If so, how?

  10. [Disclaimer: I know very few about GPUs and computer graphics.]

    Thank you for that exciting blog post! You write: “Reduced power consumption. With partial damage we don’t need to redraw the whole window any more”.

    From a 2017 blog post introducing WebRender [1], the lesson I took home was:
    (A) in the past, we were being clever about redrawing areas as small as possible, first by containing “damages” or “invalidation” into small rectangles, then by using layer composition; all that work taking place (at least initially) on our general-purpose CPU;
    (B) now in 2017, we’ll ditch all that logic and batch everything to the GPU. Even though we are not clever anymore, this is faster because nowadays GPUs are designed precisely for that kind of brutal work.

    I quote: “What if we stopped trying to guess what layers we need? What if we removed this boundary between painting and compositing and just went back to painting every pixel on every frame? This may sound like a ridiculous idea, but […]”

    That’s apparently less trivial than it sounds, because you must carefully craft your batches to make a productive use of the GPU. So this new paradigm still has some amount of optimization, only it is different in nature.

    Still, I was skeptical at that time: even if you *can* do that much work with a GPU and it is fast enough compared to the previous implementation, doing *less* work certainly wouldn’t hurt? What about power consumption? Resources for neighbors?

    [Also, I was a bit worried that we compared web browsers (2D mostly still mostly black-on-white textual contents, running casually alongside other and heavier software) to video games (ever-changing textured 3D graphics, played full-screen with massive electrical power and computing power drain on high-end GPU-propelled desktop computers)… And text rendering was still to be done on the CPU!… But anyway.]

    So I assumed that even if sub-optimal in terms of work done, we were already happy with the speed gain we were getting with that new strategy (WebRender), and we were not interested (at least not immediately) by smarter logic, that would lead to more complex code. (Also, by reading that blog post, it was not clear to me whether we had actually ditched damage bounding altogether, or if it was still done to some extent by the earlier steps—those that the author calls Style and Layout.)

    So, back to the present blog post: is this a step back from then? Are we now re-introducing a kind of optimization (damage bounding) that we were doing in the past, on CPU, but that we had stopped doing when introducing WebRender and switching to full GPU rendering? Or am I understanding it all wrong?

    [1]: https://hacks.mozilla.org/2017/10/the-whole-web-at-maximum-fps-how-webrender-gets-rid-of-jank/

  11. Rip this blog, Mozilla decided to kill off much of its engineering resources and offer 3rd party services as a way to bring in new revenue :'(

Leave a comment