Security isn’t exactly a strong point of X11, and improving it is one of the main reasons why Wayland is such a vast improvement over X11. Just one of the many examples of X11 being inherently insecure is that keyloggers are entirely trivial on X11, because keylogger functionality is effectively built into it. Of course, this isn’t exactly news, and as Peter Hofmann details, there is an old X11 extension that adds somewhat rudimentary security to X11: the X11 SECURITY extension.
This extension is part of every X.org installation, but it hasn’t seen any meaningful work in a long, long time. What it does is allow you to do is set X11 clients as “trusted” and “untrusted”, where untrusted clients cannot interact with tusted ones. This provides some basic security – it actually prevents keylogging! – but only very basic, as Hoffman notes:
The thing is that it’s immediately clear that this extension — in its current state — is not the answer to “X11 is insecure”: You only have two classes, trusted and untrusted. That’s not enough. For example: When you run your browser as untrusted, you can’t simultaneously run some sandboxed program (Snap, Flatpak, …) in a meaningful way, because those two clients can spy on each other again. You want a proper per-client isolation instead.
Sandboxing plays an important role here. If you run programs “the traditional way” (i.e., full access to the filesystem and network), then an attacker can do all kinds of things and X11 keylogging is just one of a million concerns.
↫ Peter Hofmann
but it also happens to break a lot of things, and many applications simply don’t work with it at all. Oddly enough, Firefox has no issues with it, and will happily run in untrusted mode.
The biggest problem, however, is that untrusted clients only have access to exactly two other X11 extensions, which leads to a whole host of problems, like no scaling, broken keyboard layouts, no 3D acceleration, and so on. On top of all of that, it breaks clipboard functionality, as anything copied in an untrusted client cannot be pasted anywhere else.
As such, Hoffman concludes:
In its current state, I’d say the SECURITY extension is “somewhat useful”, but more work would have to be done. Both in X.Org and in the clients. You would have to come up with a new clipboard protocol, for example. And the list goes on. (See where I’m going with this?) It’s not that simple.
↫ Peter Hofmann
Since pretty much nobody adopted it when this extension came out in the ’90s, and it hasn’t seen much work since, the amount of work that would be required to bring it up to modern standards would be astronomical, and trying to get clients to adopt it would probably prove fruitless considering Wayland already exists, and offers all of the potential security benefits and then some. People often claim it would be “easy” to modernise X11, but just this one particular issue – security, kind of important – shows just how quickly the X11 house of cards comes crashing down if you try to do anything to drag it out of its ’80s and ’90s mindset.
Thom Holwerda,
“Astronomical” is an exaggeration here. It would not be that much work, but most people don’t see that much value either because of 1) they want to see X11 get replaced with something more modern. and 2) The security threat was never actually that big a problem.
Yes it’s true X11 allows one program to keylog another program. Ideally X11 would require software to have permission to do that. But even so many distros only let one user access the desktop at a time. Everyone go test this right now, ssh or su into a different account and then try to run an X11 program from a different user, X11 typically blocks such requests even from root!
Now, while X11 lets processes launched by the same user watch each other’s events, these processes are already in the same privilege domain. All processes running as a user are able to modify the user’s profile even if wayland is running! They could change the environment variables and libraries to hook into various functions across processes. X11 considering all processes under the same account to be under the same security domain is modeled after the same behavior used by the OS itself.
Here’s another thing everyone can test:
Anyone surprised?
To the extent that X11 does not sandbox processes with the same UID from one another, I can agree there ought to be a way to do that, but the fact of the matter is this is consistent with the security model that underpins the entirety of linux. Wayland preventing inadvertent communications through the compositor, while a good thing, is a bit of a mute point when those process already share the same security domain.
@Alfman
You are of course correct and a very nice demonstration.
That said, the dangers in the GUI are amplified. How often do you do your banking from the command line? And if I am going to log into some critical cloud infrastructure from the command-line, I am probably using something like SSH. If I am doing that, my password is not getting echoed to standard out to be captured via your method above. And I hope I have locked down ptrace.
But if I am doing my SSH from a terminal in X11, my password can be easily read as I type it. And a connection could be established without showing it to me. Or, you could wait for me to generate a private key and take that instead. Or, my credentials can be read as I access my banking in a web browser. You can take a screenshot while you are at it to capture account numbers and the like if they are silly enough to show them on the screen (my bank does).
I run Microsoft Edge on Linux fairly often. It is only available as a binary. When I run it on Xorg, it could be running the Linux equivalent of Microsoft Recall in the background for all I know. It is not like I am going to be alarmed to see it sending data over the network. And sometimes it uses gigabytes of RAM, a lot of CPU or even GPU, and shows up 100 times in my process list. So, lots of room to hide substantial background shenanigans.
Even if you could lock Xorg down, X11 has no way to ask the X server for a screenshot or stream. None of the x11 screen capture tools that I am aware of would work. Lots of stuff would break.
Now, I doubt Microsoft is taking screenshots on my system and processing them with an AI. But they could. Well, they cannot because I am a Wayland user as you know. So, they can only do it if I log into my bank from within Edge itself. That said, I do authenticate to Teams and Outlook from Edge. But I imagine that Microsoft is already mining that stuff on the other side.
LeFantome,
To be clear, it was just meant as an illustrative example that could be performed in less than ten steps. But the point I hoped people would take away is that processes running in the same security domain are not protected from each other at the OS level. It doesn’t matter if you use wayland or not.
It’s inherently unsafe to install untrusted software as the same user as trusted software. If you happen to install/run malware downloaded from steam under the same account you use for banking, which TBH this is probably pretty common, then in principal that malware could find a way to get your credentials without escalating privileges!! In other words from the operating system perspective there’s nothing to restrict because it’s the same UID.
What’s needed would be sandboxing beyond the UID, but this is relatively novel for desktop operating systems. I’ve been critical of this security model for decades: just because one user is running two applications doesn’t mean those two applications should be in the same security domain!!
I don’t really accept this argument that “securing X11 would break XYZ and therefor X11 can’t be fixed”. The solution is to add new permissions that X11 could check for. A screen shot program would have to pass a permission check for it to succeed, otherwise it would fail. You could add a notification to inform the user that an application performed an unauthorized operation . Naturally the distro would be responsible for setting the right permissions out of the box. Permissions of manually installed software would need to be manually assigned. And not for nothing this would have been better than Xwayland’s approach of just breaking X11 features they don’t want software to have.
I want to push back on “they cannot because I am a Wayland user as you know”. If you 1) don’t trust edge, and 2) run edge as the same UID as other software you do trust, then you’ve still got an unresolved security problem even with wayland. Obviously I’m not asserting microsoft are doing this, but running as your UID without a sandbox means they could install trojan horses into everything without needing a privilege escalation.
Incidentally this is exactly why PHP completely gave up from trying to protect PHP processes from each other (like it originally did) and now requires security separation via new UID security domains that front ends like nginx launch scripts into. Using mature OS isolation mechanisms was deemed the only sane approach to robust security for web hosting. We should be applying the same mentality around desktop software too.
@Alman
> the point I hoped people would take away is that processes running in the same security domain are not protected from each other at the OS level
I completely agree. You made your point well.
> I don’t really accept this argument that “securing X11 would break XYZ and therefor X11 can’t be fixed”.
I did not make that argument and that is not at all what I am saying. Of course Xorg could be fixed. Perhaps that is the best way forward. There is an Xorg fork that claims it is trying to do this as we speak. My point is that “fixing” Xorg will break existing applications. As you say, the solution is simply to “fix” the applications (update them to the new methods). In the meantime though, things will break. The breakage does not stem from incompetence. It is a symptom of the fix.
> If you 1) don’t trust edge, and 2) run edge as the same UID as other software you do trust, then you’ve still got an unresolved security problem
Again, I completely agree. That does imply though that the two systems represent the same level of risk. Wayland uses sockets and DMA. Unlike on Xorg, reading the screen, input, or memory of another application is not so easily done on Wayland The filesystem and network however…
> this would have been better than Xwayland’s approach of just breaking X11 features they don’t want software to have
Ok, my turn to push back.
Wayland considers security to be more important. That is the issue. This means that, in order for some things to work, there have to be explicit mechanisms put in place first. I share your frustration with how hard it has been and how long it has taken to convince some in the Wayland world (especially the GNOME crew) which of these mechanisms needs to exist. Though it has been a slog, most of these mechanisms have now been built or will be soon. There are few remaining that impact many people. But things were not taken. Security was added.
Xwayland does not restrict anything more than any other X server, which makes sense since Xwayland is Xorg really. 95% of Xwayland code is in Xorg too (though that 95% is only about 2/3 of the Xorg code). Xwayland is Xorg without the Xorg DDX (replaced by code that talks to a Wayland compositor instead). This is why Wayback works. Wayback is not having to build any new functionality into Xwayland or to add back any functionality. Wayback just runs Xwayland as a Wayland app in its own window (the only one on the system). The Xwayland server, the window manager, and all the X11 clients (apps) run inside that one Wayland application. So none of the security restrictions between Wayland apps has any impact in that setting. The X11 apps have as much access to each other as they usually do and all the old tricks work.
What you may be thinking of is the more typical use case of using Xwayland on a normal Wayland compositor where each X11 app is run rootless in its own window.. There, things break. In that case, each X11 app is running inside its own Wayland window and so all the normal restrictions between Wayland apps are in play. It has nothing to do with Xwayland though. The Wayland compositor is what “breaks” stuff like key logging, input automation, screen capture, screensavers, lock-screens, and global hot-keys. It restricts these things between the Xwayland windows just as it does between other Wayland apps. No prejudice.
Anyway, nobody took away “X11 features they don’t want software to have”. They just added security. Yes, they have been too slow or too resistant to build fixes for the things that broke. But that is not the same thing.
@Alfman
So mad. I saw the typo in your name and I tried immediately to fix it. But the system just “spun”. I went to three different computers but none of would accept the edit. And now too much time has passed. Apologies..
LeFantome,
I apologize. I must be misunderstanding some of your comments that seem a bit contradictory to me. For example: “You cannot fix X11 without ‘breaking everything’ as the famous Wayland post proclaims.”
I disagree because there wouldn’t have been the need to break either X or Xwayland if they implemented a sensible permission check instead of culling features. Screen sharing in video conferencing software, screen capture and key injection in remote xvnc sessions, steam hotkey overlays, etc. would not have been broken by Xwayland. Years of breakages could have been completely avoided. Having a permission check lets the user make the decision “Do I want steam to be able to install a X keylogger and perform screenshots of other applications”. The answer could legitimately be yes or no, but when it comes to open computing I will always be in favor of putting owners in control rather than having security policy forced on us. I think is fair to say wayland are guilty of this.
We probably agree that X11 is an old bloated code base; I’ve never been opposed to replacing it. I am just critical of wayland’s choices and management. I wish they had more respect for user requirements. They were too eager to break stuff rather than giving users control over permissions. This ended up tarnishing their own brand.
But every broken feature is an example of this. Like the ability to get window and mouse coordinates. They forcefully removed those features and consequently some multiwindow applications get all messed up. This is directly attributable to wayland’s decision to deny us the feature. This wouldn’t be such a big deal if wayland had kept the features as privileged permissions. The criticism is that wayland would rather dictate our needs to us than give us the option.
This is why people are working on sandboxing apps. Unfortunately it requires apps to be designed to work in a sandboxed environment so it will happen gradually
Magnusmaster,
I don’t know how long it will take, but yep I agree.
This is what some fail to grasp. You cannot fix X11 without “breaking everything” as the famous Wayland post proclaims. This is why there is not X12. This is why Wayland was started as a new initiative and Xorg development stalled. It allowed more conservative users to keep using things without disruption while the other ecosystem matured.
Any attempt to modernize Xorg will break things. There has been a recent Xorg fork and this is already happening. The list of things that the Xorg security extension breaks sounds an awful lot like the list of things that detracts have to say about Wayland. No coincidence. He did not talk about it but it is obvious that you would have the same issues with screen capture, streaming, global hotkeys, and everything else that relies on random apps accessing other apps without restriction.
Thankfully, we are finally close to putting all this behind us.