Linked by Thom Holwerda on Wed 3rd Jan 2018 00:42 UTC
Intel

A fundamental design flaw in Intel's processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug.

Programmers are scrambling to overhaul the open-source Linux kernel's virtual memory system. Meanwhile, Microsoft is expected to publicly introduce the necessary changes to its Windows operating system in an upcoming Patch Tuesday: these changes were seeded to beta testers running fast-ring Windows Insider builds in November and December.

Crucially, these updates to both Linux and Windows will incur a performance hit on Intel products. The effects are still being benchmarked, however we're looking at a ballpark figure of five to 30 per cent slow down, depending on the task and the processor model. More recent Intel chips have features - such as PCID - to reduce the performance hit.

That's one hell of a bug.

Thread beginning with comment 652439
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE: Overhyped
by Alfman on Wed 3rd Jan 2018 05:21 UTC in reply to "Overhyped"
Alfman
Member since:
2011-01-28

Brendan,

The minor timing quirk in Intel CPUs (that does not break documented behaviour, expected behaviour or any guarantee, and therefore can NOT be considered a bug in the CPU); allows an attacker to determine which areas of kernel space are used and which aren't.
...
Note that the insane hackery to avoid this non-issue adds significant overhead to kernel system calls; ironically, making the performance of monolithic kernels worse than the performance of micro-kernels (while still providing inferior security than micro-kernels). The insane hackery doesn't entirely fix the "problem" either (a small part of kernel must remain mapped, and an attacker can still find out where in kernel space that small part of the kernel is and use this information to infer where the rest of the kernel is).


Regarding address layout randomization, that is my position as well; it is nothing more than security by obscurity under a different name. Security that depends on keeping (non cryptographic) secrets like memory addresses is inherently flawed. However to play devil's advocate, it's proponents would argue it's intended as a secondary security measure and not a primary one. Security by obscurity, as flawed as it is, is arguably more secure than not having a second line of defense at all when an exploit for the primary security is found. ASLR was introduced as a very low overhead way to increase the failure modes for code exploits. However given the increasingly known failures of ASLR on non-theoretical hardware and the costs of repairing ASLR's broken assumptions, I'd say it's days as an effective countermeasure are numbered. In other words I don't think the community can justify the high costs that would be required to make ASLR strong against side channel attacks (like cache-hit testing).

This was the topic of a 2016 blackhat paper:
https://www.blackhat.com/docs/us-16/materials/us-16-Jang-Breaking-Ke...

Countermeasures are discussed at the very end...
Modifying CPU to eliminate timing channels
- Difficult to be realized

Using separated page tables for kernel and user processes
- High performance overhead (~30%) due to frequent TLB flush

Fine-grained randomization
- Difficult to implement and performance degradation

Coarse-grained timer?
- Always suggested, but no one adopts it.


As we know, nothing has been done to address sidechannel attacks against ASLR. Even a javascript version of the exploit was demonstrated last year just to illustrate how effective sidechannel attacks can be.

https://www.theregister.co.uk/2017/02/14/aslr_busting_javascript_hac...
This, the VU team says, is what makes this vulnerability so significant. An attacker could embed the malicious JavaScript code in a webpage, and then run the assault without any user notification or interaction.

Because the assault does not rely on any special access or application flaws, it works on most browsers and operating systems. The researchers say that when tested on up-to-date web browsers, the exploit could fully unravel ASLR protections on 64-bit machines in about 90 seconds.

Infecting a system does not end there, of course: it just means ASLR has been defeated.




Fortunately the "malicious performance degradation attack" (the ineffective work-around for the non-issue) is easy for end users to disable.


The problem is there's very little published info on this newest attack. The little bits that are around suggest to me this is much more significant than merely broken ASLR. It sounds like intel's out of order branch prediction may be executing speculative code prior to checking the full credentials in such a way that they found a way to exploit the deferment, which does not happen on AMD processors. Apparently the temporary software fix is to reload the page table every kernel invocation. This invalidates the caches and happens to fix ASLR as well, but I think fixing ASLR was just a side effect - there's not enough information to know for sure. I could be completely wrong but this media hush now would make very little sense if they had merely broken ASLR again given that ASLR is already publicly cracked and has been for ages already. I believe the sense of urgency and the deployment of high performance-cost workarounds in macos, windows, and linux, and planned service outages at amazon strongly suggest something much more critical was found to directly compromise kernel security on intel processors.

Hopefully Thom will post an update when all is finally revealed.

Edited 2018-01-03 05:27 UTC

Reply Parent Score: 7

RE[2]: Overhyped
by Brendan on Wed 3rd Jan 2018 06:14 in reply to "RE: Overhyped"
Brendan Member since:
2005-11-16

Hi,

The problem is there's very little published info on this newest attack. The little bits that are around suggest to me this is much more significant than merely broken ASLR. It sounds like intel's out of order branch prediction may be executing speculative code prior to checking the full credentials in such a way that they found a way to exploit the deferment, which does not happen on AMD processors. Apparently the temporary software fix is to reload the page table every kernel invocation. This invalidates the caches and happens to fix ASLR as well, but I think fixing ASLR was just a side effect - there's not enough information to know for sure. I could be completely wrong but this media hush now would make very little sense if they had merely broken ASLR again given that ASLR is already publicly cracked and has been for ages already. I believe the sense of urgency and the deployment of high performance-cost workarounds in macos, windows, and linux, and planned service outages at amazon strongly suggest something much more critical was found to directly compromise kernel security on intel processors.


As I understand it:

a) Program tries to do a read from an address in kernel space

b) CPU speculatively executes the read and tags the read as "will generate page fault" (so that a page fault will occur at retirement), but also (without regard to permission checks and likely in parallel with permission checks) either speculatively reads the data into a temporary register (if the page is present) or pretends that data being read will be zero (if the page is not present) for performance reasons (so that other instructions can be speculatively executed after a read). Note that the data (if any) in the temporary register can not be accessed directly (it won't become "architecturally visible" when the instruction retires).

c) Program does a read from an address that depends on the temporary register set by the first read, which is also speculatively executed, and because it's speculatively executed it uses the "speculatively assumed" value in the temporary register. This causes a cache line to be fetched for performance reasons (to avoid a full cache miss penalty if the speculatively executed instruction is committed and not discarded).

d) Program "eats" the page fault (caused by step a) somehow so that it can continue (e.g. signal handler).

e) Program detects if the cache line corresponding to "temporary register was zero" was pre-fetched (at step c) by measuring the amount of time a read from this cache line takes (a cache hit or cache miss).

In this way (or at least, something vaguely like it); the program determines if a virtual address in kernel space corresponds to a "present" page or a "not present" page (without any clue what the page contains or why it's present or if the page is read-only or read/write or executable or even if the page is free/unused space on the kernel heap).

- Brendan

Reply Parent Score: 5

RE[3]: Overhyped
by galvanash on Wed 3rd Jan 2018 06:41 in reply to "RE[2]: Overhyped"
galvanash Member since:
2006-01-25

There has to be more to it than that. I mean I'm not saying your analysis is wrong, but it has to be incomplete. Someone has either demonstrated a reliable attack using this exploit to compromise and/or crash affected systems from low privilege user space code, or there is more to it than there appears to be.

No way would everyone issue fixes like this in such a cloak and dagger fashion, especially a fix that causes a significant performance regression, if it wasn't scaring the crap out of some people...

Reply Parent Score: 8

RE[3]: Overhyped
by le_c on Wed 3rd Jan 2018 07:23 in reply to "RE[2]: Overhyped"
le_c Member since:
2013-01-02

Would it be possible to slow down page fault notifications? For example, if the page fault was not on kernel space, halt the application for the time offset of a kernel read. In this way all segfaults would be reported at the same time.

Are there any sane apps that depends on timely segfault handling and thus would be affected by such a workaround?

Reply Parent Score: 2