Linked by Thom Holwerda on Wed 3rd Jan 2018 00:42 UTC
Intel

A fundamental design flaw in Intel's processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug.

Programmers are scrambling to overhaul the open-source Linux kernel's virtual memory system. Meanwhile, Microsoft is expected to publicly introduce the necessary changes to its Windows operating system in an upcoming Patch Tuesday: these changes were seeded to beta testers running fast-ring Windows Insider builds in November and December.

Crucially, these updates to both Linux and Windows will incur a performance hit on Intel products. The effects are still being benchmarked, however we're looking at a ballpark figure of five to 30 per cent slow down, depending on the task and the processor model. More recent Intel chips have features - such as PCID - to reduce the performance hit.

That's one hell of a bug.

Thread beginning with comment 652441
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[2]: Overhyped
by Brendan on Wed 3rd Jan 2018 06:14 UTC in reply to "RE: Overhyped"
Brendan
Member since:
2005-11-16

Hi,

The problem is there's very little published info on this newest attack. The little bits that are around suggest to me this is much more significant than merely broken ASLR. It sounds like intel's out of order branch prediction may be executing speculative code prior to checking the full credentials in such a way that they found a way to exploit the deferment, which does not happen on AMD processors. Apparently the temporary software fix is to reload the page table every kernel invocation. This invalidates the caches and happens to fix ASLR as well, but I think fixing ASLR was just a side effect - there's not enough information to know for sure. I could be completely wrong but this media hush now would make very little sense if they had merely broken ASLR again given that ASLR is already publicly cracked and has been for ages already. I believe the sense of urgency and the deployment of high performance-cost workarounds in macos, windows, and linux, and planned service outages at amazon strongly suggest something much more critical was found to directly compromise kernel security on intel processors.


As I understand it:

a) Program tries to do a read from an address in kernel space

b) CPU speculatively executes the read and tags the read as "will generate page fault" (so that a page fault will occur at retirement), but also (without regard to permission checks and likely in parallel with permission checks) either speculatively reads the data into a temporary register (if the page is present) or pretends that data being read will be zero (if the page is not present) for performance reasons (so that other instructions can be speculatively executed after a read). Note that the data (if any) in the temporary register can not be accessed directly (it won't become "architecturally visible" when the instruction retires).

c) Program does a read from an address that depends on the temporary register set by the first read, which is also speculatively executed, and because it's speculatively executed it uses the "speculatively assumed" value in the temporary register. This causes a cache line to be fetched for performance reasons (to avoid a full cache miss penalty if the speculatively executed instruction is committed and not discarded).

d) Program "eats" the page fault (caused by step a) somehow so that it can continue (e.g. signal handler).

e) Program detects if the cache line corresponding to "temporary register was zero" was pre-fetched (at step c) by measuring the amount of time a read from this cache line takes (a cache hit or cache miss).

In this way (or at least, something vaguely like it); the program determines if a virtual address in kernel space corresponds to a "present" page or a "not present" page (without any clue what the page contains or why it's present or if the page is read-only or read/write or executable or even if the page is free/unused space on the kernel heap).

- Brendan

Reply Parent Score: 5

RE[3]: Overhyped
by galvanash on Wed 3rd Jan 2018 06:41 in reply to "RE[2]: Overhyped"
galvanash Member since:
2006-01-25

There has to be more to it than that. I mean I'm not saying your analysis is wrong, but it has to be incomplete. Someone has either demonstrated a reliable attack using this exploit to compromise and/or crash affected systems from low privilege user space code, or there is more to it than there appears to be.

No way would everyone issue fixes like this in such a cloak and dagger fashion, especially a fix that causes a significant performance regression, if it wasn't scaring the crap out of some people...

Reply Parent Score: 8

RE[4]: Overhyped
by Brendan on Wed 3rd Jan 2018 07:40 in reply to "RE[3]: Overhyped"
Brendan Member since:
2005-11-16

Hi,

There has to be more to it than that. I mean I'm not saying your analysis is wrong, but it has to be incomplete. Someone has either demonstrated a reliable attack using this exploit to compromise and/or crash affected systems from low privilege user space code, or there is more to it than there appears to be.

No way would everyone issue fixes like this in such a cloak and dagger fashion, especially a fix that causes a significant performance regression, if it wasn't scaring the crap out of some people...


You're right - there's something I've overlooked.

For a sequence like "movzx edi,byte [kernelAddress]" then "mov rax,[buffer+edi*8]", if the page is present, the attacker could find out which cache line (in their buffer) got fetched and use that to determine 5 bits of the byte at "kernelAddress".

With 3 more individual attempts (e.g. with "mov rax,[buffer+4+edi*8]", "mov rax,[buffer+2+edi*8]" and "mov rax,[buffer+1+edi*8]") the attacker could determine the other 3 bits and end up knowing the whole byte.

Note: It's not that easy - with a single CPU the kernel's page fault handler would pollute the caches a little (and could deliberately invalidate or completely pollute caches as a defence) before you can measure which cache line was fetched. To prevent that the attacker would probably want/need to use 2 CPUs that share caches (and some fairly tight synchronisation between the CPUs so the timing isn't thrown off too much).

- Brendan

Edited 2018-01-03 07:40 UTC

Reply Parent Score: 3

RE[4]: Overhyped
by Alfman on Wed 3rd Jan 2018 09:27 in reply to "RE[3]: Overhyped"
Alfman Member since:
2011-01-28

Brendan,

In this way (or at least, something vaguely like it); the program determines if a virtual address in kernel space corresponds to a "present" page or a "not present" page (without any clue what the page contains or why it's present or if the page is read-only or read/write or executable or even if the page is free/unused space on the kernel heap).


The thing is similar results have been achieved in the past using different techniques with hardily anyone blinking an eye.

For instance:
https://gruss.cc/files/prefetch.pdf
Indeed, prefetch instructions leak timing information on the exact translation level for every virtual address. More severely, they lack a privilege check and thus allow fetching inaccessible privileged memory into various CPU caches. Using these two properties, we build two attack primitives: the translation-level oracle and the address-translation oracle. Building upon these primitives, we then present three different attacks. Our first attack infers the translation level for every virtual address, effectively defeating ASLR. Our second attack resolves virtual addresses to physical addresses on 64-bit Linux systems and on Amazon EC2 PVM instances in less than one minute per gigabyte of system memory. This allows an attacker to perform ret2dir-like attacks. On modern systems, this mapping can only be accessed with root or kernel privileges to prevent attacks that rely on knowledge of physical addresses. Prefetch Side-Channel Attacks thus render existing approaches to KASLR ineffective. Our third attack is a practical KASLR exploit. We provide a proof-of-concept on a Windows 10 system that enables return-oriented programming on Windows drivers in memory. We demonstrate our attacks on recent Intel x86 and ARM Cortex-A CPUs, on Windows and Linux operating systems, and on Amazon EC2 virtual machines.



Incidentally this paper from 2016 recommends the exact same (slow) countermeasures that the linux kernel and others are now implementing, so if merely getting this meta information was such a big deal on it's own, then why didn't they act sooner when ASLR weaknesses first started being published?


I believe/assume the answer to this lies in the possibility that someone actually managed to breach a kernel boundary to either read or write memory without proper access. I don't know if this is true, but IMHO it would explain the urgency we're seeing.



galvanash,

There has to be more to it than that. I mean I'm not saying your analysis is wrong, but it has to be incomplete. Someone has either demonstrated a reliable attack using this exploit to compromise and/or crash affected systems from low privilege user space code, or there is more to it than there appears to be.


We'll have to wait and see, but as long as we're speculating, here's mine ;) Assuming that the speculative engine doesn't check page permissions for rights before executing speculative branches (as some sources have mentioned), then I wonder if maybe intel is leaking protected memory by a side channel attack on indirect jumps?



Pseudo code example:

// 4GB filled with nothing but "ret" opcodes
char*dummy=malloc(0x100000000ULL);
memset(dummy, 0xC3, 1<<32);

// ignore the page faults...
sigaction (SIGSEGV, IGNORE);

long best_clocks=~0;
long best_x;
long clocks;

clearcache();
asm(
"mov eax, [FORBIDDEN_MEMORY_ADDRESS]"
"call [$dummy+eax]");


for(long long x=0; x<0x100000000ULL; x+=CACHE_WIDTH) {

clocks = cpuclocks();
char ignore = dummy[x];
clocks = cpuclocks() - clocks;

if (best_clocks>clocks) {
best_clocks = clocks;
best_x = x;
}

}

printf("The value at %lx is near %ld!\n", FORBIDDEN_MEMORY_ADDRESS, best_x);


Bear in mind this is just a very rough idea, but the theory is that if the branch predictor speculatively follows the branch, then page corresponding to the hidden kernel value should get loaded into cache. The call will inevitably trigger a fault, which is expected, but the state of the cache will not get reverted and will therefor leak information about the value in kernel memory. Scanning the dummy memory under clock analysis should reveals which pages are in cache. Different variations of this idea could provide more information.

Edited 2018-01-03 09:39 UTC

Reply Parent Score: 4

RE[3]: Overhyped
by le_c on Wed 3rd Jan 2018 07:23 in reply to "RE[2]: Overhyped"
le_c Member since:
2013-01-02

Would it be possible to slow down page fault notifications? For example, if the page fault was not on kernel space, halt the application for the time offset of a kernel read. In this way all segfaults would be reported at the same time.

Are there any sane apps that depends on timely segfault handling and thus would be affected by such a workaround?

Reply Parent Score: 2