Linked by Thom Holwerda on Tue 8th Jan 2013 23:27 UTC
Windows So, a rudimentary jailbreak for Windows RT made its way onto the web these past few days. Open source applications were ported right away, and it was confirmed that Windows RT is the full Windows - it's exactly the same as regular Windows, except that it runs on ARM. Microsoft responded to the jailbreak as well.
Thread beginning with comment 548272
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[9]: x86
by Alfman on Thu 10th Jan 2013 23:34 UTC in reply to "RE[8]: x86"
Alfman
Member since:
2011-01-28

viton,

"It is safe to assume that code in question is already optimized."

It's optimized for x86, not necessarily for ARM.

"So it should be enough to just move ARM instructions (according to target CPU issue possibilities) up to nearest branch entry point. Modern OoO ARMs will do the rest for you."


If the instruction sets were 1:1 then the translation would be trivial. But the trouble is that x86 flags & registers are very unique to that architecture. Also ARM has conditional instructions, the optimal code path could be very dissimilar to x86's which needs alot more jumps. Even different processors in the x86 family can have different optimal code paths.

Maybe a direct translation is good enough for some applications, but it would be like compiling a binary with no optimization for the target processor.

Reply Parent Score: 3

RE[10]: x86
by viton on Fri 11th Jan 2013 03:15 in reply to "RE[9]: x86"
viton Member since:
2005-08-09

It's optimized for x86, not necessarily for ARM.

It hardly does matter for quick and dirty Jit translator.
Translation overhead is worse than small inefficiencies in code.

But the trouble is that x86 flags & registers are very unique to that architecture

That is not exposed to C or any language above asm.
Check some executables. Usually only basic flags are used.

Also ARM has conditional instructions, the optimal code path could be very dissimilar to x86's which needs alot more jumps.


Do not confuse modern OoO ARMs with old in-order ones.
Conditional execution is actually bad for OoO engine, as it is adding unwanted dependency between instructions. I heard some rumors that cortex-A9 replaces CE with branches internally.

Maybe a direct translation is good enough for some applications, but it would be like compiling a binary with no optimization for the target processor.


You don't get it. Nobody cares about such a small inefficiencies.
And It has already been done. There is an app that does static binary translation of Windows games for Android. (Winulator)

Reply Parent Score: 2

RE[11]: x86
by Alfman on Fri 11th Jan 2013 05:31 in reply to "RE[10]: x86"
Alfman Member since:
2011-01-28

viton,

"It hardly does matter for quick and dirty Jit translator. Translation overhead is worse than small inefficiencies in code."

I'm not sure if it was clear before, so I'll make it explicit now, the goal from the outset would be low overhead execution of modern x86 software on ARM (with the same API on each, not a reimplementation like wine). I'm convinced this is possible, but it would require optimization for the target platform.


"That is not exposed to C or any language above asm.
Check some executables. Usually only basic flags are used."


Yea but the x86 has some funny behavior some times where some instructions set the flags while others do not and even some undefined behaviors like with the bittest opcodes and so the compilers for these instructions have to implement different opcodes to handle things than on ARM processors. I'm sure a thorough analysis would reveal instruction level incompatibilities that I'm hard pressed to come up with on the spot. I expect they'd all be solvable, but a target optimizer would have to be part of the design.


"Do not confuse modern OoO ARMs with old in-order ones."

I don't think I am, but even on x86 the OOO doesn't work miracles, branch misses are very costly and if we can optimize the code to avoid jumps on ARM then we'll avoid the penalty.


"Conditional execution is actually bad for OoO engine, as it is adding unwanted dependency between instructions"

I think it depends, OOO obviously works best with sets of instructions having no dependencies, which was actually very common on x86 due to the limited number of registers getting reused for independent calculations. x86 designers use OOO+register renaming to get more work done using such few registers. On ARM, there are many more registers such that there is no need to "time share" registers, so OOO is inherently going to work differently there.

I'm not willing to make a conclusion here about performance characteristics without having performed some actual benchmarks.


"You don't get it. Nobody cares about such a small inefficiencies."

People in this thread complained about it already. And even if they hadn't ARM devices in particular have much more stringent power requirements than the typical x86 computer.

Qemu already does what your talking about, yet it suffers from the opcode replication problems I'm referring to because it lacks a target optimizer and might need multiple instructions to replicate x86 instruction semantics on ARM. And beyond that, directly translated x86 code can't possibly make good use of ARM's registers. Do you think ARM would have them if they were not important?

This next link has the same goal, and identifies some of the same code conversion pitfalls. They're thinking of using LLVM's optimizer instead of GCC's.

http://www.winehq.org/pipermail/wine-devel/2011-April/089638.html


"There is an app that does static binary translation of Windows games for Android. (Winulator)"

Ah, that's a decent example, good tip! The website has no information about how it works though, and doesn't even show any benchmarks. The FAQ just talks about running games from the 90's so it's efficiency is unknown, but yes there are similar projects to be found.

Edited 2013-01-11 05:47 UTC

Reply Parent Score: 3