Linked by Hadrien Grasland on Fri 28th Jan 2011 20:37 UTC
OSNews, Generic OSes It's recently been a year since I started working on my pet OS project, and I often end up looking backwards at what I have done, wondering what made things difficult in the beginning. One of my conclusions is that while there's a lot of documentation on OS development from a technical point of view, more should be written about the project management aspect of it. Namely, how to go from a blurry "I want to code an OS" vision to either a precise vision of what you want to achieve, or the decision to stop following this path before you hit a wall. This article series aims at putting those interested in hobby OS development on the right track, while keeping this aspect of things in mind.
Thread beginning with comment 460246
To read all comments associated with this story, please click here.
HLL FTL
by Innominandum on Mon 31st Jan 2011 02:32 UTC
Innominandum
Member since:
2005-11-18

Just in case you guys don't believe me: I compiled and disassembled a small segment of code that happened to be on my screen, under GCC 4.5.2:

x^=(x<<13), x^=(x>>17), x^=(x<<5)

Which resulted in:

8B45F0 mov eax,[rbp-0x10]
C1E00D shl eax,0xd
3145F0 xor [rbp-0x10],eax
8B45F0 mov eax,[rbp-0x10]
C1E811 shr eax,0x11
3145F0 xor [rbp-0x10],eax
8B45F0 mov eax,[rbp-0x10]
C1E005 shl eax,0x5
3145F0 xor [rbp-0x10],eax

Ouch. 6 memory references. It runs at an average of 20 cycles on my AMD Phenom 8650. The obvious 2 memory reference replacement runs an average of 8 cycles, more than twice as fast.

This is basic stuff that even a neophyte ASM programmer would not miss.

Edited 2011-01-31 02:35 UTC

Reply Score: 1

RE: HLL FTL
by WereCatf on Mon 31st Jan 2011 02:47 in reply to "HLL FTL"
WereCatf Member since:
2006-02-15

And what were the compiler parameters for GCC then?

Reply Parent Score: 2

RE[2]: HLL FTL
by WereCatf on Mon 31st Jan 2011 03:13 in reply to "RE: HLL FTL"
WereCatf Member since:
2006-02-15

Replying to myself: I got the same code _without any kinds of compiler parameters_, ie. you are comparing optimized code versus completely unoptimized i486-compatible code. The reason why you get such code is quite obvious...

Reply Parent Score: 2

RE: HLL FTL
by Alfman on Mon 31st Jan 2011 03:48 in reply to "HLL FTL"
Alfman Member since:
2011-01-28

Here is what I get, with and without -O3 in GCC 4.4.1.
(I hope this output doesn't get clobbered)
Edit: They did get clobbered, I needed to fix manually.

They are both pretty bad, I am actually quite surprised at how poorly GCC handled it. But for the record, I never doubted your claims about being able to do better than the compiler. Does someone have ICC on hand to see it's output?


Gcc flag -O3
08048410 <func>:
8048410: push %ebp
8048411: mov %esp,%ebp
8048413: mov 0x8(%ebp),%edx
8048416: pop %ebp
8048417: mov %edx,%eax
8048419: shl $0xd,%eax
804841c: xor %edx,%eax
804841e: mov %eax,%edx
8048420: shr $0x11,%edx
8048423: xor %eax,%edx
8048425: mov %edx,%eax
8048427: shl $0x5,%eax
804842a: xor %edx,%eax
804842c: ret

Normal:
080483e4 <func>:
80483e4: push %ebp
80483e5: mov %esp,%ebp
80483e7: mov 0x8(%ebp),%eax
80483ea: shl $0xd,%eax
80483ed: xor %eax,0x8(%ebp)
80483f0: mov 0x8(%ebp),%eax
80483f3: shr $0x11,%eax
80483f6: xor %eax,0x8(%ebp)
80483f9: mov 0x8(%ebp),%eax
80483fc: shl $0x5,%eax
80483ff: xor %eax,0x8(%ebp)
8048402: mov 0x8(%ebp),%eax
8048405: pop %ebp
8048406: ret

Edited 2011-01-31 03:57 UTC

Reply Parent Score: 1

RE: HLL FTL
by Alfman on Mon 31st Jan 2011 03:52 in reply to "HLL FTL"
Alfman Member since:
2011-01-28

Innominandum,

I see your disassembly is in intel x86 syntax, how did you generate that? All the GNU tools at my disposal generate AT&T syntax which I find very annoying.

Reply Parent Score: 1

RE[2]: HLL FTL
by jal_ on Mon 31st Jan 2011 10:16 in reply to "RE: HLL FTL"
jal_ Member since:
2006-11-02

Check the objdump parameters, especially --disassembler-options with value intel-mnemonic.

Reply Parent Score: 2