Linked by Thom Holwerda on Mon 12th Mar 2012 19:00 UTC, submitted by yoni
Privacy, Security, Encryption "And just when you thought the whole Stuxnet/Duqu trojan saga couldn't get any crazier, a security firm who has been analyzing Duqu writes that it employs a programming language that they've never seen before." Pretty crazy, especially when you consider what some think the mystery language looks like "The unknown c++ looks like the older IBM compilers found in OS400 SYS38 and the oldest sys36.The C++ code was used to write the tcp/ip stack for the operating system and all of the communications."
Thread beginning with comment 510463
To view parent comment, click here.
To read all comments associated with this story, please click here.
acobar
Member since:
2005-11-15

Indeed, you raised very interesting points about the drawbacks of having a calling convention (CC).

Disclaimer: there are more than 15 years since I last coded in asm.

About the multiple data return (MDR), perhaps, it would create a nightmare for compilers writers for, perhaps, not so much benefit? We also should note that one of the key points of a CC is also to allow code efficiency. For example, if a function returns an integer, the only thing you need to do before call it is save the return register, for example, eax.

You do:
push eax ; save it as eax will be used as rval
push ff0 ; 2nd arg - 8 bytes
push ebx ; 1st arg
call randomf
add esp, 12
mov [edi], eax ; get rval
pop eax
; restore eax

Suppose you had a MDR operator, like =* for example, and you could declare a function to be like int : float getboth(int i, float f).

You write:
m:q =* getboth(1, 2.0);


Everything nice but what are the implications if you write:
m : q =* getboth(1, 2.0) * getboth(2, 1.2);
?

You now must extend the syntax of the whole language so that this kind of construction can be useful and, to make code efficient, would need to reserve two registers to cope with the return values. Now, imagine you would like to return, say, 16 values on processors with few resources. You would run out of registers.

Also, on C compilers now you just use a reference and the compiler may altogher try to eliminate the associated pushes and pops.

Reply Parent Score: 2

Alfman Member since:
2011-01-28

acrobar,

"About the multiple data return (MDR), perhaps, it would create a nightmare for compilers writers for, perhaps, not so much benefit?"

I guess the multiple return has pros and cons on two fronts:
1. What would be the necessary impact on compiler implementations and calling conventions under the hood?
2. How would this language feature change the way high level software is written?

I won't speak towards #1 since that would deserve a much more in-depth analysis than either of us can commit to for this conversation.

As for #2, there's at least one extremely common use case that crops up over and over again, and that's a function which returns both a status and a data value. This pattern is so common I wouldn't mind a language addressing it specifically.


long pos=ftell(file);
if (pos<0) {
printf("error %s\n", str_error(errno));
}

This convention which is so common on linux has some problems. For one, pos cannot distinguish between a high bit being a legitimate position or an ftell error. Therefor, because of overloading, the range is half of what it should be. Another is the use of the TLS global errno to return a status. Maybe it's a necessary evil, but it's still not pretty and it is still compulsory to flag the error using a returned value. Other times the return value/type cannot be overloaded for the error case at all.

All these problems can be easily/efficiently solved in ASM using output flags and registers, but as you rightly observed the question is how to create a clear syntax to deal with it.

One approach is to adapt the perl error syntax which I find pretty clear.

my $pos = ftell(FILE) or die($!); # what to do on error

Of course perl supports multi-value returns directly too.
my ($a, $b) = ftell(FILE); // this requires just one input register to be occupied, leaving the rest free
// The syntax may be rough for "one-liners", but the $a and $b temp variables can reference the registers as is without any copying.

C forces us to offload the value temporarily into memory
int pos;
if (!ftell(FILE, &pos)) { error... } // This burns one more input register than the prior version, and also wastes two memory accesses.


"Also, on C compilers now you just use a reference and the compiler may altogher try to eliminate the associated pushes and pops."

I think C only has leeway to do this for inlined functions. Inter-procedural code cannot be optimized without breaking calling conventions.

Reply Parent Score: 2