An int is 2 bytes on some architectures but 4 bytes on most. I never understood this inconciseness. What if a 4 byte arch program is moved to a 2 byte arch?
I would have chosen names like
int8,uint8,int16,uint16,int32,etc
or
int1,uint1,int2,uint2,int4,etc
similar to what Fortran does instead of char,int,short,long,long long,etc.
But, at the time, nobody knew that a byte = 8 bits would be where the dust settled.
An exercise that I don’t know the answer to. What’s the most efficient way of inserting a mask of bits of some value into a result at some bit offset. Is it this
result ^= (result ^ value<<offset) & mask<<offset
If not – what is? There should be a macro or library function to do this sort of thing.
The new C99 standard includes a header, stdint.h, that does actually declare int16_t and more.
The list includes int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, intptr_t, uintptr_t, int_least32_t.. int_fast32_t.. intmax_t
Incidentally a byte isn’t eight bits, it’s defined as the minimum amount needed to hold a char type, which is usually 8 bits, but on older, more esoteric systems, could be 7 or 16 bits. The actual SI 8-bit type is an octet
Your exercise is a bit odd. I’d normally just go
result = input & mask;
where result, input and mask were unsigned ints. I don’t see why you need to XOR stuff so much. If you want to merge two masks then it’s just
In part 3: “For example, 7.0 might be stored as a 6.99999 float value–more about precision later.” That isn’t right: the standard guarantees at least 6 decimal digits of precision for float, so any integer (such as 7) shorter than that can be represented exactly, regardless of FLT_RADIX.
In part 4: “If the smallest type on a particular system were 8 bits, the int8_t type would not be defined.” This should probably be “were 16 bits”.
An int is 2 bytes on some architectures but 4 bytes on most. I never understood this inconciseness.
AFAIK, an int in C is equivalent at least to the base integer type of the architecture i.e. it is efficient for computation (16 or 32 bits on 16 bits architecture for example). The specifications enforce that the size of int must be 16 bits at least. So, is your are sure that for a loop index in a loop that your number will never be over 32677 in size and that you want portable and efficient code then you should use unsigned int for type. BTW, long is guaranteed to be 32 bits at least.
Your follow-up example is still abiguous at best. Assuming a 32-bit architecture, ‘inserting’ the value as you described would give 0x00000A78. Even if value was only 8-bits, it would be inserted as 0x12340A78. Masking (using AND operation and assuming 8-bits) would give 0x12340278, with OR you’d get 0x12345D78.
There is not really a macro or piece of code to do what you describe in a efficient way. To do this generically, you would need 4 parameters:
original number, ‘insert’ value, bit offset, length (in bits) of significant bits of insert.
Yes, I know it’s off topic. And then – C is used where such nitty gritty stuff is needed. The entire problem being that bits are not individually addressable. I needed it a couple of days ago at work (process control). Google suggests that it is also applied in say tcp/ip implementations but did not provide any definitive answers. I think my expression is very similar to AC’s with mask=0xF. Thanks for the replies.
C is used where such nitty gritty stuff is needed.
It’s a misconseption that C is ugly and therefore only used for doing ugly stuff. The language, it’s great – I think most people get frustrated by the pesky preprocessor/header files, and blame it on C.
Chasing #defines through 4 different header files is not amusing nor elegant. To every OSS programmer out there: Please stop redefining the same thing xxx times!
An int is 2 bytes on some architectures but 4 bytes on most. I never understood this inconciseness. What if a 4 byte arch program is moved to a 2 byte arch?
I would have chosen names like
int8,uint8,int16,uint16,int32,etc
or
int1,uint1,int2,uint2,int4,etc
similar to what Fortran does instead of char,int,short,long,long long,etc.
But, at the time, nobody knew that a byte = 8 bits would be where the dust settled.
An exercise that I don’t know the answer to. What’s the most efficient way of inserting a mask of bits of some value into a result at some bit offset. Is it this
result ^= (result ^ value<<offset) & mask<<offset
If not – what is? There should be a macro or library function to do this sort of thing.
An exercise that I don’t know the answer to. What’s the most efficient way of inserting a mask of bits of some value into a result at some bit offset.
Not sure what you mean by this.
The new C99 standard includes a header, stdint.h, that does actually declare int16_t and more.
The list includes int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, intptr_t, uintptr_t, int_least32_t.. int_fast32_t.. intmax_t
Incidentally a byte isn’t eight bits, it’s defined as the minimum amount needed to hold a char type, which is usually 8 bits, but on older, more esoteric systems, could be 7 or 16 bits. The actual SI 8-bit type is an octet
Your exercise is a bit odd. I’d normally just go
result = input & mask;
where result, input and mask were unsigned ints. I don’t see why you need to XOR stuff so much. If you want to merge two masks then it’s just
supermask = mask1 | mask2;
To apply two masks at once its
result = input & (mask1 | mask2);
Simple.
Note that neither Visual Studio 6 nor 2003(.NET) support inttypes.h/stdint.h.
In part 3: “For example, 7.0 might be stored as a 6.99999 float value–more about precision later.” That isn’t right: the standard guarantees at least 6 decimal digits of precision for float, so any integer (such as 7) shorter than that can be represented exactly, regardless of FLT_RADIX.
In part 4: “If the smallest type on a particular system were 8 bits, the int8_t type would not be defined.” This should probably be “were 16 bits”.
An int is 2 bytes on some architectures but 4 bytes on most. I never understood this inconciseness.
AFAIK, an int in C is equivalent at least to the base integer type of the architecture i.e. it is efficient for computation (16 or 32 bits on 16 bits architecture for example). The specifications enforce that the size of int must be 16 bits at least. So, is your are sure that for a loop index in a loop that your number will never be over 32677 in size and that you want portable and efficient code then you should use unsigned int for type. BTW, long is guaranteed to be 32 bits at least.
Sorry for not being clear. What I meant – by an example
int result=0x12345678;
int value=0xA;
Insert “value” at say bit offset 8 in “result” such that “result” becomes 0x12345A78. What’s the most efficient general expression for such operation?
Claus –
Your follow-up example is still abiguous at best. Assuming a 32-bit architecture, ‘inserting’ the value as you described would give 0x00000A78. Even if value was only 8-bits, it would be inserted as 0x12340A78. Masking (using AND operation and assuming 8-bits) would give 0x12340278, with OR you’d get 0x12345D78.
There is not really a macro or piece of code to do what you describe in a efficient way. To do this generically, you would need 4 parameters:
original number, ‘insert’ value, bit offset, length (in bits) of significant bits of insert.
What is such an operation used for?
P.J.
This is rather offtopic; in the future providing and e-mail would be nice.
Short summary though, replacing a single digit in a hex number of _any_ datatype could use something like the following:
/*
* n: the target byte
* 1) Zero the target byte
* 2) Insert the value
*/
result = (result ^ (0xF << (n*4)) ^ !result) & result;
result = result | (value << (n*4));
The use of ‘n’ just simplifies alignment; just count from the right starting at zero.
Yes, I know it’s off topic. And then – C is used where such nitty gritty stuff is needed. The entire problem being that bits are not individually addressable. I needed it a couple of days ago at work (process control). Google suggests that it is also applied in say tcp/ip implementations but did not provide any definitive answers. I think my expression is very similar to AC’s with mask=0xF. Thanks for the replies.
C is used where such nitty gritty stuff is needed.
It’s a misconseption that C is ugly and therefore only used for doing ugly stuff. The language, it’s great – I think most people get frustrated by the pesky preprocessor/header files, and blame it on C.
Chasing #defines through 4 different header files is not amusing nor elegant. To every OSS programmer out there: Please stop redefining the same thing xxx times!