This is the fourth post of my Bit fields series; describing how to not use bit fields, how to use them, the limitations imposed by architecture and the compiler’s implementation, the use of volatile, and finally a show-stopper as well as a proposal to fix it.
Compiler
Unlike the architecture-defined limitations, K&R also explicitly left some of the compiler’s implementation details unspecified. This doesn’t directly cause a problem with their usage—the implementation would at least be internally consistent—but how the implementation interacts with external definitions makes for compatibility issues that need to be addressed. Indeed, this explicit lack of code portability between compilers is one reason touted to not use struct
bit fields.
Field position
One aspect that was left explicitly unspecified by K&R was the position of bit fields within the larger struct
. “Whole” struct
fields are defined to be in ascending (but not necessarily contiguous) memory locations. Bit fields are also specified to be in the defined order—but not where in the encompassing word they should be positioned. Specifically, whether the first bit field used the least significant or most significant bit(s).
In the LCR example in my previous post, the specification is clear that wordLength
occupies the two least significant bits. They appear in the struct
first—but the C and C++ specifications equally allow a compiler implementation to interpret that as the two most significant bits. In that case the given example merely has to sequence the fields in the reverse order, without changing their widths. A quick experiment with the compiler will determine whether it uses LSb or MSb ordering—and one can only hope that an update to the compiler won’t for some reason decide to reverse this!
Access size
Also unspecified by K&R was what word access size the compiler should use to encompass the bit field. The language specifies that a bit field cannot cross an int
boundary, and it appears they assumed that the compiler would access all bit fields as an int
. But with today’s architectures commonly supporting 8-, 16-, 32- and even 64-bit accesses, it is non-deterministic as to what word size the compiler will use to access a field.
Example
Given the following definition, there are a number of questions as to how the compiler could (or should) arrange and access the fields.
struct Example { int loInt : 3; int hiInt : 4; char loByte : 3; char hiByte : 4; int : 0; // Padding int next : 8; }; // Example
- What is
sizeof(Example)
?
K&R specified that an unnamed field of length0
was a signal to the compiler to align the next field on anint
boundary. Thus the size has to at least be twoint
s—a minimalist packing of thestruct
could put thechar
inside theint
too. But the change of types might signal a request to the compiler to start a newint
, thus it would be threeint
s.
To avoid problems, depending on what was desired, either an explicit padding field should be betweenhiInt
andloByte
, or thechar
s should be replaced byint
s. - What if
bool
was used?bool
is a “natural” forstruct
bit fields, since many (but not all) fields are simply flags. Thus while the compiler might represent a singlebool
with the size of achar
orint
, a series ofbool
fields would certainly be implemented by the compiler as single bits. - What if
enum
s were used?
In the LCR example it was assumed thatWordLengths
,Parities
andStopBits
would be jammed together within the same byte—and indeedgcc
does exactly that. But a strict adherence to a rule that “different types should start a newint
” would obviate this usage. Perhaps the compiler implements that “different type sizes should start a newint
“—but mostly it’s because allenum
s are expressed asint
s, so there is no type change. - What access size should be used for
loByte
andhiByte
?
A programmer might expect an 8-bit access would be used, but this isn’t actually required by the specification. The compiler is permitted to use anint
-sized access instead—as long as that access doesn’t overlap an adjacent variable. (Note that the compiler may very well externally pad astruct
bit field to anint
boundary to prevent this, while leaving thestruct
‘s size as1
.) - What access size should be used for
loInt
andhiInt
?
Similarly, a programmer might expect anint
-sized access—after all, that’s the declared type! But again, since the bit field is small enough for a byte access, it might be more size or speed efficient for the compiler to do so.
In short, there are a lot of implementation variations permitted with the above example. And usually, the compiler’s decisions are invisible to the programmer—unless the access is to external hardware rather than simple memory. Then, each and every usage of any field may not behave as the hardware requires, so there needs to be a way to signal to the compiler what sort of access is required. And of course, the signal isn’t defined in the specification, so becomes another non-portable compiler issue—another argument to avoid struct
bit fields.
Getting the definition of the struct
bit fields correct, to map to the hardware’s specification, may require changing the types of the bit fields to get the final size and field positions correct. But whereas an architecture may allow unaligned or byte-sized accesses to ordinary memory, the hardware peripheral may impose restrictions such as fully aligned and/or full-size accesses. Any deviation may cause anything from an ignored access to a full-blown bus error!
Importantly
But even including all of the above, there’s one consideration that is more important than all others when accessing hardware registers.
Comments are welcome. I suggest that generic comments on the whole “Bit fields” series and concepts go on the main page, while comments specific to this sub-page are written here.