This is the first post of my Bit fields series; describing how to not use bit fields, how to use them, the limitations imposed by architecture and the compiler’s implementation, the use of volatile, and finally a show-stopper as well as a proposal to fix it.
Specification
Consider the following documentation for the Line Control Register on the National Semiconductor 8250/16×50 Universal Asynchronous Receiver/Transmitter (UART):
Bit | Notes | |||
---|---|---|---|---|
7 | Divisor Latch Access Bit (DLAB) | |||
6 | Set Break Enable (BREAK) | |||
5, 4, 3 | Bit 5 | Bit 4 | Bit 3 | Parity Select |
X | X | 0 | No Parity | |
0 | 0 | 1 | Odd Parity | |
0 | 1 | 1 | Even Parity | |
1 | 0 | 1 | Mark (1) | |
1 | 1 | 1 | Space (0) | |
2 | Bit 2 | Stop Bits | ||
0 | 1 Stop Bit | |||
1 | 1.5 Stop Bits (5 Bits) or 2 Stop Bits (6-8 Bits) | |||
1, 0 | Bit 1 | Bit 0 | Word Length | |
0 | 0 | 5 Bits | ||
0 | 1 | 6 Bits | ||
1 | 0 | 7 Bits | ||
1 | 1 | 8 Bits |
I chose this real-world example for a number of reasons:
- It’s quite small;
- It has both one-bit and multi-bit fields;
- One of the field’s interpretations (Stop Bits) is dependent on another field (Word Length);
- Not all bit patterns are used (there are “dead” encodings);
- The register is used for more than just setting the character format (DLAB and BREAK fields are for other functions).
Definitions
The following code uses #define
s for each of the definitions. Note I used “binary literal” syntax (an extension to the language) to highlight the bit values:
// // UART/8250/LCR.h // // These are the definitions for the Line Control Register (LCR) of the // National Semiconductor 8250 UART and its derivatives (16x50 etc.) // #ifndef UART_8250_LCR_h #define UART_8250_LCR_h #define LCR_WORD_MASK 0b00000011 // Use to clear field first #define LCR_WORD_5 0b00000000 #define LCR_WORD_6 0b00000001 #define LCR_WORD_7 0b00000010 #define LCR_WORD_8 0b00000011 #define LCR_WORD_BAUDOT LCR_WORD_5 // Baudot code (ITA-1), not ASCII #define LCR_WORD_MURRAY LCR_WORD_5 // (also used for Murray code, ITA-2) #define LCR_STOP_MASK 0b00000100 #define LCR_STOP_1 0b00000000 #define LCR_STOP_2 0b00000100 // When Word Length is 6-8 bits #define LCR_STOP_1_5 LCR_STOP_2 // When Word Length is 5 bits #define LCR_PARITY_MASK 0b00111000 // Use to clear field first #define LCR_PARITY_NONE 0b00000000 #define LCR_PARITY_ODD 0b00001000 #define LCR_PARITY_EVEN 0b00011000 #define LCR_PARITY_MARK 0b00101000 #define LCR_PARITY_SPACE 0b00111000 #define LCR_BREAK_MASK 0b01000000 #define LCR_BREAK_DISABLE 0b00000000 #define LCR_BREAK_ENABLE 0b01000000 #define LCR_DLAB_MASK 0b10000000 #define LCR_DLAB_DISABLE 0b00000000 #define LCR_DLAB_ENABLE 0b10000000 #endif // UART_8250_LCR_h
Usage
The above definitions have the advantage that it is easy to form up a complete value for the LCR by simply ORing together the appropriate values:
// Useful typedef. It does assume an 8-bit-byte architecture! typedef unsigned char byte; // The most common RS-232 character format: // No parity, 8-bit ASCII, with one stop bit static const byte N81 = (LCR_PARITY_NONE | LCR_WORD_8 | LCR_STOP_1); // A really old teletype format: // No parity, 5-bit Baudot, one and a half stop bit times static const byte Baudot = (LCR_PARITY_NONE | LCR_WORD_5 | LCR_STOP_1_5); byte LCR_GetEncoding(byte parity, byte wordLength, byte stopBits) { return (byte)(parity | wordLength | stopBits); } // LCR_GetEncoding(parity, wordLength, stopBits)
Complications
That is all well and good for easy initial configuration of the LCR—as long as the programmer doesn’t try to OR in multiple values from the same field-group (which the above naming convention highlights)!
But what if the compound value needs to have one of the fields modified? For example, say the word length needs to be changed from 8 bits to 7 bits? Then the old value needs to be masked out before the new value can be ORed in:
lcr = (byte)((lcr & ~LCR_WORD_MASK) | LCR_WORD_7);
Forgetting to mask out the old value (don’t forget the ~
operator!) is a common error when dealing with bit fields. Indeed, helper macros are often defined to avoid errors like that.
Note that sometimes it isn’t necessary to do both the &
masking and the |
setting: if the bit field value desired is all 1
s or all 0
s for the bit field, then the former or latter (respectively) can be omitted. This is not recommended however! If a later update to the code changes either the definition or the required value, then this “optimisation” will end up being an error. Best allow the compiler to determine that an operation is not required by examining the values, rather than being clever at the moment of writing the code.
Extension
Another commonly used #define
construct is applicable when the same field set may be applicable to multiple fields within the one struct
. For this contrived example, assume a register that controls the clock divisor input to a number of peripherals. All values are the same, since it’s the position that dictates which peripheral is being configured:
Bit | Notes | ||
---|---|---|---|
7, 6 | I²C clock divisor | ||
Bit 7 | Bit 6 | Divisor | |
0 | 0 | MCU clock ÷1 | |
0 | 1 | MCU clock ÷2 | |
1 | 0 | MCU clock ÷4 | |
1 | 1 | MCU clock ÷8 | |
5, 4 | SPI clock divisor | ||
Bit 5 | Bit 4 | Divisor | |
0 | 0 | MCU clock ÷1 | |
0 | 1 | MCU clock ÷2 | |
1 | 0 | MCU clock ÷4 | |
1 | 1 | MCU clock ÷8 | |
3, 2 | UART2 clock divisor | ||
Bit 3 | Bit 2 | Divisor | |
0 | 0 | MCU clock ÷1 | |
0 | 1 | MCU clock ÷2 | |
1 | 0 | MCU clock ÷4 | |
1 | 1 | MCU clock ÷8 | |
1, 0 | UART1 clock divisor | ||
Bit 1 | Bit 0 | Divisor | |
0 | 0 | MCU clock ÷1 | |
0 | 1 | MCU clock ÷2 | |
1 | 0 | MCU clock ÷4 | |
1 | 1 | MCU clock ÷8 |
This specification could easily be implemented as previously, with repetitive values for each of the divisors, but subtly different names.
An alternative would be to define the clock divisor values just once, and a shift value for each of the different peripherals:
// // Clock/CDR.h // // These are the definitions for a Clock Divisor Register (CDR) // for a contrived MCU. // #ifndef Clock_CDR_h #define Clock_CDR_h #define CDR_DIVISOR_MASK 0b11 #define CDR_DIVISOR_1 0b00 #define CDR_DIVISOR_2 0b01 #define CDR_DIVISOR_4 0b10 #define CDR_DIVISOR_8 0b11 #define CDR_SHIFT_I2C 6 #define CDR_SHIFT_SPI 4 #define CDR_SHIFT_UART2 2 #define CDR_SHIFT_UART1 0 #endif // Clock_CDR_h
Notice how the divisor values are correct for their field, but need to be shifted into position by the CDR_SHIFT_XXX
values, as so:
// Set CDR for SPI to ÷4 cdr = (byte)((cdr & (CDR_DIVISOR_MASK << CDR_SHIFT_SPI)) | (CDR_DIVISOR_4 << CDR_SHIFT_SPI));
This saves repetition, but makes for some ugly code. Again, macros are usually used to help this.
A better(?) alternative
Most of the ugly code is generic—but the compiler already has implementations of exactly this to hide it all; if only it was invoked. That’s the subject of the next post: using struct
bit fields.
Comments are welcome. I suggest that generic comments on the whole “Bit fields” series and concepts go on the main page, while comments specific to this sub-page are written here.