Exploration 64: Working with Bits

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

EXPLORATION 64

Working with Bits

This Exploration begins a series of Explorations that cover more advanced topics in the C++ type system. The series kicks off with an examination of how to work with individual bits. This Exploration begins with operators that manipulate integers at the bit level, then introduces bitfields—a completely different way of working with bits. The final topic is the bitset class template, which lets you work with bitsets of any size.

Integer As a Set of Bits

A common idiom in computer programming is to treat an integer as a bitmask. The bits can represent a set of small integers, such that a value n is a member of the set if the bit at position n is one; n is not in the set if the corresponding bit is zero. An empty set has the numeric value zero, because all bits are zero. To better understand how this works, consider the I/O stream formatting flags (introduced in Exploration 39).

Typically, you use manipulators to set and clear flags. For example, Exploration 17 introduced the skipws and noskipws manipulators . These manipulators set and clear the std::ios_base::skipws flag by calling the setf and unsetf member functions. In other words, the following statement:

std::cin >> std::noskipws >> read >> std::skipws;

is exactly equivalent to

std::cin.unsetf(std::ios_base::skipws);
std::cin >> read;
std::cin.setf(std::ios_base::skipws);

Other formatting flags include boolalpha (introduced in Exploration 12), showbase (Exploration 54), showpoint (display a decimal point even when it would otherwise be suppressed), and showpos (show a plus sign for positive numbers). Consult a C++ reference to learn about the remaining formatting flags.

A simple implementation of the formatting flags is to store the flags in an int and assign a specific bit position to each flag. A common way to write flags that you define in this manner is to use hexadecimal notation, as shown in Listing 64-1. Write a hexadecimal integer literal with 0x or 0X, followed by the base 16 value. Letters A through F in upper- or lowercase represent 10 through 15. (The C++ standard does not mandate any particular implementation of the formatting flags. Your library probably implements the formatting flags differently.)

Listing 64-1. An Initial Definition of Formatting Flags

typedef int fmtflags;
fmtflags const showbase  = 0x01;
fmtflags const boolalpha = 0x02;
fmtflags const skipws    = 0x04;
fmtflags const showpoint = 0x08;
fmtflags const showpos   = 0x10;
// etc. for other flags . . .

The next step is to write the setf and unsetf functions . The former function sets specific bits in a flags_ data member (of the std::ios_base class), and the latter clears bits. To set and clear bits, C++ provides some operators that manipulate individual bits in an integer. Collectively, they are called the bitwise operators.

The bitwise operators perform the usual arithmetic promotions and conversions (Exploration 25). The operators then perform their operation on successive bits in their arguments. The & operator implements bitwise and; the | operator implements bitwise inclusive or; and the ~ operator is a unary operator to perform bitwise complement. Figure 64-1 illustrates the bitwise nature of these operators (using & as an example).

Figure 64-1. How the & (bitwise and) operator works

Operator Abuse

It may seem strange to you (it certainly does to me) that C++ uses the same operator to obtain the address of an object and to perform bitwise and. There are only so many characters to go around. The logical and operator is used for rvalue references. The asterisk does double duty for multiplication and dereferencing a pointer or iterator. The difference is whether the operator is unary (one operand) or binary (two operands). So there is no ambiguity, just unusual syntax. Later in this Exploration, you will learn that the I/O operators are repurposed shift operators. But don’t worry, you will get used to it. Eventually.

Implement the setf function. This function takes a single fmtflags argument and sets the specified flags in the flags_ data member. Listing 64-2 shows a simple solution.

Listing 64-2. A Simple Implementation of the setf Member Function

void setf(fmtflags f)
{
   flags_ = flags_ | f;
}

The unsetf function is slightly more complicated. It must clear flags, which means setting the corresponding bits to zero. In other words, the argument specifies a bitmask in which each 1 bit means to clear (set to 0) the bit in flags_. Write the unsetf function. Compare your solution with Listing 64-3.

Listing 64-3. A Simple Implementation of the unsetf Member Function

void unsetf(fmtflags f)
{
   flags_ = flags_ & ~f;
}

Recall from Exploration 46 that various assignment operators combine an arithmetic operator with assignment. Assignment operators also exist for the bitwise functions, so you can write these functions even more succinctly, as shown in Listing 64-4.

Listing 64-4. Using Assignment Operators in the Flags Functions

void setf(fmtflags f)
{
   flags_ |= f;
}
 
void unsetf(fmtflags f)
{
   flags_ &= ~f;
}

Recall from Exploration 56 that the | operator combines I/O mode flags. Now you know that the flags are bits, and the I/O mode is a bitmask. Should the need arise, you can use any of the bitwise operators on I/O modes.

Bitmasks

Not all the flags are individual bits. The alignment flags, for example, can be left, right, or internal. The floating-point style can be fixed, scientific, hexfloat, or general. To represent three or four values, you need two bits. For these situations, C++ has a two-argument form of the setf function. The first argument specifies a mask of bits to set within the field, and the second argument specifies a mask of which bits to affect.

Using the same bitwise operators, you can define adjustfield as a two-bit-wide bitmask, for example, 0x300. If both bits are clear, that could mean left-adjustment; one bit set means right-adjustment; the other bit could mean “internal” alignment (align after a sign or 0x in a hexadecimal value). That leaves one more possible value (both bits set), but the standard library defines only three different alignment values.

Listing 64-5 hows one possible implementation of the adjustfield and floatfield masks and their associated values.

Listing 64-5. Declarations for Formatting Fields

fmtflags static constexpr adjustfield = 0x300;
fmtflags static constexpr left        = 0x000;
fmtflags static constexpr right       = 0x100;
fmtflags static constexpr internal    = 0x200;
fmtflags static constexpr floatfield  = 0xC00;
fmtflags static constexpr scientific  = 0x400;
fmtflags static constexpr fixed       = 0x800;
fmtflags static constexpr hexfloat    = 0xC00;
// general does not have a name; its value is zero

Thus, to set the alignment to right, one calls setf(right, adjustfield). Write the two-argument form of the setf function. Compare your solution with Listing 64-6.

Listing 64-6. Two-Argument Form of the setf Function

void setf(fmtflags flags_to_set, fmtflags field)
{
   flags_ &= ~field;
   flags_ |= flags_to_set;
}

One difficulty with defining bitfields in this fashion is that the numeric values can be hard to read, unless you’ve spent a lot of time working with hexadecimal values. Another solution is to use more familiar integers for all flags and fields and let the computer do the hard work by shifting those values into the correct positions.

Shifting Bits

Listing 64-7 shows another way to define the formatting fields. They represent the exact same values as shown in Listing 64-1, but they are a little easier to proofread.

Listing 64-7. Using Shift Operators to Define the Formatting Fields

int static constexpr boolalpha_pos = 0;
int static constexpr showbase_pos  = 1;
int static constexpr showpoint_pos = 2;
int static constexpr showpos_pos   = 3;
int static constexpr skipws_pos    = 4;
int static constexpr adjust_pos    = 5;
int static constexpr adjust_size   = 2;
int static constexpr float_pos     = 7;
int static constexpr float_size    = 2;
 
fmtflags static constexpr boolalpha   = 1 << boolalpha_pos;
fmtflags static constexpr showbase    = 1 << showbase_pos;
fmtflags static constexpr showpos     = 1 << showpos_pos;
fmtflags static constexpr showpoint   = 1 << showpoint_pos;
fmtflags static constexpr skipws      = 1 << showpoint_pos;
fmtflags static constexpr adjustfield = 3 << adjust_pos;
fmtflags static constexpr floatfield  = 3 << float_pos;
 
fmtflags static constexpr left     = 0 << adjust_pos;
fmtflags static constexpr right    = 1 << adjust_pos;
fmtflags static constexpr internal = 2 << adjust_pos;
 
fmtflags static constexpr fixed      = 1 << float_pos;
fmtflags static constexpr scientific = 2 << float_pos;
fmtflags static constexpr hexfloat   = 3 << float_pos;

The << operator (which looks just like the output operator) is the left-shift operator. It shifts its left-hand operator (which must be an integer) by the number of bit positions specified by the right-hand operator (also an integer). Vacated bits are filled with zero.

1 << 2  == 4
10 << 3 == 80

Although this style is more verbose, you can clearly see that the bits are defined with adjacent values. You can also easily see the size of multi-bit masks. If you have to add a new flag, you can do so without the need to recompute any other fields or flags.

What is the C++ right-shift operator? ________________. That’s right: >>, which is also the input operator.

If the right-hand operand is negative, that reverses the direction of the shift. That is, a left shift by a negative amount is the same as right-shifting by a positive amount and vice versa. You can use the shift operators on integers but not on floating-point numbers. The right-hand operand cannot be greater than the number of bits in the left-hand operand. (Use the numeric_limits class template, introduced in Exploration 25, to determine the number of bits in a type, such as int.)

The C++ standard library overloads the shift operators for the I/O stream classes to implement the I/O operators. Thus, the >> and << operators were designed for shifting bits in an integer and were later usurped for I/O. As a result, the operator precedence is not quite right for I/O. In particular, the shift operators have a higher precedence than the bitwise operators, because that makes the most sense for manipulating bits. As a consequence, if, for instance, you want to print the result of a bitwise operation, you must enclose the expression in parentheses.

std::cout << "5 & 3 = " << (5 & 3) << '
';

One caution when using the right-shift operator: The value of the bits that are filled in is implementation-defined. This can be particularly problematic with negative numbers. The value -1 >> 1 may be positive on some implementations and negative on others. Fortunately, C++ has a way to avoid this uncertainty, as the next section explains.

Safe Shifting with Unsigned Types

Every primitive integer type has a corresponding type that you declare with the unsigned keyword. These types are known—not surprisingly—as unsigned types. One key difference between ordinary (or signed) integer types and their unsigned equivalents is that unsigned types always shift in a zero when right-shifting. For this reason, unsigned types are preferable to signed types for implementing bitmasks.

typedef unsigned int fmtflags;

Write a program to determine how your C++ environment right-shifts negative values. Compare this with shifting unsigned values. Your program will certainly look different from mine, which is shown in Listing 64-8, but you should be able to recognize the key similarities.

Listing 64-8. Exploring How Negative and Unsigned Values Are Shifted

#include <iostream>
#include <string>
 
template<class T>
void print(std::string const& label, T value)
{
   std::cout << label << " = ";
   std::cout << std::dec << value << " = ";
   std::cout.width(8);
   std::cout.fill('0'),
   std::cout << std::hex << std::internal << std::showbase << value << '
';
}
 
int main()
{
   int i{~0}; // all bits set to 1; on most systems, ~0 == -1
   unsigned int u{~0u}; // all bits set to 1
   print("int >> 15", i >> 15);
   print("unsigned >> 15", u >> 15);
}

On my Linux x86 system, I see the following output:

int >> 15 = -1 = 0xffffffff
unsigned >> 15 = 131071 = 0x01ffff

which means right-shifting a signed value fills in the vacated bits with copies of the sign bit (a process known as sign extension, and that right-shifting an unsigned value works correctly by shifting in zero bits.

Signed and Unsigned Types

The plain int type is shorthand for signed int. That is, the int type has two sign flavors: signed int and unsigned int, the default being signed int. Similarly, short int is the same as signed short int, and long int is the same as signed long int. Thus, you have no reason to use the signed keyword with the integer types.

Like too many rules, however, this one has an exception: signed char. The char type comes in three flavors, not two: char, signed char, and unsigned char. All three types occupy the same amount of space (one byte). The plain char type has the same representation as either signed char or unsigned char, but it remains a distinct type. The choice is left to the compiler; consult your compiler’s documentation to learn the equivalent char type for your implementation. Thus, the signed keyword has a use for the signed char type; the most common use for signed char is to represent a tiny, signed integer when conserving memory is important. Use plain char for text, signed char for tiny integers, and unsigned char for tiny bitmasks.

Unfortunately, the I/O stream classes treat signed char and unsigned char as text, not tiny integers or bitmasks. Thus, reading and writing tiny integers is harder than it should be, as demonstrated in the following:

signed char real_value{-42};
std::cout << static_cast<int>(real_value) << '
';
 
int tmp{};
std::cin >> tmp;
if (tmp >= std::numeric_limits<signed char>::min() and
    tmp <= std::numeric_limits<signed char>::max())
{
  real_value = static_cast<signed char>(tmp);
  // do something with real_value
}
else // handle the error

Unsigned Literals

If an integer literal does not fit in a signed int, the compiler tries to make it fit into an unsigned int. If that works, the literal’s type is unsigned int. If the value is too big for unsigned int, the compiler tries long, and then unsigned long, then long long, and finally unsigned long long, before giving up and issuing an error message.

You can force an integer to be unsigned with the u or U suffix. The U and L suffixes can appear in any order for an unsigned long literal. Use ULL for unsigned long long. (Remember that C++ permits lowercase l, but I recommend uppercase L to avoid confusion with numeral 1.)

1234u
4321UL
0xFFFFLu

One consequence of this flexibility is that you can’t always know the type of an integer literal. For example, the type of 0xFFFFFFFF might be int on a 64-bit system. On some 32-bit systems, the type might be unsigned int, and on others, it might be unsigned long. The moral is to make sure you write code that works correctly, regardless of the precise type of an integer literal, which isn’t difficult. For example, all the programs and fragments in this book work on any C++ compiler, regardless of the size of an int.

Type Conversions

A signed type and its unsigned counterpart always occupy the same amount of space. You can use static_cast (Exploration 25) to convert one to the other, or you can let the compiler implicitly perform the conversion, which can result in surprises, if you aren’t careful. Consider the example in Listing 64-9.

Listing 64-9. Mixing Signed and Unsigned Integers

void show(unsigned u)
{
   std::cout << u << '
';
}
 
int main()
{
   int i{-1};
   std::cout << i << '
';
   show(i);
}

This results in the following output on my system:

-1
4294967295

If you mix signed and unsigned values in an expression (usually a bad idea), the compiler converts the signed value to unsigned, which often results in more surprises. This kind of surprise often arises in comparisons. Most compilers will at least warn you about the problem.

Listing 64-10. Mystery Program

#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
 
template<class T>
void append(std::vector<T>& data, const T& value, int max_size)
{
  if (data.size() < max_size - 1)
    data.push_back(value);
}
 
int main()
{
  std::vector<int> data{};
  append(data, 10, 3);
  append(data, 20, 2);
  append(data, 30, 1);
  append(data, 40, 0);
  append(data, 50, 0);
  std::copy(data.begin(), data.end(),
          std::ostream_iterator<int>(std::cout, " "));
  std::cout << '
';
}

Before you run the program, predict what Listing 64-10 will print.

_____________________________________________________________

Try it. Were you correct? ________________ Explain what the program does.

_____________________________________________________________

The program succeeds in appending 10 to data because the vector size is zero, which is less than 2. The next call to append, however, does nothing, because the vector size is 1, and max_size - 1 is also 1. The next call fails for a similar reason. So why does the next call succeed in appending 40 to data? Because max_size is 0, you might think the comparison would be with -1, but -1 is signed, and data.size() is unsigned. Therefore, the compiler converts -1 to unsigned, which is an implementation-defined conversion. On typical workstations, -1 converts to the largest unsigned integer, so the test succeeds.

The first moral of the story is to avoid expressions that mix signed and unsigned values. Your compiler might help you here by issuing warnings when you mix signed and unsigned values. A common source for unsigned values is from the size() member functions in the standard library, which all return an unsigned result. You can reduce the chances for surprises by using one of the standard typedefs for sizes, such as std::size_t (defined in <cstdlib> and <cstddef>), which is an implementation-defined unsigned integer type. The standard containers all define a member type, size_type, to represent sizes and similar values for that container. Use these typedefs for your variables when you know you have to store sizes, indices, or counts.

“That’s easy!” you say. “Just change the declaration of max_size to std::vector<T>::size_type, and problem solved!” Maybe you can avoid this kind of problem by sticking with the standard member typedefs, such as size_type and difference_type (Exploration 43). Take a gander at Listing 64-11 and see what you think.

Listing 64-11. Another Mystery Program

#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
 
/** Return the index of a value in a range.
 * Look for the first occurrence of @p value in the range
 * [<tt>first</tt>, <tt>last</tt>), and return the zero-based
 * index or -1 if @p value is not found.
 * @param first The start of the range to search
 * @param last One past the end of the range to search
 * @param value The value to search for
 * @return [0, size), such that size == last-first, or -1
 */
template<class InputIter>
typename std::iterator_traits<InputIter>::difference_type
index_of(InputIter first, InputIter last,
         typename std::iterator_traits<InputIter>::value_type const& value)
{
   InputIter iter{std::find(first, last, value)};
   if (iter == last)
      return -1;
   else
      return std::distance(first, iter);
}
 
/** Determine whether the first occurrence of a value in a container is
 * in the last position in the container.
 * @param container Any standard container
 * @param value The value to search for.
 * @return true if @p value is at the last position,
 *         or false if @p value is not found or at any other position.
 */
template<class T>
bool is_last(T const& container, typename T::value_type const& value)
{
    return index_of(container.begin(), container.end(), value) ==
           container.size() - 1;
}
 
int main()
{
   std::vector<int> data{};
   if (is_last(data, 10))
      std::cout << "10 is the last item in data
";
}

Predict the output before you run the program in Listing 64-11.

_____________________________________________________________

Try it. What do you actually get?

_____________________________________________________________

I get “10 is the last item in data,” even though data is clearly empty. Can you spot the conceptual error that I committed? In a standard container, the difference_type typedef is always a signed integral type. Thus, index_of() always returns a signed value. I made the mistake of thinking that the signed value -1 would always be less than any unsigned value because they are always 0 or more. Thus, is_last() would not have to check for an empty container as a special case.

What I failed to take into account is that when a C++ expression mixes signed and unsigned values, the compiler converts the signed value to unsigned. Thus, the signed result from index_of becomes unsigned, and -1 becomes the largest possible unsigned value (on a typical two’s complement system, such as most desktop computers). If the container is empty, size() is zero, and size() - 1 (which the compiler interprets as size() - 1u) is also the largest possible unsigned integer.

If you are fortunate, your compiler issues a warning about comparing signed and unsigned values. That gives you a hint that something is wrong. Fix the program. Compare your solution with Listing 64-12.

Listing 64-12. Fixing the Second Mystery Program

#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>
 
/** Return the index of a value in a range.
 * Look for the first occurrence of @p value in the range
 * [<tt>first</tt>, <tt>last</tt>), and return the zero-based
 * index or -1 if @p value is not found.
 * @param first The start of the range to search
 * @param last One past the end of the range to search
 * @param value The value to search for
 * @return [0, size), such that size == last-first, or -1
 */
template<class InputIter>
typename std::iterator_traits<InputIter>::difference_type
index_of(InputIter first, InputIter last,
         typename std::iterator_traits<InputIter>::value_type const& value)
{
   InputIter iter{std::find(first, last, value)};
   if (iter == last)
      return -1;
   else
      return std::distance(first, iter);
}
 
/** Determine whether the first occurrence of a value in a container is
 * in the last position in the container.
 * @param container Any standard container
 * @param value The value to search for.
 * @return true if @p value is at the last position,
 *    or false if @p value is not found or at any other position.
 */
template<class T>
bool is_last(T const& container, typename T::value_type const& value)
{
   typename T::difference_type
        pos(index_of(container.begin(), container.end(), value));
  auto last_index(static_cast<typename T::difference_type>(container.size() - 1));
   return pos != -1 and pos == last_index;
}
 
int main()
{
   std::vector<int> data{};
   if (is_last(data, 10))
      std::cout << "10 is the last item in data
";
}

The second moral of the story is not to use unsigned types if you don’t have to. Most of the time, signed types work just as well. Just because a type’s range of legal values happens to be non-negative is not a reason to use an unsigned type. Doing so only complicates any code that must cooperate with the unsigned type.

Tip When using the standard library, make use of the typedefs and member typedefs that it provides. When you have control over the types, use signed types for all numeric types, including sizes, and reserve the unsigned types for bitmasks. And always be very, very careful every time you write an expression that uses an unsigned type with other integers.

Overflow

Until now, I’ve told you to ignore arithmetic overflow. That’s because it’s a difficult topic. Strictly speaking, if an expression involving signed integers or floating-point numbers overflows, the results are undefined. In reality, your typical desktop system wraps integer overflow (so adding two positive numbers can yield a negative result). Overflow of floating-point numbers can yield infinity, or the program may terminate.

If you must prevent overflow, you should check values before evaluating an expression, to ensure the expression will not overflow. Use std::numeric_limits<> to check the min() and max() values for the type.

If you explicitly cast a signed value to a type, such that the value overflows the destination type, the results are not so dire. Instead of undefined behavior, the results are defined by the implementation. Most implementations simply discard the excess bits. Therefore, for maximum safety and portability, you should check for overflow. Use numeric_limits (Exploration 25) to learn the maximum or minimum value of a type.

Unsigned integers are different. The standard explicitly permits unsigned arithmetic to overflow. The result is to discard any extra high-order bits. Mathematically speaking, this means unsigned arithmetic is modulo 2ⁿ, where n is the number of bits in the unsigned type. If you have to perform arithmetic that you know may overflow, and you want the values to wrap around without reporting an error, use static_cast to cast to the corresponding unsigned types, perform the arithmetic, and static_cast back the original types. The static_casts have no impact on performance, but they clearly tell the compiler and the human what’s going on.

Introducing Bitfields

A bitfield is a way to partition an integer within a class into individual bits or masks of adjacent bits. Declare a bitfield using an unsigned integer type or bool, the field name, a colon, and the number of bits in the field. Listing 64-13 shows how you might store the I/O formatting flags using bitfields.

Listing 64-13. Declaring Formatting Flags with Bitfields

struct fmtflags
{
   bool skipws_f :         1;
   bool boolalpha_f:       1;
   bool showpoint_f:       1;
   bool showbase_f:        1;
   bool showpos_f:         1;
   unsigned adjustfield_f: 2;
   unsigned floatfield_f:  2;
 
   static unsigned constexpr left     = 0;
   static unsigned constexpr right    = 1;
   static unsigned constexpr internal = 2;
 
   static unsigned constexpr fixed      = 1;
   static unsigned constexpr scientific = 2;
   static unsigned constexpr hexfloat   = 3;
};

Use a bitfield member the way you would use any other data member. For example, to set the skipws flag, use

flags.skipws_f = true;

and to clear the flag, use the following:

flags.skipws_f = false;

To select scientific notation, try the line that follows:

flags.floatfield_f = fmtflags::scientific;

As you can see, code that uses bitfields is easier to read and write than the equivalent code using shift and bitwise operators. That’s what makes bitfields popular. On the other hand, it is hard to write functions such as setf and unsetf. It is hard to get or set multiple, nonadjacent bits at one time. That’s why your library probably doesn’t use bitfields to implement I/O formatting flags.

Another limitation is that you cannot take the address of a bitfield (with the & operator), because an individual bit is not directly addressable in the C++ memory model.

Nonetheless, the clarity that bitfields offer puts them at the top of the list when choosing an implementation. Sometimes, other factors knock them off the list, but you should always consider bitfields first. With bitfields, you don’t have to be concerned with bitwise operators, shift operators, mixed-up operator precedence, and so on.

Portability

The C++ standard leaves several details up to each implementation. In particular, the order of bits in a field is left up to the implementation. A bitfield cannot cross a word boundary where the definition of a word is also left up to the implementation. Popular desktop and workstation computers often use 32 bits or 64 bits, but there is no guarantee that a word is the same size as an int. An unnamed bitfield of size zero tells the compiler to insert pad bits, so that the subsequent declaration aligns on a word boundary.

class demo {
  unsigned bit0 : 1;
  unsigned bit1 : 1;
  unsigned bit2 : 3;
  unsigned      : 0;
  unsigned word1: 2;
};

The size of a demo object depends on the implementation. Whether bit0 is the least or most significant bit of a demo’s actual implementation also varies from one system to another. The number of pad bits between bit2 and word1 also depends on the implementation.

Most code does not have to know the layout of the bits in memory. On the other hand, if you are writing code that interprets the bits in a hardware control register, you have to know the order of bits, the exact nature of padding bits, and so on. But you probably aren’t expecting to write highly portable code, anyway. In the most common case, when you are trying to express a compact set of individual set members or small bitmasks, bitfields are wonderful. They are easy to write and easy to read. They are limited, however, to a single word, often 32 bits. For larger bitfields, you must use a class, such as std::bitset.

The bitset Class Template

Sometimes you have to store more bits than can fit in an integer. In that case, you can use the std::bitset class template, which implements a fixed-size string of bits of any size.

The std::bitset class template takes one template argument: the number of bits in the set. Use a bitset object the way you would any other value. It supports all the bitwise and shift operators, plus a few member functions for further convenience. Another nifty trick that bitset can perform is the subscript operator, which lets you access individual bits in the set as discrete objects. The right-most (least significant) bit is at index zero. Construct a bitset from an unsigned long (to set the least-significant bits of the bitset, initializing the remaining bits to zero) or from a string of ‘0’ and ‘1’ characters, as illustrated in Listing 64-14.

Listing 64-14. Example of Using std::bitset

#include <bitset>
#include <cstdlib>
#include <iostream>
#include <string>
 
/** Find the first 1 bit in a bitset, starting from the most significant bit.
 * @param bitset The bitset to examine
 * @return A value in the range [0, bitset.size()-1) or
 *    size_t(-1) if bitset.none() is true.
 */
template<std::size_t N>
std::size_t first(std::bitset<N> const& bitset)
{
   for (std::size_t i{bitset.size()}; i-- != 0;)
      if (bitset.test(i))
         return i;
   return std::size_t(-1);
}
 
int main()
{
   std::bitset<50> lots_o_bits{std::string{"1011011101111011111011111101111111"}};
   std::cout << "bitset: " << lots_o_bits << '
';
   std::cout << "first 1 bit: " << first(lots_o_bits) << '
';
   std::cout << "count of 1 bits: " << lots_o_bits.count() << '
';
   lots_o_bits[first(lots_o_bits)] = false;
   std::cout << "new first 1 bit: " << first(lots_o_bits) << '
';
   lots_o_bits.flip();
   std::cout << "bitset: " << lots_o_bits << '
';
   std::cout << "first 1 bit: " << first(lots_o_bits) << '
';
}

In Exploration 25, I presented static_cast<> as a way to convert one integer to a different type. Listing 64-14 demonstrates another way to convert integer types, using constructor and initializer syntax: std::size_t(-1) or std::size{-1}. For a simple type conversion, this syntax is often easier to read than static_cast<>. I recommend using this syntax only when converting literals; use static_cast<> for more complicated expressions.

Unlike working with bitfields, most of the behavior of bitset is completely portable. Thus, every implementation gives the same results when running the program in Listing 64-14. The following output displays those results:

bitset: 00000000000000001011011101111011111011111101111111
first 1 bit: 33
count of 1 bits: 28
new first 1 bit: 31
bitset: 11111111111111111100100010000100000100000010000000
first 1 bit: 49

Write a function template, find_pair, that takes two arguments: a bitset to search, and a bool value to compare. The function searches for the first pair of adjacent bits that are equal to the second argument and returns the index of the most significant bit of the pair. What should the function return if it cannot find a matching pair of bits? Write a simple test program too.

Compare your solution with mine, which is presented in Listing 64-15.

Listing 64-15. The find_pair Function and Test Program

#include <bitset>
#include <cassert>
#include <cstdlib>
#include <iostream>
 
template<std::size_t N>
std::size_t find_pair(std::bitset<N> const& bitset, bool value)
{
   if (bitset.size() >= 2)
      for (std::size_t i{bitset.size()}; i-- != 1; )
         if (bitset[i] == value and bitset[i-1] == value)
            return i;
   return std::size_t(-1);
}
 
int main()
{
   std::size_t const not_found{~0u};
   std::bitset<0> bs0{};
   std::bitset<1> bs1{};
   std::bitset<2> bs2{};
   std::bitset<3> bs3{};
   std::bitset<100> bs100{};
 
   assert(find_pair(bs0, false) == not_found);
   assert(find_pair(bs0, true) == not_found);
   assert(find_pair(bs1, false) == not_found);
   assert(find_pair(bs1, true) == not_found);
   assert(find_pair(bs2, false) == 1);
   assert(find_pair(bs2, true) == not_found);
   bs2[0] = true;
   assert(find_pair(bs2, false) == not_found);
   assert(find_pair(bs2, true) == not_found);
   bs2.flip();
   assert(find_pair(bs2, false) == not_found);
   assert(find_pair(bs2, true) == not_found);
   bs2[0] = true;
   assert(find_pair(bs2, false) == not_found);
   assert(find_pair(bs2, true) == 1);
   assert(find_pair(bs3, false) == 2);
   assert(find_pair(bs3, true) == not_found);
   bs3[2].flip();
   assert(find_pair(bs3, false) == 1);
   assert(find_pair(bs3, true) == not_found);
   bs3[1].flip();
   assert(find_pair(bs3, false) == not_found);
   assert(find_pair(bs3, true) == 2);
   assert(find_pair(bs100, false) == 99);
   assert(find_pair(bs100, true) == not_found);
   bs100[50] = true;
   assert(find_pair(bs100, true) == not_found);
   bs100[51] = true;
   assert(find_pair(bs100, true) == 51);
}

Although bitset is not widely used, when you need it, it can be extremely helpful. The next Exploration covers a language feature that is much more widely used than bitset: enumerations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Exploration 64: Working with Bits

Create new playlist

Sign In

Sign Up

Table of Contents for
Exploration 64: Working with Bits