Iterative development

It can be useful to develop in a series of small steps. At each point trying to write the code that you need but not too much more. Building a huge library that you think you’ll need might be a waste of time but building something small that lets you simplify you current code is worthwhile.

I’ve often do this at work but that code isn’t available to me here. So a direct demonstration isn’t possible. Instead I’ve have done some prototyping and I’ll use that to tell the story. I’m going to use the example of a bit manipulation library because I’ve done one before. Lets imagine we’re working on some sort of packet inspector on a network. In particular I’m going to reference the TCP segment structure in the code.

Direct manipulation

The first way to access bit flags is directly with basic bit manipulation and some masks.

bool IsECE(const std::array<std::byte, 56>& segmentHeader) {
    const auto flags = segmentHeader[13];
    return ((flags >> 1) & 1u) && ((flags >> 6) & 1u);
}

This works but for me this is just not enough. If someone comes across this later it will mean nothing. My minimum would be to have a comment or some helpfully named types and constants, perhaps both.

// https://en.wikipedia.org/wiki/Transmission_Control_Protocol

struct TcpSegmentHeader {
...
    uint8_t DataOffsetAndReserved;
    uint8_t Flags;
...
};

const uint8_t gc_DataOffsetBitIndex(4);
const uint8_t gc_DataOffsetBitSize(4);
...
const uint8_t gc_ExplicitCongestionNotificationBitIndex(6);
...
const uint8_t gc_SynchronizeSequenceNumbersBitIndex(1);
...

I could have used the abbreviated form of the flag names but gc_CongenstionWindowReducedBitIndex appeals to me more than gc_SwrBitIndex. That lets me re-write the function.

bool IsExplicitCongestionNotification(const TcpSegmentHeader& segmentHeader) {
    const auto flags = segmentHeader.Flags;
    return ((flags >> gc_SynchronizeSequenceNumbersBitIndex) & 1u) &&
           ((flags >> gc_ExplicitCongestionNotificationBitIndex) & 1u);
}

Now someone looking at the code has a hope of understanding it. Next you might be asked to start updating the flags.

void ClearExplicitCongestionNotificationFlag(TcpSegmentHeader& segmentHeader) {
    const auto& flags = segmentHeader.Flags;
    flags = flags & ~(1u << gc_ExplicitCongestionNotificationBitIndex);
}

This isn’t hard but making a mistake becomes more likely. Did you mask the right thing? Was that the right operator to use? If you do it enough times one of them will be wrong. To me it is also less obvious what the code is doing. If I was skimming the code I could tell it’s doing bit manipulation but what sort? I mean, it says it’s “clearing the explicit congestion notification flag” but is that what it’s actually doing?

Bit manipulation library

So when you have a few of these it’s useful to make a functions to help. For the real thing I’d use template functions based on a Value type but I’ll skip that here for simplicity.

namespace Bits
{
    Value MaskUpTo(uint8_t index);
    Value Mask(uint8_t index, uint8_t size = 1);
    
    bool IsSet(Value value, uint8_t index);
    bool IsClear(Value value, uint8_t index);
    
    bool IsAllSet(Value value, uint8_t index, uint8_t size = 1);
    bool IsAnySet(Value value, uint8_t index, uint8_t size = 1);
    bool IsNoneSet(Value value, uint8_t index, uint8_t size = 1);
    
    Value Clear(Value value, uint8_t index, uint8_t size = 1);
    Value Set(Value value, uint8_t index, uint8_t size = 1);
    Value Flip(Value value, uint8_t index, uint8_t size = 1);
    
    Value GetAt(Value value, uint8_t index, uint8_t size);
    Value SetAt(Value value, uint8_t index, uint8_t size, Value subValue);
} // namespace Bits

And I’ll take advantage of my new library and add some more functions.

uint8_t GetDataOffset(const TcpSegmentHeader& segmentHeader) {
    return Bits::GetAt(segmentHeader.DataOffsetAndReserved,
        gc_DataOffsetBitIndex, gc_DataOffsetBitSize);
}

void SetDataOffset(TcpSegmentHeader& segmentHeader, uint8_t size) {
    return Bits::SetAt(segmentHeader.DataOffsetAndReserved,
        gc_DataOffsetBitIndex, gc_DataOffsetBitSize, size);
}

bool IsExplicitCongestionNotification(const TcpSegmentHeader& segmentHeader) {
    const auto flags = segmentHeader.Flags;
    return Bits::IsSet(flags, gc_SynchronizeSequenceNumbersBitIndex) &&
           Bits::IsSet(flags, gc_ExplicitCongestionNotificationBitIndex);
}

void ClearExplicitCongestionNotificationFlag(TcpSegmentHeader& segmentHeader) {
    const auto& flags = segmentHeader.Flags;
    flags = Bits::Clear(flags, gc_ExplicitCongestionNotificationBitIndex);
}

This is much better. The function names tell you what’s happening. It’s easier to write and easier to read. The type constraints from the functions make it a safer as well. Unit tests can make sure at least this part of your logic is right.

Bit manipulation classes

If you continue to expand your codebase you can make it even easier. The API is functional but the code it produces could be more concise.

namespace Bits
{
    template <typename Value>
    class Index
    {
    public:
        Index(uint8_t index);

        bool IsSet(Value value) const;
        bool IsClear(Value value) const;

        Value Set(Value value) const;
        Value Clear(Value value) const;
        Value Flip(Value value) const;
    ...
    };

    template <typename Value>
    class Range
    {
    public:
        Range(uint8_t index, uint8_t size);

        bool IsAllSet(Value value) const;
        bool IsAnySet(Value value) const;
        bool IsNoneSet(Value value) const;

        Value Set(Value value) const;
        Value Clear(Value value) const;
        Value Flip(Value value) const;

        Value GetAt(Value value) const;
        Value SetAt(Value value, Value subValue) const;
    ...
} // namespace Bits

Which requires some new constants.

const Bits::Range<uint32_t> gc_DataOffsetBitRange(4, 4);
...
const Bits::Index<uint32_t> gc_ExplicitCongestionNotificationBitIndex(6);
...
const Bits::Index<uint32_t> gc_SynchronizeSequenceNumbersBitIndex(1);
...

And I can simplify the functions some more.

uint8_t GetDataOffset(const TcpSegmentHeader& segmentHeader) {
    return gc_DataOffsetBitRange.GetAt(segmentHeader.DataOffsetAndReserved);
}

void SetDataOffset(TcpSegmentHeader& segmentHeader, uint8_t size) {
    return gc_DataOffsetBitRange.SetAt(segmentHeader.DataOffsetAndReserved, size);
}

bool IsExplicitCongestionNotification(const TcpSegmentHeader& segmentHeader) {
    const auto flags = segmentHeader.Flags;
    return gc_SynchronizeSequenceNumbersBitIndex.IsSet(flags) &&
           gc_ExplicitCongestionNotificationBitIndex.IsSet(flags);
}

void ClearExplicitCongestionNotificationFlag(TcpSegmentHeader& segmentHeader) {
    const auto& flags = segmentHeader.Flags;
    flags = gc_ExplicitCongestionNotificationBitIndex.Clear(flags);
}

Library performance

While there are benefits to this code: easy to write, easy to read, easy to test. You might have concerns about performance. Direct bit manipulation is about fast as you can get. There are assemble equivalents for most of this. Is this library going to use them?

Performance is often less of an issue than people think but, if it is, you can check. Modern optimisers are pretty good but if the compiler isn’t using the best assembly then you can do something about it. Everything that uses these functions will gain the benefit. I’ve done some quick benchmarking for this and my library looks neck and neck with direct manipulation.

Skipping to the end

I’ve suggested building up this library in a series of small steps. However if you know you’ll be using something a lot it might be better to go for a good interface with a bad implementation. I could have gone straight for the Bits::Index and Bits::Range classes but implementing it with std::bitset. It would probably have involved a lot of conversion between types and looping through arrays of bits. That means it would have been slow but it would have still worked. I could have let me get on to the urgent packet inspection work. If you can get a good interface the implementation can always be fixed later on.

In the end

It matters less whether you take it in small steps or skip to the end. That comes down to how much you can predict about where the code is going. It’s about noticing patterns that occur in your codebase and then simplifying those patterns. I’ve posted about this before when thinking about DRY. There is a bit of extra work to notice and formalise the pattern but future work is then easier in many ways.


Posted

in

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *