API lock-in

When C# first came out it didn’t have generics, that didn’t come out until version 2.0. That meant that, say, the user interface APIs either had to deal with collections of unknown objects or be use custom written collections. If they did one you don’t know what object you have; if you do the other then you can’t process the collections with standard tools. I’d like to be more specific but Microsoft hasn’t kept all the documentation available that far back. It’s probably somewhere on the Wayback Machine but I couldn’t chase it down. Then generics did come out but the existing APIs were already written, we were locked in to the old way of doing things.

Old-fashioned find

I started on this track thinking about the interface of std::find. If you want to search a collection in C++ you might well do something like this:

    const auto i = std::ranges::find(numbers, 7);
    if (i != numbers.end()) {
        std::out << "Your lucky number was at position " <<
            std::distance(i - numbers.begin()) << std::endl;
    }

This works but I feel the i != numbers.end() is clunky. Another language might do it differently:

    int index = numbers.IndexOf(7);
    if (index != -1) {
        Console.WriteLine("Your lucky number was at position {0}", index);
    }

That’s less clunky but you’re still left comparing the against magic number. It’s what we’ve always done. However it leaves room for writing this:

    int index = numbers.IndexOf(7);
    Console.WriteLine("Your lucky number was at position {0}", index);

Which might work or it might not. If the developer has thought everything through then the list always contains the lucky number. Maybe the list did always contain the lucky number until something else in the codebase changed. The worst this will do is print out -1 but if you’re accessing data then you’re out of bounds.

In any case I thought the interface to std::find looked a bit old fashioned. What might it look like if they were writing it today:

template <typename Range, typename T>
std::optional<Range::iterator> find(const Range& range, const T& value);

So rather than returning an iterator directly it returns an optional iterator. That means you can know directly whether it was successful or not:

    const auto i = std::ranges::find(numbers, 7);
    if (i.has_value()) {
        const auto index = std::distance(i.value() - numbers.begin());
        std::out << "Your lucky number was at position " << index << std::endl;
    }

I think that’s less clunky than the original. We’re not doing a mysterious comparison were asking a specific question about our result. That example is overly wordy. Today I’d probably write it like this:

    if (const auto i = std::ranges::find(numbers, 7)) {
        std::out << "Your lucky number was at position " <<
            std::distance(i.value() - numbers.begin()) << std::endl;
    }

The optional type is automatically converted to a boolean when evaluating the if-statement. Of course there is still room to be careless:

    const auto i = std::ranges::find(numbers, 7);
    std::out << "Your lucky number was at position " <<
        std::distance(i.value() - numbers.begin()) << std::endl;

But i isn’t a dumb iterator here, it’s an optional iterator. It “knows” whether you found something or not. It’s never going to print out -1, it’ll throw an exception instead. You can still shoot yourself in the foot using *i which doesn’t perform any checks.

Are there any downsides? I don’t know how std::optional is implemented but there’s probably a slight space overhead. It probably stores an extra boolean to determine whether the value is present or not. We’re probably going to use the iterator immediately so that’s not too much cost. Having an extra check with every value() access might matter more. I tried to do some performance tests but the old and new style seem neck and neck. It could be that as long as the check is done once the optimiser can get rid of any repetitions. Looking at the assembly on Compiler Explorer that seems to be right.

This new style seems to have potential. I’ve added it to my own library of code so I can see how it works out long term.

Locked in

This post isn’t really about a new-style find. It’s about the more general problem fixed APIs. Maybe the standards group for C++ had the same idea last year but there are already millions of lines of code using the old-style std::find, they can’t just change it.

There are more traditional ways of extending APIs:

Adding new functions – No problems there.
Adding extended versions of existing functions – This brings to mind some Windows APIs which would be scattered with Ex versions of functions. It’s often means the function has too many parameters for easy reading as well.
Overloading existing functions – If your language allows this it’s probably the easiest way to extend an existing function. In C++ you cannot overload functions purely on return type but can change the return type if they input arguments are different.
Extending a structure which is an argument to the function – As long as the structure has default values this could be the cleanest way. Existing users are unaffected and new users can supply extra options for new behaviour.

They’ve dealt with a similar situation recently. The original std::find takes in a start and end point. That’s flexible but most of the time I want to iterate over an entire collection. You end up endlessly repeating yourself with values.begin() and values.end(). They have finally introduced std::ranges::find which means you can just use values. Thank goodness.

So, I suppose we could use a namespace to distinguish things, std::optionally::find? But lets say I think that std::optionally::find or std::ranges::find are superior to the original. I’m stuck adding optionally and ranges all over my code. Maybe the standard library could have been designed with more change in mind.

What if, instead of std, they had used release based namespace. So there would be a std11 namespace, std14, std17 and so on. At the same time a using-statement, say using std = std17;, simplifies things most of the time. If our new-style find comes it it can be std26::find but the previous version, std23::find, is still available. If you want to use just one version of find you can. If you want or need to use both versions because of an existing codebase the conversion is just a search & replace.

In the end

I don’t think std26::find is a perfect solution but change is going to happen and having some means of adapting things is a good idea. If you have a file format then you, hopefully, use a version number. Same for a web API. Most people anticipate that an API might be extended but realising that it might be good to change seems much rarer.

Old-fashioned find

Locked in

In the end

Comments

Leave a Reply Cancel reply