I was recently talking about clang-tidy. One of the performance checks it can run is unnecessary-value-param. This makes sense, accidentally passing a big array by value rather than by reference can invisibly ruin the code performance. The rule was pretty easy though, pass simple types by value and everything else by reference. Then along came std::string_view which they recommend you pass by value.

Function parameters

There are a few different ways of passing data to a function, actually more than I normally think about. C++ might be one of the more complex common languages here.

  • Value parameters are copied in.
    • uint32_t value: Perhaps the most common sort of parameter, fast if the value is simple.
    • const uint32_t value: More often than not the parameter is not going to change, whether it is marked in that way or not.
  • Reference parameters are just pointers under the hood but can be accessed as if they were values.
    • std::string& value: This is great is you have a value you want to change but some people don’t like this because it may not be obvious that the function can change things.
    • const std::string& value: Perhaps the next most common sort of parameter, fast if the value is complex but slightly less good for simple values.
  • Array parameters are just pointers but for a specific use case.
    • uint32_t values[]: An array where you can change the content.
    • const uint32_t values[]: An array you can’t change.
  • Pointer parameters are just values which happen to be memory addresses but that gives them extra complexity. I tend to avoid these because they can be null which requires extra error processing.
    • char* value: An address which you can both change directly and change what it points to.
    • const char* value: An address which you can change but not change what it points to.
    • char const* value: An address which you can’t change but can change what it points to.
    • const char const* value: An address which you can’t change or what it points to.

That seems like a lot. It’s okay if you’ve got an easy rule to follow but I don’t want to have to learn a lot of special cases. Wouldn’t it be better if we could just leave it to the compiler or library developers? I want to be able to have a simple interface. All I should have to care about is… whether I want to send something in to the function or get something out. I wish I could just write a function like this:

void Test(In<Type> input, Out<Type> output);

And then call it like this:

    Test(input, output);

I want it to just work:

  • If Type is simple:
    • In<Type> is passed by value, lets say as a const.
    • Out<Type> is passed by reference.
  • If Type is complex:
    • In<Type> is passed by const reference.
    • Out<Type> is passed by reference.
  • If Type is a special case then the library developer specifies the behaviour.

Meta-programming

Simple usage of generics or templates gives you the ability to use a class or function with different types. However more powerful systems also give you the ability to manipulate and make choices based on those types. I think I can make my wish come true. Unfortunately in C++ meta-programming has a separate set of library functionality to learn and I get far less practice at this.

My first thought is a templated alias declaration. Something like this:

template <typename Type, std::enable_if_t<IsSimpleParameter<Type>, bool> = true>
using In = const Type;

template <typename Type, std::enable_if_t<IsComplexParameter<Type>, bool> = true>
using In = const Type&;

Here std::enable_if_t is meant to enable / disable the definition based on type. One should be visible for chosen for simple types and one for complex types. However the compiler complains that In already exists in the current scope. I also tried to specialise this for std::string_view:

template <>
using In<std::string_view>> = const Type;

But apparently you can’t specialise an alias template in C++ yet.

Next I looked at something more complicated:

template <typename Type, std::enable_if_t<IsSimpleParameter<Type>, bool> = true>
struct In {
public:
    In(const Type value) :
        m_Value(value) {}

    operator Type() {
        return m_Value;
    }

private:
    const Type m_Value;
};

With something similar for complex types that used const Type&. As I had to double up on classes it was a bit repetitious. It also didn’t work because “template parameter '__formal' is incompatible with the declaration“. I wish the compiler was a bit clearer about template issues.

Then tried std::conditional_t to choose between the two alternative types:

template <typename Type>
struct In {
public:
    using ParameterType = std::conditional_t<IsSimpleParameter<Type>, const Type, const Type&>;

    In(ParameterType value) :
        m_Simple(value) {}

    operator ParameterType() {
        return m_Simple;
    }

private:
    ParameterType m_Simple;
};

Notionally a simple type will be copied into the class and copied back out when the parameter is used in a function, for a complex type it’s just a reference that is passed back and forth. This actually works. It automatically picks the right way to pass simple or complex parameters.

Now that I’ve got it working this I realise I can simplify things:

template <typename Type>
using In = std::conditional_t<IsSimpleParameter<Type>, const Type, const Type&>;

If I use In<uint32_t> then it knows this is const uint32_t and if I use In<std::string> then it knows this is const std::string&.

I could do it more simply for Out:

template <typename Type>
using Out = Type&;

You could get away without this as it’s always passing this by non-const reference, it’s not complicated. There is a nice symmetry to having In and Out parameters.

However maybe it’s better to use a class for this one:

template <typename Type>
struct Out {
public:
    explicit Out(Type& value) :
        m_Value(value) {}

    operator Type&() {
        return m_Value;
    }

    Type& operator=(const Type& value) {
        m_Value = value;
        return m_Value;
    }

private:
    Type& m_Value;
};

By using the explicit keyword we stop the compiler from using the constructor automatically. This means you must mark any out parameter when you call a function:

    Test(input, Out(output));

That would help people who think a simple reference doesn’t indicate the potential change of the parameter.

I haven’t talked about IsSimpleParameter yet. This is just a templated variable declaration:

template <class Type>
constexpr bool IsSimpleParameter = sizeof(Type) <= sizeof(size_t);

This means that any type that is smaller than a pointer is considered “simple” and will be passed by value as an In parameter. Anything larger than that is “complex” and will be passed by reference as an In parameter. However this can be specialised for any type:

template<>
constexpr bool IsSimpleParameter<std::string_view> = true;

So std::string_view despite taking up more space is still passed by value as in In parameter.

Meta-programming downsides

While being able to do this is great it’s not all roses. It’s extra complexity. Writing this is harder than normal coding, it has different tools and the compiler’s error messages just aren’t as helpful. If someone else is using it and it breaks that’s going to create problems. It certainly took more time to do this than remembering that std::string_view needed some special treatment, at least for one library. If there were lots of special cases or you’re willing to commit to having your entire system like it will bring more value in the long run.

There could be a slight performance hit to doing something like this. It has the potential to require extra copying and conversions. However I expect the optimizer will be able to bypass all that in almost all cases. The class based approaches don’t actually change any of the data so the final assembly code may well be unchanged.

The biggest problem I had was while writing unit tests. I use this function to tell help determine which parameter type has been chosen:

template <typename Type>
static const Type* GetAddressOf(In<Type> value) {
    return &value;
}

I want to be able to write these tests:

    uint32_t integer{};
    EXPECT_NE(GetAddressOf(integer), &integer);

    std::string string{};
    EXPECT_EQ(GetAddressOf(string), &string);

The integer is passed by value so the address returned from the function won’t match. The string is passed by reference so the address returned from the function will match. However the compiler tells me “no matching overloaded function found“. It does know that GetAddressOf is a potential match but it can’t work out what Type should be. This is frustrating. It is obvious to me what Type should be but not to the compiler. If you tell the compiler explicitly then it’s fine:

    uint32_t integer{};
    EXPECT_NE(GetAddressOf<uint32_t>(integer), &integer);

    std::string string{};
    EXPECT_EQ(GetAddressOf<std::string>(string), &string);

So it can have problems with template functions but is fine for non-template functions.

In the end

Once I’d worked out that something like this was possible it was an interesting, if slightly frustrating, programming challenge. It would be nice to have a language which treated types in the same way as other data so the same language constructs could be used for both. I have take a quick look at Lisp but it’s very different from what I’m use to.

I’m not going to recommend you start using my In and Out templates, I don’t know if I’m going to use them myself. I did want to try getting an system which would “just work” with these new requirements for std::string_view.

It’s not uncommon for interfaces to come with special rules: initialise this first, always remember to destroy this afterwards, and so on. Having an interface that does “just work” means developers don’t have to remember extra rules and they can’t end up getting them wrong. If you’re designing interfaces try to make them simple and reliable so that no one needs to patch them up afterwards.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *