Benchmarking tool

I’ve written before about when and how to optimise code. I think there are often more important aspects to a project than code performance. Servers, desktops, laptops and even new mobile phones are many times more powerful than an old computer. Instead you need to decide whether the performance is good enough. That could be how long a batch process takes to run, is a user interface responsive enough, or are the cloud compute costs too high.

If you have decided that a specific piece of code needs to be optimised then Quick C++ Benchmark might help. It’s an online tool that lets you write C++, run benchmark tests and get a lovely graph at the end. There are various options so you can select a compiler, language version and what optimisation level to use.

Handy example

I was experimenting with something recently and needed a basic string join function. This is not something that needed optimising, it was just a handy example. Something like this:

std::string Join(const std::vector<std::string>& strings, const std::string& separator) {
  std::string result;
  for (size_t i = 0; i < strings.size(); i++) {
    if (i != 0) {
      result += separator;
    }
    result += strings[i];
  }
  return result;
}

Which you benchmark online. I thought the most obvious thing that might be slowing things down was re-allocation of the string. If I pre-calculated the required size I can avoid that:

size_t JoinSize(const std::vector<std::string>& strings, const std::string& separator) {
  size_t size = separator.size() * (strings.size() - 1);
  for (size_t i = 0; i < strings.size(); i++) {
    size += strings[i].size();
  }
  return size;
}

std::string Join(const std::vector<std::string>& strings, const std::string& separator) {
  std::string result;
  result.reserve(JoinSize(strings, separator));
  for (size_t i = 0; i < strings.size(); i++) {
    if (i != 0) {
      result += separator;
    }
    result += strings[i];
  }
  return result;
}

Hmm, that’s only fractionally faster. This might seem oddly similar but what about:

size_t JoinSize(const std::vector<std::string>& strings, const std::string& separator) {
  size_t size = separator.size() * (strings.size() - 1);
  for (const auto& string : strings) {
    size += string.size();
  }
  return size;
}

It’s a tiny change but it is still faster. Maybe try using something better suited to building a string. There isn’t a StringBuffer but what about this:

std::string Join(const std::vector<std::string>& strings, const std::string& separator) {
  std::stringstream stream;
  for (size_t i = 0; i < strings.size(); i++) {
    if (i != 0) {
      stream << separator;
    }
    stream << strings[i];
  }
  return stream.str();
}

That’s terrible, just terrible. Forget that and go back. Can we use the looping trick again? Not as easily because we still have to check whether to add the separator:

std::string Join(const std::vector<std::string>& strings, const std::string& separator) {
  std::string result;
  result.reserve(JoinSize(strings, separator));
  for (auto i = strings.begin(); i != strings.end(); i++) {
    if (i == strings.begin()) {
      result += separator;
    }
    result += *i;
  }
  return result;
}

Again slightly faster, overall we’ve gained 20% compared to the initial implementation.

Lessons

What can we learn from all this:

  • Simple changes can bring small performance boosts. However you might have to experiment to find out what’s best.
  • This probably wasn’t a good target to choose. Unless it’s in a tight loop it probably won’t make much of a difference. However a library function is a good place to get lots of little boosts.
  • Exactly what loops you use matters. Fortunately the ranged-based for loop is easy to read, understand and is fast as well.
  • Overall this code is more complex. Unless it’s a problem I’d just let the string grow automatically.
  • Looking back why did my loop in JoinSize not start out as a ranged-based for loop? It’s because I used copy-paste, which is problematic.
  • If I was persuing this further I’d want to know why some loops were slower and some were faster.

You could do all of this locally but being able to show your findings can be handy.
If you want to persuade someone to change their loops, do it here and show them it is faster.


Posted

in

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *