I’ve used a few different error handling techniques since university. Over the last few years a couple of other options have appeared. Lets take a simple problem and see how it works with all of them.
A simple problem
Let’s say we want to read a text file containing a list of dates and return them. The file looks something like this:
Date list 5 2001-04-13 2005-03-23 2011-11-01 2022-03-30 2024-07-15
That is: a fixed header, the number of dates to expect, all of the dates line by line. This covers the basics that you might find in any file type: fixed items, values that control later parsing, values with detailed formats. This is probably a file produced by a program, it outputs the number of items to expect from the list. For a human produced file I would have an arbitrary sized list with a terminator, that’s easier for someone to write.
In these example programs I’m going to assume I have access to whatever functions I want. I’ll try to make them obvious and describe their behaviour if necessary.
No error checking
Ignoring the problem is the easiest way to go about things.
std::vector<Date> ReadDates(const std::string& filename) { auto* file = FileOpen(filename, "r"); std::ignore = FileGetLine(file); uint32_t count = ParseUint32(FileGetLine(file)); std::vector<Date> dates; for (uint32_t i = 0; i < count; i++) { dates.push_back(ParseDate(FileGetLine(file))); } FileClose(file); return dates; }
This is as simple as we can get. Each part of the header is read and assumed to be correct. If there are any problems it’s unclear what the result would be. The best you could hope for is an empty date vector, at worst a crash. You don’t really know.
I’ve deliberately used a old-fashioned approach to FileOpen
.
If you’re using C++
then you should use a std::filestream
which will close automatically in the destructor.
It will make this and every other example easier and shorter.
However some languages don’t have destructors.
They might use with-statements or defer keywords instead.
I went with the FileOpen
approach just to leave something annoying that had to be cleared up at the end.
That’s often the way with error handling.
Single return
An easy way of dealing with errors is to have your functions return an error code or simple a success / fail boolean. One way to do is to focus on the successes first and leave the failures until the end.
bool ReadDates(const std::string& filename, std::vector<Date>& dates) { bool success = true; auto* file = FileOpen(filename, "r"); if (file != nullptr) { std::string headerLine; if (FileGetLine(file, headerLine) && headerLine == "Date list") { std::string countLine; uint32_t count; if (FileGetLine(file, countLine) && ParseUint32(countLine, count)) { for (uint32_t i = 0; i < count; i++) { std::string dateLine; Date date; if (FileGetLine(file, dateLine) && ParseDate(dateLine, date)) { dates.push_back(date); } else { LogInfo("Failed to load date line {}", i); success = false; } } } else { LogInfo("Failed to load header"); success = false; } } else { LogInfo("Failed to load date count"); success = false; } FileClose(file); } else { LogInfo("Failed to open file {}", filename); success = false; } return success; }
I really don’t like this. The successful control flow keeps getting indented more and more. This is a really small example and it’s already further in than I’d like. It’s hard to match success / fail pairs so you could end up with the wrong message. You have to try and fit the whole thing in your head at once. I think it makes the code hard to understand.
Logging error messages is both good and bad. If you expect the function to succeed then in will help with debugging. If you expect the function might fail, for example if you’re trying to parse random files, then it’s going to produce a lot of noise. I’m normally going to try and include it because sometimes that’s what you want. This isn’t meant to be about exploring the easiest case.
Return early
We can use the same sort of functions but change the order.
bool ReadDates(const std::string& filename, std::vector<Date>& dates) { auto* file = FileOpen(filename, "r"); if (file == nullptr) { LogInfo("Failed to open file {}", filename); return false; } std::string headerLine; if (!FileGetLine(file, headerLine) || headerLine != "Date list") { FileClose(file); LogInfo("Failed to load header"); return false; } std::string countLine; uint32_t count; if (!FileGetLine(file, countLine) || !ParseUint32(countLine, count)) { FileClose(file); LogInfo("Failed to load date count"); return false; } for (uint32_t i = 0; i < count; i++) { std::string dateLine; Date date; if (!FileGetLine(file, dateLine) || !ParseDate(dateLine, date)) { FileClose(file); LogInfo("Failed to load date line {}", i); return false; } dates.push_back(date); } FileClose(file); return true; }
To me this is much easier to read.
At the moment this is my standard coding style.
It does something and, if it fails, tidies up and returns immediately.
Once you’ve dealt with the error case you can assume it’s just going to work afterwards.
There is less to hold in your head at any one time.
You do have to watch out for the return
keyword.
However, the format normally makes them easy to spot.
Having to repeat FileClose
in several places is annoying.
I can easily imagine situations where memory allocations early on would also have to be deleted
in each failure case.
Again, much easier to do with std::stream
and std::unique_ptr
.
Try-catch
Exception handling use to be the big new thing. I hear less about it nowadays and don’t use it often.
std::vector<Date> ReadDates(const std::string& filename) { FILE* file = nullptr; try { file = FileOpen(filename, "r"); std::ignore = FileGetLine(file); uint32_t count = ParseUint32(FileGetLine(file)); std::vector<Date> dates; for (uint32_t i = 0; i < count; i++) { dates.push_back(ParseDate(FileGetLine(file))); } FileClose(file); return dates; } catch (const std::runtime_error& exception) { if (file != nullptr) { FileClose(file); } throw; } }
It’s certainly much clearer than the previous two examples,
almost as simple just assuming it’s going to work.
If we assume that the file processing and parsing functions can throw useful std::runtime_error
then it will also provide debug information.
However, requiring FileClose
has produced some awkwardness.
It would be a bit easier if a finally
keyword was available.
I think it’s apparent clearness is one of the reasons we hear less about it now. It’s easy to program without thinking too much about it. Almost every one of those lines could throw an exception. That’s a lot of invisible control flow. I know that there are edge cases in C++ which it can end up leaking memory if you’re not careful. Other languages might have garbage collection but can have other resource problems.
Short-circuit logic
I have often used a variant with function success / fail that relies on the short-circuit behaviour of &&
.
bool ReadDates(const std::string& filename, std::vector<Date>& dates) { bool success = true; auto* file = FileOpen(filename, "r"); success = success && (file != nullptr); std::string headerLine; success = success && FileGetLine(file, headerLine); success = success && headerLine == "Date list"; std::string countLine; success = success && FileGetLine(file, countLine); uint32_t count; success = success && ParseUint32(countLine, count); for (uint32_t i = 0; success && i < count; i++) { std::string dateLine; success = success && FileGetLine(file, dateLine); Date date; success = success && ParseDate(dateLine, date); if (success) { dates.push_back(date); } } if (file != nullptr) { FileClose(file); } return success; }
If success
remains true
then the functions will be executed.
If it becomes false
then all remaining functions will be skipped.
Unfortunately there’s no allowance to generate error messages here.
If you want to know more you’ll have to add logging or step through it.
Optional values
Using an std::optional
we can represent a success with it’s value or a fail together.
It provides an and_then
method which calls a given function if it’s valid or returns another optional value if not.
std::optional<std::vector<Date>> ReadDates(const std::string& filename) { auto file = FileOpen(filename, "r"); if (!file.has_value()) { LogInfo("Failed to open file {}", filename); return nullopt; } const auto headerLine = file.and_then(FileGetLine); if (!headerLine.has_value() || headerLine != "Date list") { FileClose(file.value()); LogInfo("Failed to load header"); return nullopt; } const auto count = file.and_then(FileGetLine).and_then(ParaseUint32); if (!count.has_value()) { FileClose(file.value()); LogInfo("Failed to load date count"); return nullopt; } std::vector<Date> dates; for (uint32_t i = 0; i < count; i++) { const auto date = file.and_then(FileGetLine).and_then(ParseDate); if (!date.has_value()) { FileClose(file.value()); LogInfo("Failed to load date line {}", i); return nullopt; } dates.push_back(date); } FileClose(file.value()); return dates; }
I’ve not used and_then
before. It ends up being very similar to the “return early” version before. Rather than a success and values we combine the two. However the if-expression has to be a separate line so we don’t save any space. I’m using the has_value()
and value()
methods for the optional values. It could be written more compactly using automatic convertion and dereferencing.
Maybe it could be better if we were willing to give up on error messages?
Optional<std::vector<Date>> ReadDates(const std::string& filename) { auto file = FileOpen(filename, "r"); const auto header = file.AndThen(FileGetLine).AndThen(ParseString, "Date list"); const auto count = file.And(header).AndThen(FileGetLine).AndThen(ParaseUint32); auto dates = MakeOptional<std::vector<Date>>(); for (const uint32_t i : MakeRange(count.or_else(0))) { dates = file.AndThen(FileGetLine).AndThen(ParseDate).AndThen(AppendValue, dates); } file.AndThen(FileClose); return dates; }
I to switch away from std::optional
to a custom Optional
class because
I wanted more from and_then
.
It just takes a function but I want to give it a function and some additional arguments.
I also added an And
method so that count
can be conditional on both file
and header
.
This has the same number of lines as “no error checking”.
I’m impressed but also unsure about it.
This is definitely feels more complicated.
Execution flows from top to bottom and all the if-statements have been hidden away inside AndThen
.
As each value is Optional
you are forced to deal with errors,
you can’t just try to pull data from the file if it hasn’t opened properly.
I think the biggest potential problem is failing to cascade the success status through properly.
Here it would have been easy to code it so that a bad header didn’t matter.
It has to be deliberately included using file.And(header)
and that could be missed.
Result values
Similarly to Optional
you can create a Result
type that either has a value or error details.
Result<std::vector<Date>> ReadDates(const std::string& filename) { auto file = FileOpen(filename, "r"); if (file.IsError()) { return file.AddError("Failed to open file {}", filename); } const auto header = file.AndThen(FileGetLine).AndThen(ParseString, "Date list"); if (header.IsError()) { FileClose(file.Value()); return header.AddError("Failed to load header"); } const auto count = file.AndThen(FileGetLine).AndThen(ParaseUint32); if (count.IsError()) { FileClose(file.Value()); return count.AddError("Failed to load date count"); } std::vector<Date> dates; for (uint32_t i = 0; i < count; i++) { const auto date = file.AndThen(FileGetLine).AndThen(ParseDate); if (date.IsError()) { FileClose(file.Value()); return date.AddError("Failed to load date line {}", i); } dates.push_back(date); } FileClose(file.Value()); return dates; }
This manages to have a more compact yet potentially richer set of error messages.
The error details generate by, say, ParseUint32
could be preserved and amended by this functions error message.
Can this be re-written in a similar compact form as Optional
?
Optional<std::vector<Date>> ReadDates(const std::string& filename) { auto file = FileOpen(filename, "r"). IfErrorAdd("Failed to open file {}", filename); const auto header = file.AndThen(FileGetLine).AndThen(ParseString, "Date list"). IfErrorAdd("Failed to load header"); const auto count = file.And(header).AndThen(FileGetLine).AndThen(ParaseUint32). IfErrorAdd("Failed to load date count"); auto dates = MakeOptional<std::vector<Date>>(); for (const uint32_t i : MakeRange(count.or_else(0))) { dates = file.AndThen(FileGetLine).AndThen(ParseDate).AndThen(AppendValue, dates). IfErrorAdd("Failed to load date line {}", i); } file.AndThen(FileClose); return dates; }
So yes it can but this might be too far for me.
IfErrorAdd
lets non-error values through without change but amends error messages with more information.
However it feels like the error messages are getting in my way of understanding the code.
For now I prefer the less compact form.
On balance
I think std::optional
, Optional
and Result
all have potential.
I need a bit more practice with them.
The early return technique still has a lot to recommend it, simple to do and understand.
The biggest frustration with all this code was FileClose
. If you’re dealing with code that can have a number of error states minimise the amount of extra work required. Have objects that manage their own state or use statements that can tidy up for you.
This can be a particular problem if different libraries want different error handling techniques. Some might return raw values, some optional ones, some results and other use exception handling. That would be a mess. If a library doesn’t provide a compatible system it could be worth wrapping it up. A thin layer around the library to translate into your preferred error handling technique. Here I could have written a templated Defer
class that can take a lambda and run it when the object was destroyed. That way I could have setup the FileClose
once and then forgotten about it. Much better.
P.S.
Did anyone spot the problem with these examples? I thought of it after the post had been scheduled but before it went live. None of them check that the file is actually finished before closing. According to the format there is a fixed number of dates and that’s it, extra dates are an error. While I could go back and fix it without anyone knowing there’s also a lesson here. I go on about it enough, we all make mistakes. If these were going in to a code base then they should have had unit tests written and, ideally, a code reviews as well. Here all the examples with error checking are valid but need an extra check which will be, say, 5 lines in the worst case and just 1 in the best case.
Leave a Reply