Blocked

You might want to be working but sometimes it’s not possible. Let’s think about why it might be and what to do about it. I have a feeling that this post could also be titled “Avoid single point of failure”.

Illness

People get ill. It’s going to happen sometimes. This is obviously a problem on some level for the person involved. Hopefully they get better soon. Does their work need to get done in the meantime? As a software engineer things aren’t normally time critical, a day or two often isn’t going to make much of a difference. That’s going to slow things down slightly but shouldn’t be too bad. If it’s longer, a week or more, then it might start to impact schedules.

What’s more serious is when the work does still need to be done. Maybe you’re managing a team and the next sprint hasn’t been planned. Maybe you have the passwords for the server that someone really needs to access. If you’re a bit snuffly it might not be too bad to check your messages and do a bit of scheduling. If you’re coughing up blood then you really shouldn’t have to bother with that. In either case it’s better if the the team is able to cope with someone being ill:

  • Have documentation on how to do things or automate them so they’re really simple.
  • Give multiple people the right permissions to fix problems.
  • Share knowledge around when people are here so you can get by when they’re not here.

All of the above also applies to holidays. They’re going to happen and it shouldn’t stop normal activity from happening. People deserve to be able to take a break and shouldn’t have to keep an eye on the office while they’re away. Think of it as practice for them being ill.

Services

When I started working this wasn’t so much of a problem. You had computer in front of you and that’s all you really needed. No so any more. On a given day I’m using source control to access and change code, package management systems to build libraries, search engines to look up documentation and advice, project management systems for tasks and your end product could be running in the cloud. If any one of these goes down you might be able to limp on for a bit, you might not.

Some of these are pretty hard to deal with. I’ll have a copy of the code locally. I might not be able to check in any commits to the server but I can still prepare them. All the rest might turn into blockers for me.

If you’ve got a build system to make external libraries hopefully it’s cached a copy of the download. I had a situation recently where one of the host site of one dependency was down for maintenance. It was cached locally on my machine but not on the build server. The easiest solution was just to delay the merge until the site was back up. That’s fine if you have time but terrible if you don’t. Try to avoid a build system that has to connect to the internet.

Being cut off from search engines seems very punishing nowadays. Unless you’re dealing with a familiar topic it seems to be part of the bread and butter of development. I mean, you know roughly how the API work but details matter. However, if I’m looking up C++ language details then 90% of the time it’s from cppreference.com. I could actually download the entire reference and follow the links manually. It’s still a step down from typing into a search box and just getting the right page.

If your services live in the cloud and only in the cloud then… you’ll have to find something else to do.

IT issues

While external services are more relevant now you’ve always needed a working machine and local network. Over the years I’ve had hard drives fail, networks go down and machines lock up. What to do and how to prevent it very much depends on on the specifics.

Hopefully you use source control and check in regularly. I talked about the problem of source control having multiple jobs. It’s not uncommon for me to have experimental branches living locally on my machine for a long time. They’re don’t feel in the right state to share. However that’s a risk and running a separate backup system would help mitigate that.

Other business areas might be able to get away with using, say, OneDrive for all their backup. Spreadsheets, word documents and PowerPoint presentations don’t tend to take up that much room. A few source files don’t take up that much room either but the build artefacts can add up and change constantly.

How long does it take to get a new development machine ready if one fails? What applications do you need? What does it take to build your product? Is this documented somewhere or are you going to have to try and remember? If possible the best solution might be to have an installation or configuration script to do the work. Hopefully you just need to clone the repository, run the script and sit back.

Dependencies

Maybe you’re fit as a fiddle and your machine is working but need something from someone else. Working within your team isn’t normally too bad, everyone is easy to get in touch with. However if they are in another part of the company or a different company it can be a lot harder.

If there’s a problem with my code, a bug or a small missing feature, I like to be able to fix the problem quickly. It might not be on my schedule but I don’t want to hold up their schedule. However this isn’t always possible. Having a “small” problem derail you for days can mess up the schedule in other ways. Often it’s possible to produce a quick fix or a work around. A proper solution can wait for a better time.

If you need have an interface between systems then it’s best if both sides can give it time together. I’ve been in situations where one side has come up with an interface and it’s only much later that the other side realises the inherent problems with it. It’s very easy to skim over something and think it seems fine without working through all the details. This can be especially true if it’s not clear which team has to deal with which parts of the problem.

In the end

You can try and build resiliency into your systems. You should. However, there are a lot of situations when you might not be able to continue with a particular task. Ideally I like to have a main task but have several other alternate tasks available. This could be my next task or longer term low priority ones. That way when one of these problems crops up there’s still something to do.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *