This problem is sometimes referred to as communication complexity and can be a problem for teams. It also has implications for systems architecture and probably elsewhere.
Team problem
One person by themselves doesn’t have to worry about synchronising their work with anyone. There is no communication overhead. On the other hand, it’s just one person so big tasks will take a long time.
Growing the team means there are more people to do the work but checks must be made to ensure everything is compatible.
People | Checks |
---|---|
1 | 0 |
2 | 1 |
3 | 3 |
4 | 6 |
5 | 10 |
6 | 15 |
7 | 21 |
8 | 28 |
9 | 36 |
More people means exponentially more checks. More time being “wasted” on coordination. Of course is not really wasted but it is a cost that comes from having larger teams.
The exact nature of the checks depends on the work that’s being done. It could be one on one conversations between developers. It could be the number of people who have to be involved in code reviews. It could be how many people that are at each team meeting. There are more potential interactions with more people and so more cost.
A small team of, say, 3 or 4 people can work on a tough problem. Even if they have to regularly work closely together the cost of those interactions isn’t too high. A larger team has to work differently. Whether that’s breaking the problem into smaller less connected problems, changing the team structure reduce communication channels, or ideally both. A bigger team should work differently from a smaller team. This problem is the basis of the recommended team sizes for Scrum. Depending on who you listen to that might be 3-9 members or 5-9 members.
By splitting up teams into smaller groups and combining checks you can reduce overhead.
People | Groups | Checks |
---|---|---|
1 | 1 | 0 |
2 | 2 | 1 |
3 | 3 | 3 |
4 | 2/2 | 3 |
5 | 3/2 | 5 |
6 | 2/2/2 | 6 |
7 | 3/2/2 | 8 |
8 | 3/3/2 | 10 |
9 | 3/3/3 | 12 |
So with 3 groups of 3 people the checks inside each group will be 3 and the checks between each group will be 3. So that’s 3 * 3 + 3 = 12 checks in total. This is a lot less than the 36 checks to coordinate everyone at once. If you’re working on a bigger project then maybe it will be 5 groups of 5 people or something completely different. The right balance will depend on the exact work your doing.
System architecture problem
The same sort of argument can be made about system architectures. With tightly connected modules in a project you have to make checks whenever you make a change. The more modules you have the more checks you need to make. As the number of modules gets higher the specific of the original change matter less and less. Instead you spend more and more time to check all the different interactions.
The first step here is to try and break the system up into modules that don’t have to interact much. Badly organised modules means interwoven dependencies. Well organised modules means each module has it’s own responsibilities and limited connection to others. That means, say, putting the complicities of database access into one module rather than having it everywhere. Other modules still have have to interact with the database module but a lot of intricacies are hidden inside it.
The second step might be to organise modules or libraries into layers or tiers. Higher layers use lower layers to perform tasks but the lower layers don’t have to care about higher layers. Sometimes these layers might have official names, e.g. Presentation, Business, Database. However you can have many more layers for technical reasons. Unit tests for each layer could be written assuming that lower layers have already been tested and are working. This reduces the need to duplicate functionality inside tests, you can safe just use code from lower layers.
It’s also important to remember than dependencies between modules can be both explicit and implicit.
If two modules both use a constant, const uint32_t MaximumTokenCount = 1024;
,
then they are explicitly linked to that definition.
However if two modules both have their own constant or just use 1024 directly but these have to
be in sync then they are still implicitly linked.
It’s much better for the compiler to take care of an explicit link than to rely on manually updating
values with an implicit link.
Network problem
If you are choosing between a peer-to-peer architecture, a client-server module or something else there is a communication overhead between nodes. A networked video game might use a peer-to-peer system if there was just a few players. That would minimise the round trip time to synchronise data. For a large number of players that becomes impractical as more and more data has to be passed round. There a client-server module would reduce the amount of data that had to be transmitted but would increase the round trip time to distribute data. For massive numbers of players extra layers would be required to stop a single server from becoming overloaded.
General problem
The number of connections in any tightly couple systems grows quickly.
Technically for a system with n
nodes that’s:
connections = n x (n-1)/2
If there is a cost to those connections then that cost also grows quickly. Even if it the cost starts small it can become a problem.
This is a general problem that can affect all sorts of systems. Maybe we can learn lessons for one field by thinking about at other fields. Generally if you have any complicated system then try to isolate the parts. Make it so that changes in to one part doesn’t impact all the other parts.
Leave a Reply