There are many well documented reasons to limit our work-in-progress (at the individual, team and organisation levels), as well as limit our work-in-process*.
However, perhaps the most compelling reasons for limiting both forms of WIP from a business perspective are:
* Work-in-process refers to work items which we ourselves haven't started working on yet but are "in flight" (being worked on in some way by someone upstream) or otherwise expected to be worked on / delivered, and as such will require us to work on them at some point, thus constraining our capacity to start or finish other things. Work items on our backlog which we are expected to deliver are work-in-process, even if no-one has actually done anything other than add these things to the backlog.
Let's say I am given 10 things my boss wants me to complete, and let's assume they would each take me exactly the same amount of time to get done if I focused on each individually, one at a time.
Let's also assume I am not dependent on anyone or anything else to get these things done, meaning the only delay/waiting time would be self-imposed or a result of unforeseen circumstances, and working on more than one item at the same time would make all of those items worked on in parallel take longer due to the scientifically proven consequences of context switching.
If each of the 10 tasks will take me 1 hour to complete, in theory I can be done in 10 hours. No problem, I tell the boss they will all be done by 12 noon tomorrow. However, after I'd completed 4 of the tasks, my son got sent home from school with sickness so I had to go and pick him up and take him home. Luckily I have an understanding boss, and when I told her I wouldn't be done with all the tasks until later in the afternoon, she was sympathetic.
However, because I told her I would be done by 12 noon, she scheduled a presentation to her boss at 12.30. So she said to me "I know you won't get all the tasks done by then, but if you could get as many done as possible it would be much appreciated. This is your top priority; if anyone else tries to give you any work, send them to me. Also, I've put the tasks in priority order so I have the most important things to show the boss in the presentation".
In this scenario, if I didn't limit my focus to the 10 tasks given to me by my boss (who has ultimate authority for setting priorities for the company), and accepted work from elsewhere, I would not be honouring the business priorities. In this simple scenario we haven't even considered the variability of individual tasks which exists in real life work situations. This task variability, along with other natural and uncontrollable volatility in our work environments, makes it even more important to focus and deliver in short, repeating cycles.
Limiting WIP forces us to understand what the highest priority work items are, and ensures we have delivered the highest priority work when a deadline arrives, even if we haven't completed everything that was asked for or estimated. Put another way, limiting WIP is a crucial aspect of mitigating the business risks we call schedule risk and building the wrong thing.
Going back to the previous scenario, imagine I didn't have the deadline of noon tomorrow, and thus was never told by my boss "this is the number one priority" when my son was taken ill. Suddenly it seems I have more time; more capacity to take on other work.
Brad from Finance approaches me with some tasks he'd like me to do. If I accept those tasks, not only am I not honouring the business priorities (or, at least, I am not sure that I am), I am also taking longer delivering the highest priority tasks than I would if I focused on them and got them done before accepting other work.
Limiting our batch size (the number of work items we accept and commit to at the same time) and then limiting our WIP (the number of work items we actually work on at the same time) enables us to optimise the delivery time for the highest priority work items, assuming someone with the authority and know-how to do so has correctly prioritised these items.
If we do not have a single ordered priority list to work from then we are always at risk of not working on, or optimising the delivery time of, the highest priority items for the business. We will consequently miss deadlines and market windows. Limiting WIP forces us to prioritise and sequence, or at least it forces the conversation.
When we accept the truth about context switching, both in terms of literally juggling/"focusing" on two tasks at the same time and pausing unfinished work to switch our focus to something else, the path to becoming "more efficient" becomes clear - If we create single prioritised lists of work items (representing deliverables, not tasks) for individuals and teams to pull from, we are optimising how many things will be delivered (all other things being equal). In other words, we will deliver more things than we otherwise would in the same amount of time.
When we pause work and switch our focus to something else, the first deliverable is in "waiting" mode. It is deemed "in progress" from a cycle time perspective because we started work on it, but it is not actually making any progress. The clock is ticking, and this waiting time is pure delay from the point of view of the requestor of that deliverable (the customer).
We may think we are being "efficient" because we are working on more than one thing at the same time, but that view of efficiency is from our own point of view rather than that of the customer. This type of efficiency which is from our own point of view as the worker is called resource efficiency. Efficiency as viewed from the customer's point of view is called flow efficiency.
Getting more stuff done, if by "done" we mean we have delivered something useful to somebody, is achieved through improving flow efficiency, not resource efficiency. Resource efficiency improvements mean people's time is utilised more, not that more stuff is being delivered. And to make matters worse, as a general rule, the higher the resource efficiency, the lower the flow efficiency.
A great advantage of using a numerical lever like WIP is that we can treat it as a unit of capacity which we can see and control. Consider a simple scenario where we start the week with 5 work items in progress (and we have a WIP limit of 5). At the end of the week, the same 5 items are in progress. One week later, 2 of those items are complete and 3 of them are still in progress.
Putting aside how effective or efficient we are being, if we assume for now that working on 5 items in parallel is "about right" for our team (and hence why we have chosen a WIP limit of 5), then the fact we are now only working on 3 means that 2 "slots" have opened up for more work to be started. These slots represent real capacity; space in our schedule. This enables us to start new work at the cadence with which we complete it (aka match demand with capacity) and without potentially pulling time and focus away from existing work items already in progress.
If we had started more work during the first week and gone over our WIP limit, we may have still completed 2 items in the following week (or perhaps more), but we would not be sure what our optimum capacity is in terms of getting things done sustainably, and would become less predictable. If we feel we are working under capacity, it is better from a predictability point of view to experiment with increasing the WIP limit by 1 and seeing how it affects our throughput (how many things we get done in a time period) rather than to arbitrarily break our WIP limit.
A team with a WIP limit of 5 can say "give us 5 things of similar size/effort, and we will complete them in x amount of time, on average". The person setting the priorities knows to always pick the top 5, and the cadence with which they need to do that.
NOTE - a Scrum development team's Sprint forecast is using the same principle; Because the team works in a consistent way each Sprint (which is a fixed length of time) and only pulls in the amount of work they know they can complete based on historical data, the team becomes quite predictable over time, despite variances in the size/duration of individual work items
We can also use this visualised WIP lever to allocate capacity to different types of work. For example, we can reduce our WIP limit (and thus capacity) for planned work items from the product backlog such that we have explicitly created more time to perform BAU work or address quality issues.
Within the team, the individuals can also abide by WIP limits to avoid context switching and become more predictable to their team mates in what they are working on and their progress. This enables the team to focus on the work itself rather than the worker, and getting that work complete by helping each other, rather than optimising for how many tasks are in progress.
In summary, the primary business case for limiting WIP is that it gives faster outcomes (delivery of the right things) AND better predictability, both in the short and longer term, as well as reducing stress on individuals and teams because they are controlling the work to their own pace and capacity (which in turn is likely to lead to higher productivity).