This case study describes how Siemens Health Services improved quality, increased predictability overall, and decreased cycle times by switching from traditional agile metrics (Story Points, Velocity) to actionable flow metrics (Work In Progress, Cycle Time, Throughput). Siemens’ agility was increased by switching to a continuous flow model, which explains how predictability is a systemic behavior that needs to be managed by being aware of and acting in accordance with Little’s Law’s presumptions and the effects of resource utilization.
Siemens Health Services provides sophisticated software for the Healthcare industry. We had been using traditional “agile” metrics (e. g. but never experienced the transparency and predictability that those metrics promised (story points, velocity). By moving to the simpler, more actionable metrics of flow (work in progress, Cycle Time, and throughput) we were able to achieve a 42% reduction in Cycle Time and a very significant improvement in operational efficiency Additionally, implementing Kanban has resulted in significant enhancements in collaboration and quality, all of which have persisted across numerous releases. This article explains how Siemens’ agility was increased by switching to a continuous flow model and how predictability is a systemic behavior that must be managed by understanding and acting in accordance with Little’s Law’s presumptions and the effects of resource utilization.
Product Lifecycle Management (PLM) is the name of the development team for Siemens HS. It has about 50 teams that are primarily based in Malvern, Pennsylvania, and significant development resources that are located in India and Europe. The company launched a very ambitious project in 2003 to create Soarian®, a completely new line of healthcare enterprise solutions.
The healthcare industry is very complicated and is constantly changing, being restructured, and being regulated. Given our industry, it should come as no surprise that the quality of our products is of the utmost importance; in fact, one could argue that quality is mission critical. Additionally, the systems we develop must be scalable to accommodate both the smallest and largest multi-facility healthcare systems in the world. Our performance must be of the highest caliber, and we must abide by all applicable FDA, ISO, Sarbanes-Oxley, patient safety, auditability, and reporting regulations.
Our main business challenge is to develop functionality quickly enough to compete with established, mature systems on the market. Our systems give us access to new technological capabilities that enable us to outperform the competition. In order to accomplish this, we adopted agile development methodology and, more specifically, Scrum/XP practices.
Our development teams transitioned to agile in 2005. We took an expedited approach to assimilating and incorporating new practices, enlisting many of the most well-known experts and coaches in the community. We almost immediately noticed a significant improvement over our previous waterfall techniques, and our enthusiasm for agile kept growing. We had a mature agile development program by September 2011, having incorporated the majority of Scrum and XP practices.
All roles (product owners, scrum masters, business analysts, developers, and testers) were represented on our Scrum teams. We ran 30-day sprints with formal sprint planning, reviews, and retrospectives and had an established product backlog. We were regularly releasing significant new feature and improvement batches once a year (primarily because that’s how frequently we’ve always released). Generally speaking, our teams and process were well-integrated with techniques like continuous integration (CI), test-driven development (TDD), story-driven development, continuous customer interaction, pair programming, planning poker, and relative point-based estimation. Our experience demonstrated that Scrum and agile practices greatly enhanced cross-role collaboration, customer functionality, code quality, and speed. All feature analysis, development, and testing are part of our Scrum process. Only after passing validation testing conducted by a Test Engineer within each Scrum Team in a fully integrated environment is a feature deemed “done.” Siemens conducts customer beta testing after the completion of all release features before announcing general availability and shipping to all of our customers.
Despite numerous advancements and substantial advantages brought about by our adoption of agile, our overall success was limited. It was difficult for us to predict and meet committed release dates. A high level of certainty and predictability is necessary to satisfy regulatory requirements and customer expectations. Our internal decision checkpoints and quality gates required firm commitments. Accurate release scope and delivery forecasts with a significant penalty for delay were necessary to uphold our commitment to customers, internal stakeholder expectations, and revenue projections.
Our teams were forced to plan and finish stories in time-boxed sprint increments while working in the field. Every sprint’s final week was always marked by a mad dash by teams to score as many points as they could, which led to hurried and overburdened story testing. Although velocity rates at sprint reviews frequently appeared to be high, the truth revealed that many stories were blocked or unfinished, and there were many features in development with few, if any, features finishing before the release. Teams starting too many features and/or stories led to a discrepancy between velocity (number of points completed in a sprint) and reality. It had been customary to launch several features simultaneously in order to reduce potential risks. Additionally, whenever a story or feature was delayed (due to a variety of factors like awaiting a dependency from another team, awaiting customer approval, being unable to conduct testing due to environmental or build break issues, etc.), it was recorded. When this happened, teams would just move on to the next story or feature so that we could recoup the points we had promised to earn. As a result, even though velocity burn-ups might appear to be in line with expectations, many features were not being finished at a regular cadence, which caused bottlenecks, especially at the end of the release as the teams worked to finish and test features. During this time, we worked under the belief that we would succeed if we mastered agile practices, made better plans, and put in more effort. Heroic efforts were expected.
In order to coordinate and promote process improvement throughout the PLM organization, executive management appointed a small team of directors level managers in November 2011. The team’s main objective was to finally achieve the predictability, operational efficiency, and quality gains that our agile approach had initially promised. The team concluded after conducting research that any changes would need to be systemic. Other process improvements in the past had been more narrowly focused on particular functional areas, like coding or testing, and had not produced significant improvements for the entire system or value stream. In this context, value stream refers to all development activities carried out by Scrum Teams from “specifying to done.” We discovered that our issues were indeed systemic and were brought on by our preference for large batch sizes, such as large feature releases, by reviewing the value stream from a “Lean” perspective. We also gained an understanding of the effects of large, systemic queues from reading Goldratt (Goldratt, 2004) and Reinertsen (Reinertsen, 2009). It was an epiphany to realize that the overtime for which programmers were giving up their weekends might actually have been extending the release completion date.
This path inevitably led us to learn about Kanban. We saw in Kanban a way to maintain our fundamental agile development practices while still enforcing Lean and continuous improvement throughout the system. By implementing a pull system, Kanban would manage Work in Progress, Cycle Time, and Throughput, minimizing the negative effects of large batches and high capacity utilization. Additionally, we saw in Kanban the potential for metrics that could be both concrete (and easily understood by all corporate stakeholders) and give individual teams and program management highly transparent and useful data.
We selected our revenue-cycle application, which consists of 15 scrum teams and is located in Malvern, Pennsylvania, as our pilot. , Brooklyn, N. Y. , and Kolkata, India. The application itself requires integrating all of these domains into a single, comprehensive customer solution, despite the fact that each scrum team focuses on distinct business domains. At this level of systemic complexity, dependency management, and continuous integration, the entire program must be extremely consistent and cohesive. To address this, we developed a “big-bang” strategy that standardized all teams’ policies, work-units, workflows, doneness, and metrics. We also came to the conclusion that we needed electronic boards, which would be big screens in each team room that all of our local and remote developers could access in real time. Additionally, an electronic board would offer a method for collecting real-time metrics and an enterprise management view across the program.
We had to modify our metrics once we made the decision to implement Kanban at Siemens HS in order to better match our newly discovered emphasis on flow. The metrics of flow are very different from conventional metrics used in scrum teams. As previously mentioned, our teams now paid attention to Work in Progress (WIP), Cycle Time, and Throughput rather than factors like story points and velocity. These flow metrics are preferred over conventional agile metrics because they are much more transparent and actionable. Transparent metrics give teams (and programs) a high level of visibility into their progress. By “actionable,” we mean that the metrics themselves will recommend the particular team interventions required to enhance the process’s overall performance.
We must first examine some definitions in order to comprehend how flow metrics might suggest improvement interventions. We defined WIP for Siemens HS as any work item (e g. user story, defect, etc. ) in our workflow that was situated between the “Specifying Active” step and the “Done” step (see Figure 1).
Cycle Time was defined as the overall amount of time required to complete a task from “Specifying Active” to “Done.” The quantity of work items that entered the “Done” step per unit of time was referred to as throughput (e g. user stories per week). It is crucial to remember that throughput and velocity differ slightly in this situation. With velocity, a team measures story points per iteration. A team merely counts the number of work items finished per arbitrary unit of time when using throughput. Iterations, weeks, months, or even days could be used as that unit of time.
That is to say, if one of these metrics changes, it is almost certain that the other two will as well. This relationship identifies the precise lever or levers to be pulled to correct any of these metrics if any one is not where we want it to be. In essence, Little’s Law is what turns flow metrics into usable data.
Think about how profound this relationship is for a second. Most (but not all) of what we need to know about the correlation between WIP and Cycle Time is revealed by Little’s Law. Little’s Law, specifically, for the purposes of this paper formalizes the fact that a decrease in average WIP will result in a decrease in average Cycle Time, provided certain assumptions are met. You don’t need to undergo a difficult agile transformation or perform more rigorous estimation and planning in order to make positive changes to your overall process. Most of the time, all you have to do is manage how many projects you are working on at once. Simple, but true.
Little’s Law cannot be fully discussed in this article; instead, please see the References and Appendix A at the end of this paper for a fuller, though still incomplete, treatment of Little’s Law. The purpose of bringing it up here is to increase awareness of the relationship between these important metrics. Most agile teams don’t consider WIP, Cycle Time, and Throughput, but knowing these metrics and how they interact is essential to applying agile management in a practical way. The rest of this article will explain such an approach.
The Cumulative Flow Diagram (CFD) and the Cycle Time Scatterplot were the two primary chart types we used to display these metrics. A thorough discussion of what these charts are and how to interpret them is well outside the scope of this article, as it was with Little’s Law. These charts are among the best tools available for managing flow, so we encourage you to look into them further on your own. The references section at the end of this article contains some links to resources you may find useful because, regrettably, there is a lot of false and misleading information about how these charts operate.
We have emphasized throughout this paper that Siemens HS places a high priority on predictability. How did the organization perform prior to the implementation of Kanban? A scatterplot of Cycle Times for completed stories in the Financials organization for the entire release prior to the introduction of Kanban is shown in Figure 3.
What this scatterplot tells us is that in this release, 50% of all stories finished in 21 days or less But remember we told you earlier that Siemens HS was running 30-%C2%AD%E2%80%90day sprints? That means that any story that started at the beginning of a sprint had little better than 50% chance of finishing within the sprint Furthermore, 85% of stories were finishing in 71 days or less%E2%80%94that’s 2 What’s worse is that Figure 3 demonstrates the general trend of story Cycle Times during the release was getting longer and longer and longer (see Figure 4).
What was going on here? According to a condensed version of Little’s Law, we essentially have two choices when Cycle Times are too long: reduce WIP or boost throughput. Most managers inexplicably usually opt for the latter. They make teams work longer hours (stay late) each day. They make teams work mandatory weekends. They try to steal resources from other projects. Some businesses may even hire temporary or permanent employees. The issue with these attempts to increase Throughput is that most businesses actually end up increasing WIP more quickly than Throughput. Returning to Little’s Law, we can see that Cycle Times will only rise if WIP rises more quickly than Throughput. Long Cycle Times are a problem, and increasing WIP more quickly than throughput growth only makes it worse.
3. How We Learned to Reduce Cycle Time We eventually made the decision to reduce Cycle Times by limiting WIP through the use of Kanban, which was much more logical and cost-effective. What most people don’t realize is that reducing WIP can be as easy as making sure that work isn’t begun at a faster rate than it is finished (please see Figure 5 for an example of how a process where arrival and departure rates are out of sync increases WIP). The first step in stabilizing a system is to match arrival and departure rates. We could only hope to operate a stable system in order to achieve our objective of predictability. Unfortunately for us, we delayed limiting WIP for the first release after implementing Kanban (there is a case to be made that at that point, we weren’t actually using “Kanban”). Why? Because the teams and management resisted imposing WIP limits early on in our adoption of Kanban. This was expected because imposing restrictions on work ran counter to the beliefs that were prevalent at the time. We therefore made the decision to postpone the implementation of WIP limits until the third month following the release. This made the management and teams more accustomed to the strategy and receptive to it. We paid a price for the delay in implementing WIP limits, and looking back, we should have pushed harder to impose WIP limits right away. The same issues that we saw in the previous release (pre-Kanban) started to emerge as a result of the absence of WIP limits, as you might expect: Cycle Times were too long and, generally speaking, they were getting longer. Looking at the CFD (Figure 5) for the first Kanban release demonstrates how our teams were beginning to work on items more quickly than we were finishing them.
The 85th percentile of story cycle time had decreased from 71 days to 43 days during our first release with Kanban. The teams also experienced significantly less variability, as shown by comparing Figures 4 and 7 (the release before Kanban and the first release using Kanban, respectively). Less variability resulted in more predictability. In other words, Cycle Times did not continue to rise indefinitely as they had during the previous release once we limited WIP in early September 2012. At around 41 days, they almost immediately reached a stable state and persisted there for the remainder of the release.
I hope it is clear to the reader that we were able to achieve predictability by acting on the metrics that were provided. Figure 9 illustrates that our first Kanban release produced cycle times of 43 days or less, and our second Kanban release produced cycle times of 40 days or less. This result is the very definition of predictability.
We would now be able to use these metrics as input to future projections by achieving predictable and stable Cycle Times. Additionally, the quality was greatly improved as a result of the shorter cycle times and reduced variability. Figure 10 demonstrates how Kanban minimized the time between when defects were created and when they were fixed during the release, as well as how it decreased the number of defects created during release development.
We were able to uncover more flaws and carry out more prompt resolutions by controlling queues, restricting work-in-progress and batch sizes, and developing a cadence using a pull system (limited WIP) rather than a push system (non-limited WIP). As defects are hidden in incomplete requirements and incomplete code, “pushing” a large batch of requirements and/or starting too many requirements delays the discovery of errors and other problems.
Siemens HS was able to determine what adjustments were required to take control of their system by understanding Little’s Law and observing how the flow appears in charts like CFDs and scatterplots. In particular, the organization was experiencing a WIP problem that was affecting Cycle Time and quality. Siemens saw an immediate decrease in cycle time and an immediate improvement in quality after taking the action to limit WIP.
What is cycle time?
Cycle time is the amount of time required to complete an entire production cycle for a particular good, service, or activity, from sourcing raw materials to shipping it to customers. An efficient method of evaluating the effectiveness of the entire manufacturing operation is typically to examine the cycle times of the processes that go into the production of a product. However, it is a reliable metric for other monotonous tasks that don’t necessarily involve the creation of a product, like loading cargo, completing forms, or taking customer calls. Cycle time analysis can demonstrate the effectiveness of the manufacturing process.
There are two main ways to express cycle time:
What is throughput?
By examining the amount of time it takes a product to complete all necessary manufacturing processes, throughput, also known as manufacturing cycle time, is a method of gauging business performance. Throughput can be expressed as either the number of units produced in a specific amount of time or as the amount of time it takes to complete a process, depending on what you want to analyze. You can calculate throughput time by adding the following:
Differences between throughput vs. cycle time
Cycle time and throughput are both concepts that typically refer to the length of time needed to manufacture a product from beginning to end. Throughput breaks down the entire process into a number of smaller components, whereas cycle time measures the process as a whole, and this is the primary difference between the two methods of assessing the effectiveness of the manufacturing process. Although you can use both terms interchangeably, they usually refer to methods for assessing slightly dissimilar aspects of production efficiency.
Here are two additional differences:
Differences in usage
Typically, throughput is used to evaluate manufacturing performance because the way it breaks the process down into different parts makes it simpler to spot the areas that could be improved. Cycle time, which measures the total amount of time from the start of a product’s manufacturing process until it is delivered to the customer, is more frequently used to gauge overall responsiveness.
Differences in calculation
Throughput and cycle time are determined in different ways. Cycle time can be calculated by converting it to an equation as follows:
Cycle time is calculated as Net Production Time / Units Produced.
Additionally, there are numerous ways to calculate throughput volume, including:
Ways to improve throughput and cycle time
Reducing waste as much as possible, which usually entails getting rid of various steps in the manufacturing process that don’t add to the overall value of the finished product, is the primary factor in improving both throughput and cycle time.
Some of the ways of improving cycle time are:
Some of the ways of improving throughput are:
TAKT Time VS Cycle Time VS Lead Time VS Throughput Time – Difference explained with example
FAQ
Is throughput the inverse of cycle time?
In a stable system, the average number of work items is equal to their average completion rate times their average retention time. Throughput or its inverse, average cycle time, can be used to represent the average completion rate.
Is throughput yield related cycle time?
Cycle time and throughput are two metrics you can use to boost manufacturing performance that are closely related but distinct from one another. Using the two performance measures within a manufacturing or development team, regardless of your position, can help increase effectiveness.
What is meant by throughput time?
Throughput time is the total amount of time needed to complete a specific process from beginning to end. A manufacturer, for instance, can determine how long it takes to produce a product from the time the customer places the initial order to the time it is manufactured and sold.
What is the difference between throughput time and lead time?
While lead time concentrates on the period of time between a customer’s order and delivery, throughput time concentrates on the amount of time that products take to move through your system.