
At Microsoft, we operate one of the world’s largest IT infrastructures. So, when we embarked on the journey nearly a decade ago to move from a primarily on-premises network of physical servers to one that now operates almost entirely in the Azure cloud, it was a mammoth undertaking.
And like all long and rewarding journeys, this one led to many important insights. We’d like to share five overarching principles that we learned along the way with our customers, most of whom are somewhere in the midst of their own organizational transformation into a cloud-first company, or who may be contemplating such a move.
By delineating our guiding principles and major takeaways from our own journey to the cloud, we at Microsoft Digital—the company’s IT organization—hope that other companies can learn from our experience and have a smoother and more efficient transition of their own, saving time, money, and effort.
“Our customers can learn from us having gone through it,” says Pete Apple, a technical program manager and cloud architect in Microsoft Digital. “Because we didn’t do it right the first time, at all. And so that learning process of, ‘This is what we did, this is how we did it, this is what you should think about’ can help them consider their own options.”
Stages in our journey to the cloud

- 10% migrated
- Retire unused workload
- Small apps
- IaaS—lift and shift
(IaaS = Infrastructure as a Service)

- 28% migrated
- Reduce multiple environments
- Small and mid-sized apps
- IaaS and PaaS
(PaaS = Platform as a Service)

- 74% migrated
- Large, more complex apps
- Focus on PaaS

- 90%+ migrated
- Largest, most complex apps
- Design cloud-native apps
Our journey to transform our on-premises IT infrastructure to a system based in the Microsoft Azure cloud took roughly four years, and we continue to innovate and refine our approach today.
Be vision-led and metric-driven
When setting off on a years-long journey, you don’t just walk out the door with a vague idea of where you’re going. As we embarked on this years-long project, our leadership laid out the overall vision that guided our project plans.
“Our leadership was critical; they gave us the vision of, ‘We’re going to migrate to the cloud, and we want to be first and best. We’re going to be an example for the rest of the industry,’” Apple says. “They made a big bet on it, and then they put the support behind it to hold the teams accountable, tracking against the goals and metrics. This directive went all the way to up to (Microsoft CEO) Satya Nadella; it was an absolute priority from his point of view.”

Martin O’Flaherty, a principal PM manager at Microsoft Digital, explained how important it was that senior leadership stuck to the vision and remained patient during the long journey to the cloud.
“Our executive vice president took the long view of this project, and he backed us as we took the time to work through all the issues and all the times when things failed,” O’Flaherty says. “We had to ‘embrace the red’ by talking about those failures rather than cover things up, in order to keep learning throughout the process. Leadership made it clear that doing the job right was the priority, and that trust gave us the confidence to stay focused and deliver.”
As far as metrics are concerned, consider the size of the Microsoft digital landscape: more than 220,000 employees in over 100 countries using more than 750,000 devices. Moving a supporting infrastructure of this size to the cloud required careful attention to specific metrics throughout the process, both to carefully measure progress and to understand the biggest challenges and potential obstacles along the way.
“We had something like 800-plus different services across the company that we had to deal with in our journey to the cloud, which I like to call the total footprint,” Apple says. “We had to track how many of them were in the cloud, how many were on-premises, and how many were hybrid. And we kept track of that quarter by quarter. We also had to monitor things like the spend for on-prem versus the cloud, and our quality metrics such as service-level agreements and customer satisfaction ratings. We had to keep an eye on all of it.”
Pay attention to people, processes, and technology
Moving a large IT infrastructure to the Azure cloud is a technology challenge, but it’s just as important to think about the people and the processes involved.
“It’s not just about getting everything moved from on-premises servers to a cloud solution,” Apple says. “Once you have it there, it’s about what your staff should look like, the different roles and skills you’ll need to run things in the cloud. Then, how do you plan for the day-to-day operation of it? What kind of processes and monitoring do you need?”
O’Flaherty notes that of these three considerations, transforming your people resources for the move to the cloud might be the biggest task.
“When we talk about ‘people change,’ we mean how people do their work—and frankly, that’s usually the hardest challenge,” O’Flaherty says. “Once we had good momentum in moving our technology to the cloud, we needed to change how people do their work. We needed to modernize.”
Apple says that transitioning the people skills of the organization was a deliberate process.
“We provided training, and we made it very clear that everyone needed to learn to work with infrastructure as code, rather than physical machines,” he says. “And whenever we had the ability to hire new people, we prioritized those DevOps skills. We invested in that, because that was the direction we were going.”
Sometimes, the technology decisions are also what enables the implementation of more effective processes. O’Flaherty explains how one specific decision during the cloud journey made it possible to implement best-practice processes that ensured quality standards were met.
“We decided to use one single instance of Azure DevOps. So, all of our teams—across more than 800 applications—and all our code repos were in one Azure DevOps account,” O’Flaherty says. “This setup allowed us to implement consistent engineering standards, like requiring every code change to be reviewed by two people. Because we could enforce these policies across the board, we achieved a new level of consistency, accountability, and confidence in our development process.”
Confront legacy applications and technical debt
When the time comes to make a major technological transformation, like moving an on-premises infrastructure to the cloud, it provides the perfect opportunity to deal with the challenge of aging legacy applications and technical debt that has accumulated within the organization.
Dealing with legacy applications up front means you can reduce the total load of what you end up moving to the cloud.
“The first thing we asked was, ‘What do we not need anymore?’” O’Flaherty says. “We were able to identify something like 30% of tools and services that could be retired or consolidated. We also looked at other SaaS solutions as replacements for things we were building ourselves, which removed about 15% more of the portfolio. So we had almost halved the total burden at that point.”
Strategic approach for moving our IT infrastructure to the cloud

Apple explained the benefits of starting with a clean slate when you move to the cloud.
“There’s always that backlog of work items and legacy things, and the idea is that you don’t want to bring your bad habits with you to the cloud,” Apple says. “So, if you’ve got a solution that is still using COBOL or Windows 2008, maybe it’s time to pull off the Band-Aid? That’s a good investment of your developer capacity.”
There were also the significant challenges that Microsoft faced with addressing years of technical debt—which O’Flaherty describes as technical issues resulting from past development decisions that weren’t as robust or maintainable as they could have been—during the early stages of the journey to the cloud.
“We knew the scale of the technical debt we had—it was kind of like an iceberg, with a huge amount of work below the surface. And we knew it was going to take several years to get through it all,” he says. “The key was understanding that we were going to have to invest a significant amount of engineering time to get there—that we needed to put 30% to 40% of our engineering resources behind this effort for well over a year just to get on top of the problem. We had to take that hit up front, or we’d still be in the same boat today.”
Transform your operations with end-to-end thinking
In the old world of on-premises network infrastructure, services were often siloed. Different departments ran their own systems and tools, and employees couldn’t always access data and technologies that were needed to gain a bigger picture or develop cross-disciplinary solutions.
Enter the cloud-based network, which opens up the ability for end-to-end thinking and working.
“In the old days, the interactions between applications were pretty monolithic,” Apple says. “With the move to the cloud and engineering modernization, you open up new kinds of compute and access to data. Developers can use APIs, containers, Power Apps and more to access the various data lakes we have across the company. There’s a lot more flexibility, and they can work much faster.”
Another area where having a cloud-based network allows us to take more of an end-to-end approach is security, which has become a major priority at Microsoft in recent years.
“End-to-end thinking means I can do a multi-layer defense and comprehensive security implementation in the cloud,” says Basma Basem, a senior program manager in Microsoft Security. “I can make sure that there’s a security implementation from an architecture and design standpoint on each layer of the services I’m building in the Azure cloud. And you have such a wide variety of security solutions in the cloud, it makes it much easier to find the right solution and ensure that you have good security posture management.”
Consistently prioritize your goals and metrics
When it comes to tackling such a tremendously huge project, it’s vital to understand your priorities and keep them front and center as you move through the process.
“We had a lot of priorities around financial considerations in moving away from the physical infrastructure model,” Apple says. “That was number one. Then we had priorities around efficiency and modernization. And we had to find ways to measure those priorities and ensure we were hitting our targets.”
Of course, prioritization also means that you can’t take on all your challenges at once. Your leads have to make sure that they communicate effectively so everyone understands the priorities, the pace of progress, and when different issues will be addressed.
“There’s a tendency to kind of try to boil the ocean and fix everything at once,” O’Flaherty says. “We really had to temper people’s expectations, even within our own leadership, and say that this is going to take a while. If there were 50 compliance problems, we couldn’t tackle all 50 at the same time—the leads would identify the top 3, and we’d do those 3, then move on to the next batch. We really had to set specific goals and follow our metrics along the way.”
And there’s one overall metric that Apple likes to keep top-of-mind when discussing what moving our network to the Azure cloud has meant for Microsoft—cost.
“We’re spending 20% less on our infrastructure costs than we did when we were operating on-premises,” Apple says. “When you look at what we were spending on physical infrastructure versus today, in the cloud, it’s a significant savings.”
Every cloud journey has its own path
Today, we operate roughly 98 percent of the Microsoft corporate infrastructure in the cloud, and we are continually looking for strategies to be more efficient, more automated, and less costly. Apple notes that the company decided to push hard to get to this level (“to the point of heartburn for some people”) and show what was possible, but that not every organization will need or want to go this far in their own cloud transition.
“We are the extreme in terms of pushing the bar,” Apple says. “We’ve been very innovative in this space, because we wanted to prove our point in terms of how much we could put on the cloud. We realize every business has to make tradeoffs, and some may want to keep a certain percentage of their infrastructure still on-premises. But the flexibility of the cloud and the cost savings are real, and we want our customers to understand that and take advantage of it.”

Here are some of the major insights we took from the process of moving our network into the Azure cloud:
- Confront your technical debt. Be prepared to do the upfront work of addressing your technical backlog and getting into a better state before you make the transition to the cloud. You’ll not only avoid major headaches—you’ll also reduce the total network footprint that you’ll be moving.
- Invest for the long term. Leadership has to be willing to devote significant resources over the course of the project, and to understand that the results might not be realized in the short term. But the overall payoff will be worth it once you’ve completed the work.
- Get employees on board. Make training and upskilling a priority as you transition your workforce to a cloud-first mindset. Incorporate the shift into individual reviews and goal-setting so that everyone is pointed in the right direction.
- Take the opportunity to instill a “secure by default” philosophy. As you move to the cloud, you can proactively create and deploy strong security architecture, keeping compliance requirements top of mind, continuously monitoring your organization’s security posture, and fostering a culture where everyone factors security risk into their work and decision making.
- Embrace “the red.” Create a culture where teams are comfortable with revealing when they are falling short on their metrics (being “in the red”). Being open about those issues will help others avoid the same pitfalls in their own areas and significantly increase overall quality.
- Keep your goals and metrics front and center. On a long and complicated journey, it’s vital to keep everyone focused on the destination—your goals, sometimes called objectives and key results (or OKRs). Defining and carefully tracking the right metrics (also known as key performance indicators, or KPIs) is another essential part of this process.

- Read about the many benefits Microsoft has accrued by moving our network to the Azure cloud.
- Find out how AIOps and automation tools are enhancing network reliability at Microsoft.
- Learn how we’re using Azure monitoring, patching, and security tools to manage the health of our cloud network.
- Discover more stories about moving our network to the cloud.