Chapter 5

Application migration

When companies think about how to take advantage of the cloud, the question of application migration inevitably raises its head. This subject alone could have thousands of pages written on it. In this chapter, we look at the framework Fourth Coffee used for thinking about application migration. This chapter provides tangible guidance to help you achieve efficiencies while evolving to the cloud. We include some of the tools that you can leverage to accelerate your journey.

The Five R’s

The new CIO of Fourth Coffee, Charlotte, found herself responsible for hundreds of small and large line of business (LOB) applications. Keeping these running was critical to Fourth Coffee, but it also was consuming the bulk of her budget and most of her peoples’ time. To pursue digital transformation and save the company, she needed to reduce the resources she spent on these LOB applications so that she could invest in new applications.

In discussing the situation with her staff, Charlotte was greeted with all the reasons why change was impossible and why things had to stay the way there were. When she brought up the possibility of rewriting an application using a cloud architecture, her staff picked an example application and explained why rewriting it was nearly impossible. They recounted previous efforts to improve things and articulated all the reasons why those had failed. She quickly realized what she was dealing with, something psychologists call “learned helplessness,” which is a condition in which people feel powerless due to a history of previous failures.

Charlotte ended the meeting early and said to the team, “I understand there will be challenges and difficulties, but we can, and we will, succeed with this transformation. We don’t control everything, but we control the critical things. We control our thinking, our attitude, and our actions. I’m going to send you the Gartner Five R’s model for moving to the cloud. I want you to read it and internalize the fact that we will be moving ahead with changes. Our challenge is to prioritize our actions and determine the right path forward for our applications.”

Charlotte diagnosed two misconceptions her team had that would lead them to failure if left uncorrected:

  • They considered moving to the cloud to be an all-or-nothing proposition.

  • They thought that all applications would move to the cloud in the same way.

The Gartner 5 R’s model provided Charlotte’s team with a conceptual framework for migrating applications to cloud computing. Not every application needed to move to the cloud, and the ones that did could move there using one of the 5 R’s: Rehost, Refactor, Revise, Rebuild, and Replace. Each of these approaches provide different costs and benefits.

Let’s explore these different approaches in more detail.

Rehost

Rehosting allows us to redeploy the application to a different environment—that is, physical to virtual (IaaS) or on-premises IaaS to Cloud IaaS. Sometimes this is called lift-and-shift. The advantage of rehosting is that it allows very quick migration to the cloud because there are almost no changes required to the application. The disadvantage is that it does not leverage the cloud to the maximal degree.

As Isaac Newton taught us, a body at rest wants to remain at rest. One of the smartest things Charlotte did was to insist that the team take one LOB application and rehost it in the cloud as soon as possible. That got things moving. Here team had gotten their cloud accounts set up, they were using the Portal and getting connected the documentation and the community. Charlotte made a point to start small and celebrate each victory the team had. Fourth Coffee was now in the cloud.

After the application was running in the cloud for a while, one of the team members pointed out that they could take advantage of several free management capabilities that the cloud provider made available. Fourth Coffee used Azure, which includes inventory management, change management, update management, and other things. When something went wrong with the application, the team went to the change management tool and discovered a modification that someone had made that broke the application. They’d never had a function like that before; now they not only had it but it was free, and they didn’t have to evaluate, purchase, or deploy a management tool. The team started warming to the cloud.

Figure 5-1 shows an example of the rehost scenario and the changes that happen in moving from an on-premises system to IaaS. The management of the hypervisor and host becomes the responsibility of the cloud provider, but the rest of the stack does not change.

This diagram shows the changes when moving with the rehost scenario. In this scenario the main changes are focused on the hypervisor and the hardware. They ultimately become the responsibility of the cloud provider.
Figure 5-1 Rehost.

Refactor

Refactoring an application moves it from its existing infrastructure to a PaaS cloud infrastructure. Developers can use existing frameworks that they previously have leveraged with the applications and take advantage of updated versions of these frameworks, which can then extend the application to use native cloud capabilities. This R scenario highlights a few inherent problems with PaaS; for example, a feature implemented on premises does not have a modern cloud-native feature to match. It also highlights vendor lock-in because you have to code for the cloud platform–specific features. This scenario’s end result uses the cloud in the most cost-effective manner; however, the journey through the refactor scenario often is the most expensive to achieve.

Figure 5-2 highlights what changes, transformations, updates, and new components apply in this scenario.

This diagram shows the changes when moving with the refactor scenario. In this scenario, the areas, hypervisor, and hardware are still under the cloud provider. However, the customer now evaluates the layers of containers, frameworks, and programming languages that the cloud provider supports in its environment.
Figure 5-2 Refactor.

Revise

The revise scenario attempts to take existing code bases and extend them to support modern cloud concepts. This can be a costly procedure. Depending on the complexity of the application, it also can be lengthy.

For example, breaking apart a monolithic application can be complicated, and the level of complication depends on how it was coded. It may require a rearchitecting of the code base to break it into smaller applications and well-defined interfaces to let the code begin to be extended to support cloud concepts and become optimized for the cloud.

Figure 5-3 highlights the changes, updates, and new elements for the revise scenario.

This diagram shows the changes required for the revise scenario. The key areas are refocusing on the application itself and modifying the source code to be more cloud native.
Figure 5-3 Revise.

Rebuild

The rebuild scenario involves moving away from any existing code base that has been written and starting from scratch. This involves rearchitecting the application to support all the innovative features of the cloud provider. Vendor lock-in is a key drawback of this scenario, but it also ensures that the application will fully capitalize on the capabilities of cloud.

Figure 5-4 highlights the changes required in the rebuild scenario for the application.

This diagram shows the changes required for the rebuild scenario. In this scenario, we focus on rebuilding the application from scratch so it will be built to support cloud-native technologies.
Figure 5-4 Rebuild.

Replace

In replace, you again move away from an existing code base (as in the rebuild scenario); however, instead of rewriting an application, you decide to use an application that meets the needs of the organization but is delivered as-a-service (that is, SaaS). Replace also can mean retire if there is no other path; it’s necessary to make this hard choice to phase out old applications.

For example, let’s say you’re using a CRM system that was custom built 15 years ago. This system was built on old .Net frameworks and gets functional patches from the development team. Its architecture is tightly knit with no well-defined interfaces. To adopt any of the other patterns listed for the 5 R’s would be a significant undertaking for any development team. The organization can use a cloud-based CRM system to avoid modernizing the old system and synchronize the old data to the new system.

Figure 5-5 highlights the transformation the application and management will undergo in the replace scenario.

This diagram shows the changes required for the replace scenario. In this scenario, enterprises should focus on replacing their applications with SaaS alternatives.
Figure 5-5 Replace.

Getting to the cloud to drive optimization

In this section we discuss moving to the cloud to help you drive optimization. When you look at the 5 R’s and the paths they present, it’s immediately obvious that you need to spend a lot of time deciding which one fits best to the application and organization’s needs.

In general, the business needs to evaluate each application with an in-depth analysis and determine the best course of action.

But what if we suggest that you focus on the R scenario for rehost, at least initially. This may seem like a crazy idea at first, but moving an application to the cloud, no matter which scenario you choose, is no trivial task. There are many steps still to get ready for the move to the cloud, but regardless of the steps or tasks involved, moving to the cloud in the rehost scenario does one explicit thing.

One of the key drivers for the cloud is reducing cost. Lifting-and-shifting an application at first doesn’t seem like much of a cost reduction because you’re placing a virtual machine from one virtualization platform to a cloud platform and changing your cost operating model from capital expenditure (CAPEX) to operating expenditure (OPEX).

Now every month a bill arrives for that application virtual machine. However, you don’t have the overhead of managing the physical infrastructure, the virtual machine used to run on, or the overhead of upgrading the hardware, so it saves money.

While you’re saving this money, you could do more. When a machine is running in the cloud and the cost of the virtual machine is visible through the monthly bill, you’ll be more inclined to examine ways to optimize this deployment and capitalize on the nature of PaaS- or SaaS-based services. In essence, this simple act of moving a virtual machine focuses the organization’s approach to the modernization of the application.

Moving the application into IaaS also helps the organization understand some of the other changes that need to happen when operating applications in the cloud. If we look at data in an application and examine its classification, this will dictate the governance that must be applied to secure the data and in turn what the application must implement. Additionally we also have to modify operations to ensure they are monitoring the application for performance and outages properly.

Let’s take another simple example. An organization has a three-tier LOB application that needs to move to the cloud, and the organization allocates one million dollars to the task of getting to the cloud. The team has established that the application cannot be replaced with a SaaS alternative.

The revise and rebuild scenarios are options for this project, but recoding this line of business application provides significant challenges. With a deadline of one year to move to the cloud, recoding the application to be cloud native and upskilling the development team does not seem feasible.

Undoubtedly as the developers would begin to recode the application, they would find that translating some of the core code is significantly harder than anticipated, and the time required would almost double what had been budgeted. They have consumed the million dollars allocated to replatform the code and need an additional investment before reaching the cloud. Let’s say this additional investment is another 500,000 dollars to complete the project. They will have invested 1.5 million dollars, and only now will they begin to operate in the cloud and make use of the benefits.

In the rehost scenario, they get the application to the cloud in a matter of weeks, they begin the process of understanding how to operate in the cloud, and they can start building abstract interfaces that begin to decouple the application. The application can start leveraging the right pieces of the cloud and further promote driving down the cost of running the application. For example, they can use native cloud backup or native cloud disaster recovery and native cloud monitoring.

As we previously mentioned, it may seem a little crazy not to push toward revise or refactor scenarios first, but it begins to make sense when you evaluate the additional benefits and other challenges that moving to a rehost scenario enables you to address while upskilling the development team and implementing the necessary organizational changes to support effectively moving to the cloud.

DevOps the cornerstone of application migration

A cornerstone of moving your application to the cloud is to ensure you get DevOps right. We discuss DevOps in greater detail in Chapter 7, “Supporting innovation.”

In the context of this chapter, we need to quickly discuss DevOps and ensure you have the right models in place to support moving to the cloud. It’s no secret things change when you move to the cloud no matter what R scenario you ultimately choose, but it’s also clear that you need to drive efficiency and automate as much as possible, from deployment to monitoring to remediation.

DevOps provides this foundation for organizations to move their application estates to the cloud. It redefines the processes, the tools, and the people to support applications in moving to the cloud. DevOps doesn’t necessarily have to start with the cloud; you can begin long before any application move to the cloud is considered, but DevOps is an absolute requirement before you begin the journey of migration or modernization.

Take DevOps in stages. It’s a life cycle that drives improvement and learning from mistakes. For example, in the case of the three-tier application we mentioned earlier, we could begin simply by automating the build process from source code into a staging environment and then releasing into production the first and second tiers of the application. As we examine what we learn from applying DevOps to each tier we begin to understand that no matter where the application tiers are to be deployed they can be deployed by the DevOps process.

We can improve on this by integrating application telemetry and infrastructure monitoring data, so operations and developers can look at this information in unison and make more intelligent decisions about how they need to progress. If they choose to evolve the application on the first tier to either a web app or a container running a web app, they would follow a process of changing the source code and following the automated deployment process, which would target the new environment of choice.

DevOps is the cornerstone of the application migration process. Without it in place, the result would end up bringing archaic practices to a modern cloud infrastructure, which would hinder an organization’s capability of fully attaining the benefits of the cloud.

The migration process

In this section we discuss the process for application migration, including demonstrating some of the tooling to help you collect the information about your environment and the tooling to help move your application to the cloud. The migration process is broken down into various phases as shown in Figure 5-6. Each phase’s data is used as input into the next phase to help guide decisions and have a successful migration project.

This diagram shows the end-to-end life cycle for application migration. It has the phases of discover, assess, convert to the left feeding into a lifecycle approach covering, functional, performance tests, optimize, migrate and remediate. The final steps are final sync of the data and then a cutover from on-premise to the cloud.
Figure 5-6 Application migration life cycle.

Each phase’s method of collection and tooling should drive a standardized process from performing application migrations to ultimately building the concept of an application migration factory.

Each phase as it’s being designed should derive the required data that needs to be collected from all key stakeholder teams. Examples of key stakeholders include the operations, development, and security teams. These teams can define the “important” information they need to assess an application, not only from an application migration standpoint but also in terms of performance and security, which will dictate some of your choices in the migration life cycle.

You also can use the information that the teams require to guide how you build your cloud infrastructure and how you construct the migration factory. It allows you to have a minimum bar that already meets the requirements of these teams, so when an application goes through the migration factory you know things will be in the correct order, and you don’t expect any blockers from going to production with these applications.

Discover

In the discover phase of the application migration life cycle, you need to gather as much information as possible to make the appropriate decisions for migrating your applications. This involves creating a large amount of standardized workshops and questionnaires and selecting the right tooling to gather the appropriate information for the organization’s decision-making.

Application selection

Application selection is paramount. Selecting the wrong application at the start of this life cycle can considerably hinder the migration to the cloud. However, selecting an application that doesn’t represent or touch enough points to ensure that the migration life cycle is appropriately comprehensive could lead to a disaster down the line.

Factors for looking at applications can be broken into two main categories:

  • Business factors Business factors look at the mission criticalness of the application, any regulatory governance it must meet, and potentially the sensitive nature of the data contained in the application. A great place to start would be applications that aren’t mission critical, have low regulatory governance, and contain a low level of sensitive data. These would be the low-hanging fruit and would enable you to begin building the standardization process required to achieve a successful application migration.

  • Technical factors Technical factors look at the hybrid requirements of the application (that is, does it require access to data sources that have to remain on premises), its monitoring requirements, the location of the monitoring tooling, any custom connections to other business software, how “chatty” the application is, and how latency sensitive the application is. A great starting point would be with applications that have no custom integration, little or no latency requirements, low monitoring requirements, little or no hybrid requirements, and are not very chatty.

As we mentioned earlier, you need to find an application that’s representative of the application estate and has a nice blend so that you can build the standardization procedures. However, since this is also a life cycle, you can start with the simple applications and evolve the procedures as you cycle thru the application estate.

Information Gathering

There are three main methods for information gathering in relation to application migration. It’s preferable that you perform the information gathering methods in the following sequence: questionnaires, tooling, and then workshops. We discuss each method in the follow sections.

Questionnaires

It’s necessary to build a robust questionnaire to obtain the data you require from an application. This helps you make decisions about that application—for example, whether you should migrate it at all and which R scenario best fits with the application today and its future state.

In Table 5-1 we show you a list of basic sample questions (and some example answers) that you could use as the start of your questionnaire.

Table 5-1 Basic questions for the questionnaire

BASICS

ANSWERS

Application name

Time Management

Who is this for?

HR

How long has this application been around?

> 8 years

Are there applications serving similar needs in your portfolio?

Yes, potential to consolidate

Are there SaaS options in the market that might meet your needs with or without customization?

Yes

What’s your team’s timeline for the cloud journey?

2 years

Are you looking to actively leverage and contribute to the open source community?

No

What’s the expected number of concurrent users per month?

2,000

Table 5-2 highlights the basis of some business driver questions, which further help you to prioritize and select the best type of applications to start with.

Table 5-2 Business driver questions for the questionnaire

BUSINESS DRIVERS

ANSWERS

What is the primary objective to migrate to the cloud for this application?

Provide multichannel access, including mobile

What is the secondary objective to migrate to the cloud for this application?

Free up datacenter space quickly

Is this application critical to your business?

No

Do you expect this application to handle large traffic?

No

How often do you plan to update the app?

Once every 1 to 3 years

Do you expect this app to add breakthrough capabilities like intelligence, IoT, Bots?

No

Do you have a pressing timeline (DC shutdown, EoL licensing, DC contract expiration, M&A)?

No

How important is it to leverage your existing code and data?

Important

If you were to decide on a migration/modernization strategy, which one would you pick?

Refactor: Minimally alter to take better advantage of the cloud

What are the least efficient aspects of this application?

Infrastructure

Table 5-3 includes some sample questions that are more aligned to the development of the application and its evolution.

Table 5-3 Development and architectural questions for the questionnaire

ARCHITECTURAL AND DEV PROCESS CONSIDERATIONS

ANSWERS

What’s the next architectural milestone you want to achieve for this app?

Good with monolithic for this app

Does this app require you to access the underlying VM (that is, to install custom software)?

No

Does this application involve extensive business processes and messaging? Is it chatty?

No

Does this application involve custom integration with other web and cloud apps via APIs or connectors?

Yes

Have you adopted SOA for this application?

No

Are you interested in moving your application’s database to cloud as well?

No

What is the primary objective you want to achieve with data storage for this application?

Ease of management

How important is Big Data/AI capability for this application?

Nice to have

Is this application highly connected with or dependent on on-premises applications/systems?

No

If you were to assess the level of changes you are willing take to move this application to the cloud, what would they be?

Moderate: no core code change required

Is your app sensitive to latency?

Yes

Table 5-4 shows sample questions relating to regulatory requirements.

Table 5-4 Regulatory questions for the questionnaire

REGULATORY, COMPLIANCE AND SECURITY REQUIREMENTS

ANSWERS

Are there specific compliance or country-specific data requirements that can affect your migration and architectural strategies?

No

Does the application need secure authorization and authentication?

Yes

Does the application require firewall, app gateway, or advanced virtual network and related components?

Yes

These questions are the basics. You can go deeper and gather more information relating to the application and the surrounding infrastructure. Some additional areas for which you may want to build questionnaires include

  • Compute

  • Networking

  • Storage

  • Database

Questionnaires on these topics give a more robust picture than the information we provide in the tables.

Baselining

Application baselining collects data on the normal state of the application and how it operates during normal and off-peak business hours so that you can establish a pattern and provide a comparison when you move to the cloud. Baselining really should be an ongoing thing rather than being focused on a point-in-time migration because it will help you understand application characteristics that will be useful for determining problems during operations.

Gathering this data enables you to make accurate predictions of the types of services you consume within your cloud provider and helps you build budgeting and financial models from that data. If you don’t understand how the normal state looks then you can greatly affect the IT budget by making inaccurate decisions.

Baselining requires collecting two main types of data: performance and configuration.

Performance data Performance data gives you a view on how the application is responding to the workload it{{#}}8217;s being put under. Reviewing this data for a wide period rather than using a single time capture of only a couple hours will give you a more accurate profile of how the application works. Ultimately you can compare the performance data when you move the application to the cloud.

You can handle collecting performance data in a variety of ways:

  • Performance monitor Performance monitor is natively built into Windows and helps you capture and visualize performance data for any performance counter available in Windows and its applications.

    Logman is a command-line utility that allows you to build performance counter sets (a collection of performance counters in a file) and invoke them to collect performance data.

    Here’s a simple way of creating a file and invoking a timed capture for performance data.

    1. Open Notepad and copy the following content into it:

      "SystemProcessor Queue Length"
      "MemoryPages/sec"
      "MemoryAvailable MBytes"
      "Processor(*)\% Processor Time"
      "Network Interface(*)Bytes Received/sec"
      "Network Interface(*)Bytes Sent/sec"
      "LogicalDisk(C:)\% Free Space"
      "LogicalDisk(*)Avg. Disk Queue Length"
    2. Save the file and call it windowsperf.conf.

    3. Open an administrative command prompt and copy the following code into the window. This code requires you to change the data and time windows between the -b parameter and the -E parameter:

      logman create counter baseperf -f bin -b 03/04/2018 09:40:00 -E 03/04/2018
      09:45:00 -si 05 -v mmddhhmm -o "c:perfaseperf" -cf "c:perfwindowsperf.conf"
    4. Open Performance Monitor (type perfmon and press enter). As shown in Figure 5-7, you will observe the data set after its created on the left-hand menu. Notice the data set baseperf.

      This diagram shows Performance Monitor with the data set baseperf after it’s been created.
      Figure 5-7 Performance Monitor data collection.
    5. Go back to the command prompt windows and type the following:

      logman start baseperf

      This will begin the data collection for the prescribed time period you entered in the previous steps.

      After the data is collected, you can use Performance Monitor to view the data. The same procedure can be used across multiple machines, and you could wrap these in scripts to further automate the process.

  • PowerShell PowerShell can help automate the collection of performance counter data using the Get-Counter cmdlet. Here is a sample script for collecting performance counter data via PowerShell:

    $CtrList = @(
            "SystemProcessor Queue Length",
            "MemoryPages/sec",
            "MemoryAvailable MBytes",
            "Processor(*)\% Processor Time",
            "Network Interface(*)Bytes Received/sec",
            "Network Interface(*)Bytes Sent/sec",
            "LogicalDisk(C:)\% Free Space",
            "LogicalDisk(*)Avg. Disk Queue Length"
            )
        Get-Counter -Counter $CtrList -SampleInterval 5 -MaxSamples 5 | Export-Counter -Path C:PerfExample.blg -FileFormat BLG -Force
  • SCOM Microsoft System Center Operations Manager (SCOM) is traditionally an on-premises monitoring tool created to give deep insight into your IT environment. You can use SCOM to collect long-term data about your environment, (using management packs and/or scripts) and it gives you insight on what the application’s normal state is.

Note

For more information regarding SCOM, please visit this page: https://docs.microsoft.com/en-us/system-center/scom/welcome?view=sc-om-2016.

  • OMS&S Microsoft Operation Management Suite and Security (OMS&S) is a cloud-based tool that can span multiple environments (cloud and on-premises) via agent-based collection to aggregate large amounts of data—including performance data—into a single repository for analysis. You can collect any performance counter on Windows and Linux and determine an application’s normal state from this and a variety of other data.

    OMS&S can collect multiple sources of data and provide service mapping to help identify interdependencies and/or communication paths you may not be aware of. Figure 5.8 shows an example of adding Windows performance counters to Windows.

The figure shows the performance counter blade in OMS&S for Windows. It shows the default counters that are captured once you deploy the configuration. However, it also shows a search bar in which you can type any counter name available in Windows and begin collecting data for that specific counter.
Figure 5-8 Performance counter configuration in OMS&S for Windows.

Note

For more information regarding collecting performance data in OMS&S, please visit https://docs.microsoft.com/en-us/azure/log-analytics/log-analytics-data-sources-performance-counters.

  • Third-party tools

    There are many third-party tools beyond the scope of this book. Here’s a small sample:

    • Splunk

    • Nagios

    • Spiceworks

    • HP OpenView

Configuration data Configuration data enables you to look at how the application is configured at a point in time. This point-in-time view enables you to correlate its normal state of operations in conjunction with the captured performance data. Application vendors often have many settings that you can tweak for improved performance depending on the environment where the application is deployed.

When you are capturing the configuration data you can review it before moving to the cloud so you can determine any potential problems. A simple example involves authentication. If the configuration of the application requires Kerberos authentication but you haven’t planned your network around support for a Kerberos authentication method in the cloud, the application will fail.

Additionally, if you choose to step a little into the modernization of the application—for example, containers—you need to ensure you can allow the containers to connect securely to other systems using some identity model. If it is a Windows container and you haven’t stepped into modernizing the authentication methods to something like OAuth2.0, then the application still requires Kerberos and the supporting infrastructure to allow Kerberos to operate successfully.

You can collect configuration capture in a variety of different ways. Manually capturing the configuration data is an option, of course. However, using automate tools, such as capturing performance data, simplifies the process. We discuss some of these tools in the next section because some of the discovery tools or migration assistants also capture the configuration data you need.

Service Mapping

When looking at applications, you need to gather as much information as possible about the application to have a successful migration project. However, the information needs to be validated, as is always the case when you’re collecting information.

Applications that are candidates for migration usually have the original architecture diagram and protocol flows drawn up at the start, but as the development process occurs this information tends to not be updated as regularly as you require. (For application developers who do make updates, that’s great!) But the lack of information updates or inaccurate updates still can lead into the scenario in which the app was “designed” to do X, but it does Y. This becomes more important when you’re about to migrate an application.

Take a simple example of a LOB application. The vendor has described in the architecture and supporting documentation that the app uses a built-in authentication process. When you move it to the cloud, even in a rehost scenario, the application breaks! No one can log on.

You troubleshoot and discover that the application requires Kerberos authentication. Now you must either change the network configuration to support hybrid connections to your on-premises network or you must use Azure Active Directory Domain Services (if your application can support the associated Kerberos versions!).

Service mapping is the technique of taking the architecture and protocol flows and discovering whether your application is indeed operating in the capacity the specifications have defined. This will help you identify how your application is working, what processes it instantiates, how it communicates with other systems; you essentially have almost all the information you need to understand which way the application is working on the network.

Not only will service mapping help you with your application and how it communicates, you can use it to help you design your network infrastructure in a Cloud environment. From the information you can determine if you need to open additional ports or firewall rules, or you can find out if you can move the application to Cloud.

During your service mapping, if you discover an application that communicates using multicast technologies or broadcast technologies, then you will have problems moving to a cloud platform like Azure. because at the time of writing, Azure doesn’t support those technologies.

Oracle RAC is an example of one of these database platforms whose technology currently uses multicast technology in its clustering mechanisms. So, for now at least, you can’t bring it through application migration in the rehost scenario.

You can handle service mapping in a variety of ways, and it builds on the information you have collected in the previous sections of this chapter. You could use network traces in tools like Network Monitor or Wireshark. Figure 5-9 shows a sample network trace in network monitoring. We show the communications that happen normally (you will see a lot), and then we show that we can map this to a process.

The figure shows the output for network monitor. It includes the traffic data and the process that generated the data.
Figure 5-9 Network monitoring.

Figure 5-10 shows Service Map that’s built into OMS. It shows the application server, where it’s talking, and on what ports it’s talking. On the right side we drill into process information and event details down to the DLL versions. Service maps, which has a helper agent on it, collects this data alongside the OMS agent and injects it into our log analytics workspace, so we can produce these views!

The figure shows a sample dashboard for a service map. The middle shows the name of the virtual machine, and on either side the dashboard shows the ports on which a process is talking and to which other systems it is connected.
Figure 5-10 Service map.

Data

Data is usually the most important part of the application. If we lose the application, we can restore the front-end components and connect back to the data stores. More importantly, we need to profile our data in the correct way. Identifying our data stores and understanding the sizing requirements, its security requirements, and its performance requirements will help us build an appropriate map to the cloud.

For example, an Azure Standard Storage account can hold up to 500 TB of data and has a 20,000 IOPS limit. If we have an application that has 400 GB of data then the storage account is perfect, but if the IOPS count required is 30,000 IOPS then the storage account won’t meet the needs. Also, an application that requires an NFS interface to the storage also discounts the Azure Storage Account. Finally, if the data stored in that data store has been classified as High Business Impact or Personal Information, we potentially need to enable encryption or more to host the data in that storage account even if its other needs have been met.

Similarly, if the data happens to be a database, the same information about performance, sizing, and so on is still required, but support also comes into play. Does the vendor support the datastores that are available in the cloud, and what are its conditions on items like blob storage and latency, for example.

Tools

An organization needs to select the correct tools to gather the depth information related to an application to aid in their decision-making process. In this section, we discuss some of the free tools available today. These tools will help you gather a lot of the information we have talked about in your environment.

Microsoft Assessment and Planning Toolkit

Microsoft Assessment and Planning Toolkit (MAPS) provides a powerful tool to collect an inventory. It’s an assessment and reporting tool to gather the information required to help you migrate your applications to the cloud.

The installed tool collects an inventory of the machines you want to target. Targets are called scenarios, and they include the following:

  • Windows

  • Linux/Unix

  • VMWare

  • SQL

  • Oracle

An inventory contains the operating system, configuration, and performance data for the machine to be able to assess it for scenarios later. One of the scenarios we are most interested in is the Azure Migration Platform Scenario, which collects information to build an inventory that consists of the hardware information, OS information, IIS instances, SQL information, and web applications. Figure 5-11 shows the inventory discovery process in MAPS.

The figure shows the inventory collection in MAPS. It shows a sample of a collection running with an information screening showing the progress.
Figure 5-11 Inventory collection in MAPS.

After the inventory is collected, the next step is to collect the performance data. You can run some reports prior to doing this, but collecting the performance data gives you a complete picture of how your system performs over a period of time The performance data collection should be left for as long a period as possible during normal business operations to get an accurate representation of how it performs and also to contribute to the sizing report for an Azure VM. Table 5-5 shows the output of a sizing report after the inventory and performance data have been collected.

Table 5-5 Output of an Azure VM sizing report

MACHINE NAME

OS

TYPE

AZURE VM SIZE

EST. MONTHLY SMALL COMPUTE HOURS

EST. MONTHLY NETWORK USE-OUTGOING (GB)

EST. MONTHLY STORAGE USE (GB)

VM CPU UTILIZATION (%)

VM MEMORY UTILIZATION (MB)

VM DISK I/O UTILIZATION (IOPS)

VM NETWORK UTILIZATION-OUT (MB/S)

VM NETWORK UTILIZATION-IN (MB/S)

DATA DISKS

dc01.fourthcoffee.com

Microsoft Windows Server 2012 R2 Datacenter

Virtual

B1ms

11520

5.19

15.63

0.72

1371.3

2.69

0

0

1

Azure Migrate

Azure Migrate is a cloud-based service used to discover, assess, and migrate from n-premises networks to the cloud. This tool is targeted toward a virtual machine migration or a database migration and leverages existing Microsoft tools within Azure Site Recovery and the Database Migration Service. As of March 2019, this tool is scoped only to VMware environments with a planned roadmap to support additional environments. Hyper-V support is currently planned.

Azure Migrate runs an appliance that you can download via the Azure Marketplace. It connects to vCenter and scans the environment to build an inventory and assess it for migrating to the cloud.

After analysis, Azure Migrate guides you to the process of migrating your applications to the cloud.

Note

For more information on Azure Migrate, please visit https://azure.microsoft.com/en-us/migrate/.

Azure App Service Migration Assistant

Although it’s not an official Microsoft tool, the Azure App Service Migration Assistant looks at a variety of information related to an on-premises web app and determines its suitability for the cloud, specifically the Azure App Service. The Azure App Service Migration Assistant inventories and collects configuration data around an app’s binding, authentication, extensions, and so on. You receive an assessment and report based on the findings from the inventory and determine if you can easily move to the Azure App Service. The tool also performs the migration if you want.

This tool is not officially supported by Microsoft, but it’s a valuable tool in the arsenal.

Note

For more information, please visit https://www.movemetothecloud.net/.

Third-party tooling

There are plenty of third-part tools available today. It’s outside the scope of this book to highlight all available tools or provide in-depth guidance about them, but here is a short list of some of the major tools available. You can visit their websites for further information.

Workshops

The workshop is considered the final point in the chain of the discovery phase. It requires the participation of all the teams—infrastructure, support, operations, security, and development. Using all the information collected in the previous stages of the discovery phase, you can review each application and have a complete end-to-end discussion to determine each app’s viability for the cloud. You can use the information to agree on the appropriate R scenario to map to for migrating the application to the cloud.

These workshops also can cover the assessment work that we cover in more detail in the next section, but that assessment work is not the end purpose of the workshop.

Assess

The assessment phase builds on top of the data collected in the discovery phase There is overlap between the two phases, especially regarding the applications you begin to target and how you prioritize which application goes first.

In this section, we introduce two items that help you prioritize the applications and map them to the cloud in the appropriate way.

Prioritization tables

You can break prioritization into three areas to help you define a weighting system that you can use to determine the sequence of which applications move first:

  • By type

  • By business value or criticality

  • By complexity and risk

By Type

Breaking applications into their respective types can help you discover whether there are SaaS alternatives or native PaaS frameworks that you can leverage. Figure 5-12 shows a breakdown of how you might segment applications by type.

This diagram shows the segments you might break the application types into. For example we broke the applications in our example into buckets of Microsoft Apps, Custom Apps, Third-Party Apps, and Commercial Off the Shelf Apps
Figure 5-12 Segmenting applications into types.
By Business Value or Criticality

Understanding the business value and how critical it is to the organization also helps you prioritize an application for migration. If it is a mission-critical application which the business cannot operate without and requires absolute stability, this might lead to a delay in bringing the mission-critical application to the cloud until lessons are learned.

Figure 5-13 shows an example of how you might segment the applications by business value.

This diagram shows the segments into which you might divide apps by business value and criticality. The buckets in the image are Mission Critical, Important, Marginal, and Can Be Retired.
Figure 5-13 Segment applications into business value and criticality.
By Complexity and Risk

Finally, you can break applications into segments of complexity and risk. If an app has low risk and low complexity for the migration, it receives a high score for being approached. Figure 5-14 shows how you might segment based on complexity and risk.

This diagram shows the segments for dividing apps into complexity and risk. In this scenario there are 3 levels Low, Medium, and High
Figure 5-14 Segment applications into complexity and risk.

You can score each area and bucket and use the cumulation of those scores to build a prioritization table. Figure 5-15 shows a sample prioritization table derived from the segments and the discovery data. It also includes other potential factors you may weigh in the decision of prioritization. The weight factor is from 1 to 5: 1 is not important or low, and 5 is very important or high.

This diagram shows a sample prioritization table. The table shows various different elements to consider for an application with ratings of 1 to 5 against them.
Figure 5-15 Prioritization table.

This table holds a list of all the applications in an organization. You use one like it to determine the path and a timeline of events to migrate the application estate to the cloud.

Target

The final stage before you enter the migration factory is targeting the application to Azure Services. Some of the tools we previously mentioned present a report that maps the on-premises applications to services available in the target cloud. Target selection starts when you select the actual cloud environment you want to migrate the application to.

After you make the selection, the R scenario drives the services in the cloud environment you ultimately will consume. Using an example with the rehost scenario, we target IaaS services in Azure. To target these services, you need to map the on-premises virtual machine to an Azure virtual machine size. These sizes determine the CPU cores, the memory, and the amount of data disks you can attach to.

For other services—that is, if you need to monitor the application inside the virtual machine—you need to identify the monitoring needs, which would have been established in the discovery phase, and what cloud services can meet the monitoring needs. If the requirements are for operating system information then you could leverage OMS&S. If the requirements are for deep application telemetry then you might leverage application insights.

Cloud Map

When you migrate to the cloud, remember that not all features may be available. Consequently, it’s important to create a cloud map table. The table shown in Figure 5-16 maps the features consumed by the on-premises network to what’s available in the cloud. We also include roadmap data on our map to help teams determine when the support for the services they require will become available in the cloud. Figure 5-16 shows a table for database services consumed on premises and what they map to in the cloud. You could further iterate this to include the individual services each database source uses and what features map to Azure Database Services.

This diagram shows a sample cloud map table. It shows the application required in the cloud and whether it is available with the functionality required in the cloud. If the service is not available in the cloud or the functionality is not available, it provides a road map date on when the functionality is expected to be available.
Figure 5-16 Cloud map table.

Building a migration factory

Perhaps the most important part of migrating applications to the cloud is creating a concept called the migration factory. The migration factory principal outcome is that after we’ve collected the information from discovery and assessed it correctly, it enters the “factory” and moves to the cloud, ready for cutover.

The factory has a lot of responsibility in the migration life cycle chain. It needs to ensure that the application works as expected on the cloud platform of choice when it exits the factory. It also generates an important output of how to do the final migration over to the cloud.

To achieve these results, in this section we dive a little deeper into each phase of the migration factory and begin why the data we collected in the discovery, assess, and target phases are so important.

Testing

A core concept of DevOps involves testing. The testing produces valuable feedback to determine whether the application running on the relevant cloud platform works. This testing should not be manual; it should be heavily automated. Building this automated test system ensures that the application is tested consistently against the rules that have been agreed upon by the business and IT organization.

Functional Tests

In the functional testing phase of the migration factory, you validate that the application operates as expected. You can do a variety of tests, including synthetic transactions with validate data queries or logon procedures. Other elements including cross-platform integration points and reports. You also would test for component upgrades of an application. The tests are constructed with IT operations and end-user scenarios in mind, and the scores must be 100%. Any failure should be fed back into the remediation phase of the migration factory. Table 5-6 shows a sample definition for a functional test for an application.

Table 5-6 Functional Test Sample

ITEM

DESCRIPTION

[Test Case Name]

Log in Test

[Test Case ID]

12334234

[Test Case Author]

Joe Bloggs

[Testing Phase]

Functional / User Testing

[Description]

[Test Case Steps]

This test validates that a user can log to the system via the front-end portal.

1. Open the web page for the application: http://crm.

2. Enter the username and password for end user.

3. Validate the default status page of Dashboard A.

4. Log out.

5. Repeat steps 1–4 using an administration account and verify. Dashboard B is default.

Screenshots

<Dashboard A Screenshot>

<Dashboard B Screenshot>

[Test Case Results]

Pass/Fail

[Test Case Feedback]

Test passed for end-user scenario, administration scenario loads Dashboard C

Note

You can find further samples at http://download.microsoft.com/download/8/D/9/8D995CB3-2C3E-43B4-97D3-B372FBF6C7EF/STARTS%20Quality%20Bar%20FY2016.pdf.

The sample details functional testing originally designed for Windows Phone.

Performance Tests

In the performance testing phase of the migration factory, you validate that the application operates equally or better than the on-premises infrastructure. Testing performance covers standard end-user scenarios, including reporting, login times, querying for new data, and creating records. Testing also should cover internal application metrics that aren’t traditionally visible to a user but can be used to correlate reported events. The performance testing can be validated against the baselines previously performed in the discovery phase. Application architectures may change because of the performance testing during the remediation phase.

Performance tests can be defined similarly to functional tests. They should focus on performance metrics for the application. Table 5-7 shows a sample performance test you could start with to document and then build the automation test from.

Table 5-7 Performance test sample

ITEM

DESCRIPTION

[Test Case Name]

Log in Test

[Test Case ID]

12334234

[Test Case Author]

Joe Bloggs

[Testing Phase]

Performance Test / Login Time

[Description]

[Test Case Steps]

This test validates that a user log takes less than 3 seconds.

1. Open the web page for the application: http://crm.

2. Enter the username and password for the end user.

3. Collect metrics from Application Insights.

4. Validate the login metric is less than 3 seconds.

5. Repeat steps 1–4 using an Administration Account.

Performance Requirements

<3 seconds

[Test Case Results]

Pass/Fail

[Test Case Feedback]

Login was within specified parameters.

Performance testing also will describe load scenarios and expected performance. Here are some sample expectations under load testing of a web app with a load of 500 users.

  • Throughput 100 requests per second (ASP.NETRequests/sec performance counter)

  • Requests Executing 45 requests executing (ASP.NETRequests Executing performance counter)

  • Avg. Response Time 2.5-second response time (TTLB on 100 megabits per second [Mbps] LAN)

  • Resource utilization thresholds

  • Processor\% Processor Time 75 percent

Hopefully by now, you might begin to understand that these tests can be duplicated from the original application build and testing process. This is the whole point of DevOps—to simplify and automate while maintaining the same or better coverage as before. You may also find that the testing from the migration factory feeds back into the original testing scenarios and improves them at the initial build so that each functional upgrade undergoes a rigorous end-to-end test.

Remediation

In the remediation phase, the output of the functional and performance tests are analyzed for patterns and remediation takes place. The remediation sets output back to the teams running discovery and assessment in case there is additional data that should be collected up front before an application enters the migration factory. The remediation phase also ensures that the application incorporates the changes, so they do not reoccur in subsequent testing. Remediation may identify areas or scenarios in the testing phases that are not covered and drive change to ensure more complete coverage in the application testing.

Data migration

Data migration has a specific phase in the migration factory because it identifies and details the most effective method of getting data to the cloud. In some cases, the tooling may migrate the data for you; in other cases, you may have to export the data to a storage unit, encrypt it, and ship it to a cloud datacenter. Whatever the method, the data migration phase highlights this for you and begins to shape the process you will require when it comes to switching over to production.

Data Sync

Data sync happens both during the migration factory and after as you prepare for cutover. In fact, one of the choices that will appear during the migration factory is whether you offline sync your data and incur an outage or use an automated method of having the data consistently in sync so you have no downtime.

There are many types of data, of course, and you must establish the methods of getting data to the cloud. For example, if the data is that of a database and is hosted with in SQL, you can look at items like SQL Replication or SQL Always On, which enable you to have a constant data stream of replication moving to the cloud to make cutover easier. Azure has tools like the Database Migration Service, which can replicate and cut over the data for an application’s database.

If you have an application that doesn’t have native replication, you can look into third-party tools that can replicate the data. However, you also might look deeper into the cloud platform. Azure Site Recovery is a tool on the Azure Platform that can replicate a virtual machine and its data to Azure and allow for a smooth cutover.

Offline Data sync is a potential option as well. With it, you back up and restore the data and schedule an outage for the period of the cutover.

Cutover

The last phase of the migration life cycle is the cutover. The cutover involves bringing the application to production, and it’s the final gate of application migration. This final gate gives an organization its last chance to stop a migration from happening if there are problems and a final chance to ensure all elements of the application are thoroughly tested and all supporting processes and operational technology have been put in place.

You may have noticed that at the start of the application life cycle migration we suggest you involve a variety different teams to ensure that you have representation of interests across an enterprise to ensure an application is secure, stable, performant, and recoverable. However, we haven’t directly called out those groups in the migration factory. We also haven’t called out any specifics about support systems for the application—for example, how you monitor the application in the new cloud environment or how you back up the application.

This is the one of the reasons we have the cutover gate. The cutover gate enables you to ensure that you have all the tools in place so that when the application gets cut over you can monitor the application appropriately. You make sure that you can back up and restore the application in the cloud and still recover using the legacy backup catalog. You also ensure that all the security criteria have been put in place, including governance, regulatory, and privacy requirements.

Building a detailed checklist to ensure all these requirements are met is essential! When components go live in Azure, they need to meet fundamental requirements which have been determined during the discovery phase. If they don’t meet this quality bar for production then the component can’t go live!

The quality bar for each application may be slightly different in terms of whether the application can scale, what availability mechanisms are in place, how you monitor the application telemetry, and so on. The pillars shown in Table 5-8 help define what that quality bar should be generically across all applications. If the quality bar can’t be met, you can raise and approve exceptions, but you should document and revisit them periodically to rebalance the quality bar for them.

Table 5-8 Quality Bar Questionnaire

SCALABILITY

AVAILABILITY

RESILIENCY

MANAGEMENT

DEVOPS

SECURITY

Can the application handle the load?

Does the application infrastructure increase the node count with increased demand?

Do you have multiple instances of the application across regions and geographies?

Can the application sustain a node failure?

Can the application sustain a path failure?

If the application fails, how is a new instance instantiated?

If the application fails, how does the final transaction get replayed?

If you failed to a different geo region, how do you recover from the last checkpoint?

How do you back up the application?

How do you monitor the application telemetry?

How do you monitor the operating system telemetry?

How do you monitor performance problems?

How do you gather logging data?

How do you capture fault information and integrate it into the bug/triage/remediation process?

How do you deploy new updates to the cloud deployment?

Have you translated your applications and infrastructure to use infrastructure as code?

Do you have encryption turned on for data at rest?

Do you have encryption turned on for data in transit?

How do you rotate encryption keys?

Do you have firewalls in place?

How do you monitor audit data from machines and cloud services?

The table itself is not a definitive checklist for creating that cutover quality bar, but it gives you some suggestions of the types of questions to ask to ensure all elements are in place for an application before it moves to production.

The cutover process itself doesn’t just finish with the quality bar. The methods for bringing an application live must be defined. Figure 5-17 shows a sample application that has been through the migration factory and is ready to be cut over so the cloud service takes the load.

This diagram shows a sample application from its state on premises and after going through the application factory so it’s ready for cutover.
Figure 5-17 Sample application being prepared for cutover.

Figure 5-17represents a web application in a traditional two-tier architecture (web and database). The application was accessed via a URL that presented the interface to the application. After going through discovery, the web application was migrated from a full web virtual machine into a container and was deployed into a Kubernetes cluster. The DevOps pipeline was created to support the application being deployed directly from source code into a container and then into the Kubernetes cluster. The database was migrated to run in Azure SQL Database. The migration factory validated all the test scenarios for functionality and performance, and we met our cutover quality bar for moving to production.

To perform the final cutover for this application, we must consider a few steps.

Consumer access to the application

We must consider how consumers access the application. As we mentioned earlier, the consumers access the application via a URL, but knowing only this is not enough. We must understand what state that URL is in. Is it a short URL like http://app01 or is it a FQDN URL like http://app01.fourthcoffee.com? We need to understand where the application is being accessed from: internally or externally? Do we have any proxy servers in place? Do we access the URL over HTTP or HTTPS?

These are all relatively basic questions, but they can greatly affect cutting over. For example, if we have an internal-facing application, customers access it over HTTP, and we decide to use an approach as shown in Figure 5-15, how do we translate a short URL to a long URL? How does that impact our network? How do users authenticate to the application if we haven’t modernized to authenticate to OAuth2.0? How does the application respond if we use HTTPS? Was it designed to handle HTTPS?

The questions keep building, but you will begin to see that they help us build additional functional tests that can be integrated into the migration factory. For now, let’s say we’re moving from a short URL to a long URL as previously detailed, and we maintain Kerberos authentication.

It will be necessary to explain to the staff that we will begin requiring them to access the application via the FQDN. We also can create a CNAME for http://app01 to redirect to the FQDN. In our scenario, we can create a traffic manager URL that all clients will be sent to before they’re appropriately directed back into the application. The traffic manager URL gives us some breathing space as well. We can expose the app on premises to the internet securely and have the traffic manager send the stream to the on-premises network while we get ready to cut over. The traffic manager has a failover profile that allows us to seamlessly redirect the traffic to Azure. The application gateway will ensure that even if clients attempt to access the application via the unsecure http endpoint, they will be redirected to the secure https endpoint. In theory, we also could support http://app01 pointing to the internal VIP of the application located on the Kubernetes cluster if we don’t want to expose the application or modify access policies.

In our case, we need to make DNS modifications to support this new traffic redirection. These will include CNAMEs, but it also involves reducing the TTL of the DNS records so clients don’t cache the results for long and will get updated quickly to the new site. Ensuring network configuration is in place to support these paths is also a requirement, but this can be completed as part of the infrastructure building and the final cutover testing.

The data sync in our case will be an offline cutover. The SQL Server on premises will be drained of connections and will have restrictions put in place for accessing it by any tier or consumer except the Database Migration Service. The data will be migrated, and then the traffic manager profile will be redirected to the cloud. Consumers then will begin to access the system as normal, and the cutover will be complete. We can begin to retire the on-premises system.

Creating the teams to support the migration factory

Creating a migration factory requires having a migration factory team in place to support the endeavor. The team is tasked with performing all the functions we have discussed so far in this chapter—from discovery to cutover. Building the overall teams to support this starts with building PODS.

Pods

Pods represent a group of people who fall into a particular skill set. For example, you may create a pod of .Net developers, a pod of Java developers, or a pod of people who have VMware-specific skills. Figure 5-18 shows a sample of pods forming a bigger migration team.

This diagram shows a sample application from its state on premises and after going through the application factory so it’s ready for cutover.
Figure 5-18 Pods.

The pods combine to create bigger teams to support the migration of particular applications. The pods can flex based on the demand during the application migration. For example, if this a Java application, the Java pods might increase its members to support migrating that application. Then if the next application is .NET heavy, the Java pod will downsize, and the .NET pod will scale accordingly. Notice that the infrastructure and security pod may stay the same size.

Teams

Pods form teams. Although there will be a core migration team to support the effort, there also will be a variety of other teams to support the overall migration factory. Figure 5-19 shows some of the additional teams (which can also be made up from pods).

This diagram shows a sample set of Teams made up from our PODS created previously. It also shows the interaction between the PODS and core responsibilities.
Figure 5-19 Teams.

Our approach is to have a core migration team that will run our factory and get our application estate to the cloud in the quickest but best way possible. Our app team will be responsible for discovery and profiling our applications. Our advance team is responsible for ensuring the migration team knows what needs to change in its processes to support any new types of applications, especially if the advance team has predicted or identified a pattern in the applications still to come.

The exception team handles the one-offs in the applications, which have pretty unique circumstances, or applications that have entered the factory and have been identified as having unique characteristics that don’t conform to a standardization process. The QA teams ensure that the applications have been validated correctly in the cloud and are the guardians of the cutover.

The teams shown in Figure 5-19 work closely together. As in the DevOps life cycle, they feed into each other with valuable feedback to improve the overall migration factory.

Beyond migration

We have walked through the processes and systems you need to put in place to complete application migration. After it’s been implemented and you successfully start moving applications to the cloud, the conversation will eventually divert to the question, “What next?”

Optimization

What’s next is optimization. At the start of this chapter, we highlighted that you often will have to get to the cloud first and then drive optimization rather than trying to drive optimization and then move to the cloud. Refactoring an application takes an enormous amount of effort, and items usually pop up unexpectedly, and time to delivery of the refactoring slips, which delays the whole project. If we move using the rehost scenario, we get to the cloud rapidly and begin the process of optimization with focus on cost optimization and using cloud efficiency.

Optimization covers a variety of different possibilities some of which we will discuss here.

Management optimization

Let’s start with how you can approach the management system that enterprises use to control their environments and how different areas within that management ecosystem can be optimized.

Backup

When moving your application to the cloud, especially in a rehost scenario, you have to consider how you back up the data. Traditionally, in an on-premises network you have a backup server or farm that deploys agents out to the application hosts you want to back up and then create a schedule. These backup hosts are connected to storage and some archival tape. These hosts also need relatively good network connectivity between the backup server and the application host, especially if there are large amounts of data that need to be backed up.

To a degree, we could replicate this setup in the cloud by building a backup server virtual machine and attaching lots of data disks. Then install a virtual tape library drive that emulates the functionality and have it backed up on some other storage type.

Virtual machines consume resources in Azure and have sizing guidelines that directly affect the CPUs, RAM, and disk it can support (among other things) and heavily impacts the cost.

For example, the max size disk supported today is 4 TB in Azure. We can attach up to 64 disks to a virtual machine, which allows us to have a maximum space of 256 TB (that’s quite a lot). If we match this to a virtual machine size, such as Standard_L16s, to ensure we get the IOPS and throughput needed to meet our RPO and RTO for our backup, our estimated monthly cost runs into $33,000 USD per month. What’s interesting is that this is just for the data; it’s not a complete virtual machine backup!

Note

Pricing will vary. Please look for the latest information available on the Azure pricing calculator at https://azure.microsoft.com/en-us/pricing/calculator/.

That solution isn’t cost-effective, and we shouldn’t implement it. If we take an optimized approach to backups, we would look at a native cloud service. Azure has an integrated backup platform called Azure Backup. If we want to back up the virtual machines with all the data and applications installed in a snapshot consistent method, the cost for 256 TB of storage for Azure Backup is estimated to be $11,000 USD per month. If we still require guest OS–level backup, the cost rises because we must include a virtual machine to run Azure Backup Server. Figure 5-20 shows a sample virtual machine blade with Azure Backup integrated directly into it. You can see job information and the recovery points available. This shows the default policy of one backup per day.

This figure shows Azure Backup natively integrated into a virtual machine blade. The screenshot shows job information and recovery points available.
Figure 5-20 Azure Backup natively integrated into a virtual machine blade.

Another example scenario involves SQL backup. SQL can export the backup data directly to an Azure storage blob, so we could use the Azure Backup service to get a full virtual machine backup and then run a SQL backup job to export the data to Azure Storage. SQL also can restore from Azure Storage as well.

Backup also changes when you consume Azure PaaS services. If you use Azure SQL Database, for example, you don’t have an agent-based backup available; you do not get access to the “host” server. You select the backup options that the PaaS service makes available with the appropriate SLAs.

In the case of Azure SQL, the native backup runs every five minutes and creates a restore point automatically. Depending on the tier of service you select, you can retain data for 35 days and then integrate it into a recovery vault for longer-term retention.

Figure 5-21 shows a sample of the restore recovery point blade for an Azure SQL Database. Notice that we have the oldest recovery point available and then a restore point selected. You can modify the date and time of the restore point to any date going back to the oldest recovery point.

This diagram shows the restore blade for Azure SQL Database. It shows the information for a point in time backup and where it should be restored to.
Figure 5-21 Restore blade for Azure SQL Database.
Monitoring

Monitoring can be different when it comes to the cloud, with different technologies in the cloud, with different cost structures (i.e. charging for egress traffic). You need to spend time thinking of what you need to know and how to achieve it in cloud. If you choose the standard rehost scenario then you can use existing tooling to monitor the virtual machines you have migrated as long as you have the appropriate connectivity between the virtual machines and the monitoring system. If you choose a refactor scenario then the monitoring system that’s currently in place has to be examined to determine whether it’s still fit for the purpose.

The deeper and more cloud-native your network becomes, the less effective the existing tool becomes. For example, if you have an agent-based monitoring system and you move away from virtual machines to a serverless architecture, where do you deploy the agent? How do you determine the health state and provide alerts when problems occur?

Another problem with trying to maintain an existing monitor system is that you’re not receiving the latest updates of the latest services that are happening in cloud. If you think at the pace at which the cloud evolves and look at the vendor’s update cycle, it’s generally significantly behind where the services are today. Even with the updates you can’t take advantage of the powerful machine learning and AI algorithms that cloud monitoring solutions are implementing to detect more sophisticated threats.

Figure 5-22 and 5-23 show two different security-related dashboards that are built from Microsoft rulesets; those rulesets have been built from the knowledge that Microsoft has attained while running global cloud services.

This figure shows the Security and Audit dashboard in Log Analytics. It shows 4 blades which cover summary information on what the security and audit dashboard is capable of.
Figure 5-22 Security and Audit dashboard.
This figure shows the Security Baseline dashboard drill down. This shows more details of what has failed and what rules have failed. A user can utilize this info to go one step deeper.
Figure 5-23 Security baseline dashboard drill down.

OMS&S is an example of a cloud monitoring tool that can collect data native from Azure virtual machines and native PaaS services.

Microsoft takes the information it has obtained from running its global services and creates rules and patterns to detect erroneous events. You can build customer-specific rules and export data to PowerBI to build visually striking dashboards. This tool can collect data from any internet-connected source via a variety of methods, including the agent-based collection.

Disaster Recovery

When you migrate your application to the cloud, how you recover the system in the event of a critical failure has to change. Traditionally you would define a recovery point and recovery time objective, and that information would dictate the type of system you had to build. This could be a costly endeavor because you may have to duplicate the hardware of the production system and maintain the space for this duplicate hardware to exist.

If you move the application with a rehost scenario, what do you duplicate in terms of hardware? How do you replicate the virtual machine to a different location?

Azure Site Recovery is a cloud-native tool that replicates a virtual machine to a different region and enables you to perform a recovery of the virtual machine in the event of a disaster. Azure Migrate—which is a discovery, assessment, and migration tool—is built upon Azure Site Recovery because it also contains features that enable you to have Azure as your secondary site for disaster recovery for your on-premise systems. There is no duplicate hardware required, and in most cases network connectivity needs to be put in place with VPN technology (although we highly recommend a network assessment to determine the bandwidth required). Figure 5-24 shows an Azure-to-Azure disaster recovery scenario being configured.

This figure shows an Azure-to-Azure site recovery dashboard. The dashboard shows a picture of the world and which regions you will replicate between. The configuration items show what items to replicate and some specific configuration information which will change based on the virtual machine selected.
Figure 5-24 Azure-to-Azure site recovery.

Azure to Azure Site Recovery reduces the cost of having to maintain secondary datacenters and secondary hardware. If you use the “playbook” scenarios available in Azure Site recovery, you can greatly reduce the human touchpoints required to recover from a disaster.

Application optimization

The ultimate end goal in application migration is to maximize the benefits of running in the cloud. For most enterprises, the rehost scenario will be the first one to approach. After this scenario is performed you can begin looking at how you can optimize the application. In this section, we discuss a few of the options and approaches you might take when examining the next steps.

Azure Web Apps

If you have an application that has a web-tier front end, an easy way to optimize the application is to move the web app to an Azure Web app. This ensures you can benefit from the scalability, reliability, and availability of the Azure App Service platform. It also removes another virtual machine that you must manage.

The authentication methods and frameworks the application consumes may need to be updated before it can be hosted on the platform. You can integrate Azure Web app deployment mechanisms into your DevOps frameworks, so you can make changes and deploy and test rapidly.

Containers

If the application you migrated has a web tier, you also could migrate it to a container to reduce the overall footprint. With containers, you can integrate the packaging and deployment into a DevOps pipeline and have end-to-end deployment. You can use a managed container service (AKS) to run your container estate. Containers give you the chance to optimize in a lift-and-shift manner from a virtual machine to a container.

When you use containers, you become less worried about some management concepts like guest OS monitoring. Your monitoring footprint changes, and you can gather telemetry from the managed service rather than having hooks into a custom-built platform.

Azure Functions

Azure Functions provide even more opportunity for shrinking the size of a deployment and using native PaaS systems and serverless compute as much as possible. If your application had background web jobs or you had a “processing tier” in the application, you could migrate in source code these jobs to Azure functions that would execute the jobs. The job is only charged for the length of runtime, and Azure Functions provide all the necessary resources to execute that job.

In comparison, if you had to run a virtual machine to execute these jobs, you would constantly be paying for a virtual machine even when it had no work to do. You could also run into scale issues if the virtual machine ran out of resources.

Powerbi

This is one of the simplest optimizations you could do, and the benefits that an application would receive are also unquantifiable! You could use PowerBI to generate reports for your application or your business. This would reduce the need for a reporting server. PowerBI scales on demand because it’s a cloud service, and it can integrate directly with Azure SQL as a data source (as an example, it can also connect to other data sources).

Azure Sql Managed Instances

If your application requires a SQL database, rather than provisioning a dedicated virtual machine you could optimize and use a SQL Managed Instance in Azure. Managed instances provide all the functionality of a virtual machine with SQL installed but have none of the management overhead because it’s handled by the platform.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset