Migrating Services from AWS to Azure
Laposa successfully helped a corporate client transition 32 critical web services from AWS to Azure, leveraging Azure Kubernetes Service for enhanced scalability and agility.
- Technologies | AWS · DC/OS · Kubernetes · Terraform
- Deliverables | Planning · Configuration · Deployment · Monitoring
- Completion Date | 2019
-
Migrated
32
web services
-
Uptime
100%
during migration
-
Completed migration in
60
days
Client Background
A large corporate client approached Laposa to assist with moving multiple critical services from Amazon Web Services (AWS) to Microsoft Azure.
The client was running a range of containerized web applications that handled considerable traffic and depended on a highly available and scalable infrastructure in AWS. These included over 32 custom-built web services hosted using DC/OS (Distributed Cloud Operating System) on self-managed EC2 instances. Key technologies included Marathon for orchestration, ELB for load balancing, and Amazon RDS for PostgreSQL databases.
Given the strategic decision to adopt Azure Kubernetes Service (AKS) for enhanced scalability and management simplicity, the migration plan was to move the entire workload from AWS infrastructure to Azure over two months, taking into consideration application dependencies, stability, and performance.
Key Goals
of the Migration
- Transition from AWS to Azure while maintaining high availability and resiliency.
- Leverage AKS for simplifying container orchestration over the prior setup in DC/OS with Marathon.
- Avoid disruption by implementing both pre-production and production clusters for environment testing before live deployments.
- Migrate client services in phases based on customer impact, complexity, and application dependencies.
- Migrate data from Amazon RDS PostgreSQL to Azure Database for PostgreSQL – Flexible Server.
- Ensure the migrated setup is highly observable, disaster-resilient, and capable of handling traffic peaks.
- Fully decommission AWS resources after production cutover.
Challenges
Despite the successful outcome, the migration presented several challenges that required careful planning and execution to ensure a seamless transition.
- Complexity of Workloads: With 32 web services and custom Docker images, ensuring no downtime during the migration was a challenge since most of the services had customer-facing implications.
- Interdependent Applications: Many applications developed by the client were tightly coupled, mandating careful planning on which applications to migrate in tandem.
- New Architecture with AKS: While AKS was relatively new at the time, leveraging it required a deep understanding of AKS best practices, including low-level networking, load balancing, and autoscaling configuration.
- Cost and Time Optimization: Ensuring the migration happened in phases to minimize risk while deploying new architectures without increasing deployment and operational costs.
Migration Plan
The migration journey spanned two months and occurred in the following six structured phases.
1. Planning
Laposa collaborated closely with the client to design a comprehensive migration plan that took into consideration application interdependencies, customer impact, and scalability needs. The planning included developing a detailed Application Dependency Map to identify services that required moving together, thus avoiding downtime or failures during the phased migration.
Microsoft Azure experts provided guidance by conducting a workshop at their Dublin office, enhancing familiarity with AKS and troubleshooting potential issues.
2. Cluster Creation
Two distinct AKS clusters were set up in Azure:
- Pre-production Cluster: For staging, testing, and user acceptance testing (UAT) purposes.
- Production Cluster: Ready for live deployment after stress testing and security review.
Cluster setups were automated using Terraform, with scripts to provision resources such as node pools, storage accounts, Kubernetes services, database servers, and failover recovery features.
Cluster Components:
-
Network Setup: Managed by the client’s internal IT team, which included:
- Creation of virtual networks (VNets) and subnets.
- Configuration of User Defined Routes (UDRs).
- Setting up NSGs (Network Security Groups) to restrict public access.
-
Implementation of AKS: Managed by Laposa:
- Prepared Terraform configuration files for infrastructure automation.
- Configured an Ingress controller to manage client-facing web traffic.
- Kubernetes deployment YAML files for each service.
- Private links established for secure database connections.
-
Security Setup: The client's internal security team also handled:
- Web Application Firewall (WAF) and network firewall.
- Public IP configuration and DNS zones.
- Management of TLS certificates for data encryption in transit.
3. Migration
The migration was executed in phases across four major cycles to minimize risk while ensuring seamless transition:
Phase Process:
- Phase 1: Migrated less critical services that were independent of other apps. These services had less customer impact.
- Phase 2 and Phase 3: Migrated medium-complexity applications with light customer interaction and interdependent services, prepared for custom downtimes.
- Phase 4: The final and most critical phase involving high-impact, high-dependency applications with strict service level agreements (SLAs).
Laposa recommended running three replicas per application to ensure continuous uptime, and critical applications were configured with horizontal pod autoscaling to handle peaks in traffic.
The client’s Docker images stored in AWS's Elastic Container Registry (ECR) were meticulously copied to Azure Container Registry (ACR).
4. UAT Testing
Complete user acceptance testing (UAT) was conducted on the pre-production AKS cluster. The goal was to simulate real-world traffic and verify performance before promoting applications to production clusters.
Simulations were also performed to understand the scaling behavior of the AKS cluster, leveraging the client’s usage patterns and traffic volumes.
5. Security Reviews
Security was a priority, with the internal security team conducting:
- Firewall and WAF setup review for enhanced threat protection against malicious traffic.
- Penetration testing to uncover any vulnerabilities while running in AKS.
- DNS updates and load balancing of public/private traffic to the appropriate cluster components.
6. Load Testing
Before full production cutover, load tests were conducted to verify the performance of applications under heavier-than-expected load conditions. This confirmed system resilience and efficient scaling under peak stress conditions.
Post-Migration Activities
After the successful migration of services to Azure, the following tasks were undertaken to finalize the migration:
- Decommissioning of AWS resources: All AWS services, including EC2 instances, ELBs, and Amazon RDS for PostgreSQL, were decommissioned once Azure production clusters proved stable.
- Monitoring Setup: DataDog agents were deployed as a DaemonSet across all the AKS nodes to ensure continuity in monitoring and alerting capabilities.
- Disaster Recovery: Disaster recovery was fully tested by simulating a system failure in the primary region. The failover process to a secondary Azure region, including rebuilding and migrating all workloads, was successfully completed within 20 minutes, affirming the rapid recovery capabilities of the Azure setup.
Results
Since migrating from AWS to Azure, the client has experienced significant improvements in performance, efficiency, and reliability.
Successful Transition
All services were successfully moved from AWS to Azure with minimal downtime, thanks to the well-planned phased migration that prioritized critical applications.
Improved Scalability
The move to Azure Kubernetes Service (AKS) enhanced the client’s ability to scale horizontally during peak times, reducing infrastructure spend while handling higher traffic levels efficiently.
Cost Savings
The use of infrastructure-as-code (Terraform) and cloud-native services on Azure ensured better cost optimization compared to the legacy setup on AWS.
Enhanced Resiliency
The disaster recovery simulation ensured near-instantaneous availability in a secondary Azure region, reducing downtime significantly in case of failure.
Modernized Observability
With DataDog running on AKS as DaemonSets, the client maintained full visibility over workloads, which aided in performance monitoring and incident detection.
Better Security Setup
Full integration with Azure security services, including comprehensive WAF and DNS zone implementations, ensured the client’s applications were well-protected from external threats.
Conclusion
The migration from AWS to Azure with Laposa allowed the client to modernize its infrastructure, enhancing scalability and reliability while simplifying deployment processes.
Azure Kubernetes Service (AKS) replaced the older DC/OS cluster, while testing and disaster recovery solutions further fortified the platform for future growth. The client has since seen improvements in not only cost efficiency but also business agility, allowing them to focus on innovation over infrastructure management.
Next Step
If you're looking to improve how your business manages its infrastructure, scales applications, or optimizes costs by transitioning to the cloud, Laposa has the expertise to guide you through every step of the process.
Whether you’re considering a migration to Azure, Kubernetes, or any cloud-native solution, we’re here to ensure a seamless transition that drives efficiency and performance improvements.
Get in Touch