AWS Elastic Load Balancing and Application Load Balancer

High availability in Amazon Web Services offers data protection for all the legacy file data that businesses want to move to Amazon Web Services (AWS), as well as the reliable performance with minimal downtime and costs.

AWS Auto Scaling is used to monitor your applications and automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost. When using AWS Auto Scaling, it’s easy to set up application scaling for multiple resources across multiple services in minutes.

Amazon Web Services defines Elasticity as “The ability to acquire resources as you need them and release resources when you no longer need them.” Microsoft Azure defines Elasticity as “The ability to quickly expand or decrease computer processing, memory, and storage resources to meet changing demands without worrying about capacity planning and engineering for peak usage.

Scalability means that an application or system can handle greater loads by adapting.
• There are two kinds of scalability: Vertical Scalability and Horizontal Scalability
• Scalability is linked but different from High Availability

• Vertical Scalability means increasing the size of the instance
• For example, your application runs on a t2.micro
• Scaling that application vertically means running it on a t2.large
• Vertical scalability is very common for non-distributed systems, such as a database.
• There’s usually a limit to how much you can vertically scale (hardware limit)

• Horizontal Scalability means increasing the number of instances/systems for your application
• Horizontal scaling implies distributed systems.
• This is very common for web applications / modern applications
• It’s easy to horizontally scale thanks the cloud offerings such as Amazon EC2

A company that has a single Amazon EC2 instance can scale horizontally across multiple Availability Zones to adopt a highly available architecture. Elasticity is a characteristic of the AWS Cloud that helps users eliminate underutilized CPU capacity. Improved health and availability of applications along with Optimized performance and costs are benefits of Amazon EC2 Auto Scaling.

AWS Elastic Load Balancing

Elastic Load Balancing automatically distributes your incoming traffic across multiple targets, such as EC2 instances, containers, and IP addresses, in one or more Availability Zones. It monitors the health of its registered targets and routes traffic only to the healthy targets.

There are several reasons why load balancing makes sense.

• Spread load across multiple downstream instances
• Expose a single point of access (DNS) to your application • Seamlessly handle failures of downstream instances
• Do regular health checks on your instances
• Provide SSL termination (HTTPS) for your websites
• High availability across zones

AWS Gateway Load Balancer

Gateway Load Balancer helps you easily deploy, scale, and manage your third-party virtual appliances. It gives you one gateway for distributing traffic across multiple virtual appliances while scaling them up or down, based on demand.

AWS Application Load Balancer

The Application Load Balancer is a feature of Elastic Load Balancing that allows a developer to configure and route incoming end-user traffic to applications based in the AWS public cloud. In a cloud environment with multiple web services, load balancing is essential.

• An ELB (Elastic Load Balancer) is a managed load balancer and AWS guarantees that it will be working
• AWS takes care of all upgrades, maintenance, and high availability. You only worry about minimal configuration options
• It costs less to set up your own load balancer but it will be a lot more effort on your end with maintenance and integration challenges to consider.
• The 3 kinds of load balancers offered by AWS are:
—Application Load Balancer (HTTP / HTTPS only) – Layer 7
—Network Load Balancer (ultra-high performance, allows for TCP) – Layer 4
—Classic Load Balancer (slowly retiring) – Layer 4 & 7

AWS Auto Scaling Groups

An Auto Scaling group contains a collection of EC2 instances that are treated as a logical grouping for the purposes of automatic scaling and management. An Auto Scaling group also enables you to use Amazon EC2 Auto Scaling features such as health check replacements and scaling policies.

• In the real world, the load on your websites and application can change quickly
• When leveraging AWS, you can create and get rid of servers almost instantly
• The goal of an Auto Scaling Group (ASG) is to:
—Scale-out (add EC2 instances) to match an increased load
—Scale in (remove EC2 instances) to match a decreased load
—Ensure we have a minimum and a maximum number of machines running

• Automatically register new instances to a load balancer
• Replace unhealthy instances
• Cost Savings: only run at an optimal capacity (principle of the cloud)

How To Scale In AWS

• Manual Scaling: At any time, you can change the size of an existing Auto Scaling group manually. You can either update the desired capacity of the Auto Scaling group or update the instances that are attached to the Auto Scaling group. Manually scaling your group can be useful when automatic scaling is not needed or when you need to hold capacity at a fixed number of instances.

• Dynamic Scaling: Dynamic scaling scales the capacity of your Auto Scaling group as traffic changes occur. Scale the number of EC2 instances in or out automatically, based on demand is a way to use Amazon EC2 Auto Scaling groups to scale capacity in the AWS Cloud. Amazon EC2 Auto Scaling supports the following types of dynamic scaling policies:

Target tracking scaling: Increase and decrease the current capacity of the group based on an Amazon CloudWatch metric and a target value. It works similarly to the way that your thermostat maintains the temperature of your home, you select a temperature and the thermostat does the rest.

Step scaling: Increase and decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.

Simple scaling: Increase and decrease the current capacity of the group based on a single scaling adjustment, with a cooldown period between each scaling activity.

• When a CloudWatch alarm is triggered (example CPU > 70%), then add 2 units • When a CloudWatch alarm is triggered (example CPU < 30%), then remove 1
• TargetTracking Scaling
• Example: I want the average ASG CPU to stay at around 40%
• Scheduled Scaling
• Anticipate a scaling based on known usage patterns
• Example: increase the min. capacity to 10 at 5 pm on Fridays

ELB & ASG Summary

The main purpose of High Availability in the Cloud is to survive data center loss and entails running at least in two availability zones. A Network Load Balancer can handle millions of requests per second with low latency. It operates at Layer 4 and is best suited for load-balancing TCP, UDP, and TLS traffic with ultra-high-performance. Vertical scaling means increasing the size of the instance. Changing from a t3a.medium to a t3a.2xlarge is an example of a size increase. An Auto Scaling Group (ASG) can automatically and quickly scale in and scale out to match the changing load on your applications and websites. Auto Scaling Groups can add or remove instances but from the same type. They cannot change the EC2 Instances Types on the fly. Application Load Balancers are used for HTTP and HTTPS load balancing. They are the best suited for this kind of traffic. Auto Scaling Strategies include Manual Scaling, Dynamic Scaling (Simple/Step Scaling, Target Tracking Scaling, Scheduled Scaling), and Predictive Scaling. Auto Scaling Groups (ASG) offer the capacity to scale out and scale in by adding or removing instances based on demand. Load Balancers cannot help with back-end autoscaling. Pay-as-you-go pricing and Auto Scaling policies can help automatically handle a seasonal workload increase in a cost-effective manner for an online retail company that wants to migrate its on-premises workload to AWS.

• High Availability vs Scalability (vertical and horizontal) vs Elasticity vs Agility in the Cloud
• Elastic Load Balancers (ELB)
• Distribute traffic across backend EC2 instances, which can be Multi-AZ
• Supports health checks
• 3 types: Application LB (HTTP – L7), Network LB (TCP – L4), Classic LB (old)
• Auto Scaling Groups (ASG) – Auto Scaling groups are designed to handle the load for applications during periods of high demand such as when a system administrator finds that the CPU utilization was at 100% during business hours for an Amazon EC2 instance where users have reported latency issues.
• Implement Elasticity for your application, across multiple AZ
• Scale EC2 instances based on the demand on your system, replace unhealthy instances – Amazon EC2 Auto Scaling offers a company steady and predictable performance from its Amazon EC2 instances at the lowest possible cost while providing the ability to scale resources to ensure that it has the right resources available at the right time.
• Integrated with the ELB

A company has more agility if it can deploy an application in 2-3 days instead of its on-premises application deployment cycle of 3-4 weeks after migrating to the AWS Cloud. High availability is an AWS Cloud benefit that is shown by an architecture’s ability to withstand failures with minimal downtime.

Learn More