Scalability & High Availability
Understanding Scalability and High Availability
In this lecture, we explored the fundamental concepts of scalability and high availability.
Scalability refers to a system's ability to handle increased load. It can be achieved through:
- Vertical Scaling: Increasing the capacity of individual components (e.g., upgrading a server).
- Horizontal Scaling: Adding more components to a system (e.g., adding more servers).
High Availability ensures that a system remains operational even in the face of failures. It often involves:
- Redundancy: Having multiple components that can take over if one fails.
- Load Balancing: Distributing workload across multiple components.
What is Load Balancing?
Load balancing distributes incoming traffic across multiple servers to optimize performance and reliability. AWS offers several types of load balancers to handle various use cases.
Types of AWS Load Balancers:
- Application Load Balancer (ALB): Supports HTTP, HTTPS, and WebSocket protocols.
- Network Load Balancer (NLB): Supports TCP, TLS, security CP, and UDP protocols.
- Gateway Load Balancer (GWLB): Operates at the network layer, supports TCP, TLS, and UDP protocols.
Key Benefits of Using AWS Load Balancers:
- Improved Performance: Distributes traffic across multiple servers to prevent overloading.
- Enhanced Reliability: Ensures high availability by redirecting traffic to healthy servers.
- Increased Security: Provides SSL termination and allows for fine-grained security group controls.
- Simplified Management: AWS manages load balancers, reducing operational overhead.
- Scalability: Easily scales to handle increased traffic.
- Integration with Other AWS Services: Works seamlessly with EC2, ECS, Certificate Manager, CloudWatch, Route 53, WAF, and more.
How Load Balancing Works:
- Traffic Arrival: Users connect to the load balancer.
- Load Distribution: The load balancer distributes traffic to available EC2 instances.
- Health Checks: The load balancer monitors the health of EC2 instances and removes unhealthy ones from the rotation.
- Traffic Routing: Healthy instances receive and process incoming traffic.
Security Considerations:
- Security Groups: Configure security groups to allow traffic from the load balancer to EC2 instances and restrict access from other sources.
- SSL/TLS Termination: Use SSL/TLS certificates to encrypt traffic between the load balancer and clients.
- Web Application Firewall (WAF): Protect your applications from common web attacks.
Understanding Application Load Balancers (ALBs) in AWS
ALBs operate at Layer 7 (application layer) and distribute HTTP/HTTPS traffic across various targets like EC2 instances, containers, and Lambda functions. They offer several advantages:
- Route traffic intelligently: ALB can route traffic based on URL paths, hostnames, query strings, and headers. Imagine having separate user and search applications behind a single ALB – it directs users to the appropriate application based on the URL path.
- Support for modern architectures: ALBs excel at handling microservices and container-based applications. They work seamlessly with Amazon ECS (container orchestration service) and can redirect traffic to dynamic ports on ECS instances.
- Multiple target groups: An ALB can manage multiple target groups, each containing various resources like EC2 instances or Lambda functions. This allows you to efficiently distribute traffic based on specific criteria.
- Health checks: ALB monitors the health of your targets and ensures traffic only reaches healthy ones.
- Security benefits: ALBs enforce the use of the latest SSL/TLS protocols, enhancing your application's security.
Benefits of using ALBs:
- Simplified management: One ALB can manage multiple applications, eliminating the need for separate classic load balancers for each application.
- Scalability: ALBs can handle varying traffic volumes, ensuring your application remains responsive even during high demand.
- Flexibility: ALB's routing capabilities allow you to direct traffic based on diverse criteria, providing granular control over your application's behavior.
Beyond the Basics:
- ALBs can route traffic to on-premises servers through private IP addresses.
- They provide a fixed hostname for your application, similar to classic load balancers.
- The client's true IP address is hidden from your application servers and can be accessed through special headers like X-Forwarded-For.
Network Load Balancer (NLB) Summary
- A Layer 4 load balancer for TCP and UDP traffic.
- High-performance, handling millions of requests per second with ultra-low latency.
- Supports static IP addresses per Availability Zone (AZ) using Elastic IPs.
Key Use Cases
- Extreme Performance: When you need to handle a massive amount of TCP or UDP traffic.
- Static IP Addresses: When your application requires fixed IP addresses.
- Hybrid Environments: Can be used to front both EC2 instances and on-premises servers.
- Multi-Layer Load Balancing: Can be combined with Application Load Balancers (ALBs) for advanced routing and security.
NLB Target Groups
- EC2 Instances: Can direct traffic to your EC2 instances.
- IP Addresses: Can direct traffic to both private IP addresses of EC2 instances and on-premises servers.
Health Checks
- Supports TCP, HTTP, and HTTPS health checks.
Remember: NLB is not included in the AWS Free Tier.
Introducing the Gateway Load Balancer
AWS's newest load balancer, the Gateway Load Balancer, is designed to streamline network traffic management and security. This powerful tool allows you to route all network traffic in your VPC through a series of third-party virtual appliances, such as firewalls or intrusion detection systems, before it reaches your applications.
Key Benefits:
- Enhanced Security: Inspect and filter traffic to mitigate potential threats.
- Improved Performance: Optimize network traffic flow and reduce latency.
- Simplified Management: Centrally manage and scale your network appliances.
How it Works:
- Traffic Ingestion: The Gateway Load Balancer intercepts incoming traffic.
- Traffic Distribution: It distributes the traffic across a target group of virtual appliances.
- Traffic Inspection: The appliances analyze the traffic for security and performance.
- Traffic Routing: Accepted traffic is forwarded to the destination application.
Key Points to Remember:
- Operates at Layer 3 (Network Layer)
- Uses GENEVE protocol on port 6081
- Supports EC2 instances and IP addresses as targets
By leveraging the Gateway Load Balancer, you can significantly enhance the security and performance of your AWS infrastructure.
Sticky Sessions: A Deep Dive
Sticky sessions, or session affinity, is a technique used to ensure that client requests are consistently routed to the same backend server instance within a load balancer. This is particularly useful for applications that require session state, such as user login information or shopping cart items.
How It Works:
- Cookie-Based Mechanism:
- A cookie is generated and sent to the client's browser.
- This cookie is included in subsequent requests to the load balancer.
- The load balancer uses the cookie to identify the original server instance and route the request accordingly.
Cookie Types:
- Application-Based Cookie: Generated by the application and can contain custom attributes.
- Duration-Based Cookie: Generated by the load balancer with a specific expiration time.
Key Considerations:
- Balancing Trade-Off: While sticky sessions can improve user experience, they can also lead to load imbalance if some servers become overloaded with sticky clients.
- Cookie Expiration: Be mindful of cookie expiration times to avoid unexpected behavior.
- Security Implications: Ensure proper security measures are in place to protect sensitive information stored in session state.
Cross Zone Load Balancing in AWS
Cross Zone Load Balancing is a feature in AWS that distributes traffic across multiple Availability Zones (AZs) within a region. This helps improve the availability and fault tolerance of your applications.
How Does it Work?
- Even Distribution: Traffic is evenly distributed across all instances, regardless of their AZ.
- Improved Fault Tolerance: If one AZ fails, traffic is automatically rerouted to instances in other AZs.
- Enhanced Performance: By distributing traffic across multiple AZs, you can reduce latency and improve overall performance.
Default Behavior:
- Application Load Balancer (ALB): Cross Zone Load Balancing is enabled by default, but you can disable it at the target group level.
- Network Load Balancer (NLB) and Gateway Load Balancer (GLB): Cross Zone Load Balancing is disabled by default. You can enable it, but it will incur additional charges.
- Classic Load Balancer: Cross Zone Load Balancing is disabled by default. Enabling it will not incur additional charges.
Key Points to Remember:
- Cross Zone Load Balancing can significantly improve the reliability and performance of your AWS applications.
- Carefully consider the default behavior and additional charges associated with enabling Cross Zone Load Balancing for different load balancer types.
- When configuring target groups for ALBs, you can choose to inherit the default Cross Zone Load Balancing setting or override it.
What are SSL Certificates?
- Secure communication between clients (users) and servers (websites).
- Encrypts data during transfer, protecting sensitive information like credit card details.
- Represented by a lock icon or "HTTPS" in the website address bar.
- Issued by trusted authorities like Comodo, Symantec, and GoDaddy.
- Require renewal to ensure authenticity.
How do they work with Load Balancers?
- Load balancers distribute traffic across multiple servers.
- When users connect over HTTPS, the load balancer performs "SSL termination."
- This decrypts traffic for internal communication within the secure Virtual Private Cloud (VPC) network.
- The load balancer uses an SSL certificate to secure the initial connection.
- AWS Certificate Manager (ACM) simplifies managing these certificates.
Understanding Server Name Indication (SNI):
- Enables hosting multiple websites on a single load balancer.
- Client specifies the desired website during the initial connection.
- Load balancer uses the appropriate SSL certificate based on the requested hostname.
- SNI is not supported by older load balancers (Classic Load Balancer).
- Application Load Balancer (ALB) and Network Load Balancer (NLB) support SNI.
Supported Load Balancers for Multiple SSL Certificates:
- Application Load Balancer (v2): Supports multiple listeners with unique SSL certificates using SNI.
- Network Load Balancer: Similar to ALB, supports multiple listeners and certificates with SNI.
Key Takeaways:
- SSL certificates are crucial for secure communication online.
- Load balancers can leverage SSL certificates for secure traffic distribution.
- SNI allows hosting multiple websites with unique SSL certificates on a single load balancer (ALB or NLB).
A Smooth Transition for Your EC2 Instances
Have you ever wondered how to gracefully remove an EC2 instance from your load balancer without interrupting active user connections? Deregistration Delay, is the answer.
Understanding Deregistration Delay
When an EC2 instance is marked for removal, Deregistration Delay (previously known as Connection Draining) allows the load balancer to gradually stop sending new requests to the instance while existing connections are allowed to finish. This ensures a seamless transition and avoids abrupt connection drops.
Key Points to Remember:
- Draining Period: You can customize the draining period between 1 and 3,600 seconds.
- Default Value: The default draining period is 300 seconds (5 minutes).
- Short Requests: For short-lived requests, a shorter draining period is ideal.
- Long Requests: For long-running requests, a longer draining period is necessary.
- Disabling Draining: Setting the draining period to 0 disables the feature.
Auto Scaling Groups: A Simplified Guide
ASG automatically adjusts the number of EC2 instances in a pool based on demand. This ensures optimal performance and cost-efficiency.
- Setting Capacity Limits:
- Minimum Capacity: The fewest instances to maintain.
- Desired Capacity: The optimal number of instances.
- Maximum Capacity: The highest number of instances allowed.
- Scaling Actions:
- Scale Out: Adds instances to handle increased load.
- Scale In: Removes instances to reduce costs during low-demand periods.
- Integration with Load Balancers:
- Distributes traffic evenly across instances.
- Monitors instance health and replaces unhealthy ones.
Key Components of an ASG:
- Launch Template: Defines the configuration for new EC2 instances, including AMI, instance type, security groups, and more.
- Scaling Policies: Determine when and how much to scale based on metrics like CPU utilization or custom metrics.
- CloudWatch Alarms: Trigger scaling actions when specific thresholds are met.
Benefits of Using ASGs:
- Improved Performance: Ensures optimal resource allocation to handle fluctuating loads.
- Cost Efficiency: Automatically scales resources up or down to minimize costs.
- High Availability: Maintains service availability by replacing failed instances.
- Simplified Management: Automates scaling processes, reducing manual effort.
Scaling Policies
- Dynamic Scaling:
- Target Tracking Scaling: Sets a target value for a metric (e.g., CPU utilization) and automatically scales to maintain that value.
- Simple or Step Scaling: Triggers scaling actions based on CloudWatch alarms, adding or removing instances as needed.
- Scheduled Scaling: Pre-schedules scaling actions based on anticipated usage patterns (e.g., increased load during peak hours).
- Predictive Scaling: Forecasts future load and schedules scaling actions proactively, ideal for cyclical patterns.
Choosing the Right Metrics
Selecting appropriate metrics is crucial for effective scaling:
- CPU Utilization: A common metric to monitor, as increased CPU usage often indicates higher workload.
- RequestCountPerTarget: Tracks the number of requests per target instance, helping to optimize resource utilization.
- Network In/Out: Useful for network-bound applications to ensure sufficient bandwidth.
- Custom Metrics: Create custom metrics to monitor specific application-level performance indicators.
Cooldown Periods
Cooldown periods prevent excessive scaling actions. After a scaling event, the ASG waits for a specified duration before triggering another scaling action. This allows metrics to stabilize and new instances to become fully operational.
Tips for Optimal Scaling
- Use Ready-to-Use AMIs: Reduce instance boot time and shorten cooldown periods.
- Enable Detailed Monitoring: Collect frequent metrics for precise scaling decisions.
[1]: Stephane Maarek, AWS Certified Solutions Architect Associate Certification SAA-C03