Monitoring ELB

Monitored every 60 seconds provided there is traffic
Only reports when requests are flowing through the LB
If there are no requests or data for a given metric, the metric will not be reported to CloudWatch
If there are requests flowing through the LB, ELB will measure and send metrics for that LB in 60 second intervals
Available Metrics:

HealthyHostCount:
- The count of the number of healthy instances in each AZ
- Hosts are declared healthy if they meet the threshold for the number or consecutive health checks that are successful
- Hosts that have failed more health checks then the value of the unhealthy threshold are considered unhealthy
- If cross-zone is enabled, the count of the number of healthy instances is calculated for all AZs
- Preferred Statistic: Average
UnHealthyHostCount:
- The count of the number of unhealthy instances in each AZ
- Hosts that have failed more health cheeks than the value of the unhealthy threshold are considered unhealthy
- If cross-zone is enabled, the count of the number of unhealthy instances is calculated for all AZs
- Instances may become unhealthy due to connectivity issues, health checks returning non-200 responses (in the case of HTTP or HTTPS health checks), or timeouts when performing the health check
- Preferred Statistic: Average
RequestCount:
- The count of the number of completed requests that were received and routed to the back end instances
- Preferred Statistic: Sum
Latency:
- Measures the time elapsed in seconds after the request leaves the load balancer until the response is received
- Preferred Statistic: Average
HTTPCode_ELB_4XX
- The count of the number of HTTP 4XX client error codes generated by the load balancer when the listener is configured to use HTTP or HTTPS protocols. Client errors are generated when a request is malformed or is incomplete
- Preferred Statistic: Sum
HTTPCode_ELB_5XX
- The count of the number or HTTP 5XX server error codes generated by the load balancer when the listener is configured to use HTTP or HTTPS protocols
- This metric does not include any responses generated by back end instances
- The metric is reported if there are no back-end instances that are healthy or registered to the load balancer, or if the request rate exceeds the capacity of the instances or the load balancers
- Preferred Statistic: Sum
HTTPCode_Backend_2XX:
HTTPCode_Backend_3XX:
HTTPCode_Backend_4XX:
HTTPCode_Backend_5XX:
- The count of the number of HTTP response codes generated by back-end instances
- Metric does not include any response codes generated by the load balancer
- The 2XX class status codes represent successful actions
- The 3XX class status codes indicate that the user agent requires action
- The 4XX class status code represents client errors
- The 5XX class status code represents back-end server errors
- Preferred Statistic: Sum
BackendConnectionErrors:
- The count of the number of connections that were not successfully established between the LB and the registered instances
- The LB will retry when there are connection errors, so the count can exceed the request rate
- Preferred Statistic: Sum
SurgeQueueLength:
- A count of the total number of requests that are pending submission to a registered instance
- Preferred Statistic: Max
SpilloverCount:
- A count of the total number of requests that were rejected due to the queue being full
- Preferred Statistic: Sum
Have an idea of what each metric does
Important metrics to note are SurgeQueueLength & SpilloverCount

Monitoring ELB

Prepare for Assured Success