Amazon CloudWatch recently gained log file monitoring and storage for application, operating system and custom logs, and meanwhile enhanced support for Microsoft Windows Server to cover a wider variety of log sources.
Amazon CloudWatch is a monitoring service for Amazon Web Services (AWS) cloud resources that offers "system-wide visibility into resource utilization, application performance, and operational health". It is readily available for most resource types such as EC2 instances, DynamoDB tables, and RDS DB instances and can also be used to collect and track custom application and service metrics. The metrics are retained for two weeks and can be observed in graphs and statistics.
Users can create alarms based on thresholds against these metrics to receive notifications or take automated actions when an alarm changes state. A noteworthy application of these capabilities is using Auto Scaling to scale EC2 capacity up or down dynamically based on user defined conditions (see previous coverage for more details).
Amazon CloudWatch now gained the ability to also monitor and store application, operating system, and custom log files. The retention period on "highly durable storage" is indefinite by default and can be reduced to a minimum of one day. Logs can be monitored for phrases, values or patterns, based on configurable filters, which then surface any hits as CloudWatch metrics and thus enable alarms based on log events in turn. For example, an alarm might be based on specific literal terms (such as "NullReferenceException") or the number of occurrences of a literal term at a particular position in log data (such as "404" status codes in an Apache access log).
This initial focus on log monitoring with filters and alarms rather than searching is a notable difference to other Logging as a Service (LaaS) offerings, which mostly offered real-time searching first. While CloudWatch logs can be browsed on a stream-by-stream basis in the AWS Management Console, real-time searching in logs is in fact not available yet (though being considered by the AWS team).
Four concepts are central for understanding CloudWatch Logs monitoring:
- Log Events – record of some activity recorded by the application or resource being monitored
- Log Streams – sequence of log events that share the same source
- Log Groups – groups of log streams that share the same retention, monitoring, and access control settings
- Metrics Filters – can be used to express how the service would extract metric observations from ingested events and transform them to data points in a CloudWatch metric
Metric filters provide the actionable insights into log events by searching for and matching terms, phrases, or values and counting each occurrence in a CloudWatch metric, which can trigger alarms in turn. They are made up of a metric name, namespace, value and the filter pattern. Besides finding phrases or values by literal terms, it is also possible to extract values from space-delimited log events, such as the transferred bytes in an Apache HTTP log. Conditional operators and wildcards are available for string fields ("=, != and *") and numeric fields (">, <, >=, <=, =, and !=") to create exact matches for this value extraction.
Filters can be created and validated for correctness in the CloudWatch console, or submitted through the API via PutMetricFilter and then validated via TestMetricFilter against a sample of up to 50 log event messages. The following example uses the AWS CLI to create a filter combining some of these features to count all 4xx HTTP status codes with the wildcard pattern "4*" and injecting transferred bytes from an Apache log as a metric value through the variable "$size":
$ aws logs put-metric-filter \
--log-group-name MyApp/access.log \
--filter-name BytesTransferred \
--filter-pattern '[ip, id, user, timestamp, request, status_code=4*, size]' \
--metric-transformations \
metricName=BytesTransferred,metricNamespace=YourNamespace,metricValue=$size
Log events can be actively ingested via the AWS CLI or through the API by means of the various AWS SDKs. Log events can also be passively ingested with the CloudWatch Log agents for Linux and Windows, which are configurable to tail and process operating system, application and custom log files. Agents send logs every five seconds by default.
The Python based Linux agent can be deployed through EC2 user data or direct command-line setup – a blog post provides an installation and configuration walkthrough. CloudWatch Logs is also integrated with Amazon's application management services Amazon Elastic Beanstalk, AWS CloudFormation and AWS OpsWorks.
The Windows agent is integrated into the EC2Config service and supports sending logs based on e.g. Windows event logs, Performance Counters (PCW), Event Tracing (ETW) logs, IIS request logs and custom log files, as well as exporting performance counters as CloudWatch metrics – a blog post provides a configuration walkthrough.
Amazon CloudWatch Logs is currently available in the us-east-1, us-west-2 and eu-west-1 regions, but logs can be ingested from other regions too. The CloudWatch documentation comprises a developer guide, dedicated API references for CloudWatch and CloudWatch Logs, a CloudWatch Logs agent reference and a CloudWatch Logs section of the AWS CLI reference. Support is available via the Amazon CloudWatch forum. Pricing is usage based and AWS offers a free tier for CloudWatch, which “many applications should be able to operate within”.