AWS Level 2 LAB 6 - Setting Up an EC2 Instance and Cloud Watch Alarm. Possible bug?

Good day everyone,

I’ve been trying to complete this lab but for some reason, even though I’m following the official documentation from AWS, I’m unable to do so.

Lab description

The Nautilus DevOps team has been tasked with setting up an EC2 instance for their application. To ensure the application performs optimally, they also need to create a CloudWatch alarm to monitor the instance’s CPU utilization. The alarm should trigger if the CPU utilization exceeds 90% for one consecutive 5-minute period. To send notifications, use the SNS topic named datacenter-sns-topic which is already created.

  • Launch EC2 Instance: Create an EC2 instance named datacenter-ec2 using any appropriate Ubuntu AMI.
  • Create CloudWatch Alarm: Create a CloudWatch alarm named datacenter-alarm with the following specifications:
    • Statistic: Average
    • Metric: CPU Utilization
    • Threshold: >= 90% for 1 consecutive 5-minute period.
    • Alarm Actions: Send a notification to datacenter-sns-topic.

Initial Steps

I created an EC2 instance with an Ubuntu AMI.

  • Added new kp
  • Added default sc

Generated CLI for review

aws ec2 run-instances --image-id “ami-04b4f1a9cf54c11d0” --instance-type “t2.micro” --key-name “datacenter-ec2-kp” --block-device-mappings ‘{“DeviceName”:“/dev/sda1”,“Ebs”:{“Encrypted”:false,“DeleteOnTermination”:true,“Iops”:3000,“SnapshotId”:“snap-00cdccb3239896f89”,“VolumeSize”:8,“VolumeType”:“gp3”,“Throughput”:125}}’ --network-interfaces ‘{“AssociatePublicIpAddress”:true,“DeviceIndex”:0,“Groups”:[“sg-0df8afedb8dfeb7e0”]}’ --credit-specification ‘{“CpuCredits”:“standard”}’ --tag-specifications ‘{“ResourceType”:“instance”,“Tags”:[{“Key”:“Name”,“Value”:“datacenter-ec2”}]}’ --metadata-options ‘{“HttpEndpoint”:“enabled”,“HttpPutResponseHopLimit”:2,“HttpTokens”:“required”}’ --private-dns-name-options ‘{“HostnameType”:“ip-name”,“EnableResourceNameDnsARecord”:true,“EnableResourceNameDnsAAAARecord”:false}’ --count “1”

First attempt to set up the Alarm

Documentation Followed: “Create a CloudWatch alarm for an instance - Amazon Elastic Compute Cloud”

Current Output: Failed to Fetch

Desired Output: Alarm successfully created

Steps done:

  1. Select the instance > Actions, Monitor and troubleshoot > Manage CloudWatch alarms.

  2. Alarm Notification > datacenter-sns-topic

  3. Alarm Thresholds:

  4. Group Samples by: Average

  5. Metric: CPU Utilization

  6. Percent: 0.9

  7. Period: 5 minutes

  8. Consecutive Period: 1

  9. Alarm name: datacenter-alarm

Second Attempt

At this point I surmise that something is off with the lab setup so inquired on the background requirements to create a CloudWatch Alarm for the instance.

Is the CloudWatch agent installed in the instance?

Prerequisites:

ssh into the instance:

$ touch datacenter-ec2-kp.pem

  • I copied the private key I downloaded when I created the key pair of the instance in datacenter-ec2-kp.pem

$ chmod 400 “datacenter-ec2-kp.pem”

$ ssh -i “datacenter-ec2-kp.pem” [email protected],com

Is the SSM Agent installed in the instance?.

$ sudo systemctl status snap.amazon-ssm-agent.amazon-ssm-agent.service

  • Running with errors

$ sudo cat /var/log/amazon/ssm/errors.log

2025-01-16 09:46:12.9001 ERROR [RemoteRetrieve @ ec2_role_provider.go.144] EC2RoleProvider Failed to connect to Systems Manager with SSM role credentials. error calling RequestManagedInstanceRoleToken: AccessDeniedException: Systems Manager's instance management role is not configured for account: 590183746102

status code: 400, request id: ad5d066b-8632-4bcd-a513-6c7bf1c36ef0

2025-01-16 09:46:12.9001 ERROR [minLog @ credentialrefresher.go.280] [CredentialRefresher] Retrieve credentials produced error: no valid credentials could be retrieved for ec2 identity. Default Host Management Err: error calling RequestManagedInstanceRoleToken: AccessDeniedException: Systems Manager's instance management role is not configured for account: 590183746102

status code: 400, request id: ad5d066b-8632-4bcd-a513-6c7bf1c36ef0

2025-01-16 10:15:55.0275 ERROR [RemoteRetrieve @ ec2_role_provider.go.144] EC2RoleProvider Failed to connect to Systems Manager with SSM role credentials. error calling RequestManagedInstanceRoleToken: AccessDeniedException: Systems Manager's instance management role is not configured for account: 590183746102

status code: 400, request id: 045d3096-6fb8-4c5a-86b7-3df376e3b9f8

2025-01-16 10:15:55.0275 ERROR [minLog @ credentialrefresher.go.280] [CredentialRefresher] Retrieve credentials produced error: no valid credentials could be retrieved for ec2 identity. Default Host Management Err: error calling RequestManagedInstanceRoleToken: AccessDeniedException: Systems Manager's instance management role is not configured for account: 590183746102

status code: 400, request id: 045d3096-6fb8-4c5a-86b7-3df376e3b9f8

This seems to point at an IAM role issue in my opinion.

Does the instance have the role with the policies CloudWatchAgentServerPolicy and AmazonSSMManagedInstanceCore attached?

  • No, the instance has no IAM role attached.

Eventually, I try to manually create an IAM role but it fails somehow.

And on top of it, even though it’s on the IAM Role list, it does not show when selecting a role in the instance options.

All in all,

Current Output: Role is not correctly created due to insufficient permissions and does not show up in the dropdown button within the “Modify IAM Role” of the instance.

Desired Output: IAM Role should be correctly created and attached to the instance.

Can you help me shed some light on how to solve this issue?

Kind Regards.

Additional documentation links I’ve used:

  • Checking SSM Agent status and starting the agent - AWS Systems Manager
  • Create IAM roles and users for use with the CloudWatch agent - Amazon CloudWatch
  • Install and run the CloudWatch agent on your servers - Amazon CloudWatch

Hi,

You don’t need to install the CloudWatch agent for this task. By default, Amazon CloudWatch automatically collects EC2 metrics every five minutes. This includes information about the health, performance, and status of your EC2 instances, such as:

  • CPU utilization: The percentage of CPU capacity being used
  • Disk read/write: How much disk space is used and how much is available
  • Network metrics: The amount of data being sent and received over the network

All you need to do is create an EC2 instance with Ubuntu, then go to CloudWatch and create a rule alarm.

Thanks for your help @raymond.baoly

I managed to complete the lab and it was my fault all along.

I have a custom DNS server provider configured in my Chrome browser (to avoid seeing ads) and it was interfering with the lab.

This post gave me the clue EC2: basic monitoring unavailable (load failed) | AWS re:Post

Have a nice day ahead!