Skip to content

Nova Home Care — Infrastructure Docs

EC2 Auto-Recovery

nova-infra/hipaa-compliance-aws-infra

EC2 Auto-Recovery¶

Overview¶

All 4 EC2 instances have CloudWatch alarms that trigger automatic recovery when the underlying hardware fails.

How It Works¶

CloudWatch monitors StatusCheckFailed_System metric
If 2 consecutive failures occur within 2 minutes, the alarm triggers
AWS migrates the instance to healthy hardware
The instance retains its instance ID, private IP, EBS volumes, and Elastic IP

Covered Instances¶

Instance	Alarm Name	Status
`prod-nhc-django`	`prod-nhc-django-auto-recover`	✅ Active
`prod-nhc-app`	`prod-nhc-app-auto-recover`	✅ Active
`prod-nhc-foursites`	`prod-nhc-foursites-auto-recover`	✅ Active
`prod-nhc-gitlab-runner`	`prod-nhc-gitlab-runner-auto-recover`	✅ Active

Limitations¶

Only recovers from system status check failures (underlying hardware)
Does not recover from instance status check failures (OS-level issues)
Instance must use EBS-backed storage (all our instances do)