Building a HIPAA Compliant Analytics Platform on AWS
Tim O'Guin
Introduction
- Who am I?
- What is Juice?
Me
- Husband and Father
- Director of Platform Engineering for Juice
- Obsessed with automation
- Permaculturist
- Interested in using systems to improve our lives and mental health
The feeling I seek...
The feeling I find...
Juice Analytics
- Founded in 2005
- Creators of Juicebox and other atrocities
- We help companies build data products
Our Role
- Analytics platform what?
- What does it mean to be a 3rd party?
- What are our responsibilities?
- What is our tooling?
Analytics Platform
- Acquire data from client
- Load it into Redshift
- Serve application to users
A business associate is an independent contractor or agent of a covered
entity that receives or obtains protected health information (“PHI”) in
connection with the services it provides for the covered entity.
Business Associate
- Terraform for AWS Resources
- Credstash for Managing Secrets
- Salt for Config Mgmt & Orchestration
- Django and Flask for Apps
- AWS ECS for non-PHI Services
- AWS Lambda
Tooling
Normal HIPAA Stuff
What is it?
The Basics
- Encrypt everything
- Log all access to PHI data
- Have change control processes
- Store audit data for 6 years
- Don't share accounts
- Have password policies
- Do backups
- Have a DR plan in place
Our Must Haves
- Infrastructure as Data
- Require MFA for AWS access
- Use separate AWS accounts per environment
- Use one central AWS account for logins and assume roles when accessing other accounts
HIPAA on AWS
The skinny...
Basics
- Sign a BAA with Amazon
- Use specific services that are compliant
- Follow whitepaper guidelines
https://aws.amazon.com/compliance/hipaa-compliance/
http://aws.amazon.com/compliance/aws-whitepapers/
Services
- EC2
- EBS
- Redshift
- S3
- Glacier
- RDS for MySQL
- RDS for Oracle
- ELB
- EMR
- DynamoDB
Overview
- Use KMS with everything that supports it
- Enable logging for all services that support it
- Enable CloudTrail logging to track API calls
- Use IAM roles everywhere you can
Differences Per Service
- Everything... Use encryption
- ELB - No terminating SSL here.
- EC2 - Dedicated tenancy ($$$), encrypted EBS
- Redshift - Enable user and connection logging
- S3 - Enable bucket logging
But I want to use other services to process PHI...
No.
The Pipeline
Data > End User
API for Data Uploads
- No AWS account for clients
- CLI app paired with API
- Login with user account and retrieve token
- Retrieve scoped credentials from service
- Upload data directly from client to S3
Loading the Data
- File movers
- ETL jobs on EC2, not ECS
- Luigi-based ETL library
- Hands off!
Serving the Apps
- Dedicated tenancy
- Password policies
- Access logging
- Browser session expiration
The Difficulties
Where'd we struggle?
The Difficulties
- IAM is powerful and complex
- Cross-account policies and roles
- KMS policies
- AWS changing things out from under us
- Learning Terraform and Terraform bugs
Questions?
Email: tim.oguin@ohollowfarms.com
Twitter: @timoguin
Instagram: ohollowtech
Snapchat: timoguin
Slides: http://slides.com/timoguin/devopsdaysbna2016
DevOpsDays Nashville 2016: HIPAA on AWS
By Tim O'Guin
DevOpsDays Nashville 2016: HIPAA on AWS
Building a HIPAA Compliant Analytics Platform on AWS
- 582