Building a HIPAA Compliant Analytics Platform on AWS

Tim O'Guin

Introduction

  • Who am I?
  • What is Juice?

Me

  • Husband and Father
  • Director of Platform Engineering for Juice
  • Obsessed with automation
  • Permaculturist
  • Interested in using systems to improve our lives and mental health

The feeling I seek...

The feeling I find...

Juice Analytics

  • Founded in 2005
  • Creators of Juicebox and other atrocities
  • We help companies build data products

Our Role

  • Analytics platform what?
  • What does it mean to be a 3rd party?
  • What are our responsibilities?
  • What is our tooling?

Analytics Platform

  • Acquire data from client
  • Load it into Redshift
  • Serve application to users

A business associate is an independent contractor or agent of a covered
entity that receives or obtains protected health information (“PHI”) in
connection with the services it provides for the covered entity.

Business Associate

  • Terraform for AWS Resources
  • Credstash for Managing Secrets
  • Salt for Config Mgmt & Orchestration
  • Django and Flask for Apps
  • AWS ECS for non-PHI Services
  • AWS Lambda

Tooling

Normal HIPAA Stuff

What is it?

The Basics

  • Encrypt everything
  • Log all access to PHI data
  • Have change control processes
  • Store audit data for 6 years
  • Don't share accounts
  • Have password policies
  • Do backups
  • Have a DR plan in place

Our Must Haves

  • Infrastructure as Data
  • Require MFA for AWS access
  • Use separate AWS accounts per environment
  • Use one central AWS account for logins and assume roles when accessing other accounts

HIPAA on AWS

The skinny...

Basics

  • Sign a BAA with Amazon
  • Use specific services that are compliant
  • Follow whitepaper guidelines

https://aws.amazon.com/compliance/hipaa-compliance/

http://aws.amazon.com/compliance/aws-whitepapers/

Services

  • EC2
  • EBS
  • Redshift
  • S3
  • Glacier
  • RDS for MySQL
  • RDS for Oracle
  • ELB
  • EMR
  • DynamoDB

Overview

  • Use KMS with everything that supports it
  • Enable logging for all services that support it
  • Enable CloudTrail logging to track API calls
  • Use IAM roles everywhere you can

Differences Per Service

  • Everything... Use encryption
  • ELB - No terminating SSL here.
  • EC2 - Dedicated tenancy ($$$), encrypted EBS
  • Redshift - Enable user and connection logging
  • S3 - Enable bucket logging

But I want to use other services to process PHI...

No.

The Pipeline

Data > End User

API for Data Uploads

  • No AWS account for clients
  • CLI app paired with API
  • Login with user account and retrieve token
  • Retrieve scoped credentials from service
  • Upload data directly from client to S3

Loading the Data

  • File movers
  • ETL jobs on EC2, not ECS
  • Luigi-based ETL library 
  • Hands off!

Serving the Apps

  • Dedicated tenancy
  • Password policies
  • Access logging
  • Browser session expiration

The Difficulties

Where'd we struggle?

The Difficulties

  • IAM is powerful and complex
  • Cross-account policies and roles
  • KMS policies
  • AWS changing things out from under us
  • Learning Terraform and Terraform bugs

Questions?

Email: tim.oguin@ohollowfarms.com

Twitter: @timoguin

Instagram: ohollowtech

Snapchat: timoguin

Slides: http://slides.com/timoguin/devopsdaysbna2016

DevOpsDays Nashville 2016: HIPAA on AWS

By Tim O'Guin

DevOpsDays Nashville 2016: HIPAA on AWS

Building a HIPAA Compliant Analytics Platform on AWS

  • 582