Skip to content

Commit

Permalink
Merge pull request #19 from franco-caylent/feature/completeExample
Browse files Browse the repository at this point in the history
CA-52 Complete example
  • Loading branch information
franco-caylent authored Sep 9, 2021
2 parents 9b66427 + 3cecf35 commit f35ceea
Show file tree
Hide file tree
Showing 29 changed files with 715 additions and 99 deletions.
5 changes: 5 additions & 0 deletions examples/complete/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
terraform.tfstate*

# Ignore any module-generated files
./rendered-config.yml
./*.yml
54 changes: 54 additions & 0 deletions examples/complete/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
This example demonstrates a terraform-generated Tamr config for a full aws-scale out environment set up for static Spark clusters. The environment consists of:
- static EMR deployment running both HBase and Spark
- data bucket and logs bucket shared by both HBase and Spark
- newly-generated EC2 key pair (used by both Tamr VM and EMR EC2 instances)
- Elasticsearch domain
- RDS Postgres instance
- Tamr VM deployment
- VPC with 4 subnets according to reference network architecture

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

No requirements.

## Providers

| Name | Version |
|------|---------|
| aws | n/a |
| random | n/a |
| tls | n/a |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| license\_key | Tamr license key | `string` | n/a | yes |
| ami\_id | AMI to use for Tamr EC2 instance | `string` | `""` | no |
| application\_subnet\_cidr\_block | CIDR Block for the application subnet | `string` | `"10.0.0.0/24"` | no |
| availability\_zones | The list of availability zones where we should deploy resources. Must be exactly 2 | `list(string)` | `[]` | no |
| compute\_subnet\_cidr\_block | CIDR Block for the compute subnet | `string` | `"10.0.1.0/24"` | no |
| data\_subnet\_cidr\_blocks | List of CIDR blocks for the data subnets | `list(string)` | <pre>[<br> "10.0.2.0/24",<br> "10.0.3.0/24"<br>]</pre> | no |
| egress\_cidr\_blocks | List of CIDR blocks from which ingress to ElasticSearch domain, Tamr VM, Tamr Postgres instance are allowed (i.e. VPN CIDR) | `list(string)` | <pre>[<br> "0.0.0.0/0"<br>]</pre> | no |
| emr\_abac\_valid\_tags | Valid tags for maintaining resources when using ABAC IAM Policies with Tag Conditions. Make sure `emr_tags` contain the values specified here and that your Subnet is tagged as well | `map(list(string))` | `{}` | no |
| emr\_tags | Map of tags to add to EMR resources. They must contain abac\_valid\_tags at minimum | `map(string)` | `{}` | no |
| ingress\_cidr\_blocks | List of CIDR blocks from which ingress to ElasticSearch domain, Tamr VM, Tamr Postgres instance are allowed (i.e. VPN CIDR) | `list(string)` | `[]` | no |
| name\_prefix | A prefix to add to the names of all created resources. | `string` | `"tamr-config-test"` | no |
| tags | Map of tags to add to resources. | `map(string)` | `{}` | no |
| vpc\_cidr\_block | CIDR Block for the VPC | `string` | `"10.0.0.0/16"` | no |

## Outputs

| Name | Description |
|------|-------------|
| ec2-key | n/a |
| elasticsearch | n/a |
| emr | n/a |
| private-key | n/a |
| rds-postgres | n/a |
| rds-pw | n/a |
| tamr-config | n/a |
| tamr-vm | n/a |

<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
11 changes: 11 additions & 0 deletions examples/complete/ec2-key-pair.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Create new EC2 key pair
resource "tls_private_key" "emr_private_key" {
algorithm = "RSA"
}

module "emr_key_pair" {
source = "terraform-aws-modules/key-pair/aws"
version = "1.0.0"
key_name = "${var.name_prefix}-key"
public_key = tls_private_key.emr_private_key.public_key_openssh
}
59 changes: 59 additions & 0 deletions examples/complete/elasticsearch.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
module "tamr-es-cluster" {
source = "git::[email protected]:Datatamer/terraform-aws-es?ref=2.1.0"

# Names
domain_name = "${var.name_prefix}-es"
sg_name = "${var.name_prefix}-es-security-group"

# Only needed once per account, set to true if first time running in account
create_new_service_role = false

# In-transit encryption options
node_to_node_encryption_enabled = true
enforce_https = true

# Networking
vpc_id = module.vpc.vpc_id
subnet_ids = [data.aws_subnet.data_subnet_es.id]
security_group_ids = module.aws-sg-es.security_group_ids
# CIDR blocks to allow ingress from (i.e. VPN)
ingress_cidr_blocks = var.ingress_cidr_blocks
aws_region = data.aws_region.current.name
}

data "aws_region" "current" {}

# Security Groups
module "sg-ports-es" {
source = "git::[email protected]:Datatamer/terraform-aws-es.git//modules/es-ports?ref=2.1.0"
}

module "aws-sg-es" {
source = "git::[email protected]:Datatamer/terraform-aws-security-groups.git?ref=1.0.0"
vpc_id = module.vpc.vpc_id
ingress_cidr_blocks = var.ingress_cidr_blocks
ingress_security_groups = concat(module.aws-sg-vm.security_group_ids, [module.emr.emr_managed_sg_id])
egress_cidr_blocks = [
"0.0.0.0/0"
]
ingress_ports = module.sg-ports-es.ingress_ports
sg_name_prefix = format("%s-%s", var.name_prefix, "-es")
tags = var.tags
ingress_protocol = "tcp"
egress_protocol = "all"
}

data "aws_subnet" "application_subnet" {
id = module.vpc.application_subnet_id
}

data "aws_subnet" "data_subnet_es" {
filter {
name = "availability-zone"
values = [data.aws_subnet.application_subnet.availability_zone]
}
filter {
name = "subnet-id"
values = toset(module.vpc.data_subnet_ids)
}
}
44 changes: 44 additions & 0 deletions examples/complete/emr-buckets.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Set up logs bucket with read/write permissions
module "s3-logs" {
source = "git::[email protected]:Datatamer/terraform-aws-s3.git?ref=1.0.0"
bucket_name = "${var.name_prefix}-emr-logs"
read_write_actions = [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:AbortMultipartUpload",
"s3:ListBucket",
"s3:ListObjects",
"s3:CreateJob",
"s3:HeadBucket"
]
read_write_paths = [""] # r/w policy permitting specified rw actions on entire bucket
}

# Set up root directory bucket
module "s3-data" {
source = "git::[email protected]:Datatamer/terraform-aws-s3.git?ref=1.0.0"
bucket_name = "${var.name_prefix}-emr-data"
read_write_actions = [
"s3:GetBucketLocation",
"s3:GetBucketCORS",
"s3:GetObjectVersionForReplication",
"s3:GetObject",
"s3:GetBucketTagging",
"s3:GetObjectVersion",
"s3:GetObjectTagging",
"s3:ListMultipartUploadParts",
"s3:ListBucketByTags",
"s3:ListBucket",
"s3:ListObjects",
"s3:ListObjectsV2",
"s3:ListBucketMultipartUploads",
"s3:PutObject",
"s3:PutObjectTagging",
"s3:HeadBucket",
"s3:DeleteObject",
"s3:AbortMultipartUpload",
"s3:CreateJob"
]
read_write_paths = [""] # r/w policy permitting default rw actions on entire bucket
}
96 changes: 96 additions & 0 deletions examples/complete/emr-cluster.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
locals {
applications = ["Spark", "Hbase", "Ganglia"]
}
# EMR Static HBase,Spark cluster
module "emr" {
source = "[email protected]:Datatamer/terraform-aws-emr.git?ref=6.1.0"

# Configurations
create_static_cluster = true
release_label = "emr-5.29.0" # spark 2.4.4, hbase 1.4.10
applications = local.applications
emr_config_file_path = "./emr.json"
bucket_path_to_logs = "logs/${var.name_prefix}-cluster/"
tags = merge(var.tags, var.emr_tags)
abac_valid_tags = var.emr_abac_valid_tags

# Networking
subnet_id = module.vpc.compute_subnet_id
vpc_id = module.vpc.vpc_id
emr_managed_master_sg_ids = module.aws-emr-sg-master.security_group_ids
emr_managed_core_sg_ids = module.aws-emr-sg-core.security_group_ids
emr_service_access_sg_ids = module.aws-emr-sg-service-access.security_group_ids

# External resource references
bucket_name_for_root_directory = module.s3-data.bucket_name
bucket_name_for_logs = module.s3-logs.bucket_name
s3_policy_arns = [
module.s3-logs.rw_policy_arn,
module.s3-data.rw_policy_arn
]
key_pair_name = module.emr_key_pair.key_pair_key_name

# Names
cluster_name = "${var.name_prefix}-EMR-Cluster"
emr_service_role_name = "${var.name_prefix}-service-role"
emr_ec2_role_name = "${var.name_prefix}-ec2-role"
emr_ec2_instance_profile_name = "${var.name_prefix}-emr-instance-profile"
emr_service_iam_policy_name = "${var.name_prefix}-service-policy"
emr_ec2_iam_policy_name = "${var.name_prefix}-ec2-policy"
master_instance_fleet_name = "${var.name_prefix}-MasterInstanceFleet"
core_instance_fleet_name = "${var.name_prefix}-CoreInstanceFleet"
emr_managed_sg_name = "${var.name_prefix}-EMR-Managed"
emr_service_access_sg_name = "${var.name_prefix}-EMR-Service-Access"

# Scale
master_instance_on_demand_count = 1
core_instance_on_demand_count = 4
master_instance_type = "m4.2xlarge"
core_instance_type = "r5.2xlarge"
master_ebs_size = 50
core_ebs_size = 200
}

module "sg-ports-emr" {
source = "git::[email protected]:Datatamer/terraform-aws-emr.git//modules/aws-emr-ports?ref=6.1.0"

applications = local.applications
}

module "aws-emr-sg-master" {
source = "git::[email protected]:Datatamer/terraform-aws-security-groups.git?ref=1.0.0"
vpc_id = module.vpc.vpc_id
ingress_cidr_blocks = var.ingress_cidr_blocks
ingress_security_groups = module.aws-sg-vm.security_group_ids
egress_cidr_blocks = var.egress_cidr_blocks
ingress_ports = module.sg-ports-emr.ingress_master_ports
sg_name_prefix = format("%s-%s", var.name_prefix, "emr-master")
egress_protocol = "all"
ingress_protocol = "tcp"
tags = merge(var.tags, var.emr_tags)
}

module "aws-emr-sg-core" {
source = "git::[email protected]:Datatamer/terraform-aws-security-groups.git?ref=1.0.0"
vpc_id = module.vpc.vpc_id
ingress_cidr_blocks = var.ingress_cidr_blocks
ingress_security_groups = module.aws-sg-vm.security_group_ids
egress_cidr_blocks = var.egress_cidr_blocks
ingress_ports = module.sg-ports-emr.ingress_core_ports
sg_name_prefix = format("%s-%s", var.name_prefix, "emr-core")
egress_protocol = "all"
ingress_protocol = "tcp"
tags = merge(var.tags, var.emr_tags)
}

module "aws-emr-sg-service-access" {
source = "git::[email protected]:Datatamer/terraform-aws-security-groups.git?ref=1.0.0"
vpc_id = module.vpc.vpc_id
ingress_cidr_blocks = var.ingress_cidr_blocks
ingress_ports = module.sg-ports-emr.ingress_service_access_ports
egress_cidr_blocks = var.egress_cidr_blocks
sg_name_prefix = format("%s-%s", var.name_prefix, "emr-service-access")
egress_protocol = "all"
ingress_protocol = "tcp"
tags = merge(var.tags, var.emr_tags)
}
29 changes: 29 additions & 0 deletions examples/complete/emr.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
[
{
"Classification":"emrfs-site",
"Properties":{
"fs.s3.consistent":"false",
"fs.s3.enableServerSideEncryption": "true",
"fs.s3a.enableServerSideEncryption":"true"
}
},
{
"Classification": "hbase-site",
"Properties": {
"hbase.rootdir": "s3://${emr_hbase_s3_bucket_root_dir}/hbase-data/",
"hbase.client.scanner.timeout.period":"600000",
"hbase.hstore.blockingStoreFiles":"200",
"hbase.hregion.memstore.block.multiplier":"8",
"hbase.hregion.memstore.flush.size":"536870912",
"hbase.rpc.timeout":"600000",
"hbase.zookeeper.property.tickTime":"3000",
"zookeeper.session.timeout":"60000"
}
},
{
"Classification": "hbase",
"Properties": {
"hbase.emr.storageMode":"s3"
}
}
]
33 changes: 33 additions & 0 deletions examples/complete/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
output "tamr-vm" {
value = module.tamr-vm
}

output "rds-postgres" {
value = module.rds-postgres
}

output "rds-pw" {
value = random_password.rds-password
sensitive = true
}

output "elasticsearch" {
value = module.tamr-es-cluster
}

output "ec2-key" {
value = module.emr_key_pair
}

output "private-key" {
value = tls_private_key.emr_private_key.private_key_pem
sensitive = true
}

output "emr" {
value = module.emr
}

output "tamr-config" {
value = module.tamr-config.rendered
}
41 changes: 41 additions & 0 deletions examples/complete/rds.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Generate random password for db
resource "random_password" "rds-password" {
length = 16
special = false
}

module "rds-postgres" {
source = "git::[email protected]:Datatamer/terraform-aws-rds-postgres.git?ref=3.0.0"

identifier_prefix = "${var.name_prefix}-"
username = "tamr"
password = random_password.rds-password.result

subnet_group_name = "${var.name_prefix}-subnet-group"
postgres_name = "tamr0"
parameter_group_name = "${var.name_prefix}-rds-postgres-pg"

vpc_id = module.vpc.vpc_id
# Network requirement: DB subnet group needs a subnet in at least two AZs
rds_subnet_ids = module.vpc.data_subnet_ids

security_group_ids = module.rds-postgres-sg.security_group_ids
tags = var.tags
}

module "sg-ports-rds" {
source = "git::[email protected]:Datatamer/terraform-aws-rds-postgres.git//modules/rds-postgres-ports?ref=3.0.0"
}

module "rds-postgres-sg" {
source = "git::[email protected]:Datatamer/terraform-aws-security-groups.git?ref=1.0.0"
vpc_id = module.vpc.vpc_id
ingress_cidr_blocks = var.ingress_cidr_blocks
ingress_security_groups = module.aws-sg-vm.security_group_ids
egress_cidr_blocks = var.egress_cidr_blocks
ingress_ports = module.sg-ports-rds.ingress_ports
sg_name_prefix = var.name_prefix
egress_protocol = "all"
ingress_protocol = "tcp"
tags = var.tags
}
Loading

0 comments on commit f35ceea

Please sign in to comment.