AWS Cloudformation Deployment

AWS Cloudformation Deployment

Index

Introduction

Panintelligence is the easy-to-use and quick-to-deploy solution to unlocking powerful data insights. We're the only software solution on the market that brings together business intelligence and machine learning to a new self-service level, empowering you to gain full control over your data.

Access unprecedented styling options to make our software look just like your own, and hook up cloud data warehousing and ETL (Extract, Transform and Load) tools, as well as your core product, to create a truly seamless analytics experience.

Simply connect your databases to the Panintelligence dashboard. The Panintelligence dashboard only has read access so it won’t modify any of your data.

Panintelligence does not collect your data or move it.

Panintelligence does not require root/admin access on the server. Sign in as “pi-user”.

This project supports multi-availability zone deployment. For more details please take a look at System architecture and Multi-Availability zone deployment. In addition, the monitoring and logging will also explain in more detail.

Installation (AWS CloudFormation)

Below is a AWS CloudFormation script template which is infrastructure as code which uses AWS services to provide Operational Excellence, Security, Reliability, Performance Efficiency and Cost Optimisation.

Infrastructure as Code (IaC) means to manage your IT infrastructure using configuration files. AWS CloudFormation is a service that helps you model and set up your Amazon Web Services resources so that you can spend less time managing those resources and more time focusing on your applications that run in AWS.

To create a template that describes all the AWS resources that are required (For example, Amazon EC2 instances or Amazon RDS DB instances), and AWS CloudFormation takes care of provisioning and configuring those resources.

To individually create and configure AWS resources manually on AWS Management Console is not required. AWS CloudFormation handles the dependencies. Find more information on AWS Cloudformation here.

For best results, please use our set of AWS (Amazon Web Services) CloudFormation scripts which will deploy Panintelligence in the most optimal way.

Please feel free to contribute to our Github project to maintain and continually improve our deployment methodology!

Github has instructions on how to deploy and this document to further explain the architecture.

System Architecture

At Panintelligence we follow AWS five pillars of success, for more information on what are AWS five pillars of success click here.

The architecture diagram below shows an overview of how the components are connected:

  1. The internet gateway allows traffic into the AWS VPC that’s attached to the public route table. Inside the route table, the AWS Application Load Balancer has a route to the internet gateway. The AWS Application Load Balancer will listen to port 80 and port 443. The security group of the AWS Application load Balancer will also allow port 80 and 443

  2. The AWS Application Load Balancer will direct traffic to a healthy EC2 target to access the Panintelligence dashboard on the web browser

  3. The AWS EC2 instances are part of an auto scaling group based on resource demand. Due to multiple instances, the instances connect to an external Maria DB which is an AWS RDS MariaDB for the persistent storage and fault resilience. The auto scaling rule will scale out if it hits 70% of CPU usage. It will scale back down when the CPU usage is below 20%

  4. As long as you’ve got access to the s3 bucket based on your IAM permissions, you can upload ‘images’, ‘themes’ and ‘excel-data’ files to the bucket. Once they are uploaded, it will set off an object creation trigger to the AWS Lambda to migrate the files onto AWS EFS. The AWS EFS is attached to the auto scaling group so the instances have persistent storage

  5. The folders within the S3 bucket will contain your own personal themes, images and excel data

  6. The RDS MariaDB will contain your own personal dashboard configurations

  7. Allows outbound network traffic using the NAT Gateway

 

 

Skills required

Minimum skills to set it up:

  • AWS Knowledge - Recommended for people who have obtained AWS Cloud Practitioner exam or higher to understand each AWS service

  • Basic Linux AWS CloudShell skills - You will need to navigate through AWS CloudShell which is a Linux environment to be able to pull the project from Git

  • Git - Some basic understanding on how to use git and be able to pull a project

  • Networking/Security - You will need to understand how the infrastructure is built and how each service communicates with each other. To set it up, you will only need to understand what parameter values you need to enter for the project

  • AWS CloudFormation - Infrastructure as code allows you to build AWS services through a configuration text file so you don’t have to manually create services. You will need to understand how it works but the instructions are in the GitHub project on how to deploy it

Advanced skills to configure other aspects of the infrastructure:

  • Networking/Security - If you wish to configure the AWS CloudFormation script and you wish to add more services, you will need to make sure you know the Security Groups, Network Access Control Lists and Route tables

  • Python - We have a Lambda function which is written in python. The python script grabs an S3 object event trigger to push to AWS EFS. If you wish to modify it, you will need Python skills

  • MariaDB/SQL commands - MariaDB knowledge on how to access AWS RDS MariaDB and view the Panintelligence dashboard database

  • Docker and Docker-compose - We install the application using configured docker-compose scripts.

Resources and prerequisites

To complete this documentation, we assume you have a Route 53 hosted zone, an ACM certificate (AWS Certificate Manager) that is validated and covers the domain you will be using for the load balancer, an S3 bucket for backups and a key pair to allow you SSH (Secure Shell Protocol) access to the server.

Resource

Description

How it is used?

Resource

Description

How it is used?

Route 53

Amazon Route 53 is a highly available and scalable cloud Domain Name System (DNS) web service. For more information, please see Setting up Amazon Route 53 documentation

You can attach your domain name to the AWS Application Load Balancer to point to the Panintelligence dashboard.

AWS ACM

AWS Certificate Manager is a service that lets you easily provision, manage, and deploy public and private Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates for use with AWS services and your internal connected resources. For more information, please see Setting up - AWS Certificate Manager documentation

In order to use port 443/HTTPS in the AWS Application Load Balancer, you will need an SSL certificate.

EC2 Key Pair

A key pair, consisting of a public key and a private key, is a set of security credentials that you use to prove your identity when connecting to an Amazon EC2 instance. Amazon EC2 stores the public key on your instance, and you store the private key. For more information, please see Amazon EC2 key pairs and Linux instances - Amazon Elastic Compute Cloud documentation

If you wish to SSH into the EC2 instance, you will need the Key pair on your local machine.

AWS S3 Bucket

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. For more information, please see AWS S3 Bucket documentation

The architecture requires the user to upload a lambda zip provided in the Git repository and another s3 bucket is created to store images, themes and excel-data.

AWS CloudFormation

AWS CloudFormation gives you an easy way to model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their life cycles, by treating infrastructure as code. For more information, please see AWS Cloudformation documentation

AWS CloudFormation allows you to build the infrastructure instead of manually configuring each component.

AWS SSM

SSM Agent makes it possible for Systems Manager to update, manage, and configure these resources. For more information, please see AWS SSM documentation

We use AWS SSM Agent to SSH into the EC2 instance on AWS Management console instead of local machine.

AWS CloudShell

AWS CloudShell is a browser-based, pre-authenticated shell that you can launch directly from the AWS Management Console. You can run AWS CLI commands against AWS services using your preferred shell (Bash, PowerShell, or Z shell). And you can do this without needing to download or install command line tools. For more information, please see What is AWS CloudShell documentation

Instead of doing it on a local machine, you can run the shell commands on AWS CloudShell.

AWS Internet gateway

An internet gateway is a horizontally scaled, redundant, and highly available VPC component that allows communication between your VPC and the internet. For more information, please see Internet Gateways documentation

The Panintelligence dashboard requires web browser access.

AWS IAM

AWS Identity and Access Management (IAM) enables you to manage access to AWS services and resources securely. Using IAM, you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources. For more information, please see AWS IAM documentation

IAM permissions allows you to have fine grain control on who and what has access to resources.

AWS Security groups

security group acts as a virtual firewall for your instance to control inbound and outbound traffic.  For more information, please AWS Security Groups documentation

Increase protection to your infrastructure.

AWS Application Load Balancer

Elastic Load Balancing automatically distributes your incoming traffic across multiple targets, such as EC2 instances, containers, and IP addresses, in one or more Availability Zones. It monitors the health of its registered targets, and routes traffic only to the healthy targets.  For more information, please see AWS Application Load Balancer documentation

The ALB directs traffic to the healthy EC2 targets.

AWS VPC

Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources into a virtual network that you've defined. This virtual network closely resembles a traditional network that you'd operate in your own data centre, with the benefits of using the scalable infrastructure of AWS. For more information, please see AWS VPC documentation

We use the AWS VPC to launch resources in the virtual network.

Subnets

You need to specify a logical address to specific resources. For more information, please see Subnets documentation

Configure resources to specific subnet cidr blocks.

NACL

network access control list (ACL) is an optional layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets. For more information, please see NACL documentation

Configure additional security.

AWS Lambda

AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes. For more information, please see AWS Lambda documentation

The infrastructure uses AWS Lambda to side load S3 objects to AWS EFS

AWS RDS MariaDB

Amazon Relational Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a relational database in the AWS Cloud. For more information, please see AWS RDS MariaDB documentation

The Panintelligence dashboard uses AWS RDS MariaDB as an external DB.

AWS EFS

Amazon Elastic File System (Amazon EFS) provides a simple, serverless, set-and-forget elastic file system for use with AWS Cloud services and on-premises resources. For more information, please see AWS EFS documentation

AWS EFS is used to keep persistent data for themes, images and excel data.

AWS Auto scaling

AWS Auto Scaling monitors your applications and automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost. For more information, please see AWS Auto scaling documentation

Auto scaling is used to increase or decrease the EC2 instances depending on traffic.

AWS EC2

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. For more information, please see AWS EC2 documentation

Using AWS EC2 to stand up Panintelligence AMI.

AWS AMI

An Amazon Machine Image (AMI) provides the information required to launch an instance.  For more information, please see AWS AMI documentation

Panintelligence has four AMI’s on the marketplace.

AWS NAT gateway

A NAT gateway is a Network Address Translation (NAT) service. You can use a NAT gateway so that instances in a private subnet can connect to services outside your VPC but external services cannot initiate a connection with those instances. Please see AWS NAT gateway documentation

Allows you to use Panintelligence Automated Lincence Manager

Service quotas

For the AWS Services that you will be using, you will need to be aware of your service level quotas. You don’t want to hit your limit on your account, however you can always submit a Service quota increase with AWS. For more information on what are AWS Service quotas, click here.

The table below contains information on the specific resources that are used in the architecture. Please put in the region name in the URL that you are building Panintelligence in.

AWS Resource

Service quotas that the infrastructure uses

Notes

AWS Resource

Service quotas that the infrastructure uses

Notes

AWS EC2

1-5 EC2 instances

 

AWS Route 53 (Optional)

1

 

Launch configuration per region

1

 

Step adjustments per step scaling policy

2

One policy to decrease EC2 instances and one to increase EC2 instances.

Target groups per Auto Scaling group

1

 

AWS Auto scaling per region

1

 

AWS VPC

1

 

AWS Internet gateway

1

 

General Purpose SSD (gp2) volume storage

2-10

 

AWS Application load balancer

1

 

AWS Lambda Elastic network interfaces per VPC

1

 

AWS Lambda Function time out

15

 

AWS Lambda temporary storage

Not available

AWS Default quota value is 512MB, A Panintelligence theme or image would not hit the limit.

AWS CloudFormation stacks

10

 

EFS per VPC

1

 

EFS Mount targets

2

 

EFS attached security group

1

 

Interface VPC endpoints per VPC

5

 

Route tables per VPC

2

Private and public route table

Routes per route table

2

 

Subnets per VPC

6

 

Security groups

6

 

Network ACLs per VPC

2

 

S3 Buckets

2

 

Secrets per account

6

 

DB instance

1

 

DB subnet group

1

 

DB Parameter group

1

 

DB security group

1

 

AWS NAT Gateway

1

 

How to obtain a Panintelligence licence key?

You will need to contact your Panintelligence customer success manager or support@panintelligence.com for pricing information and to obtain a licence for use with our software for the BYOL (Bring your own licence) version of Panintelligence.

Developer and trial both have limited use case licences embedded in the AMI image (Amazon Machine Image). Our Metered offering charges based on units (users) and dimensions (analytics, scheduler, reports).

Automated Licence Manager

For more information on how to use the automated licence feature:

Automated Licence Manager

How to obtain the Panintelligence Marketplace AMI ID?

  • Go to your AWS (Amazon Web Services) console and search for ‘AWS marketplace subscriptions’ on the services

  • Inside the AWS (Amazon Web Services) marketplace subscriptions, click on ‘Discover products’ on your left and search for ‘Panintelligence’

  • Select one of the Panintelligence products that you wish to use. For this example we will use ‘Panintelligence BYOL (Bring your own licence)’

  • Click on to ‘Subscribe’ or ‘Continue to Subscribe’ and await to confirm subscription

  • Go back to AWS (Amazon Web Services) marketplace subscriptions console in AWS (Amazon Web Services) and you should see your Panintelligence subscription. Click on to your Panintelligence subscription

  • Click to launch instance on the right

  • Depending on what region you wish to deploy it in, please select the region and the AMI ID will change. Please copy that AMI ID and keep that safe

Technical Datasheet requirements

The technical datasheet offers some guidance on how much resources you would require depending on your infrastructure and users.

Please take a look at this link.

Operating System

In addition, on Panintelligence AMI we have Linux Ubuntu 20.04 as the operating system.

Docker overview

For more information on how the docker configuration within the AMI products:

AWS AMI products

Size requirements and recommendations

Due to the recommended architecture, you have separated the dependencies on the EC2 instance to EFS and AWS RDS MariaDB.

EC2 instance type:

Instance types comprise varying combinations of CPU, memory, storage, and networking capacity and give you the flexibility to choose the appropriate mix of resources for your applications. Below is the recommended instance types to use.

Instance type

vCPU*

CPU Credits/hour

Mem (GiB)

Storage

Network Performance (Gbps)

Instance type

vCPU*

CPU Credits/hour

Mem (GiB)

Storage

Network Performance (Gbps)

t3a.medium

2

24

4

EBS-Only

Up to 5

t3.medium

2

24

4

EBS-Only

Up to 5

EBS volume type:

It is recommended to have 15GiB or higher for your EBS volume attached to the EC2. Depending on your data connections and users, it can be higher. Please take a look at the Technical Datasheet/ System Requirements documentation for more information.

Volume type

Durability

Volume size

Max IOPS per volume (16 KiB I/O)

Max throughput per volume

Amazon EBS Multi-attach

Boot volume

Volume type

Durability

Volume size

Max IOPS per volume (16 KiB I/O)

Max throughput per volume

Amazon EBS Multi-attach

Boot volume

gp2

99.8% - 99.9% durability (0.1% - 0.2% annual failure rate)

Recommended size is to have a 15GiB size attached to the EC2. It can go up to 16TiB if you wish to increase the size.

16,000

250 MiB/s *

Not supported

supported

gp3

99.8% - 99.9% durability (0.1% - 0.2% annual failure rate)

Recommended size is to have a 15GiB size attached to the EC2. It can go up to 16TiB if you wish to increase the size.

16,000

1,000 MiB/s

Not supported

supported

Amazon RDS MariaDB Instance size:

Model

Core Count

vCPU*

CPU Credits/hour

Memory (GiB)

Network Performance (Gbps)

Model

Core Count

vCPU*

CPU Credits/hour

Memory (GiB)

Network Performance (Gbps)

db.t3.micro

1

2

12

1

Up to 5

db.t3.small

1

2

24

2

Up to 5

Security

It is important to highlight the security configurations.

The application does not require root access. To sign into the instance, you will need to sign into ‘pi-user’.

When building the AWS Cloudformation deployment you will need assign a IAM User to be able to create and deploy AWS Cloudformation stacks.

Public access

The Application load balancer listens to port 80 and 443. Port 80 redirects to port 443. This is to allow the public to access the Panintelligence dashboard.

Private access

AWS Lambda has only access to the S3 bucket to upload the AWS Lambda ZIP and the S3 event triggers the AWS Lambda to migrate the S3 objects (images, themes, excel-data) to AWS EFS for persistent data.

Least privilege

AWS RDS MariaDB uses AWS secret manager to generate the database password. In addition, the infrastructure rotates the secret details every 30 days. To find the sensitive information, you can locate it in AWS secret manager.

Every 30 days, when the password rotates, you will need to reboot the EC2 instance. Future road map of the infrastructure will have a trigger to update the EC2 instances automatically when secrets rotates.

AWS EC2 user data contains sensitive information due to the Panintelligence use of the external AWS RDS MariaDB details. Therefore, the infrastructure has a secret manager endpoint attached to allow the EC2 instances to retrieve the secrets. This is because we cannot store secrets in plain text and only the EC2 instances are allowed to retrieve them.

We can’t show the secrets in plain text therefore in the AWS Cloudformation deployment we use the AWS secret manager api call to retrieve the secrets. Due to auto-scaling policy, we put the environment variables in the user data so when the EC2 instance boots up, it will have the correct information. For more information, take a look at the AWS Cloudformation project with the EC2 stacks: https://github.com/Panintelligence/aws-deployment/tree/main/nested-stacks

 

echo "export PI_DB_PASSWORD=$(aws secretsmanager get-secret-value --secret-id ${SecretArn} --query SecretString --output text --region 'eu-west-1' | jq -r .password)" >> /opt/pi/Dashboard/startup.sh echo "export PI_DB_USERNAME=$(aws secretsmanager get-secret-value --secret-id ${SecretArn} --query SecretString --output text --region 'eu-west-1' | jq -r .username)" >> /opt/pi/Dashboard/startup.sh

How to obtain the secrets?

AWS Management console: