Careers at Triumph Tech

Discourse.ai EKS Migration

 

TalkMap EKS Container Migration

Timothy Wong | May 3, 2021


Migrations



Company Name
Discourse.ai
Case Study Title
TalkMap EKS Container Migration
Vertical

Artificial Intelligence

After migrating Caudalie’s workloads to the AWS Cloud, Caudalie gained accelerated speed, agility, robust security, and a disaster recovery solution. They now have reduced operational risks and time recovery objectives.

Problem / Statement Definition

Discourse.ai needed to migrate from their existing legacy infrastructure quickly. To modernize their deployments, they needed increased scalability and multiple mirrored environments to support workload promotion through multiple stages of their DevOps Software Development life cycle.

Discourse.ai provides a virtually real-time business intelligence service that scales with demand. Additionally, they are at the forefront of the latest in AI and ML, which requires flexibility to innovate fast.

Proposed Solution and Architecture

Triumph Tech used Containers with AWS Elastic Kubernetes Service to create a step change to Discourse.ai’s existing DevOps and SDLC process. We provided automated testing and validation to their development team to test, validate, and deploy new innovations with a high level of quality quickly into their environment.

Triumph selected AWS to provide the resources required to fix the problem. We used eksctl along with CloudFormation to deploy all of our resources. This included Dev, Test, QU, and production environments consisting of:

  • EKS Cluster
  • Dedicated per environment VPC of public and private subnets
    • Enabled workload deployment over multiple availability zones
    • Leveraged NAT Gateways with Fixed Elastic IPs for use with Allow Lists
  • Approached Security Groups with PoLP, separating ingress, control, and workload traffic
  • Limited access to specific identities, access control roles, and supporting AWS best practices
  • Receive cost performance benefits with EKS node groups through EC2 AutoScaling
  • Ensuring the highest cost benefit with EC2 Autoscaling groups
    • Maximize ML processing using the Nvidia V100 Tensor Core GPU’s within the P3 instance type
  • Provide seamless integration with Kubernetes deployment workflow with EC2 classic Load Balancers
  • Provide persistent volumes to containers running inside the EKS Cluster with EFS
  • Ensure data at rest is always encrypted with KMS
  • Protect critical data stored within EFS with AWS Backup
    · Integrated, self-managed EC2-hosted gitlab server manages container builds and deployments throughout the SDLC.
Outcomes of Project & Success Metrics

We dove into the legacy environment to understand the existing application stack and SDLC.
For much-needed technical advantages allowing for rapid innovation, Discourse.ai’s current architecture needed to be migrated to a modernized infrastructure.

The project was successful after the migration into an environment that enabled both rapid innovation and automated testing in isolated environments.

Deployments and security scans were automated, and most importantly developers can release innovations easily and fast, providing real RTO and growth opportunities.

Describe TCO Analysis Performed

TCO analysis was performed by collecting data from the AWS Billing Console.
Lessons Learned:
Automated solutions optimally reduce time to market.

Combined with CloudFormation, Eksctl provides repeatable deployments for multiple customer environments.

EC2 Autoscaling maintains the necessary flexibility to keep business insights at virtual real time while controlling costs by aligning consumption with scaling.

Summary of Customer Environment

Cloud environment is the native cloud. The entire stack runs in the US-East-2 region of Amazon Web Services. Discourse eks migration.

discourse eks migration

Deployment Testing and Validation

Deployments are tested and validated through a promotion strategy. The development branch automatically deploys to the isolated staging environment without approval. The team will QA and validate application functionality and approve a promotion to the production environment. A pull request is submitted to the source control and merged into Master. Then Workloads are deployed to the production environment.

In both the staging and production environments, security scanning is automated via Trivy, a vulnerability scanner. If a critical issue is found, Trivy will produce a non-zero exit code and the build will fail. No container image artifact will be published to EKS. In order to publish the image to EB, Client must patch the vulnerability until the vulnerability scan passes.

Version Control

All code assets are version controlled within GitHub.

All CloudFormation assets are stored within AWS CodeCommit.

Application Workload and Telemetry

CloudWatch application logging is integrated by default into all of our container and serverless workloads. We include this as an “in scope” item for all modernization and migration projects. CloudWatch provides a centralized system where error logs can be captured and aid in operational troubleshooting.

Want to Migrate Your Business to AWS? Meet our Migration Experts: