Friday, May 27, 2022

Going Cloud Native with AWS Elastic Container Service

  Summary Very good article to understand benefits and some pitfalls of adopting cloud infrastructure or shifting to different cloud infrastructure.  In this article it was compared native services to amazon AWS. But points are quite generic and useful.

Zoosk Java microservices are hosted on Amazon Elastic Container Service. In Amazon’s words, “Amazon EC2 Container Service (ECS) is a highly scalable, high performance container management service that supports Docker containers and allows you to easily run applications on a managed cluster of Amazon EC2 instances. Amazon ECS eliminates the need for you to install, operate, and scale your own cluster management infrastructure.”

Sounds very appealing to have them manage one’s container applications with minimal effort. Since we had to migrate our services without a dedicated Ops resource we decided this service would be the best to host our services. Migration of the services to Amazon Elastic Container Service involved changing how we developed, built, and deployed services. All services that were stateful had to be refactored to be stateless to leverage autoscaling, where instances come and go. The migration process involved updating our tech stack to use the latest open source frameworks, deciding what AWS Services fit our use cases, development, coming up with a roll out strategy to prevent user disruptions, and cutting spend in the cloud. As part of the development, we migrated our services from using RabbitMQ to SQSMySQLto AuroraMemcached to Elasticache, and Solr to ElasticSearch. Going from nothing to running production services supporting millions of users has shown us the pros and cons of ECS.

Development process for a service to be deployed to ECS


ECS Topology

The Good

Services only consume what they need — Before we had 12 servers (4 cores, intel xeon E3–1200, 32 gb ram) in our datacenter hosting the Java microservices. Some of the services consumed only 5% of the cpu and memory on the server. Hence the servers were severly underutilized and not cost effective. Migrating to ECS allowed full usage of CPU and memory by placing a service in a cluster of EC2 instances. The ECS scheduler places the service on a EC2 instance with enough CPU and memory to allocate. We reduced the instances needed to three m4.larges because of the ability to place multiple containers until an EC2 instance runs out of resources.

Scales faster than ec2 instances — Generally containers are faster to spin up than EC2 instances off of an AMI. If traffic increases to a threshold defined, ECS will create a new container and add it to the load balancer, with ECS always having enough resources in the cluster to leverage the horizontal scaling benefits. Otherwise there will be extra delay for spinning up an EC2 instance to provide the resources needed.

Orchestration of containers — ECS features handle many use cases necessary for deploying and maintaining container services in a distributed system. There is minimal set up for a private docker registry (ECR), load balancing, scheduling, and creating an orchestration server. This reduces the amount of Ops works needed to get the service up and running.

Autoscaling — During peak traffic, services scale out to handle the load and prevent down time. During off peak hours services scale in instances to save money. With AWS ECS there are two levels of autoscaling one is at the cluster (EC2 instances for providing CPU and memory resources for the cluster) and the other one at the service (docker container instances to handle traffic).

Burstable CPU — An ECS service is allocated CPU and memory from an EC2 instance. A service allocated 1024 cpu units (1024 units = 1 core) on an instance that has 4096 available (quad core) is guaranteed to have one core available for the service at all times. If no other service is using the other three cores on the EC2 instance, the service is able to use all four cores if needed.

No extra cost associated with AWS ECS — Users only pay for the used AWS resources i.e. EC2, Elastic Load balancer, etc.

Zero Downtime deployments — When deploying a new version of a service, ECS will deploy the new version onto the cluster for staging. The load balancer executes a health check to the new containers canary endpoint. If the containers pass the health check with HTTP 200, the load balancer sends traffic to the new containers and drains the old containers for deletion.

The Bad

Might accidentally bring down a service — Autoscaling allows the ability to save money by turning off instances when not in use. However scale in action at the cluster level can bring down instances running tasks. AWS auto-scaling default policies delete instances with the oldest launch configuration or instances closest to the next billing hour. This can cause a service disruption if the instances terminated contained a service and its backups. Scaling in a cluster requires adding scale in protection to the instances running tasks to prevent service disruptions. At Zoosk, we created a python script to protect or unprotect instances that have a running task. Execution of the script happens before any scale in action. Protect, then scale in. When we initially migrated our services, we over provisioned our services to give us a buffer. After running for about a week, we did cost cutting by scaling in our cluster. Since I was an ECS newbie I just thought that ECS was smart enough to not take down instances that had a service running on it. Wrong. I brought down a production service for a minute as ECS brought it back up. Not a good look for my first year at Zoosk. So learn from my mistake!

Deploys could be better — During deployments there is no phase that allows a percentage of traffic through to the new containers and if things look good then commit, else roll back. ECS commits a deploy if the containers pass the health check from the load balancer. It will drain the existing containers before allowing one to check if the new version is stable with the production traffic. To rollback in ECS one must deploy the old task definition (containers). This increases the downtime of the service compared to flipping the traffic back to the old containers. Hence creating a canary endpoint for services is crucial for ECS because this is the gatekeeper of committing a deploy. ECS has no way of stopping a deploy unless you set a deploy action of the same task definition.

Sitting idle EC2 instance — EC2 instances in the cluster might sit there with no tasks running but are needed to provide resources for scaling out when the time comes. Zero downtime deploys will not execute if you do not have at least twice as much CPU and memory for the deployed service available in the cluster.

Outdated ECS docs — Cloudformation templates AWS has around ECS create resources that are not configured properly such as cloudwatch alarms for ECS. I was a newbie at Cloudformation and the template contained unnamed resources. Resources created in the Cloudformation stack will have a hash appended to the name. Scripting for resources can be difficult if the name changes for deploys with Cloudformation. If a cluster needs to be renamed to remove the hash it requires the entire cluster to be deleted and remade. So remember to name the resources.

Stateless applications only — Services developed for ECS have to be stateless due to scale-in and scale-out possibilities. During deploys ECS brings a new set of containers and gets rid of the old thus the state of your past version is gone.

The Learnings / Gotcha’s

Warm the load balancer — The AWS Elastic Load Balancer has limitations where the load balancer needs to gradually increase the request rate to allow time to scale, otherwise requests will be dropped. If you are sending all the traffic immediately you will need to contact AWS to “pre-warm” the load balancer or run a load test to warm it up. We faced this issue were we tried to send 400k requests per minute to a fresh load balancer and the load balancer would drop requests causing a downtime with our service. The workaround was that, before sending the traffic, we ran a load test to warm up the load balancer.

Autoscaling — Implementing autoscaling correctly involves getting rid of the peaks and falls of CPU usage by reducing and increasing CPU resources for a service. For a majority of our services we aimed to have a CPU usage range of 50–70%. Thus we set the scale in trigger at 50% and scale out trigger at 70%. We set a Cloudwatch alert to be triggered if the CPU reached 85% for longer than 5 minutes.

Metrics discrepancies — Two ways of monitoring the CPU and memory of the services is running Docker stats inside the EC2 and Cloudwatch for an aggregate of usage across all containers of the service. While testing one instance of a service I noticed discrepancies on how much CPU and memory was being utilized. Docker reported higher numbers and I assumed since it was the container platform it was correct. However when running the htop command on the EC2 instance the cpu usage coincided with the reportings in Cloudwatch. Trust Cloudwatch.

Memory allocations matters — If a service consumes more than the allocated memory the container will die. I wasted a lot of time wondering why the docker container kept dying and found out it went over the memory allocated in ECS. In Java, if you allocated lots of memory to the JVM, garbage collection would be triggered less often and the service would consume more memory. We found that our services were consuming a lot of memory when they did not have to. This wasted resources on ECS, which meant wasted money. Always profile (VisualVM is great) and load test (JMeter) to get a clear idea of how much memory the service needs.

Saving logs brought down a service — Send logs and metrics to a central location for analyzing, alerting, and monitoring. Cloudwatch is a great option for that. Set up access logs for ELB for auditing and storage in S3. There was an issue at Zoosk where services would stop working because there wasn’t any more disk space in the docker container. Services wrote their application and access logs locally. Docker disk space by default allocates only 10gb. By shipping logs and metrics to external services prevention of a service going down due to full disk space is gone.

Docker SHA — Deploying with SHA of docker is mandatory in production because using tags does not guarantee that the same version will be deployed (because if a developer pushes their changes with that tag it will overwrite the previous version). If you need to rollback, there is no way to do that with a tag. Docker SHA is a unique identifier of a version of a container. Using the SHA will allow audibility because tags can be overwritten in a docker registry but SHA’s can’t.

For Java developers — Because AWS resources use DNS name entries that occasionally change, we recommend that you configure your JVM with a TTL value of no more than 60 seconds. This ensures that when a resource’s IP address changes, your application will be able to receive and use the resource’s new IP address by requerying the DNS.

Conclusion

AWS ECS is an excellent option for hosting container services in the cloud. A developer can easily deploy and maintain services on ECS with minimal Ops work needed. ECS reduces the troubles of having to manage your own container orchestration platform at zero cost. Of course we wish that the deployment process could be improved and there are many features I would like to see in the product. But, overall, we are happy about how ECS has been able to serve our millions of users the features that they love.


Monday, May 9, 2022

AWS - Single-page application

This one is pretty simple starting point for  deployment model for single page application on AWS. Nothing complex and basic building blocks on aws. 




Ref: Single-page application - AWS Serverless Multi-Tier Architectures with Amazon API Gateway and AWS Lambda

TierComponents
Presentation

Static website content hosted in Amazon S3, distributed by CloudFront.

AWS Certificate Manager allows a custom SSL/TLS certificate to be used.

Logic

API Gateway with AWS Lambda.

This architecture shows three exposed services (/tickets/shows, and /info). API Gateway endpoints are secured by a Lambda authorizer. In this method, users sign in through a third-party identity provider and obtain access and ID tokens. These tokens are included in API Gateway calls, and the Lambda authorizer validates these tokens and generates an IAM policy containing API initiation permissions.

Each Lambda function is assigned its own IAM role to provide access to the appropriate data source.

Data

Amazon DynamoDB is used for the /tickets and /shows services.

Amazon ElastiCache is used by the /shows service to improve database performance. Cache misses are sent to DynamoDB.

Monday, May 2, 2022

CI/CD with API Management

 


Very good working code for 

Ref: GitHub - Azure/azure-api-management-devops-resource-kit: Azure API Management DevOps Resource Kit

  • How to automate deployment of APIs into API Management?
  • How to migrate configurations from one environment to another?
  • How to avoid interference between different development teams who share the same API Management instance?

Tuesday, April 19, 2022

Implementing continuous blue/green deployments on Azure Container Apps by using GitHub Actions

 

Very interesting model using git hub actions and achieving blue/green deployments on Kubernetes and Azure  where test urls can be served and test new versions with green before moving to final production or roll back if needed.

After we have validated the new version we can slowly increase the traffic on our new green revision for production traffic. zero downtime deployments with great devops architecture.

Ref:

Implementing continuous blue/green deployments on Azure Container Apps by using GitHub Actions | by Dennis Zielke | Medium

Sunday, April 10, 2022

How To Build CI/CD For Static Vue.js App Using Azure DevOps

 


A step by step guide With an Example Project

Azure DevOps

There are a lot of deployment strategies when you deploy your Vue.js applications to production and your deployment strategy is entirely depends on your application architecture. For example, If you are using Java or Nodejs with Vue you need to deploy your application on respective environments. If you are serving the Vue static assets with NGINX in that case, you can dockerize the app and put that on Azure AKS.

One way of developing Vue.js applications is to use Azure blob storage for static web hosting. In this post, we will see how we can deploy a Vue static website using Azure DevOps.

  • Introduction
  • Example Project
  • Prerequisites
  • Build Pipeline
  • Release Pipeline
  • Demo
  • Summary
  • Conclusion

Introduction

As I said earlier, one way of developing Vue.js applications to put the Vue build into Azure blob storage and distribute it with Azure CDN. As shown in the following figure all the Vue.js static assets are uploaded into blob storage and configure Azure CDN to distribute the content across the world. Here is the full article on how to do that with an example project.

Azure CDN serving Vue app

We have to build two pipelines to deploy this application using Azure DevOps.

Build Pipeline

This pipeline takes the code from the Azure Repos or any git source and goes through a series of actions such as install, test, build and finally, generate the artifacts ready for the deployments.

Build Pipeline

Release Pipeline

This pipeline takes the artifact and uploads all the files into Azure blob service. You can have pre-deployment conditions such as approvals, manual only, etc for the release.

Release Pipeline

Example Project

Here is an example project which we can put in the Azure blob storage for static website hosting. This is a simple profile page with a header and some sections.

// clone the project
git clone https://github.com/bbachi/my-profile-vuejs.git
// install dependencies and start the project
npm install
npm run serve

You can clone the project and run it on your machine. Here is the demonstration when you run it on your localhost and the port 8080.

Example Project

Prerequisites

There are some prerequisites for this article you need to be familiar with the Vue.js application and how it builds and some familiarity with the Azure account.

One way to deploy your Vue static website on Azure is to log in to your portal and upload all the files manually. Check out this article on how we can do it manually.

Build Pipeline

We need to do build pipelines as part of continuous integration first. The way this pipeline works is that the moment you check-in code into Github or Azure Repos, this pipeline builds the project, test it, make the built artifact ready for the deployment. Let’s follow all the steps to build this pipeline.

  1. Create a Project in Azure DevOps
  2. Create a Repo and Put your code in Azure Repos
  3. Create a pipeline that takes it from the source repository.
  4. Install all dependencies
  5. Run the tests
  6. Build the code
  7. Copy files from source for staging
  8. Archive all the copied files
  9. Finally, publish the artifact.
  10. Enable Continuous Integration with triggers

Let’s create a project in your Azure DevOps account. I named it StaticWebsite with visibility of your choice. You can make it public or private.

Creating a project

Once you created this project you can see the dashboard which is empty right now.

Project Dashboard

Put your code in Azure Repos

It’s time to create a repo and place all the code from the example project from above and push your code into this repo. I created a repo called static-vuejs and push all the code into this repo.

Azure Repos

You can generate the Git credentials when you click on the clone button on the top right corner. You need these credentials if you want to push the code into this repo later.

Git Credentials

Create a Build Pipeline

Let’s start a pipeline with selecting a source and repository branch and by selecting a classic editor.

Select a Source

On the next page select an Empty job

select an empty job

You need to define all the tasks under this job such as Install dependencies, Run Tests, Build the project, Copy Files, Archive files, Finally publish artifacts.

Build Pipeline

First, you need to click on the + icon on the right side of the Agent job to add tasks. We need to select the Command line task for the first three tasks: install, run tests, and build.

Command-line task

Install dependencies

When you run this pipeline you clone the React project from Azure Repos. This repo doesn’t have any dependencies installed so we need to run npm install as a first task

Install Dependencies

Build the Project

Once you install the dependencies, we need to build the project for the artifact.

Build the project

Copy the files

We have the build ready so we need to copy files from the source folder to the target folder to staging. For this, we have to select the Copy Files task. All the code actually resides in this directory Build.SourcesDirectory and we need to move all the source code to stage directory Build.ArtifactStagingDirectory

Copy Files

Archive Files

We have copied all the files to the target folder. It’s time to create an Archive file with the task Archive Files. Select this task and give all the details such as the Root folder, archive type, and an archive file.

Archive Files

Publish Artifact

Use the task called Publish build artifacts to publish the artifact.

publish artifact

Finally, Define the trigger

Let’s define the trigger for this build. Edit the pipeline by clicking on the triggers and enable continuous integration. Any commit to the static-vuejs repo master branch triggers this pipeline.

Trigger

This is how the final artifact looks.

Artifacts

Here is the complete YAML for this build pipeline.

pipeline.yml

You can get above YAML from this edit pipeline screen.

Getting YAML

Release Pipeline

We are done with the Continous integration part and let’s build a release pipeline for continuous delivery. Click on the releases and new release pipeline and follow these steps.

  1. Define the artifacts such as project, source, default version, etc
  2. Define a Stage for the deployment

Define the Artifact

The first step is to define the artifact so that it takes that particular artifact after the build pipeline is completed. Make sure you have a trigger placed.

First, we need to add an artifact by selecting the source pipeline

Adding artifact

Make sure you have a Continuous deployment trigger enabled by clicking on the lightning icon on the top right corner. This triggers the release pipeline whenever a new artifact is available.

Continuous deployment trigger

Define a Stage

Let’s define the stage for the release. You can add as many stages as you want. You need to define the tasks for the stage. For simplicity, we are using here only one stage here.

Production

If you click on the 1 job 0 tasks link that will take you to the task details page. We have two tasks for this stage.

  1. Extract Files
  2. Azure CLI

Extract Files

This Extract Files task extracts the zip archive from the artifact and puts that in the destination folder. Make sure you have the right Archive FIle Pattern as (**/$(Build.BuildId).zip)

Extract files Task

Azure CLI

We have to select the Azure CLI task for uploading files since we are using Azure CLI commands to upload all these files from the Artifact archive.

Here is the inline script that we are using to upload all the files. Make sure you have the right subscription selected.

Upload Files

Make sure you should put the correct path in the Working Directory so that you upload only files you need instead of the entire directory path into Blob storage.

Working Directory

We might need to authorize with your subscription.

Authorize
az storage blob upload-batch -s ./ -d '$web' \
--account-name staticvueweb \
--account-key \ T0EhNSZ7PIqiH0vJ5NOxocH1tE65wyvC9gYvl4A2FtyonI6JmZ2THa8Eao2sSia1C8MSnl4oRAIfF/oa+QHs4Q==

We are uploading to the storage account staticvueweb with the destination $web container from the source ./. We need an account key to upload the files to Azure blob storage. You can get the account key from the following location in the Azure portal under your storage account.

Access Keys

Demo

It’s time for the demo. As soon as you check-in the code to master it triggers build pipeline. Usually, you don’t check in the code directly into master instead you create a pull request. I just used a master here for simplicity.

Build Pipeline Demo

Build Pipeline Demo

Release Pipeline Demo

As soon as the build is completed and a new artifact will trigger the release pipeline.

Release Pipeline Demo

Once it is succeeded, you can see that some of the files are updated in the Azure blob storage.

Azure Blob Storage

Here is the change deployed to the site

Deployment Successful

Summary

  • One way of developing Vue applications is to use Azure blob storage for static web hosting.
  • All the Vue static assets are uploaded into blob storage and configure Azure CDN to distribute the content across the world. Here is the full article on how to do that with an example project.
  • We have to build two pipelines to deploy this application using Azure DevOps: Build pipeline and Release pipeline.
  • The build pipeline generates the artifact as soon as there is a commit in the source repository.
  • The Release pipeline takes the artifact and releases it to the appropriate environment.
  • Authentication should be made before you can use pipelines to access your Repos and upload files to the Azure subscription account.
  • You can use task groups to put the common tasks at one place across the environments.
  • You can use Azure key vaults to store the access keys for your storage account.
  • You can make use of Pipeline Variables while creating tasks. It’s a convenient way to get key bits of data into various parts of the pipeline.

Conclusion

This is a basic way of deploying a Vue static website using Azure DevOps. I didn’t use variables, Azure key vault, task groups in these pipelines because we are deploying into one environment. Those are out of scope for this article. I will write a full article on using all these while deploying into different environments.