AWS networking basics with terraform and a tiny bit of microservices

I have a moniker for myself - the accidental software engineer. I never imagined 5 years ago that I would be in this industry neither did I image that I would soon have to learn DevOps and aws architecture but I guess that is par for the course in a start up. I found myself in an awkward situation late last year when our DevOps gave notice that he was leaving. I had to pick up this skill otherwise, I would not have been able to deploy the code I had written. This exposure to our aws infrastructure piqued my interest to dive deeper into the subject matter and after months of hacking around, here we are - the accidental DevOps.

In this article, I will write and try to explain where I can the basic networking architecture in aws. I will implement a very simple application that sends an email once you interact with the server. The image below explains what I will achieve.

In the above image, we have a virtual private cloud inside an aws region, we then have public subnets where our flask web servers will be located. A load balancer is used to distribute the traffic going to these two servers. We have a private subnet where we run a python service that basically communicates with an sqs instance, extracts the payload and sends an email. The code for the servers and service will be implemented in python whilst the creation of the aws resources will be implemented with terraform an infrastructure as code software tool(yes you can create all these resources on aws without clicking a button for the most part). You can replicate this project by opening an aws account, pulling the code from this repository here and making slight changes to the code. You will have to undo all your changes in aws else you might incur some charges especially with anything named elastic(aren't all the resources there named this way).

When you create an aws account, you are assigned a default region and aws also creates a virtual private cloud(VPC) for you. A VPC is basically a data centre where you can host your servers. Each region also has availability zones. These are selected zones in your region where the physical hardware is located optimised to create a buffer in case of floods, bad weather etc to protect from data loss. Inside every VPC is a subnet. These could be private or public depending on the route table configuration and security group in which they are associated to. Since your network is private, you need some way to communicate with the internet. You basically have 3 options one is through an internet gateway for public subnets and the other two are through a vpc endpoint and a NAT Gateway. The later option is very expensive and should be avoided in my opinion especially if you are doing this on a private account. Now let us see some code.

In the first part of this code, I have used the eu-north-1 region - Stockholm which is a favorite city of mine. We create the VPC, an internet gateway and four subnets two in the availablilty zones 1a and 1b. The cidr block is out of the scope of this article but the subnets usually inherit a subset of the cidr block from the VPC. To instruct the VPC where to direct all incoming traffic, we need to create a route table. By default AWS creates a local route so the router can redirect all traffic coming from within. If you want a route to the outside world, you need to create a route that accepts all ip addresses like so "0.0.0.0/0" and associate this with the id of your internet gateway. Finally, we associate the route tables with their respective subnets. The configuration of these route tables is what classifies a subnet as public or private.

In this second section, we create vpc endpoints, security groups and load balancers. Remember I said for private subnets to communicate with the internet, aws needs either a NAT Gateway which is very expensive as my savings sadly discovered or a vpc endpoint. Basically the vpc endpoint creates a private link communication from your vpc to aws resources such as s3, sqs, dynamodb. For this to work, you need to enable dns support and hostname for your vpc and enable private dns when configuring your endpoints. There is something worth nothing here and I spent a lot of time trying to get this part work, once you enable private dns for the vpc endpoints, both your public and private subnets will access these vpc endpoints via this configuration for communication with aws services according to this article. This also means you have to attach the security groups and respective private and public subnets to these vpc endpoint configurations.

We then configure the security groups. For the public security group we open port 80 for incoming http traffic. This is also the port we expose in our docker container hosting our flask server. For outgoing traffic we open all the ports. Another gotcha here is that you have to assign the security group of the load balancer to the security group of the public subnets so it can accept traffic from the load balancer. This setup actually means that if we wanted, our web servers could have also been hosted in a private subnet since the load balancer is public. Finally we configure our load balancer attaching a listener that also listens for http traffic on port 80 and forwards it to our load balancer. We assign our vpc as the target group for this load balancer. In this project, I only listen to the endpoint "/api" in the load balancers health check. The default is "" but this will always fail because I don't listen to this route in the server.

In the third section, we create the resources and iam role that our containers need to operate. We need only one aws resource which is sqs(simple queue service). When a user hits the api endpoint of our server like this www.whatever.com/api?name=George&email=xxx@xxx.com the flask server reads this and puts it on the queue. The python microservice then extracts this information from the queue and sends an email to the address given. To run our containers on aws we need a service role that can access s3 to download the files needed to set up the serverless machine, sqs to read and write to the queuing system, secrets to download secrets needed for the repository, CloudWatch to store our logs, ecr to extract the code pushed to the amazon private repository and ses to send emails to the clients. I have given this service role full access to all these resources but usually you want to give only what is needed for example the server needs only push access to the sqs and the python service needs only read access but I didn't want to complicate the project with these details.

In the last terraform section, we create an ecr (elastic container registry). This is where we host our docker images. Afterwards we create our cluster and two tasks. The tasks contain the serverless instances that hosts our server and service. Make sure you activate logs to be able to see what is happening in your instance. We also need to create two services. These services help start up the tasks and make sure they are running. Whenever a task goes down, it is the responsibility of the service to start it back up. It is also in the service that we assign the load balancer. For the web service, we attach the load balancer, the public subnets and the public security group. For the python service, we attach the private subnets and the private security group.

This is our server. It is a python flask server listening on port 80. It retrieves query strings from the endpoint "api" and posts the name and email to sqs.

This is the python service that retrieves and sends the email. It is worth noting that you can't use boto3 ses here because the vpc endpoint only allows smtp servers so you have to go to aws and configure your own smtp user. Unfornately I didn't add this part to the terraform script and you have to do it on aws.

To build the image, you have to run a docker file like this for both the server and python service and create a requirements.txt file. Don't worry the code is on GitHub in case yo have doubts on how to do it.

Finally to build and push your images you have to do the above which is an example to push the image for the python service into the ecr repository created in terraform.

At the end of the day this is a lot of work just to create a server and service that sends an email that contains "Hello George here is your email" but I hope you enjoyed this article. Write a comment in the comments section below if you have a comment/question or there is something that didn't work or just shoot me an email. Maybe you too could become an accidental DevOps.

Comments

Popular posts from this blog

How we processed data of over 100gb with 16gb of ram

Python meets linear algebra