At work, I’ve written and contributed to several Python scripts that need to be run on a regular basis. These scripts run once a day, if not several times a day or even every few minutes in some cases. For most of these scripts, they are looking at data from one source and then gathering data from another source to update the original source. For example, looking at objects in Salesforce and then checking our admin API for any updates needed to Salesforce. Recently as I’ve taken a bigger role in contributing to these scripts, I’ve had the opportunity to learn more about cloud services and infrastructure. I thought I would share some of my experiences and what I’ve learned so far. I also have some thoughts around design and architecture that I am exploring as I look to improve some of the scripts that are currently deployed. In my case, I will be talking about using Amazon Web Services. However, the services on AWS are also available on other cloud providers like Microsoft Azure and Google Cloud Platform, so these ideas are applicable universally.
First, how do you deploy a script or application to the cloud? The first place you might start is with Virtual Private Servers or Virtual Machines. These are “computers” that are provisioned out as pieces of larger server computers to give you your own little computer in the cloud. On AWS these are called EC2 instances and you can pick from a variety of configurations. To setup your application on the server, you would go about it mostly the same way you would on your local machine – albeit while going through SSH on your terminal.
A side note here – one extension I strongly recommend for remote development like this and one I use all the time is Remote Explorer for Visual Studio Code. Basically, you can just hook up your IDE to your remote machine through SSH and develop like you are on your local machine. You can find more info here: https://code.visualstudio.com/docs/remote/ssh
Usually these virtual machines run some version of Linux and you’ll have to do some setup to get all the tools you’ll need to run your application. Usually this means using the package manager to update any existing packages, install the language and any dependencies needed for your project, as well as Git to download your source code. When you’re ready to start running your script on whatever cadence you need, you’ll then have to setup CRON jobs. CRON jobs define scripts that your machine will run at the time interval you decide. This could be something like every day at 12 or at every 10 minutes. The one thing that can be confusing is defining the CRON interval with the syntax but there’s a great tool that you can use to help figure out the CRON expression you need – https://crontab.guru/.
There’s definitely a bit of setup involved with using virtual machines like this and it can be annoying having to setup an environment from scratch to setup your project. This is where another service comes in – serverless functions. The whole idea here is that you don’t want to have to setup your virtual machine and manage that aspect of your infrastructure. In an ideal world you can just worry about writing your code and then easily have it fun. On AWS serverless functions are called Lambda functions. You can deploy your code and they just run – AWS handles all the setup and environment magic on the backend to make your job easier. These functions are great for running your backend API or running scripts that run for short durations, but might still need to be triggered by events or a schedule. I am working on moving a lot of our existing scripts at work to actually use this serverless architecture instead. One example that I think is a perfect use case for serverless is a script that will need to run once a week and make a few calls to our admin API to create sandbox accounts. I can pretty much code this and then just deploy it and forget about it. I don’t have to worry about setting up the server and I can depend on my code running when it needs to – I don’t have to worry about the server going down between long intervals. Additionally, I don’t have to pay for the computing power of the server when I’m not using it. I would only be paying for the few minutes once a week that I use.
This is just starting to scratch the surface of cloud computing, but virtual machines and serverless functions are definitely two important concepts to know and both widely used. There’s a lot more service to talk about that can integrate with these compute service that I will continue to touch on in a future post.