Overcoming Azure DevOps Pipeline Challenges with Terraform for AKS Deployments

Day two of developing KubeConductor is officially done.

Today, I focused on configuring an Azure DevOps CI/CD pipeline to deploy an Azure Kubernetes Service (AKS) cluster using Terraform. I’ll touch on the hurdles encountered while provisioning the infrastructure and managing existing resources, as well as the steps taken to resolve them.

The Code

My public repo can he found [here].

aks-cluster.tf: This is the Terraform configuration file that provisions an Azure Kubernetes Service (AKS) cluster using the AzureRM provider.

# aks-cluster.tf
provider "azurerm" {
  features {}

  client_id       = var.client_id
  client_secret   = var.client_secret
  tenant_id       = var.tenant_id
  subscription_id = var.subscription_id
}


resource "azurerm_resource_group" "default" {
  name     = "terraform-aks-rg"
  location = "West US"
}

resource "azurerm_kubernetes_cluster" "default" {
  name                = "terraform-aks-cluster"
  location            = azurerm_resource_group.default.location
  resource_group_name = azurerm_resource_group.default.name
  dns_prefix          = "terraform-aks"

  default_node_pool {
    name            = "default"
    node_count      = 2
    vm_size         = "Standard_DS2_v2"
    os_disk_size_gb = 30
  }

  identity {
    type = "SystemAssigned"
  }

  role_based_access_control_enabled = true

  tags = {
    environment = "Development"
  }
}

variables.tf: This Terraform configuration file defines a set of variables used in the aks-cluster.tf file to dynamically configure the Azure resources

# variables.tf
variable "resource_group_name" {
  description = "Name of the Resource Group"
  default     = "terraform-aks-rg"
}

variable "client_id" {
  description = "The Client ID of the Service Principal"
}

variable "client_secret" {
  description = "The Client Secret of the Service Principal"
}

variable "tenant_id" {
  description = "The Tenant ID of the Azure Active Directory"
}

variable "subscription_id" {
  description = "The Subscription ID where the resources will be created"
}

versions.tf: This Terraform configuration file specifies the required Terraform and provider versions needed. It helps ensure compatibility and stability by enforcing versioning.

# versions.tf
terraform {
  required_version = ">= 0.14"
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 2.56"
    }
  }
}

outputs.tf: This Terraform configuration file defines output variables for the infrastructure. In my case, these outputs are used to display values after the resources have been created. This functionality is not implemented yet, however.

# outputs.tf
output "kubernetes_cluster_name" {
  value = azurerm_kubernetes_cluster.default.name
}

output "resource_group_name" {
  value = azurerm_resource_group.default.name
}

Initial Pipeline Configuration

My goal for today was to automate the provisioning of an AKS cluster using Terraform in an Azure DevOps pipeline. The basic setup included a self-hosted Ubuntu agent and a multi-step pipeline with the following stages:

Terraform Init: Initializes a new or existing Terraform configuration

Terraform Plan: Generates an execution plan to show what changes Terraform will make to your infrastructure.

Terraform Apply: Applies the changes required to reach the desired state of the configuration.

Below is a screenshot of the steps passing the CI/CD pipeline, which runs on a self-hosted Ubuntu machine.

Debugging Pipeline Environment Variables

From the beginning, I knew I wanted an application that mirrored real-world cloud infrastructure as closely as possible. This meant that I needed to be careful with how I handle secret values such as IDs, passwords, etc. I opted to use Azure Pipeline Group Variables to store the sensitive information relating to my cloud infrastructure. My CI/CD pipeline fetches these variables on runtime, eliminating the need to hard-code these values.

Terraform relies on a consistent mapping of environment variables for Azure authentication. I encountered errors with variables like client_id and subscription_id not being recognized correctly. To fix this issue, I standardized the variable mapping using the TF_VAR_ prefix to align Terraform’s environment variables (TF_VAR_client_id, TF_VAR_client_secret, TF_VAR_tenant_id, and TF_VAR_subscription_id) with Azure’s environment requirements (ARM_CLIENT_ID, ARM_CLIENT_SECRET, etc.).

Here is the updated yml to match those requirements:

# Updated azure-pipelines.yml with complete TF_VAR_ variable mapping

trigger:
  branches:
    include:
      - main

pool:
  name: "SelfHostedUbuntu"

variables:
  - group: Terraform-SP-Credentials

jobs:
  - job: "Deploy_AKS"
    displayName: "Provision AKS Cluster Using Terraform"
    steps:
      # Step 1: Checkout Code
      - checkout: self

      # Step 2: Verify Terraform Version
      - script: |
          terraform --version
        displayName: "Verify Installed Terraform Version"

      # Step 3: Terraform Init (Set Working Directory and Pass All Variables with TF_VAR_ Prefix)
      - script: |
          terraform init
        displayName: "Terraform Init"
        workingDirectory: $(System.DefaultWorkingDirectory)/terraform
        env:
          ARM_CLIENT_ID: $(appId)
          ARM_CLIENT_SECRET: $(password)
          ARM_TENANT_ID: $(tenant)
          ARM_SUBSCRIPTION_ID: $(AZURE_SUBSCRIPTION_ID)
          TF_VAR_client_id: $(appId) # Client ID
          TF_VAR_client_secret: $(password) # Client Secret
          TF_VAR_tenant_id: $(tenant) # Tenant ID
          TF_VAR_subscription_id: $(AZURE_SUBSCRIPTION_ID) # Subscription ID

      # Step 4: Terraform Plan (Pass All Variables with TF_VAR_ Prefix)
      - script: |
          terraform plan -out=tfplan
        displayName: "Terraform Plan"
        workingDirectory: $(System.DefaultWorkingDirectory)/terraform
        env:
          ARM_CLIENT_ID: $(appId)
          ARM_CLIENT_SECRET: $(password)
          ARM_TENANT_ID: $(tenant)
          ARM_SUBSCRIPTION_ID: $(AZURE_SUBSCRIPTION_ID)
          TF_VAR_client_id: $(appId)
          TF_VAR_client_secret: $(password)
          TF_VAR_tenant_id: $(tenant)
          TF_VAR_subscription_id: $(AZURE_SUBSCRIPTION_ID)
      # Step 5: Terraform Apply (Set Working Directory and Pass All Variables)
      - script: |
          terraform apply -auto-approve tfplan
        displayName: "Terraform Apply"
        workingDirectory: $(System.DefaultWorkingDirectory)/terraform
        env:
          ARM_CLIENT_ID: $(appId)
          ARM_CLIENT_SECRET: $(password)
          ARM_TENANT_ID: $(tenant)
          ARM_SUBSCRIPTION_ID: $(AZURE_SUBSCRIPTION_ID)
          TF_VAR_client_id: $(appId)
          TF_VAR_client_secret: $(password)
          TF_VAR_tenant_id: $(tenant)
          TF_VAR_subscription_id: $(AZURE_SUBSCRIPTION_ID)

Azure CLI Summary:

Here’s a list of Azure CLI commands I used today. Sometimes these were used for debugging, sanity checking, or they were just necessary for Terraform to work.

1. Login to Azure Using Service Principal:

Authenticates to Azure using Service Principal credentials (appId, password, and tenant).

az login --service-principal -u <appId> -p <password> --tenant <tenant>

2. Retrieve AKS Cluster Credentials:

Fetches the AKS cluster credentials and configures kubectl to use this cluster.

az aks get-credentials --resource-group <resource_group_name> --name <kubernetes_cluster_name>

3. Browse Kubernetes Dashboard:

Opens the Kubernetes dashboard for the specified AKS cluster in the Azure Portal.

az aks browse --resource-group <resource_group_name> --name <kubernetes_cluster_name>

4. Create Service Principal:

Creates an Azure Active Directory Service Principal for authentication in Terraform.

az ad sp create-for-rbac --skip-assignment

Takeaways

State Management Needs Improvement: The pipeline deploys the AKS cluster and pods successfully if the resource group does not exist. However, the pipeline fails since state management is not set up yet — this a task that requires configuring a remote state backend.

Streamlined Pipeline Steps: The final pipeline configuration effectively provisions the AKS cluster but needs fine-tuning for state management and validation.

Next Steps:

  • Implement a remote state backend to centralize state management.
  • Reintroduce verification and post-deployment steps (kubectl commands) in the pipeline.
  • Optimize the CI/CD flow to handle both new and existing infrastructure.

Finally, I would like to thank my incredibly hard-working agent — SelfHostedUbuntu! Look at him just chugging along and handling errors like a champ :,-)

Bye for now!

KubeConductor

The first day of development went well. I spun-up various services in Azure, set up a self-hosted agent for my CI/CD, and installed Terraform onto my system. I gained deeper knowledge of containers, container orchestration, and the tools that facilitate the defining and provisioning of cloud infrastructure.

This is how it went:

Self-hosted Ubuntu server: While not ideal nor easy, I had to go with a self-hosted machine since Azure Pipelines requires a 1-3 day waiting period for parallel jobs. Thankfully, Azure provides students with $100 of free credit each month. I was able to spin up an entire Resource Group consisting of a NSG inbound and outbound rules, a virtual network (vnet), and network interface completely for free!

Unfortunately, since the server was running Ubuntu, RDP-ing into the machine would be a little complicated. My alternative was to simply SSH into the machine, but I worried I lacked the technical knowledge to do the entirety of the machine configuration through the command line. After some research, I came found a Microsoft document that details how install and configure a desktop environment (xfce) and remote desktop (xrdp) on Linux VMs running Ubuntu. This solution ultimately allowed me to RDP to a Linux machine using my Linux machine (using Remmina). It was my first time configuring RDP from one Linux machine the another Linux machine.

Success!

This image has an empty alt attribute; its file name is image-2.png

CI/CD: Now it was time to connect my Azure Pipeline CI/CD to my Ubuntu agent! Again, there was some extra work involved since I was opting for the self-hosted agent. Namely, the extra work involved installing Azure agent 3.x software on the remote VM so that it can listen for Azure Pipelines jobs. Microsoft once again came through with some great documentation detailing how to install the 3.x agent software on the Ubuntu machine. Eventually, I was able to set the server to listen in on my Azure Pipeline CI/CD!

The final step was to actually run my CI/CD with my target Ubtuntu machine. I made a simple YAML file targeting my agent pool that simply echos “Hello, World!”

trigger:
- main

pool:
  name: SelfHostedUbuntu  # Self-hosted agent

steps:
- script: echo Hello, World!
  displayName: 'Run a one-line script'

- script: |
    echo Add other tasks to build, test, and deploy your project.
    echo See https://aka.ms/yaml
  displayName: 'Run a multi-line script'

I decided to host my CI/CD code on GitHub (here) since that’s what I’m most comfortable with. The README details my timeline for features.

Finally, the moment of truth. I ran the Azure pipeline and… success!

From the screenshots above, the two runs passed the pipeline and my agent is registering as online!

Terraform: Not much to report here for now. I installed Terraform onto my Linux machine running pop_os 22.02, an Ubuntu-based operating system. Installation was smooth, simply using a few commands was enough to get Terraform running out-the-box:

sudo apt-get update && sudo apt-get install -y gnupg software-properties-common

wget -O- https://apt.releases.hashicorp.com/gpg | \
gpg --dearmor | \
sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg > /dev/null

echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] \
https://apt.releases.hashicorp.com $(lsb_release -cs) main" | \
sudo tee /etc/apt/sources.list.d/hashicorp.list

sudo apt update
sudo apt-get install terraform

Conclusion: I spent the rest of the day learning about AKS, EKS, Kubernetes, and containers. I learned how Kubernetes is an orchestration service for containers and how Terraform can facilitate the defining and provisioning of cloud infrastructure using a declarative language. My next step is setting up a single-cluster Kubernetes environment using Azure and basic resources such as VPCs, subnets, IAM roles, and security groups. I’m still debating on whether to use a simple, dumb microservice or create some useful microservice-driven application. I plan on using Go for the microservice for its simplicity, quick startup time, static typing, and concurrency support.

Bye for now!

Looking back…

There were many challenges at the begging of my capstone project. Some students left the team, and new ones joined. We settled on a programming language, but then that changed once requirements became clear. We started with NodeJs, but ended up using a mix of Go and NodeJS. There were many things that had to be changed, and that’s mainly due to my team’s and I lack of technical knowledge when it comes to setting up a development project from scratch.

However, I never viewed changing requirements as a downside. Our initial design and requirements document served as a good starting point and reflected the limited knowledge my team had in software development. Every time something in the original plan had to be changed it was because someone in the team learned enough about a new technology, and how it relates to our project, that they understood we had to pivot to something better.

Reading, practicing, and debugging is the best way to learn a new technology. Reading or viewing tutorials is not enough; to grow as a developer you need to try building a new project and struggle through the process of debugging your project. Our initial design and requirement documents laid the foundation that allowed us to start building basic functionalities for our application. This process allowed us to identify areas in our original plan that had to be changed.

I chose this project because I had internship experience as a DevOps engineer that I really enjoyed. I saw that this project would allow me to build CI/CD pipelines and configure a custom git plans for my team. I also had experience with Azure DevOps and I wanted to see if those skills transferred over to AWS (they do).

Overall, I wish my team would have made more progress on the application, but lack of technical knowledge really held us back initially. I’m hopeful that the rest of the year will be productive.

Clean Code!

Programming isn’t all about knowing syntax and writing code that complies successfully – it’s also about writing code in a structured way that makes it easier for developers to think about complex logic.

When I first started programming, my only concern was writing code that would run. When I started my first internship and got my first look at professional code, I realized that writing code was much, much more complex than I originally thought. The magnitude of enterprise code bases are overwhelmingly complex and take a weeks, or even months, to fully grasp. The process of understanding a complex code base was alleviated when I came across code that was structured in a meaningful way. Luckily, I was apart of a team who prioritized writing clean code; from the file structure and naming conventions for files and folders, to the building patterns for object oriented programming, it was all important to ensure that the code base remained robust and easy to understand.

For my capstone project, I’m implementing much of what I learned from my internships to ensure that I write good code that will be easy for my team to understand. I plan to maintain a logical file structure with good naming conventions for all the files, classes, and methods in my code. I’m currently working on setting up CRUD operations for user accounts using Typescript with object oriented programming while ensuring that I’m using encapsulation, inheritance, composition, and polymorphism effectively.

Another trick I learned in my internship was using decorators to extend the behavior of a class. For example, I plan to use decorators for authentication and authorization for when the user accesses certain things in the application.

Writing clean code is just as important as writing code that runs, especially when working with a team or working on a long term project. Clean code improves code readability, which in turn improves productivity when a new team member joins the team or when refactoring is needed.

Wednesday, December 6, 2023

Before starting my capstone, all of my projects were done from scratch with minimal use of Software as a Service (SaaS). Previously, when I wanted to create login/authentication for an application, I researched Json Web Tokens (JWT) and implemented something from scratch. It was very time consuming and felt like it distracted me from the bigger picture of the project. However, with SaaS from AWS, they manage that all for you.

Another example is setting up a database. There is a lot of complexity involved with setting up a database for a large project — server configuration, scaling, monitoring, etc. Luckily, there are SaaS options from AWS that handle everything for you.

I realized the importance of using SaaS to save time and money, especially for large projects organized around agile practices, where quick product delivery is key.

My teammates decided that we will use AWS for most of our project functionality since the application is expected to perform various complex functions.

Experience with AWS, and SaaS in general, will help me land a job as a software developer. I recently accepted an internship for a DevOps position where SaaS options like Terraform and Azure are used regularly. For this project, however, we are using AWS, but I expect there to be a lot of overlap between AWS and Azure.

Overall, I’m happy with my team and how things are progressing. The project mentors are always eager to help despite their busy schedules. My teammates all seem excited to get the project off the ground despite the hiccups we had at the beginning of the term. The class is structured nicely to ensure that we are consistently making progress, while ensuring that everyone in the team does their part. One difficulty is working with the in-person team since our schedules and assignments don’t align. My team is adapting well however, and I expect it to get easier as time goes on.

I’m glad this difficult term is coming to a close and winter break is right around the corner. I plan on getting some AWS and Azure certifications during the break and maybe starting a small personal project of my own. I’m excited for Winter term since it will involved much more development.

Tuesday, November 14, 2023

My previous internships have really prepared me in tackling complex projects with vague requirements. I’ve used the skills I acquired from those internships in tacking my capstone project.

Two members of my team changed projects and a new person was added onto the team pretty late into the term. Instead of being stressed and angry about it, I just rolled with it. The two team members left the group due to vague project requirements from the project mentors. Initially, I was also considering changing projects, but I realized, given the vague requirements, that this is a fantastic opportunity to be creative with how I’m going to tackle the project. The OSU CS program and my internship experience has taught me the importance of being flexible and tacking problems with a positive mindset.

So far, I’ve designed a backend for the project that uses existing services from Amazon. These services include a database, real-time data ingestion, authentication/authorization, and payment preprocessing. The scope of the project is large, so I opted to use those existing services until we encounter a roadblock that requires us to implement something from scratch. These services will all be triggered by a Lambda function that is called from an API gateway. I also plan on taking a “DevOps” role in the team, so that I could manage our CI/CD pipelines and troubleshoot issues with AWS.

We are using flutter for the frontend, which I enjoy a lot since it’s a relevant and new technology. It also lends itself to the microservice architecture, which I expect will slowly be more relevant to our project increases in complexity. But for now, there is a single API gateway that the frontend will call to trigger some AWS Lambda function.

Overall, the initial parts of the project were difficult since the requirements were vague and my teammates seemed uninterested in the project. However, from my experience, no project has explicit and clear requirements in the real-world. As developers we should tackle vague requirements head-on and learn as we go. I’m glad I decided to stay with this project since it has enabled me to be creative in how to tackle a complex project.

Monday, October 2, 2023

Introduction

My name is Osbaldo Arellano and I’m a Senior at Oregon State University studying Computer Science. I live in a small farm-working community called Gervais, located in Oregon. II enjoy playing challenging video games in my free time. I’m currently playing Lies of P and Baldur’s Gate 3, both of which are fantastic, but difficult, games. I also enjoy fishing for salmon and trout in Oregon’s scenic rivers.

My interest for computers and software began when I was a kid. My obsession with having a cool MySpace profile when I was in middle school led me to learn how to edit HTML and CSS. It also taught me how to view the source code of various web pages. I recall being amazed that characters on a screen could produce visually appealing websites. It was basically magic to me. Through middle school and high school, I was obsessed with software. I would download all the cool iOS apps and desktop software. Although I didn’t know how any of it worked, I was amazed at what computer can do. After high school, however, I did not pursue a future in software.

I joined a trade as an HVAC technician apprentice for three years before finally deciding to chase of my dreams of studying computer science. Quitting my job to go back to school was a huge leap for me, I was one year away from becoming a licensed journeyman. But I was young, so I took the leap and enrolled in Chemeketa Community College. After finishing my first computer science course, I realized that I truly did have a passion for programming and computer science.

It’s been three years and now I’m a senior at OSU. I have two internships under my belt, one in back-end software development and the other in DevOps engineering. I’ve taken courses in everything from parallel programming to algorithms, and have I completed a ton of projects. I don’t have a favorite technology or framework, but I really enjoy working on .NET/C# projects because of its emphasis on object-oriented programming.

The capstone project that interests me the most is Betchya, a mobile app that brings mobile betting to tribal casinos. There has been a huge increase in mobile betting recently, so I think that getting experience in the virtual betting world could be beneficial to me when I start looking for jobs. It’s also an opportunity to work with Native American people and help them have a piece of the pie in the rise of online betting.