Day two of developing KubeConductor is officially done.
Today, I focused on configuring an Azure DevOps CI/CD pipeline to deploy an Azure Kubernetes Service (AKS) cluster using Terraform. I’ll touch on the hurdles encountered while provisioning the infrastructure and managing existing resources, as well as the steps taken to resolve them.
The Code
My public repo can he found [here].
aks-cluster.tf: This is the Terraform configuration file that provisions an Azure Kubernetes Service (AKS) cluster using the AzureRM provider.
# aks-cluster.tf
provider "azurerm" {
features {}
client_id = var.client_id
client_secret = var.client_secret
tenant_id = var.tenant_id
subscription_id = var.subscription_id
}
resource "azurerm_resource_group" "default" {
name = "terraform-aks-rg"
location = "West US"
}
resource "azurerm_kubernetes_cluster" "default" {
name = "terraform-aks-cluster"
location = azurerm_resource_group.default.location
resource_group_name = azurerm_resource_group.default.name
dns_prefix = "terraform-aks"
default_node_pool {
name = "default"
node_count = 2
vm_size = "Standard_DS2_v2"
os_disk_size_gb = 30
}
identity {
type = "SystemAssigned"
}
role_based_access_control_enabled = true
tags = {
environment = "Development"
}
}
variables.tf: This Terraform configuration file defines a set of variables used in the aks-cluster.tf file to dynamically configure the Azure resources
# variables.tf
variable "resource_group_name" {
description = "Name of the Resource Group"
default = "terraform-aks-rg"
}
variable "client_id" {
description = "The Client ID of the Service Principal"
}
variable "client_secret" {
description = "The Client Secret of the Service Principal"
}
variable "tenant_id" {
description = "The Tenant ID of the Azure Active Directory"
}
variable "subscription_id" {
description = "The Subscription ID where the resources will be created"
}
versions.tf: This Terraform configuration file specifies the required Terraform and provider versions needed. It helps ensure compatibility and stability by enforcing versioning.
# versions.tf
terraform {
required_version = ">= 0.14"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = ">= 2.56"
}
}
}
outputs.tf: This Terraform configuration file defines output variables for the infrastructure. In my case, these outputs are used to display values after the resources have been created. This functionality is not implemented yet, however.
# outputs.tf
output "kubernetes_cluster_name" {
value = azurerm_kubernetes_cluster.default.name
}
output "resource_group_name" {
value = azurerm_resource_group.default.name
}
Initial Pipeline Configuration
My goal for today was to automate the provisioning of an AKS cluster using Terraform in an Azure DevOps pipeline. The basic setup included a self-hosted Ubuntu agent and a multi-step pipeline with the following stages:
Terraform Init: Initializes a new or existing Terraform configuration
Terraform Plan: Generates an execution plan to show what changes Terraform will make to your infrastructure.
Terraform Apply: Applies the changes required to reach the desired state of the configuration.
Below is a screenshot of the steps passing the CI/CD pipeline, which runs on a self-hosted Ubuntu machine.

Debugging Pipeline Environment Variables
From the beginning, I knew I wanted an application that mirrored real-world cloud infrastructure as closely as possible. This meant that I needed to be careful with how I handle secret values such as IDs, passwords, etc. I opted to use Azure Pipeline Group Variables to store the sensitive information relating to my cloud infrastructure. My CI/CD pipeline fetches these variables on runtime, eliminating the need to hard-code these values.
Terraform relies on a consistent mapping of environment variables for Azure authentication. I encountered errors with variables like client_id and subscription_id not being recognized correctly. To fix this issue, I standardized the variable mapping using the TF_VAR_ prefix to align Terraform’s environment variables (TF_VAR_client_id, TF_VAR_client_secret, TF_VAR_tenant_id, and TF_VAR_subscription_id) with Azure’s environment requirements (ARM_CLIENT_ID, ARM_CLIENT_SECRET, etc.).
Here is the updated yml to match those requirements:
# Updated azure-pipelines.yml with complete TF_VAR_ variable mapping
trigger:
branches:
include:
- main
pool:
name: "SelfHostedUbuntu"
variables:
- group: Terraform-SP-Credentials
jobs:
- job: "Deploy_AKS"
displayName: "Provision AKS Cluster Using Terraform"
steps:
# Step 1: Checkout Code
- checkout: self
# Step 2: Verify Terraform Version
- script: |
terraform --version
displayName: "Verify Installed Terraform Version"
# Step 3: Terraform Init (Set Working Directory and Pass All Variables with TF_VAR_ Prefix)
- script: |
terraform init
displayName: "Terraform Init"
workingDirectory: $(System.DefaultWorkingDirectory)/terraform
env:
ARM_CLIENT_ID: $(appId)
ARM_CLIENT_SECRET: $(password)
ARM_TENANT_ID: $(tenant)
ARM_SUBSCRIPTION_ID: $(AZURE_SUBSCRIPTION_ID)
TF_VAR_client_id: $(appId) # Client ID
TF_VAR_client_secret: $(password) # Client Secret
TF_VAR_tenant_id: $(tenant) # Tenant ID
TF_VAR_subscription_id: $(AZURE_SUBSCRIPTION_ID) # Subscription ID
# Step 4: Terraform Plan (Pass All Variables with TF_VAR_ Prefix)
- script: |
terraform plan -out=tfplan
displayName: "Terraform Plan"
workingDirectory: $(System.DefaultWorkingDirectory)/terraform
env:
ARM_CLIENT_ID: $(appId)
ARM_CLIENT_SECRET: $(password)
ARM_TENANT_ID: $(tenant)
ARM_SUBSCRIPTION_ID: $(AZURE_SUBSCRIPTION_ID)
TF_VAR_client_id: $(appId)
TF_VAR_client_secret: $(password)
TF_VAR_tenant_id: $(tenant)
TF_VAR_subscription_id: $(AZURE_SUBSCRIPTION_ID)
# Step 5: Terraform Apply (Set Working Directory and Pass All Variables)
- script: |
terraform apply -auto-approve tfplan
displayName: "Terraform Apply"
workingDirectory: $(System.DefaultWorkingDirectory)/terraform
env:
ARM_CLIENT_ID: $(appId)
ARM_CLIENT_SECRET: $(password)
ARM_TENANT_ID: $(tenant)
ARM_SUBSCRIPTION_ID: $(AZURE_SUBSCRIPTION_ID)
TF_VAR_client_id: $(appId)
TF_VAR_client_secret: $(password)
TF_VAR_tenant_id: $(tenant)
TF_VAR_subscription_id: $(AZURE_SUBSCRIPTION_ID)
Azure CLI Summary:
Here’s a list of Azure CLI commands I used today. Sometimes these were used for debugging, sanity checking, or they were just necessary for Terraform to work.
1. Login to Azure Using Service Principal:
Authenticates to Azure using Service Principal credentials (appId, password, and tenant).
az login --service-principal -u <appId> -p <password> --tenant <tenant>
2. Retrieve AKS Cluster Credentials:
Fetches the AKS cluster credentials and configures kubectl to use this cluster.
az aks get-credentials --resource-group <resource_group_name> --name <kubernetes_cluster_name>
3. Browse Kubernetes Dashboard:
Opens the Kubernetes dashboard for the specified AKS cluster in the Azure Portal.
az aks browse --resource-group <resource_group_name> --name <kubernetes_cluster_name>
4. Create Service Principal:
Creates an Azure Active Directory Service Principal for authentication in Terraform.
az ad sp create-for-rbac --skip-assignment
Takeaways
State Management Needs Improvement: The pipeline deploys the AKS cluster and pods successfully if the resource group does not exist. However, the pipeline fails since state management is not set up yet — this a task that requires configuring a remote state backend.
Streamlined Pipeline Steps: The final pipeline configuration effectively provisions the AKS cluster but needs fine-tuning for state management and validation.
Next Steps:
- Implement a remote state backend to centralize state management.
- Reintroduce verification and post-deployment steps (
kubectlcommands) in the pipeline. - Optimize the CI/CD flow to handle both new and existing infrastructure.
Finally, I would like to thank my incredibly hard-working agent — SelfHostedUbuntu! Look at him just chugging along and handling errors like a champ :,-)

Bye for now!