My experiment with AI in DevOps
You have heard how the latest advances in AI can help with software development. A natural question would be if AI could be integrated with DevOps to revolutionize the way infrastructure is managed and deployed.
So I did a simple experiment.
Can AI help automate some of my daily tasks?
I use Terraform as my major IaC (Infrastructure as Code) tool. At the very least, if I feed the model with detailed instructions, can it write my IaC code flawlessly?
The LLM models I used
I did some research, and found 3 cutting-edge language models suitable for Terraform coding:
I was on a budget, so I picked the free tier for each. Kagi FastGPT is a nimble research tool that gives superb results, as it performs RAG with Kagi's search index.
Phind does RAG too, and its underlying model was fine-tuned with codes. So it is being marketed as a coding assistant. OpenAI ChatGPT: you know what it is and its capability.
The task to do
I used the 3 models to research and later write Terraform and Python scripts, to set up an AWS Lambda function that opens and parses any new csv file in an existing S3 bucket, and appends the rows to a Redshift table.
The Lambda function is to be triggered by an EventBridge event which is emitted when a new file is added to the S3 bucket.
This is a very simplistic but realistic task, and to lessen the scope, I have not instructed the models to generate the deployment script. So I took care of the deployment myself, by manually packaging the python code and uploading it to another S3 bucket to be picked by the Lambda function.
But I expected the models to generate proper IAM roles and policies, so that the Lambda function had proper access to the various AWS services for its task. A DevOps engineer would find this task the most time-consuming.
The research phase
I tried to imagine myself as a junior / intermediate DevOps engineer (it was harder than I thought) using AI. So I asked the basic questions, and then the detailed questions. For examples:
- what are the common ways where AWS lambda functions are used in data engineering?
- can AWS EventBridge do what AWS Cloudwatch Events do?
- how to deploy a lambda function that requires external python libraries as project dependencies?
- what are the managed policies available for lambda function to write to Amazon Cloudwatch Logs?
To my pleasant surprise, the models got most of the answers and facts correct. If one missed something, the other models usually filled in.
The RAG models: Kagi FastGPT and Phind did slightly better than OpenAI ChatGPT 3.5. I expected this, because it is common sense that the answers will be better if you feed the models with the relevant tech documentations.
One downside was that the models did not have a strong opinion on which approach was optimal or the best practice. They gave commandline or console instructions in some answers, but to be fair, I was conducting a research and so did not prompt the models for instructions conforming with IaC.
The coding phase
After studying the answers from the research phase, I formulated an approach and created the necessary prompts. I started asking the models to generate codes.
write a Terraform script to achieve the following:
- set up an EventBridge event and rule to track changes in an existing S3 bucket: adding of new csv file
- when a new csv file is added in the bucket, trigger a lambda function to run
- the lambda function should parse the file and append the rows in the csv file to a Redshift table, using Python runtime.
- the lambda function will be uploaded to another existing S3 bucket to be picked up by AWS
- create a proper role with proper managed policies so that the lambda function has access to read from S3, and write to Redshift.
write some python code to achieve the following:
- make a connection to an Amazon Redshift Serverless database. assuming the script will be run with appropriate IAM role to make the connection.
- given a csv DictReader object, read all dictionary objects from it.
- write all dictionary objects into a database table, appending to it.
I have asked the models to generate a single script instead of a whole project (did not think that was possible). I already had a project boilerplate that I could use.
The challenges of AI
While AI demonstrated speed and versatility in providing code snippets, it struggled to deliver the most optimal solutions. I found that errors and inaccuracies often permeated the generated code.
In some cases, some lines of code were complete fabrications or "hallucinations" — instances where the AI produces made-up results but presents them as if they were true. You need considerable DevOps knowledge in order to spot such fabrications.
A model gave an incorrect aws_lambda_permission
resource block in its generated code:
resource "aws_lambda_permission" "s3_event_permission" {
statement_id = "AllowExecutionFromS3Event"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.csv_parser_lambda.function_name
principal = "s3.amazonaws.com"
source_arn = aws_s3_bucket.existing_bucket.arn
}
The principal should be events.amazonaws.com
.
Another glaring weakness of AI is its inability to leverage reusable third-party libraries and modules. It often causes the generated code to be very verbose.
In my experiment, the generated terraform script consists of many individual resource blocks. AI has failed to make use of any official AWS terraform modules.
Additional prompts and follow-up questioning were not able to force the models to use AWS terraform modules. It seemed the model had not been trained with data that included those modules.
In the generated python script, AI has opted to use the INSERT
SQL statement instead of the more efficient COPY
.
with open(download_path, 'r') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_reader) # Skip header
for row in csv_reader:
# Assuming your Redshift table has columns col1, col2, col3
cursor.execute(
"INSERT INTO your_redshift_table (col1, col2, col3) VALUES (%s, %s, %s)",
(row[0], row[1], row[2])
)
redshift_conn.commit()
cursor.close()
redshift_conn.close()
DevOps principles emphasize modularity and efficiency. Unfortunately AI failed to grasp adequately these aspects.
Conclusion
As the experiment unfolded, it became evident that while AI holds promise in augmenting DevOps workflows, it is not yet mature enough to replace human expertise entirely. A DevOps expert brings not just technical proficiency but also contextual understanding and oversight, qualities that AI struggles to replicate.
Despite its limitations, AI proved its ability to generate ideas swiftly and in abundance. This presents an opportunity for human experts to leverage AI as a tool for ideation and exploration, allowing them to sift through generated concepts and decide on their feasibility.