Troubleshooting Kubernetes with AI - Part Two: Using the K8s MCP server

In my previous post I showed how to build Kaia (the Kubernetes AI Agent) using Autogen 0.4 and Ollama. This applied the naive and simple approach of accessing kubectl using python’s subprocess and AutoGen’s FunctionTool . But nowadays the consensus seems to be that agent interactions should be standardized and the leading standard for that right now is MCP (Model Context Protocol). (For more details on Model Context Protocol - check out our recent webinar with Patrick Debois)

MCP in a nutshell

Model Context Protocol is a standard developed by Anthropic for connecting AI models with data sources. In a typical MCP architecture the AI assistant or agent is the client consuming the services of an MCP server. The MCP server provides any or all of the following features:

Resources: Context and data, for the user or the AI model to use
Prompts: Templated messages and workflows for users
Tools: Functions for the AI model to execute

The following diagram shows how an MCP server provides AI agents with capabilities in a typical agentic architecture:

sequenceDiagram participant User participant Client as Client App participant MCP as MCP Server participant Tool as External Tool/API User->>Client: Ask question about data Client->>MCP: Request available resources MCP-->>Client: List resources & capabilities Client->>MCP: Query specific resource MCP->>Tool: Fetch data Tool-->>MCP: Return data MCP-->>Client: Formatted response Client-->>User: Present answer Note over Client, MCP: All communication uses<br/>MCP Protocol (JSON-RPC)

Beside just transferring the LLM-induced calls to the external tool, the protocol also allows an MCP server to preserve some context and enrich the original prompts.

MCP for Kubernetes

Since getting released, the MCP protocol has seen significant uptake from the community. All major cloud services providers have already published MCP server implementations for their public API and open-source enthusiasts were quick to implement MCP for their favorite tools. And of course Kubernetes was no exception. One of the most complete MCP server implementations for K8s comes from Alexey Ledenev - a senior cloud architect at DoiT International. Beside supporting kubectl it also wraps around such important modern kube-related utilities as helm, argocd and istioctl .

The official repo shows a couple of examples integrating the MCP server with Claude Desktop for troubleshooting and application deployment. But what’s the fun in using AI if it’s not agentic, right? So I deided to see how this integrates with one of the popular agentic frameworks. Initially I thought of going with AutoGen 0.4 (as I did in my previous experiment), but I quickly found out that the MCP support in AutoGen looks bolted on as an afterthought via autogen_ext.tools.mcp and feels clunky at best (For some reason - the MCP server container started by AutoGen would exit before the agent had a chance to talk to it).

Therefore I decided to take another agentic framework for a test drive - namely PydanticAI that comes from the same team that built Pydantic - the data validation library that FastAPI is based on. According to the official website : “PydanticAI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI.” I instantly liked the “less painful” part. And in fact - integrating the MCP server with my agent’s code was as easy as:

Starting the MCP server in a container (note how I’m mounting my .kube./config into the container so it gets access to whatever cluster I have in my current context) :

server = MCPServerStdio(
   'docker',
   args=[
       'run',
       '--network=host',
       '-i',
       '-v',
       f'{os.path.expanduser("~")}/.kube:/home/appuser/.kube:ro', #mount .kube/config
       'ghcr.io/alexei-led/k8s-mcp-server:latest'
   ]
)

Providing it to the agent in the mcp_servers list:

agent = Agent(
   model,
   mcp_servers=[server],
   system_prompt=system_prompt,
   retries=5
)

An attentive reader will notice that agent initialization requires 2 more parameters: the model and the system prompt.

Multiple model providers

In my previous experiment I had the goal of running the whole setup self-hosted and open-source. But the results from the models I’ve tried with Ollama were quite discouraging - it looks like the smaller local models aren’t very good with using MCP. So this time I decided to support multiple model providers - with Ollama still being the default.

I added support for Gemini and Github models as both provide some free requests. In my tests - openai/gtp-4o from Gtihub models delivered the best results and it also has the added benefit of zero setup - all one needs is their Github PAT (Personal Access Token) to get this started.

def create_model(model_provider: str):
   """Create model based on provider choice."""
   if model_provider == 'ollama':
       # Ollama using OpenAI-compatible API
       return OpenAIModel(
           model_name=os.environ.get('OLLAMA_MODEL_NAME', 'llama2'),
           provider=OpenAIProvider(base_url='http://localhost:11434/v1')
       )
   elif model_provider == 'gemini':
       # Google Gemini
       return GeminiModel(
           model_name=os.environ.get('GEMINI_MODEL_NAME', 'gemini-2.0-flash')
       )
   elif model_provider == 'github':
       # GitHub models using OpenAI-compatible API
       github_token = os.environ.get('GITHUB_TOKEN')
       if github_token:
           os.environ['OPENAI_API_KEY'] = github_token
       return OpenAIModel(
           model_name=os.environ.get('GITHUB_MODEL_NAME', 'openai/gpt-4o'),
           provider=OpenAIProvider(base_url='https://models.github.ai/inference')
       )
   else:
       raise ValueError(f"Unsupported provider: {model_provider}. Choose from: ollama, gemini, github")

Note that while there’s a separate GeminiModel interface for Google Gemini - the access to Ollama and Github models is still provided via the OpenAIModel interface. I’m guessing this will change in the follow-up PydanticAI releases if somebody cares enough to provide custom interfaces for them.

The System Prompt

I started out with a prompt I crafted myself. It wasn’t working very well, so I asked Claude to improve it for me.

My original version was:

You are a Kubernetes operations assistant. You have access to a Kubernetes cluster through an MCP server.

To interact with the cluster, you must use the tools/call method with the following format:

For kubectl commands:

{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
        "name": "execute_kubectl",
        "arguments": {
            "command": "<your-kubectl-command>"
        }
    }
}

Always use the MCP server commands to execute tasks. Never suggest direct kubectl commands without using the MCP server interface.

Before executing commands, make sure to validate them and consider their impact.

Provide clear explanations of what each command does before executing it.

Never chain commands together in a single request.

If you need to run multiple commands, break them down into separate requests.

If any command fails - retry taking into account the error message.

The prompt that Claude came up with is so long that I decided to omit it here. You can see it in the code. Does it work better? I think so. Haven’t really benchmarked the results for accuracy or speed.