LLM-powered incident analysis dashboard on Civo
Learn how to build an LLM-powered incident analysis dashboard on Civo using relaxAI, Kubernetes, and GPU acceleration for real-time inference.
Written by
Software Engineer @ GoCardless
Written by
Software Engineer @ GoCardless
Incident response teams face escalating volumes of unstructured logs and tickets that demand immediate attention yet resist efficient analysis. Traditional approaches struggle to keep pace with modern infrastructure complexity, creating delays that extend mean time to resolution and impact system reliability.
AI-powered analysis through platforms such as relaxAI automatically summarizes and categorizes incidents, extracting actionable insights from raw log data. By leveraging Civo's GPU-powered Kubernetes infrastructure, which is hosted entirely within UK data centers, you can achieve faster root-cause identification while maintaining data sovereignty and keeping sensitive operational data within your controlled infrastructure.
This tutorial demonstrates building a functional incident analysis dashboard prototype fully hosted on Civo Kubernetes with GPU acceleration for real-time inference, showcasing how to efficiently analyze logs without requiring explicit programming or pattern definition.
Prerequisites
Before beginning implementation, ensure the following tools and access are available:
- Active Civo account with Kubernetes and GPU access
- Python 3.9 or later installed locally
- FastAPI framework knowledge (basic understanding sufficient)
- Docker installed for container building and deployment
- kubectl configured for Kubernetes cluster management
- relaxAI API key generated from the RelaxAI dashboard
Project structure
incident-analysis-dashboard/├── backend/│ ├── main.py│ ├── requirements.txt│ └── Dockerfile├── frontend/│ └── index.html└── kubernetes/├── backend-deployment.yaml└── frontend-deployment.yaml
Step-by-step tutorial
Step 1: Set up your Civo GPU cluster
Create a Kubernetes cluster with GPU acceleration through the Civo dashboard:
- Log in to the Civo Dashboard
- Create a new Kubernetes cluster
- Add a GPU node pool (one node is sufficient for this demo)
- Download your kubeconfig
- Connect to the cluster by running
set KUBECONFIG=path-to-kubeconfig-filein your terminal (useexportinstead ofseton Linux or macOS).
Verify connectivity:
kubectl get nodeskubectl get pods -A

Step 2: Build the backend (FastAPI + relaxAI)
Create backend/main.py implementing the incident analysis API:
from fastapi import FastAPI, HTTPExceptionfrom fastapi.middleware.cors import CORSMiddlewarefrom pydantic import BaseModel, Fieldfrom typing import List, Dict, Any, Optionalimport osimport jsonimport loggingimport reimport httpx# ---------------------------------------------------------# Logging setup# ---------------------------------------------------------logger = logging.getLogger("incident-analysis-api")logging.basicConfig(level=logging.INFO,format="%(asctime)s [%(levelname)s] %(name)s - %(message)s",)# ---------------------------------------------------------# RelaxAI client initialization# ---------------------------------------------------------RELAX_API_KEY = os.getenv("RELAX_API_KEY")RELAX_MODEL = "Llama-4-Maverick-17B-128E"RELAX_URL = "https://api.relax.ai/v1/chat/completions"# ---------------------------------------------------------# FastAPI app + CORS# ---------------------------------------------------------app = FastAPI(title="Incident Analysis API")# CORS middleware so browser frontend can talk to this APIapp.add_middleware(CORSMiddleware,allow_origins=["*"], # for demo only; lock this down in real lifeallow_credentials=True,allow_methods=["*"], # includes GET, POST, OPTIONS, etc.allow_headers=["*"], # includes Content-Type)# ---------------------------------------------------------# Pydantic models# ---------------------------------------------------------class LogEntry(BaseModel):timestamp: str = Field(..., description="ISO8601 timestamp of the log entry")severity: str = Field(..., description="Log level, e.g. ERROR, WARNING, CRITICAL")service: str = Field(..., description="Service or component name")message: str = Field(..., description="Raw log message")metadata: Dict[str, Any] = Field(default_factory=dict)class AnalysisResult(BaseModel):root_cause: strcategory: strsummary: strseverity: strrecommendations: List[str]# ---------------------------------------------------------# Helper: safely extract JSON from model output# ---------------------------------------------------------def parse_relax_json(raw: str) -> Dict[str, Any]:"""Best-effort JSON extraction from model output.Handles:- Pure JSON- JSON wrapped in prose- JSON inside ```json ... ``` code blocksRaises ValueError if nothing JSON-like can be parsed."""if not raw:raise ValueError("Empty response from model")raw = raw.strip()# Attempt direct JSONtry:return json.loads(raw)except json.JSONDecodeError:pass# Handle code-fenced JSONif raw.startswith("```"):raw_stripped = re.sub(r"^```[a-zA-Z]*", "", raw)raw_stripped = raw_stripped.strip("` \n")try:return json.loads(raw_stripped)except json.JSONDecodeError:raw = raw_stripped # fallback to generic extraction# Extract first JSON object found in mixed textstart = raw.find("{")end = raw.rfind("}")if start != -1 and end != -1 and end > start:candidate = raw[start : end + 1]try:return json.loads(candidate)except json.JSONDecodeError:logger.warning("Failed to parse JSON candidate from model output")# Nothing workedraise ValueError(f"Could not parse JSON from model output: {raw[:200]}...")# ---------------------------------------------------------# Manual RelaxAI call# ---------------------------------------------------------async def call_relax_ai(prompt: str) -> Dict[str, Any]:if not RELAX_API_KEY:raise HTTPException(status_code=500, detail="RELAX_API_KEY not configured.")headers = {"Authorization": f"Bearer {RELAX_API_KEY}","Content-Type": "application/json",}payload = {"model": RELAX_MODEL,"messages": [{"role": "user", "content": prompt}],"temperature": 0.3,"max_tokens": 500,}logger.debug("Sending request to RelaxAI...")try:async with httpx.AsyncClient(timeout=10.0) as client:response = await client.post(RELAX_URL, json=payload, headers=headers)response.raise_for_status()data = response.json()logger.debug("Received RelaxAI response: %s", data)return dataexcept Exception as e:logger.error("RelaxAI request failed: %s", e)raise HTTPException(status_code=500, detail="RelaxAI call failed.")# ---------------------------------------------------------# Main endpoint# ---------------------------------------------------------@app.post("/analyze/logs", response_model=List[AnalysisResult])async def analyze_logs(logs: List[LogEntry]):if not logs:raise HTTPException(status_code=400, detail="No logs provided")logger.info("Received %d log entries for analysis", len(logs))results: List[AnalysisResult] = []for idx, log in enumerate(logs):logger.info("Analyzing log %d: %s", idx, log.service)prompt = f"""Analyze this system log entry:Service: {log.service}Severity: {log.severity}Message: {log.message}Metadata: {log.metadata}Respond ONLY with valid JSON and follow this exact structure:{{"root_cause": "...","category": "...","summary": "...","recommendations": ["...", "...", "..."]}}Do NOT include any text before or after the JSON object."""try:resp = await call_relax_ai(prompt)content = resp["choices"][0]["message"]["content"]parsed = parse_relax_json(content)except Exception as e:logger.error("Error analyzing: %s", e)parsed = {"root_cause": "Analysis failed","category": "error","summary": log.message[:200],"recommendations": ["Check logs manually"],}results.append(AnalysisResult(root_cause=parsed["root_cause"],category=parsed["category"],summary=parsed["summary"],severity=log.severity,recommendations=parsed["recommendations"],))logger.info("Returning %d results", len(results))return results# ---------------------------------------------------------# Health check# ---------------------------------------------------------@app.get("/health")async def health_check():"""Simple health endpoint for Kubernetes liveness / readiness probes."""return {"status": "healthy", "service": "incident-analysis-api"}
Create backend/requirements.txt:
fastapi==0.104.1uvicorn[standard]==0.24.0pydantic==2.5.0relaxai==0.1.0httpx==0.27.0
Step 3: Build a minimal dashboard
Create frontend/index.html with a clean, functional interface:
<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><title>Incident Analysis Dashboard</title><style>body { font-family: Arial, sans-serif; background: #f7f7f7; padding: 20px; }.container { max-width: 900px; margin: 0 auto; background: white; padding: 25px; border-radius: 10px; box-shadow: 0 2px 8px rgba(0,0,0,0.1); }h1 { margin-bottom: 10px; font-size: 32px; }button { padding: 8px 15px; border: 1px solid #333; background: #fff; cursor: pointer; border-radius: 4px; }#status { margin-left: 15px; font-size: 14px; }.stats { margin-top: 20px; display: flex; gap: 20px; }.stat-card { padding: 15px; background: #eef2ff; border-radius: 8px; width: 180px; }.stat-label { font-size: 14px; color: #555; }.stat-value { font-size: 24px; font-weight: bold; margin-top: 5px; }.incidents { margin-top: 30px; }.incident { background: #f9fafb; border-left: 4px solid #ccc; padding: 15px; margin-bottom: 20px; border-radius: 6px; }.incident-header { display: flex; justify-content: space-between; font-size: 16px; margin-bottom: 8px; }.severity-badge { padding: 3px 6px; border-radius: 4px; font-size: 12px; font-weight: bold; color: white; }.severity-error { background: #dc2626; }.severity-critical { background: #7f1d1d; }.severity-warning { background: #d97706; }.rec-item { margin-left: 10px; }</style></head><body><div class="container"><h1>🔍 Incident Analysis Dashboard</h1><div class="controls"><div style="margin-top:20px;"><label><strong>Paste Logs (JSON Array):</strong></label><br><textarea id="logInput" style="width:100%; height:150px; margin-top:5px; font-family:monospace;">[{"timestamp": "2025-01-15T14:23:45Z","severity": "ERROR","service": "api-gateway","message": "Connection timeout to database service after 30000ms","metadata": { "request_id": "req-789012" }}]</textarea></div><button onclick="analyzeIncidents()">Analyze New Incidents</button><span id="status"></span></div><div class="stats" id="stats"></div><div class="incidents" id="incidents"></div></div><script>// IMPORTANT: Change this before building the Docker imageconst API_URL = "http://<BACKEND-EXTERNAL-IP>:8000";async function analyzeIncidents() {const statusEl = document.getElementById("status");statusEl.textContent = "Analyzing...";let logs;try {logs = JSON.parse(document.getElementById("logInput").value);} catch (e) {statusEl.textContent = "Invalid JSON: " + e.message;return;}try {const response = await fetch(`${API_URL}/analyze/logs`, {method: "POST",headers: { "Content-Type": "application/json" },body: JSON.stringify(logs)});const results = await response.json();displayResults(results);statusEl.textContent = `Analyzed ${results.length} incidents`;} catch (error) {statusEl.textContent = "Analysis failed: " + error.message;console.error(error);}}function displayResults(results) {const categories = {};results.forEach(r => categories[r.category] = (categories[r.category] || 0) + 1);// Stats sectiondocument.getElementById("stats").innerHTML = `<div class="stat-card"><div class="stat-label">Total Incidents</div><div class="stat-value">${results.length}</div></div><div class="stat-card"><div class="stat-label">Critical</div><div class="stat-value" style="color:#dc2626">${results.filter(r => r.severity === "CRITICAL").length}</div></div><div class="stat-card"><div class="stat-label">Top Category</div><div class="stat-value" style="font-size:20px; color:#3b82f6">${Object.keys(categories)[0] || "None"}</div></div>`;// Incident cardsdocument.getElementById("incidents").innerHTML = results.map(incident => `<div class="incident"><div class="incident-header"><strong>${incident.category.toUpperCase()}</strong><span class="severity-badge severity-${incident.severity.toLowerCase()}">${incident.severity}</span></div><div><strong>Root Cause:</strong> ${incident.root_cause}</div><div style="margin:10px 0; color:#6b7280;">${incident.summary}</div><div class="recommendations"><strong>Recommendations:</strong>${incident.recommendations.map(r => `<div class="rec-item">• ${r}</div>`).join("")}</div></div>`).join("");}window.onload = () => analyzeIncidents();</script></body></html>
Here’s how the dashboard should look:

Source: Image by author
Important note: This value const API_URL = “http://<BACKEND-EXTERNAL-IP>:8000”; is baked into the Docker image at build time. That means you should deploy the backend first, wait for its LoadBalancer IP to become available, and then build the frontend image using that IP.
Step 4: Deploy on Civo Kubernetes
Containerize the application with Docker:
Create backend/Dockerfile:
FROM python:3.11-slimRUN apt-get update && apt-get install -y --no-install-recommends \ca-certificates \&& rm -rf /var/lib/apt/lists/*WORKDIR /appCOPY requirements.txt .RUN pip install --no-cache-dir -r requirements.txtCOPY main.py .CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
Build and push images:
# Build backendcd backenddocker build -t your-registry/incident-backend:latest .docker push your-registry/incident-backend:latest# For frontend, use nginxcd ../frontenddocker build -t your-registry/incident-frontend:latest -f - . <<EOFFROM nginx:alpineCOPY index.html /usr/share/nginx/html/EOFdocker push your-registry/incident-frontend:latest
Create a Kubernetes secret with your credentials:
kubectl create secret generic relaxai-secret \--from-literal=RELAX_API_KEY='your_relaxai_api_key_here' \
Create a Backend Kubernetes deployment in backend-deployment.yaml:
apiVersion: apps/v1kind: Deploymentmetadata:name: incident-backendspec:replicas: 1selector:matchLabels:app: incident-backendtemplate:metadata:labels:app: incident-backendspec:containers:- name: backendimage: alimohamed782/incident-backend:latestports:- containerPort: 8000envFrom:- secretRef:name: relaxai-secretresources:requests:cpu: "250m"memory: "256Mi"limits:cpu: "500m"memory: "512Mi"---apiVersion: v1kind: Servicemetadata:name: incident-backend-servicespec:selector:app: incident-backendports:- port: 8000targetPort: 8000type: LoadBalancer
Create Frontend Kubernetes deployment in frontend-deployment.yaml:
apiVersion: apps/v1kind: Deploymentmetadata:name: incident-frontendspec:replicas: 1selector:matchLabels:app: incident-frontendtemplate:metadata:labels:app: incident-frontendspec:containers:- name: frontendimage: alimohamed782/incident-frontend:latestports:- containerPort: 80---apiVersion: v1kind: Servicemetadata:name: incident-frontend-servicespec:selector:app: incident-frontendports:- port: 80targetPort: 80type: LoadBalancer
Deploy to the cluster:
kubectl apply -f backend-deployment.yamlkubectl apply -f frontend-deployment.yaml# Get service URLskubectl get services
Step 5: Test and visualize
Access the dashboard by opening http://<FRONTEND-EXTERNAL-IP> in your browser.
Send sample logs:
[{"timestamp": "2025-01-15T14:23:45Z","severity": "ERROR","service": "api-gateway","message": "Connection timeout to database service after 30000ms. Retry attempts exhausted. Last error: Connection refused on port 5432.","metadata": {"request_id": "req-789012","user_id": "user-45678"}},{"timestamp": "2025-01-15T14:24:12Z","severity": "CRITICAL","service": "payment-processor","message": "Transaction processing failed with exception: java.lang.OutOfMemoryError: Java heap space. Current heap usage: 95%. GC overhead limit exceeded.","metadata": {"transaction_id": "txn-234567","amount": 1250.00}},{"timestamp": "2025-01-15T14:25:33Z","severity": "WARNING","service": "auth-service","message": "High latency detected for authentication requests. Average response time: 2.5s (threshold: 500ms). Possible causes: database connection pool exhaustion.","metadata": {"endpoint": "/api/v1/auth/login","requests_per_minute": 450}}]
Here’s what you should see when analyzing the sample logs:

Source: Image by author
The dashboard displays:
- Total incident count and severity distribution
- Top error categories extracted by relaxAI
- Individual incident cards with root causes
- AI-generated recommendations for each issue
- Real-time updates when analyzing new logs
Summary
This project shows how to achieve practical AI-powered observability using sovereign infrastructure, keeping operational data fully under your control while still benefiting from advanced language-model intelligence. With relaxAI running on Civo GPUs, you can summarize and classify system logs locally, avoiding external services while using the Llama 4 Maverick 17B-128E model to extract meaningful insights from unstructured messages.
By running the entire stack (GPU compute, Kubernetes, and relaxAI inference) within a single provider, you gain real-time observability without external dependencies and ensure data sovereignty. The approach also opens paths for extension, including connecting to production monitoring pipelines, integrating with alerting tools like PagerDuty or Slack, adding historical trend analysis, and automating remediation based on AI-driven recommendations.
If you want to learn more about some of the topics we explored in this tutorial, check out some of these resources:

Software Engineer @ GoCardless
Mostafa Ibrahim is a software engineer and technical writer specializing in developer-focused content for SaaS and AI platforms. He currently works as a Software Engineer at GoCardless, contributing to production systems and scalable payment infrastructure.
Alongside his engineering work, Mostafa has written more than 200 technical articles reaching over 500,000 readers. His content covers topics including Kubernetes deployments, AI infrastructure, authentication systems, and retrieval-augmented generation (RAG) architectures.
Share this article