Build an AI-Driven Incident Analysis Dashboard with Civo & relaxAI

Incident response teams face escalating volumes of unstructured logs and tickets that demand immediate attention yet resist efficient analysis. Traditional approaches struggle to keep pace with modern infrastructure complexity, creating delays that extend mean time to resolution and impact system reliability.

AI-powered analysis through platforms such as relaxAI automatically summarizes and categorizes incidents, extracting actionable insights from raw log data. By leveraging Civo's GPU-powered Kubernetes infrastructure, which is hosted entirely within UK data centers, you can achieve faster root-cause identification while maintaining data sovereignty and keeping sensitive operational data within your controlled infrastructure.

This tutorial demonstrates building a functional incident analysis dashboard prototype fully hosted on Civo Kubernetes with GPU acceleration for real-time inference, showcasing how to efficiently analyze logs without requiring explicit programming or pattern definition.

Prerequisites

Before beginning implementation, ensure the following tools and access are available:

Active Civo account with Kubernetes and GPU access
Python 3.9 or later installed locally
FastAPI framework knowledge (basic understanding sufficient)
Docker installed for container building and deployment
kubectl configured for Kubernetes cluster management
relaxAI API key generated from the RelaxAI dashboard

Project structure

incident-analysis-dashboard/
├── backend/
│   ├── main.py
│   ├── requirements.txt
│   └── Dockerfile
├── frontend/
│   └── index.html
└── kubernetes/
    ├── backend-deployment.yaml
    └── frontend-deployment.yaml

Step-by-step tutorial

Step 1: Set up your Civo GPU cluster

Create a Kubernetes cluster with GPU acceleration through the Civo dashboard:

Log in to the Civo Dashboard
Create a new Kubernetes cluster
Add a GPU node pool (one node is sufficient for this demo)
Download your kubeconfig
Connect to the cluster by running set KUBECONFIG=path-to-kubeconfig-file in your terminal (use export instead of set on Linux or macOS).

Verify connectivity:

kubectl get nodes
kubectl get pods -A

Set up your Civo GPU cluster

Step 2: Build the backend (FastAPI + relaxAI)

Create backend/main.py implementing the incident analysis API:

from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel, Field
from typing import List, Dict, Any, Optional
import os
import json
import logging
import re
import httpx




# ---------------------------------------------------------
# Logging setup
# ---------------------------------------------------------
logger = logging.getLogger("incident-analysis-api")
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s [%(levelname)s] %(name)s - %(message)s",
)




# ---------------------------------------------------------
# RelaxAI client initialization
# ---------------------------------------------------------
RELAX_API_KEY = os.getenv("RELAX_API_KEY")
RELAX_MODEL = "Llama-4-Maverick-17B-128E"
RELAX_URL = "https://api.relax.ai/v1/chat/completions"


# ---------------------------------------------------------
# FastAPI app + CORS
# ---------------------------------------------------------
app = FastAPI(title="Incident Analysis API")




# CORS middleware so browser frontend can talk to this API
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],          # for demo only; lock this down in real life
    allow_credentials=True,
    allow_methods=["*"],          # includes GET, POST, OPTIONS, etc.
    allow_headers=["*"],          # includes Content-Type
)




# ---------------------------------------------------------
# Pydantic models
# ---------------------------------------------------------
class LogEntry(BaseModel):
    timestamp: str = Field(..., description="ISO8601 timestamp of the log entry")
    severity: str = Field(..., description="Log level, e.g. ERROR, WARNING, CRITICAL")
    service: str = Field(..., description="Service or component name")
    message: str = Field(..., description="Raw log message")
    metadata: Dict[str, Any] = Field(default_factory=dict)








class AnalysisResult(BaseModel):
    root_cause: str
    category: str
    summary: str
    severity: str
    recommendations: List[str]








# ---------------------------------------------------------
# Helper: safely extract JSON from model output
# ---------------------------------------------------------
def parse_relax_json(raw: str) -> Dict[str, Any]:
    """
    Best-effort JSON extraction from model output.




    Handles:
    - Pure JSON
    - JSON wrapped in prose
    - JSON inside ```json ... ``` code blocks




    Raises ValueError if nothing JSON-like can be parsed.
    """
    if not raw:
        raise ValueError("Empty response from model")




    raw = raw.strip()




    # Attempt direct JSON
    try:
        return json.loads(raw)
    except json.JSONDecodeError:
        pass




    # Handle code-fenced JSON
    if raw.startswith("```"):
        raw_stripped = re.sub(r"^```[a-zA-Z]*", "", raw)
        raw_stripped = raw_stripped.strip("` \n")
        try:
            return json.loads(raw_stripped)
        except json.JSONDecodeError:
            raw = raw_stripped  # fallback to generic extraction




    # Extract first JSON object found in mixed text
    start = raw.find("{")
    end = raw.rfind("}")
    if start != -1 and end != -1 and end > start:
        candidate = raw[start : end + 1]
        try:
            return json.loads(candidate)
        except json.JSONDecodeError:
            logger.warning("Failed to parse JSON candidate from model output")




    # Nothing worked
    raise ValueError(f"Could not parse JSON from model output: {raw[:200]}...")


# ---------------------------------------------------------
# Manual RelaxAI call
# ---------------------------------------------------------
async def call_relax_ai(prompt: str) -> Dict[str, Any]:
    if not RELAX_API_KEY:
        raise HTTPException(status_code=500, detail="RELAX_API_KEY not configured.")


    headers = {
        "Authorization": f"Bearer {RELAX_API_KEY}",
        "Content-Type": "application/json",
    }


    payload = {
        "model": RELAX_MODEL,
        "messages": [{"role": "user", "content": prompt}],
        "temperature": 0.3,
        "max_tokens": 500,
    }


    logger.debug("Sending request to RelaxAI...")
    try:
        async with httpx.AsyncClient(timeout=10.0) as client:
            response = await client.post(RELAX_URL, json=payload, headers=headers)
            response.raise_for_status()
            data = response.json()
            logger.debug("Received RelaxAI response: %s", data)
            return data
    except Exception as e:
        logger.error("RelaxAI request failed: %s", e)
        raise HTTPException(status_code=500, detail="RelaxAI call failed.")


# ---------------------------------------------------------
# Main endpoint
# ---------------------------------------------------------
@app.post("/analyze/logs", response_model=List[AnalysisResult])
async def analyze_logs(logs: List[LogEntry]):
    if not logs:
        raise HTTPException(status_code=400, detail="No logs provided")


    logger.info("Received %d log entries for analysis", len(logs))
    results: List[AnalysisResult] = []


    for idx, log in enumerate(logs):
        logger.info("Analyzing log %d: %s", idx, log.service)


        prompt = f"""
        Analyze this system log entry:


        Service: {log.service}
        Severity: {log.severity}
        Message: {log.message}
        Metadata: {log.metadata}


        Respond ONLY with valid JSON and follow this exact structure:


        {{
        "root_cause": "...",
        "category": "...",
        "summary": "...",
        "recommendations": ["...", "...", "..."]
        }}


        Do NOT include any text before or after the JSON object.
        """


        try:
            resp = await call_relax_ai(prompt)
            content = resp["choices"][0]["message"]["content"]
            parsed = parse_relax_json(content)
        except Exception as e:
            logger.error("Error analyzing: %s", e)
            parsed = {
                "root_cause": "Analysis failed",
                "category": "error",
                "summary": log.message[:200],
                "recommendations": ["Check logs manually"],
            }


        results.append(
            AnalysisResult(
                root_cause=parsed["root_cause"],
                category=parsed["category"],
                summary=parsed["summary"],
                severity=log.severity,
                recommendations=parsed["recommendations"],
            )
        )


    logger.info("Returning %d results", len(results))
    return results


# ---------------------------------------------------------
# Health check
# ---------------------------------------------------------
@app.get("/health")
async def health_check():
    """
    Simple health endpoint for Kubernetes liveness / readiness probes.
    """
    return {"status": "healthy", "service": "incident-analysis-api"}

Create backend/requirements.txt:

fastapi==0.104.1
uvicorn[standard]==0.24.0
pydantic==2.5.0
relaxai==0.1.0
httpx==0.27.0

Step 3: Build a minimal dashboard

Create frontend/index.html with a clean, functional interface:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Incident Analysis Dashboard</title>

<style>
body { font-family: Arial, sans-serif; background: #f7f7f7; padding: 20px; }
.container { max-width: 900px; margin: 0 auto; background: white; padding: 25px; border-radius: 10px; box-shadow: 0 2px 8px rgba(0,0,0,0.1); }
h1 { margin-bottom: 10px; font-size: 32px; }
button { padding: 8px 15px; border: 1px solid #333; background: #fff; cursor: pointer; border-radius: 4px; }
#status { margin-left: 15px; font-size: 14px; }
.stats { margin-top: 20px; display: flex; gap: 20px; }
.stat-card { padding: 15px; background: #eef2ff; border-radius: 8px; width: 180px; }
.stat-label { font-size: 14px; color: #555; }
.stat-value { font-size: 24px; font-weight: bold; margin-top: 5px; }
.incidents { margin-top: 30px; }
.incident { background: #f9fafb; border-left: 4px solid #ccc; padding: 15px; margin-bottom: 20px; border-radius: 6px; }
.incident-header { display: flex; justify-content: space-between; font-size: 16px; margin-bottom: 8px; }
.severity-badge { padding: 3px 6px; border-radius: 4px; font-size: 12px; font-weight: bold; color: white; }
.severity-error { background: #dc2626; }
.severity-critical { background: #7f1d1d; }
.severity-warning { background: #d97706; }
.rec-item { margin-left: 10px; }
</style>

</head>
<body>

<div class="container">

<h1>🔍 Incident Analysis Dashboard</h1>

<div class="controls">
  <div style="margin-top:20px;">
    <label><strong>Paste Logs (JSON Array):</strong></label><br>
    <textarea id="logInput" style="width:100%; height:150px; margin-top:5px; font-family:monospace;">
[
  {
    "timestamp": "2025-01-15T14:23:45Z",
    "severity": "ERROR",
    "service": "api-gateway",
    "message": "Connection timeout to database service after 30000ms",
    "metadata": { "request_id": "req-789012" }
  }
]
    </textarea>
  </div>

  <button onclick="analyzeIncidents()">Analyze New Incidents</button>
  <span id="status"></span>
</div>

<div class="stats" id="stats"></div>
<div class="incidents" id="incidents"></div>

</div>

<script>
// IMPORTANT: Change this before building the Docker image
const API_URL = "http://<BACKEND-EXTERNAL-IP>:8000";

async function analyzeIncidents() {
  const statusEl = document.getElementById("status");
  statusEl.textContent = "Analyzing...";

  let logs;
  try {
    logs = JSON.parse(document.getElementById("logInput").value);
  } catch (e) {
    statusEl.textContent = "Invalid JSON: " + e.message;
    return;
  }

  try {
    const response = await fetch(`${API_URL}/analyze/logs`, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify(logs)
    });

    const results = await response.json();
    displayResults(results);
    statusEl.textContent = `Analyzed ${results.length} incidents`;

  } catch (error) {
    statusEl.textContent = "Analysis failed: " + error.message;
    console.error(error);
  }
}

function displayResults(results) {
  const categories = {};
  results.forEach(r => categories[r.category] = (categories[r.category] || 0) + 1);

  // Stats section
  document.getElementById("stats").innerHTML = `
<div class="stat-card">
  <div class="stat-label">Total Incidents</div>
  <div class="stat-value">${results.length}</div>
</div>

<div class="stat-card">
  <div class="stat-label">Critical</div>
  <div class="stat-value" style="color:#dc2626">
    ${results.filter(r => r.severity === "CRITICAL").length}
  </div>
</div>

<div class="stat-card">
  <div class="stat-label">Top Category</div>
  <div class="stat-value" style="font-size:20px; color:#3b82f6">
    ${Object.keys(categories)[0] || "None"}
  </div>
</div>
`;

  // Incident cards
  document.getElementById("incidents").innerHTML = results.map(incident => `
<div class="incident">

  <div class="incident-header">
    <strong>${incident.category.toUpperCase()}</strong>
    <span class="severity-badge severity-${incident.severity.toLowerCase()}">
      ${incident.severity}
    </span>
  </div>

  <div><strong>Root Cause:</strong> ${incident.root_cause}</div>

  <div style="margin:10px 0; color:#6b7280;">
    ${incident.summary}
  </div>

  <div class="recommendations">
    <strong>Recommendations:</strong>
    ${incident.recommendations.map(r => `<div class="rec-item">• ${r}</div>`).join("")}
  </div>

</div>
`).join("");
}

window.onload = () => analyzeIncidents();
</script>

</body>
</html>

Here’s how the dashboard should look:

Build a minimal dashboard

Important note: This value const API_URL = “http://<BACKEND-EXTERNAL-IP>:8000”; is baked into the Docker image at build time. That means you should deploy the backend first, wait for its LoadBalancer IP to become available, and then build the frontend image using that IP.

Step 4: Deploy on Civo Kubernetes

Containerize the application with Docker:

Create backend/Dockerfile:

FROM python:3.11-slim

RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates \
 && rm -rf /var/lib/apt/lists/*

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY main.py .

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Build and push images:

# Build backend
cd backend
docker build -t your-registry/incident-backend:latest .
docker push your-registry/incident-backend:latest


# For frontend, use nginx
cd ../frontend
docker build -t your-registry/incident-frontend:latest -f - . <<EOF
FROM nginx:alpine
COPY index.html /usr/share/nginx/html/
EOF
docker push your-registry/incident-frontend:latest

Create a Kubernetes secret with your credentials:

kubectl create secret generic relaxai-secret \
  --from-literal=RELAX_API_KEY='your_relaxai_api_key_here' \

Create a Backend Kubernetes deployment in backend-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: incident-backend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: incident-backend
  template:
    metadata:
      labels:
        app: incident-backend
    spec:
      containers:
      - name: backend
        image: alimohamed782/incident-backend:latest
        ports:
        - containerPort: 8000
        envFrom:
        - secretRef:
            name: relaxai-secret  
        resources:                  
          requests:
            cpu: "250m"
            memory: "256Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"


---
apiVersion: v1
kind: Service
metadata:
  name: incident-backend-service
spec:
  selector:
    app: incident-backend
  ports:
  - port: 8000
    targetPort: 8000
  type: LoadBalancer

Create Frontend Kubernetes deployment in frontend-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: incident-frontend
spec:
  replicas: 1
  selector:
    matchLabels:
      app: incident-frontend
  template:
    metadata:
      labels:
        app: incident-frontend
    spec:
      containers:
      - name: frontend
        image: alimohamed782/incident-frontend:latest
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: incident-frontend-service
spec:
  selector:
    app: incident-frontend
  ports:
  - port: 80
    targetPort: 80
  type: LoadBalancer

Deploy to the cluster:

kubectl apply -f backend-deployment.yaml
kubectl apply -f frontend-deployment.yaml


# Get service URLs
kubectl get services

Step 5: Test and visualize

Access the dashboard by opening http://<FRONTEND-EXTERNAL-IP> in your browser.

Send sample logs:

[
  {
    "timestamp": "2025-01-15T14:23:45Z",
    "severity": "ERROR",
    "service": "api-gateway",
    "message": "Connection timeout to database service after 30000ms. Retry attempts exhausted. Last error: Connection refused on port 5432.",
    "metadata": {
      "request_id": "req-789012",
      "user_id": "user-45678"
    }
  },
  {
    "timestamp": "2025-01-15T14:24:12Z",
    "severity": "CRITICAL",
    "service": "payment-processor",
    "message": "Transaction processing failed with exception: java.lang.OutOfMemoryError: Java heap space. Current heap usage: 95%. GC overhead limit exceeded.",
    "metadata": {
      "transaction_id": "txn-234567",
      "amount": 1250.00
    }
  },
  {
    "timestamp": "2025-01-15T14:25:33Z",
    "severity": "WARNING",
    "service": "auth-service",
    "message": "High latency detected for authentication requests. Average response time: 2.5s (threshold: 500ms). Possible causes: database connection pool exhaustion.",
    "metadata": {
      "endpoint": "/api/v1/auth/login",
      "requests_per_minute": 450
    }
  }
]

Here’s what you should see when analyzing the sample logs:

Deploy on Civo Kubernetes

The dashboard displays:

Total incident count and severity distribution
Top error categories extracted by relaxAI
Individual incident cards with root causes
AI-generated recommendations for each issue
Real-time updates when analyzing new logs

Summary

This project shows how to achieve practical AI-powered observability using sovereign infrastructure, keeping operational data fully under your control while still benefiting from advanced language-model intelligence. With relaxAI running on Civo GPUs, you can summarize and classify system logs locally, avoiding external services while using the Llama 4 Maverick 17B-128E model to extract meaningful insights from unstructured messages.

By running the entire stack (GPU compute, Kubernetes, and relaxAI inference) within a single provider, you gain real-time observability without external dependencies and ensure data sovereignty. The approach also opens paths for extension, including connecting to production monitoring pipelines, integrating with alerting tools like PagerDuty or Slack, adding historical trend analysis, and automating remediation based on AI-driven recommendations.

If you want to learn more about some of the topics we explored in this tutorial, check out some of these resources:

Prerequisites

Project structure

Step-by-step tutorial

Step 1: Set up your Civo GPU cluster

Step 2: Build the backend (FastAPI + relaxAI)

Step 3: Build a minimal dashboard

Step 4: Deploy on Civo Kubernetes

Step 5: Test and visualize

Summary

Mostafa Ibrahim

Further reading

These may also be of interest

Deploying FuseML for Efficient MLOps Orchestration

Automating research and workflows with Auto-GPT on Civo Kubernetes

Deploying Llama 3.1 with Kubeflow on Kubernetes: A CPU-Focused tutorial

LLM-powered incident analysis dashboard on Civo

Prerequisites

Project structure

Step-by-step tutorial

Step 1: Set up your Civo GPU cluster

Step 2: Build the backend (FastAPI + relaxAI)

Step 3: Build a minimal dashboard

Step 4: Deploy on Civo Kubernetes

Step 5: Test and visualize

Summary

Mostafa Ibrahim

Further reading

These may also be of interest

Deploying FuseML for Efficient MLOps Orchestration

Automating research and workflows with Auto-GPT on Civo Kubernetes

Deploying Llama 3.1 with Kubeflow on Kubernetes: A CPU-Focused tutorial