ELK Stack Log Aggregation for ChatGPT Apps

Managing logs across distributed ChatGPT applications becomes exponentially complex as your deployment scales. When you're running multiple MCP servers, handling thousands of tool calls per minute, and debugging real-time conversation flows, traditional log files scattered across containers quickly become unmanageable. You need centralized log aggregation that provides real-time search, pattern recognition, and visual analytics.

The ELK Stack (Elasticsearch, Logstash, Kibana) has become the industry-standard solution for log aggregation and analysis at scale. This powerful combination enables you to collect logs from all your ChatGPT app components—MCP servers, widget runtime, authentication services, and backend APIs—into a centralized, searchable index with real-time dashboards.

In this comprehensive guide, you'll learn how to deploy a production-ready ELK Stack for ChatGPT application log aggregation. We'll cover the complete architecture, Docker Compose setup, Logstash pipeline configuration, Kibana dashboard creation, and production deployment strategies with security best practices.

For the complete ChatGPT development workflow, see our Complete Guide to Building ChatGPT Applications. If you want to skip the infrastructure complexity and focus on building your app, MakeAIHQ provides managed logging and monitoring out of the box.

ELK Stack Architecture for ChatGPT Apps

The ELK Stack consists of four core components working together to create a complete log aggregation pipeline. Understanding each component's role is essential for designing a reliable logging infrastructure.

Elasticsearch: The Search Engine

Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene. It stores your log data in indices (similar to databases) and provides near-real-time search capabilities across billions of log entries.

For ChatGPT applications, Elasticsearch indexes contain structured log documents with fields like:

timestamp: When the event occurred
log_level: DEBUG, INFO, WARN, ERROR, CRITICAL
service_name: Which MCP server or component generated the log
tool_name: Which ChatGPT tool was invoked
user_id: Which user triggered the event (when authenticated)
message: The actual log message
stack_trace: Error stack traces for debugging
response_time: Performance metrics for tool calls

Elasticsearch automatically creates inverted indices for full-text search, allowing you to find logs like "all ERROR logs from the restaurant-booking MCP server in the last 24 hours where response_time > 5000ms" in milliseconds.

Logstash: The Data Pipeline

Logstash is a server-side data processing pipeline that ingests logs from multiple sources, transforms and enriches them, and sends them to Elasticsearch. It operates in three stages:

Input plugins: Collect logs from files, HTTP endpoints, message queues, databases
Filter plugins: Parse, transform, enrich log data (grok patterns, JSON parsing, GeoIP lookup)
Output plugins: Send processed logs to Elasticsearch, S3, monitoring systems

For ChatGPT apps, Logstash pipelines typically:

Parse JSON-formatted logs from containerized MCP servers
Extract structured fields from unstructured log messages using grok patterns
Add metadata like environment (production, staging), region, deployment version
Calculate derived metrics (request duration, token usage, error rates)
Route logs to different Elasticsearch indices based on log level or service

Kibana: The Visualization Layer

Kibana is the web-based UI for visualizing Elasticsearch data. It provides:

Discover: Full-text search interface for exploring logs
Visualizations: Charts, graphs, maps, tables for log analytics
Dashboards: Pre-built collections of visualizations for monitoring
Canvas: Pixel-perfect infographic-style reports

For ChatGPT applications, Kibana dashboards typically show:

Real-time request volume by tool name
Error rate trends over time
P50/P95/P99 latency percentiles
Top 10 slowest tools
Geographic distribution of users (from IP addresses)
Alert thresholds (e.g., error rate > 5%)

Filebeat: The Lightweight Shipper

Filebeat is a lightweight agent that ships log files from your application servers to Logstash or Elasticsearch. Unlike Logstash (which is resource-intensive), Filebeat is designed to run on every server with minimal overhead.

For Docker-based ChatGPT deployments, Filebeat:

Mounts the Docker socket to collect container logs
Tails log files in real-time
Adds metadata like container name, labels, environment variables
Handles backpressure when Logstash is overloaded
Guarantees at-least-once delivery with persistent state

The typical data flow is: ChatGPT App → Filebeat → Logstash → Elasticsearch → Kibana.

Learn more about logging best practices in our guide: MCP Server Logging Best Practices for ChatGPT.

Production Docker Compose Setup

Deploying the ELK Stack in Docker provides consistency across development, staging, and production environments. This Docker Compose configuration creates a production-ready cluster with proper networking, volumes, and security.

# docker-compose.elk.yml
version: '3.8'

services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.11.3
    container_name: chatgpt-elasticsearch
    environment:
      # Cluster configuration
      - cluster.name=chatgpt-logs-cluster
      - node.name=chatgpt-es-node-01
      - discovery.type=single-node

      # Memory configuration (CRITICAL for production)
      - ES_JAVA_OPTS=-Xms4g -Xmx4g
      - bootstrap.memory_lock=true

      # Security configuration
      - xpack.security.enabled=true
      - xpack.security.http.ssl.enabled=false
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}

      # Performance tuning
      - indices.memory.index_buffer_size=30%
      - thread_pool.write.queue_size=1000
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - elasticsearch-data:/usr/share/elasticsearch/data
      - ./elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:ro
    ports:
      - "9200:9200"
      - "9300:9300"
    networks:
      - elk
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost:9200/_cluster/health || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 5
    restart: unless-stopped

  logstash:
    image: docker.elastic.co/logstash/logstash:8.11.3
    container_name: chatgpt-logstash
    environment:
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
      - XPACK_MONITORING_ENABLED=true
      - XPACK_MONITORING_ELASTICSEARCH_HOSTS=http://elasticsearch:9200
      - LS_JAVA_OPTS=-Xmx2g -Xms2g
    volumes:
      - ./logstash/config/logstash.yml:/usr/share/logstash/config/logstash.yml:ro
      - ./logstash/pipeline:/usr/share/logstash/pipeline:ro
      - ./logstash/patterns:/usr/share/logstash/patterns:ro
    ports:
      - "5044:5044"  # Beats input
      - "9600:9600"  # Logstash monitoring API
    networks:
      - elk
    depends_on:
      elasticsearch:
        condition: service_healthy
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost:9600/_node/stats || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 5
    restart: unless-stopped

  kibana:
    image: docker.elastic.co/kibana/kibana:8.11.3
    container_name: chatgpt-kibana
    environment:
      - SERVERNAME=chatgpt-kibana
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
      - ELASTICSEARCH_USERNAME=kibana_system
      - ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}
      - XPACK_SECURITY_ENABLED=true
      - XPACK_ENCRYPTEDSAVEDOBJECTS_ENCRYPTIONKEY=${KIBANA_ENCRYPTION_KEY}
    volumes:
      - ./kibana/config/kibana.yml:/usr/share/kibana/config/kibana.yml:ro
      - kibana-data:/usr/share/kibana/data
    ports:
      - "5601:5601"
    networks:
      - elk
    depends_on:
      elasticsearch:
        condition: service_healthy
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://localhost:5601/api/status || exit 1"]
      interval: 30s
      timeout: 10s
      retries: 5
    restart: unless-stopped

  filebeat:
    image: docker.elastic.co/beats/filebeat:8.11.3
    container_name: chatgpt-filebeat
    user: root
    environment:
      - ELASTIC_PASSWORD=${ELASTIC_PASSWORD}
    volumes:
      - ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - filebeat-data:/usr/share/filebeat/data
    networks:
      - elk
    depends_on:
      logstash:
        condition: service_healthy
    command: filebeat -e -strict.perms=false
    restart: unless-stopped

volumes:
  elasticsearch-data:
    driver: local
  kibana-data:
    driver: local
  filebeat-data:
    driver: local

networks:
  elk:
    driver: bridge

Critical production considerations:

Memory allocation: Elasticsearch requires -Xms and -Xmx to be equal (prevents heap resizing). Allocate 50% of available RAM (max 32GB due to compressed pointers).
Volume persistence: Named volumes ensure data survives container restarts. For production, use block storage (AWS EBS, GCP Persistent Disk).
Health checks: Ensure services start in the correct order (Elasticsearch → Logstash → Kibana → Filebeat).
Security: Use environment variables for passwords. Generate strong keys with openssl rand -hex 32.

Logstash Pipeline Configuration

Logstash pipelines define how logs flow from inputs through filters to outputs. This production-ready pipeline handles ChatGPT application logs with JSON parsing, field extraction, and enrichment.

# logstash/pipeline/chatgpt-app.conf

input {
  # Beats input (receives logs from Filebeat)
  beats {
    port => 5044
    codec => json
  }

  # HTTP input (for direct log shipping from apps)
  http {
    port => 8080
    codec => json
    additional_codecs => {
      "application/json" => "json"
    }
  }
}

filter {
  # Parse JSON logs from MCP servers
  if [message] =~ /^\{.*\}$/ {
    json {
      source => "message"
      target => "parsed"
    }

    # Promote parsed fields to top level
    if [parsed] {
      mutate {
        rename => {
          "[parsed][level]" => "log_level"
          "[parsed][timestamp]" => "log_timestamp"
          "[parsed][service]" => "service_name"
          "[parsed][tool]" => "tool_name"
          "[parsed][user_id]" => "user_id"
          "[parsed][duration_ms]" => "response_time"
          "[parsed][error]" => "error_message"
          "[parsed][stack]" => "stack_trace"
        }
      }
    }
  }

  # Parse unstructured logs with grok patterns
  if ![log_level] {
    grok {
      match => {
        "message" => "%{TIMESTAMP_ISO8601:log_timestamp} %{LOGLEVEL:log_level} \[%{DATA:service_name}\] %{GREEDYDATA:log_message}"
      }
      patterns_dir => ["/usr/share/logstash/patterns"]
    }
  }

  # Convert timestamps to @timestamp field
  if [log_timestamp] {
    date {
      match => ["log_timestamp", "ISO8601", "yyyy-MM-dd'T'HH:mm:ss.SSSZ"]
      target => "@timestamp"
      remove_field => ["log_timestamp"]
    }
  }

  # Normalize log levels
  mutate {
    uppercase => ["log_level"]
  }

  # Add environment metadata
  mutate {
    add_field => {
      "environment" => "${ENVIRONMENT:production}"
      "region" => "${AWS_REGION:us-east-1}"
      "deployment_version" => "${DEPLOYMENT_VERSION:unknown}"
    }
  }

  # Parse user agent strings
  if [http_user_agent] {
    useragent {
      source => "http_user_agent"
      target => "user_agent"
    }
  }

  # GeoIP lookup for client IPs
  if [client_ip] {
    geoip {
      source => "client_ip"
      target => "geoip"
      fields => ["city_name", "country_name", "location"]
    }
  }

  # Calculate derived metrics
  if [response_time] {
    ruby {
      code => "
        response_time = event.get('response_time').to_f
        event.set('response_time_category',
          case response_time
          when 0..100 then 'fast'
          when 101..500 then 'normal'
          when 501..2000 then 'slow'
          else 'very_slow'
          end
        )
      "
    }
  }

  # Tag errors for alerting
  if [log_level] == "ERROR" or [log_level] == "CRITICAL" {
    mutate {
      add_tag => ["error_log"]
    }
  }

  # Remove unnecessary fields
  mutate {
    remove_field => ["host", "agent", "ecs", "input", "parsed"]
  }
}

output {
  # Primary output: Elasticsearch
  elasticsearch {
    hosts => ["http://elasticsearch:9200"]
    user => "elastic"
    password => "${ELASTIC_PASSWORD}"

    # Dynamic index routing by date
    index => "chatgpt-logs-%{+YYYY.MM.dd}"

    # Document ID (prevents duplicates)
    document_id => "%{[@metadata][fingerprint]}"

    # ILM policy (Index Lifecycle Management)
    ilm_enabled => true
    ilm_rollover_alias => "chatgpt-logs"
    ilm_pattern => "{now/d}-000001"
    ilm_policy => "chatgpt-logs-policy"
  }

  # Error output: Separate index for ERROR/CRITICAL logs
  if "error_log" in [tags] {
    elasticsearch {
      hosts => ["http://elasticsearch:9200"]
      user => "elastic"
      password => "${ELASTIC_PASSWORD}"
      index => "chatgpt-errors-%{+YYYY.MM.dd}"
    }
  }

  # Debugging output (only in non-production)
  if "${ENVIRONMENT:production}" != "production" {
    stdout {
      codec => rubydebug
    }
  }
}

Pipeline highlights:

Dual input: Accepts logs from Filebeat (port 5044) and direct HTTP (port 8080)
JSON parsing: Extracts structured fields from JSON logs
Grok patterns: Parses unstructured logs when JSON isn't available
Enrichment: Adds GeoIP, user agent parsing, environment metadata
Dynamic indexing: Creates daily indices (chatgpt-logs-2026-12-25)
Error routing: Sends ERROR/CRITICAL logs to a separate index for faster alerting

For advanced log analysis techniques, see our guide: Log Analysis with Kibana for ChatGPT.

Custom Grok Patterns for ChatGPT Logs

Grok patterns enable you to parse unstructured log messages into structured fields. These custom patterns handle common ChatGPT application log formats.

# logstash/patterns/chatgpt-patterns.txt

# MCP Server log pattern
# Example: 2026-12-25T14:32:18.456Z INFO [restaurant-booking] Tool call: create_reservation user=user_123 duration=234ms
MCP_LOG %{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} \[%{DATA:service}\] Tool call: %{DATA:tool} user=%{DATA:user_id} duration=%{NUMBER:duration_ms}ms

# Widget runtime pattern
# Example: [2026-12-25 14:32:18] WARN Widget timeout: MapWidget component=InteractiveMap timeout=5000ms
WIDGET_LOG \[%{TIMESTAMP_ISO8601:timestamp}\] %{LOGLEVEL:level} Widget %{DATA:event_type}: %{DATA:widget_name} component=%{DATA:component_name} timeout=%{NUMBER:timeout_ms}ms

# Authentication log pattern
# Example: 2026-12-25T14:32:18Z INFO [auth-service] OAuth token verified: user_id=user_123 scope=read_profile,write_apps ip=203.0.113.42
AUTH_LOG %{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} \[auth-service\] %{DATA:auth_event}: user_id=%{DATA:user_id} scope=%{DATA:scopes} ip=%{IP:client_ip}

# Error with stack trace pattern
# Example: 2026-12-25T14:32:18Z ERROR [mcp-server] UnhandledPromiseRejection: Connection timeout
ERROR_LOG %{TIMESTAMP_ISO8601:timestamp} ERROR \[%{DATA:service}\] %{DATA:error_type}: %{GREEDYDATA:error_message}

# Performance metric pattern
# Example: METRIC tool_call_duration_ms=234 service=restaurant-booking tool=create_reservation percentile=p95
METRIC_LOG METRIC %{DATA:metric_name}=%{NUMBER:metric_value} service=%{DATA:service} tool=%{DATA:tool} percentile=%{DATA:percentile}

Usage in pipeline:

filter {
  grok {
    match => {
      "message" => [
        "%{MCP_LOG}",
        "%{WIDGET_LOG}",
        "%{AUTH_LOG}",
        "%{ERROR_LOG}",
        "%{METRIC_LOG}"
      ]
    }
    patterns_dir => ["/usr/share/logstash/patterns"]
  }
}

Grok debugger tool: Use Kibana's Dev Tools → Grok Debugger to test patterns against real log samples.

Filebeat Configuration for Docker Containers

Filebeat ships logs from Docker containers to Logstash with minimal resource overhead. This configuration collects logs from all ChatGPT app containers with metadata enrichment.

# filebeat/filebeat.yml

filebeat.inputs:
  # Docker container log input
  - type: container
    enabled: true
    paths:
      - '/var/lib/docker/containers/*/*.log'

    # Decode JSON logs from containers
    json.keys_under_root: true
    json.overwrite_keys: true
    json.add_error_key: true

    # Add Docker metadata
    processors:
      - add_docker_metadata:
          host: "unix:///var/run/docker.sock"
          match_fields: ["container.id"]
          labels.dedot: true

      # Add container labels as fields
      - decode_json_fields:
          fields: ["message"]
          process_array: false
          max_depth: 3
          target: ""
          overwrite_keys: true

      # Add custom fields
      - add_fields:
          target: ''
          fields:
            environment: ${ENVIRONMENT:production}
            region: ${AWS_REGION:us-east-1}

    # Filter containers by label
    condition: "${data.docker.container.labels.logging} == 'enabled'"

  # File input (for non-containerized logs)
  - type: log
    enabled: true
    paths:
      - /var/log/chatgpt-apps/*.log
    fields:
      log_source: file_system
    multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
    multiline.negate: true
    multiline.match: after

# Filebeat modules (optional)
filebeat.modules:
  - module: system
    syslog:
      enabled: true
    auth:
      enabled: true

# Output to Logstash
output.logstash:
  hosts: ["logstash:5044"]

  # Load balancing across multiple Logstash instances
  loadbalance: true

  # Enable compression
  compression_level: 3

  # Bulk settings
  bulk_max_size: 2048
  worker: 2

  # Backpressure handling
  slow_start: true

# Logging configuration
logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat.log
  keepfiles: 7
  permissions: 0644

# Performance tuning
queue.mem:
  events: 4096
  flush.min_events: 512
  flush.timeout: 1s

# Monitoring
monitoring.enabled: true
monitoring.elasticsearch:
  hosts: ["http://elasticsearch:9200"]
  username: "elastic"
  password: "${ELASTIC_PASSWORD}"

Key features:

Container autodiscovery: Automatically detects and ships logs from all Docker containers
Metadata enrichment: Adds container name, labels, image, environment variables
JSON decoding: Parses JSON logs before sending to Logstash
Label filtering: Only ships logs from containers with logging=enabled label
Multiline handling: Combines stack traces into single log events
Backpressure: Slows down when Logstash is overloaded

Add logging label to ChatGPT app containers:

# In your ChatGPT app docker-compose.yml
services:
  mcp-server:
    labels:
      - "logging=enabled"

Kibana Dashboard Configuration

Kibana dashboards visualize log data for real-time monitoring and debugging. This dashboard configuration tracks ChatGPT application health, performance, and error rates.

{
  "title": "ChatGPT Application Monitoring Dashboard",
  "description": "Real-time monitoring for ChatGPT apps: request volume, latency, errors, tool usage",
  "panels": [
    {
      "id": "request_volume_timeline",
      "type": "line",
      "title": "Request Volume (Requests/min)",
      "gridData": {"x": 0, "y": 0, "w": 12, "h": 4},
      "visState": {
        "type": "line",
        "params": {
          "type": "line",
          "grid": {"categoryLines": false},
          "categoryAxes": [{"id": "CategoryAxis-1", "type": "category", "position": "bottom", "show": true}],
          "valueAxes": [{"id": "ValueAxis-1", "name": "Requests", "type": "value", "position": "left", "show": true}],
          "seriesParams": [{"show": true, "type": "line", "mode": "normal", "data": {"label": "Requests", "id": "1"}}]
        },
        "aggs": [
          {"id": "1", "enabled": true, "type": "count", "schema": "metric"},
          {"id": "2", "enabled": true, "type": "date_histogram", "schema": "segment", "params": {"field": "@timestamp", "interval": "1m", "min_doc_count": 0}}
        ]
      }
    },
    {
      "id": "error_rate_gauge",
      "type": "gauge",
      "title": "Error Rate (%)",
      "gridData": {"x": 12, "y": 0, "w": 6, "h": 4},
      "visState": {
        "type": "gauge",
        "params": {
          "gauge": {
            "gaugeType": "Arc",
            "percentageMode": true,
            "colorSchema": "Green to Red",
            "gaugeStyle": "Full",
            "backStyle": "Full",
            "orientation": "vertical",
            "verticalSplit": false,
            "labels": {"show": true, "color": "black"},
            "scale": {"show": true, "labels": false, "color": "#333"},
            "type": "meter",
            "style": {"bgFill": "#000", "fontSize": 60}
          }
        },
        "aggs": [
          {"id": "1", "enabled": true, "type": "count", "schema": "metric", "params": {"customLabel": "Error Rate"}},
          {"id": "2", "enabled": true, "type": "filters", "schema": "group", "params": {"filters": [{"input": {"query": "log_level:ERROR OR log_level:CRITICAL"}, "label": "Errors"}]}}
        ]
      }
    },
    {
      "id": "response_time_percentiles",
      "type": "area",
      "title": "Response Time Percentiles (ms)",
      "gridData": {"x": 18, "y": 0, "w": 6, "h": 4},
      "visState": {
        "type": "area",
        "aggs": [
          {"id": "1", "enabled": true, "type": "percentiles", "schema": "metric", "params": {"field": "response_time", "percents": [50, 95, 99]}},
          {"id": "2", "enabled": true, "type": "date_histogram", "schema": "segment", "params": {"field": "@timestamp", "interval": "1m"}}
        ]
      }
    },
    {
      "id": "top_tools_table",
      "type": "table",
      "title": "Top 10 Tools by Request Count",
      "gridData": {"x": 0, "y": 4, "w": 12, "h": 4},
      "visState": {
        "type": "table",
        "params": {
          "perPage": 10,
          "showPartialRows": false,
          "showMetricsAtAllLevels": false,
          "sort": {"columnIndex": null, "direction": null},
          "showTotal": true,
          "totalFunc": "sum"
        },
        "aggs": [
          {"id": "1", "enabled": true, "type": "count", "schema": "metric"},
          {"id": "2", "enabled": true, "type": "terms", "schema": "bucket", "params": {"field": "tool_name.keyword", "size": 10, "order": "desc", "orderBy": "1"}}
        ]
      }
    },
    {
      "id": "geographic_distribution_map",
      "type": "map",
      "title": "User Geographic Distribution",
      "gridData": {"x": 12, "y": 4, "w": 12, "h": 4},
      "visState": {
        "type": "map",
        "params": {
          "mapType": "Coordinate Map",
          "isDesaturated": false,
          "mapZoom": 2,
          "mapCenter": [0, 0]
        },
        "aggs": [
          {"id": "1", "enabled": true, "type": "count", "schema": "metric"},
          {"id": "2", "enabled": true, "type": "geohash_grid", "schema": "segment", "params": {"field": "geoip.location", "autoPrecision": true, "precision": 3}}
        ]
      }
    },
    {
      "id": "error_logs_table",
      "type": "table",
      "title": "Recent Error Logs",
      "gridData": {"x": 0, "y": 8, "w": 24, "h": 4},
      "visState": {
        "type": "table",
        "params": {
          "perPage": 20,
          "showPartialRows": false,
          "showMetricsAtAllLevels": false
        },
        "aggs": [
          {"id": "1", "enabled": true, "type": "top_hits", "schema": "metric", "params": {"field": "_source", "size": 20, "sortField": "@timestamp", "sortOrder": "desc"}},
          {"id": "2", "enabled": true, "type": "filters", "schema": "bucket", "params": {"filters": [{"input": {"query": "log_level:ERROR OR log_level:CRITICAL"}, "label": ""}]}}
        ]
      }
    }
  ],
  "timeRestore": true,
  "timeFrom": "now-1h",
  "timeTo": "now",
  "refreshInterval": {
    "pause": false,
    "value": 30000
  }
}

Dashboard features:

Request volume timeline: Tracks requests per minute with 1-minute granularity
Error rate gauge: Real-time error percentage with color-coded thresholds (green < 1%, yellow 1-5%, red > 5%)
Response time percentiles: P50, P95, P99 latency visualization
Top tools table: Shows which tools are most frequently called
Geographic map: User distribution based on GeoIP lookup
Error logs table: Live feed of ERROR/CRITICAL logs with full details

Import dashboard:

# Save JSON to chatgpt-dashboard.json, then:
curl -X POST "http://localhost:5601/api/kibana/dashboards/import" \
  -H "kbn-xsrf: true" \
  -H "Content-Type: application/json" \
  -d @chatgpt-dashboard.json

Elasticsearch Index Template and ILM Policy

Index templates define field mappings and settings for new indices. ILM (Index Lifecycle Management) policies automate index lifecycle: rollover, retention, deletion.

{
  "index_patterns": ["chatgpt-logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 3,
      "number_of_replicas": 1,
      "index.codec": "best_compression",
      "refresh_interval": "5s",
      "index.lifecycle.name": "chatgpt-logs-policy",
      "index.lifecycle.rollover_alias": "chatgpt-logs"
    },
    "mappings": {
      "properties": {
        "@timestamp": {"type": "date"},
        "log_level": {"type": "keyword"},
        "service_name": {"type": "keyword"},
        "tool_name": {"type": "keyword"},
        "user_id": {"type": "keyword"},
        "response_time": {"type": "long"},
        "response_time_category": {"type": "keyword"},
        "error_message": {"type": "text", "fields": {"keyword": {"type": "keyword", "ignore_above": 256}}},
        "stack_trace": {"type": "text"},
        "message": {"type": "text"},
        "environment": {"type": "keyword"},
        "region": {"type": "keyword"},
        "deployment_version": {"type": "keyword"},
        "client_ip": {"type": "ip"},
        "geoip": {
          "properties": {
            "city_name": {"type": "keyword"},
            "country_name": {"type": "keyword"},
            "location": {"type": "geo_point"}
          }
        }
      }
    }
  }
}

ILM Policy:

{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_primary_shard_size": "50GB",
            "max_age": "1d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "3d",
        "actions": {
          "shrink": {
            "number_of_shards": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          },
          "set_priority": {
            "priority": 50
          }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

Lifecycle phases:

Hot phase: Active indices receiving writes. Rollover after 1 day or 50GB per shard.
Warm phase: Older indices (3+ days). Shrink to 1 shard, force merge segments for compression.
Delete phase: Indices older than 30 days are automatically deleted.

Apply template and policy:

# Create index template
curl -X PUT "http://localhost:9200/_index_template/chatgpt-logs-template" \
  -H "Content-Type: application/json" \
  -d @index-template.json

# Create ILM policy
curl -X PUT "http://localhost:9200/_ilm/policy/chatgpt-logs-policy" \
  -H "Content-Type: application/json" \
  -d @ilm-policy.json

For index optimization strategies, see: Elasticsearch Optimization for ChatGPT.

Production Deployment and Scaling

Deploying the ELK Stack to production requires careful planning for high availability, security, backup, and monitoring.

High Availability Architecture

For production ChatGPT applications handling millions of logs per day:

Elasticsearch cluster: Minimum 3 master-eligible nodes (quorum = 2). Separate data nodes for horizontal scaling.
Logstash: Deploy 2+ instances behind a load balancer (Filebeat automatically load balances).
Kibana: Run 2+ instances behind a load balancer with session affinity.
Filebeat: Deploy as a DaemonSet (Kubernetes) or on every Docker host.

Example AWS deployment:

Elasticsearch: 3× c5.2xlarge instances (8 vCPU, 16GB RAM) across 3 availability zones
Logstash: 2× c5.xlarge instances (4 vCPU, 8GB RAM) behind Application Load Balancer
Kibana: 2× t3.medium instances (2 vCPU, 4GB RAM) behind ALB with sticky sessions

Security Hardening

Enable X-Pack Security: Authentication, role-based access control (RBAC), field-level security
TLS/SSL encryption: Encrypt all communication (Elasticsearch cluster, Logstash → ES, Kibana → ES)
API key authentication: Use API keys instead of passwords for application log shipping
Network segmentation: Place Elasticsearch/Logstash in private subnets, expose only Kibana via ALB
Audit logging: Enable audit logs for all authentication, authorization, and data access events

# elasticsearch.yml security configuration
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true
xpack.security.audit.enabled: true

Backup and Disaster Recovery

Snapshot repository (S3 example):

# Register S3 snapshot repository
curl -X PUT "http://localhost:9200/_snapshot/chatgpt-logs-backup" -H "Content-Type: application/json" -d '{
  "type": "s3",
  "settings": {
    "bucket": "chatgpt-elasticsearch-backups",
    "region": "us-east-1",
    "base_path": "snapshots",
    "compress": true
  }
}'

# Create snapshot (automated via cron or Elasticsearch snapshot policy)
curl -X PUT "http://localhost:9200/_snapshot/chatgpt-logs-backup/snapshot-$(date +%Y%m%d-%H%M%S)?wait_for_completion=false"

Snapshot policy (automated daily backups, 30-day retention):

{
  "policy": {
    "schedule": "0 2 * * *",
    "name": "<chatgpt-logs-{now/d}>",
    "repository": "chatgpt-logs-backup",
    "config": {
      "indices": ["chatgpt-logs-*"],
      "ignore_unavailable": false,
      "include_global_state": false
    },
    "retention": {
      "expire_after": "30d",
      "min_count": 5,
      "max_count": 50
    }
  }
}

Monitoring and Alerting

Use Elasticsearch built-in monitoring (X-Pack):

# Enable monitoring in elasticsearch.yml
xpack.monitoring.collection.enabled: true

# In Kibana: Stack Monitoring shows cluster health, node stats, index stats

Watcher alerts for critical errors:

{
  "trigger": {
    "schedule": {"interval": "5m"}
  },
  "input": {
    "search": {
      "request": {
        "indices": ["chatgpt-logs-*"],
        "body": {
          "query": {
            "bool": {
              "must": [
                {"range": {"@timestamp": {"gte": "now-5m"}}},
                {"terms": {"log_level": ["ERROR", "CRITICAL"]}}
              ]
            }
          }
        }
      }
    }
  },
  "condition": {
    "compare": {"ctx.payload.hits.total": {"gte": 100}}
  },
  "actions": {
    "send_email": {
      "email": {
        "to": "ops@example.com",
        "subject": "ChatGPT App: High Error Rate Alert",
        "body": "Detected {{ctx.payload.hits.total}} errors in the last 5 minutes"
      }
    }
  }
}

Conclusion

The ELK Stack provides a production-ready log aggregation platform for ChatGPT applications, enabling centralized search, real-time analytics, and proactive monitoring across distributed MCP servers and widgets. With the Docker Compose setup, Logstash pipelines, and Kibana dashboards provided in this guide, you now have a complete logging infrastructure that scales from prototype to production.

Key takeaways:

Elasticsearch provides fast, scalable log storage with near-real-time search
Logstash transforms and enriches logs with filters, grok patterns, and metadata
Kibana visualizes log data with customizable dashboards and alerts
Filebeat ships logs from Docker containers with minimal overhead
Production deployment requires high availability, security hardening, backup strategies, and monitoring

For complete ChatGPT application development workflows including logging integration, see our Complete Guide to Building ChatGPT Applications.

Skip the Infrastructure Complexity

Building and maintaining the ELK Stack requires significant DevOps expertise and ongoing management. If you'd rather focus on building your ChatGPT application instead of managing logging infrastructure, MakeAIHQ provides:

Managed log aggregation with centralized search and real-time dashboards
Pre-built monitoring for MCP servers, tool calls, and error tracking
Automatic alerting for performance degradation and error spikes
No infrastructure to manage – we handle Elasticsearch, Logstash, Kibana scaling and upgrades

Start your free trial and deploy production ChatGPT apps with enterprise logging in minutes, not weeks.

Related Guides:

Complete Guide to Building ChatGPT Applications – Pillar guide
MCP Server Logging Best Practices for ChatGPT – Structured logging standards
Log Analysis with Kibana for ChatGPT – Advanced Kibana query techniques
Elasticsearch Optimization for ChatGPT – Performance tuning and shard management
Prometheus Metrics for ChatGPT Apps – Complementary metrics monitoring

External Resources:

Elastic Stack Official Documentation – Complete Elasticsearch, Logstash, Kibana reference
Logstash Filter Plugins – Official filter plugin documentation
Kibana Dashboard and Visualization Guide – Creating custom dashboards