MCP Server Resource Management: Memory, CPU & Connection Pooling

Building a Model Context Protocol (MCP) server that handles production traffic requires sophisticated resource management. A poorly optimized MCP server can consume excessive memory, overwhelm databases with connections, or bottleneck CPU resources under load. This guide covers production-tested patterns for connection pooling, memory optimization, CPU throttling, and graceful degradation—essential techniques for MCP servers serving ChatGPT applications at scale.

Resource management becomes critical when your MCP server powers ChatGPT apps with thousands of concurrent users. Unlike traditional REST APIs with request-response cycles, MCP servers maintain persistent connections and handle streaming responses, making resource control non-negotiable. Poor resource management leads to cascading failures: memory leaks crash servers, connection exhaustion blocks new requests, and CPU saturation increases latency.

This article provides battle-tested implementations for managing scarce resources in MCP servers. You'll learn how to implement connection pooling for databases and external APIs, optimize memory usage with streaming buffers, throttle CPU-intensive operations, and build graceful degradation patterns that keep your server responsive under stress. Each pattern includes production-ready TypeScript code you can deploy immediately.

Whether you're building a ChatGPT app with MCP server integration or scaling an existing deployment, these resource management techniques ensure your server remains stable, performant, and cost-effective under production loads.

Connection Pooling: Managing Database and HTTP Connections

Connection pooling prevents resource exhaustion by reusing expensive connections instead of creating new ones for each request. MCP servers handling multiple simultaneous tool calls benefit dramatically from pooled connections to databases, Redis caches, and external APIs.

The connection pool manager below implements configurable pooling with health checks, automatic reconnection, and connection lifecycle management:

// connection-pool-manager.ts
import { Pool, PoolClient } from 'pg';
import { createClient, RedisClientType } from 'redis';
import axios, { AxiosInstance } from 'axios';
import { EventEmitter } from 'events';

interface PoolConfig {
  min: number;
  max: number;
  idleTimeoutMillis: number;
  connectionTimeoutMillis: number;
  healthCheckIntervalMs?: number;
}

interface PoolStats {
  total: number;
  idle: number;
  active: number;
  waiting: number;
}

export class ConnectionPoolManager extends EventEmitter {
  private pgPool: Pool | null = null;
  private redisPool: RedisClientType[] = [];
  private httpClients: Map<string, AxiosInstance> = new Map();
  private healthCheckInterval: NodeJS.Timeout | null = null;

  constructor(private config: PoolConfig) {
    super();
  }

  // Initialize PostgreSQL connection pool
  async initPostgresPool(connectionString: string): Promise<void> {
    this.pgPool = new Pool({
      connectionString,
      min: this.config.min,
      max: this.config.max,
      idleTimeoutMillis: this.config.idleTimeoutMillis,
      connectionTimeoutMillis: this.config.connectionTimeoutMillis,
    });

    // Handle pool errors
    this.pgPool.on('error', (err, client) => {
      console.error('Unexpected pool error:', err);
      this.emit('pool-error', { type: 'postgres', error: err });
    });

    // Verify connection
    const client = await this.pgPool.connect();
    await client.query('SELECT 1');
    client.release();

    console.log('PostgreSQL pool initialized:', this.getPostgresStats());
  }

  // Get client from PostgreSQL pool with timeout
  async getPostgresClient(): Promise<PoolClient> {
    if (!this.pgPool) {
      throw new Error('PostgreSQL pool not initialized');
    }

    const timeoutPromise = new Promise<never>((_, reject) => {
      setTimeout(() => reject(new Error('Connection timeout')),
        this.config.connectionTimeoutMillis);
    });

    const client = await Promise.race([
      this.pgPool.connect(),
      timeoutPromise,
    ]);

    return client;
  }

  // Initialize Redis connection pool
  async initRedisPool(url: string): Promise<void> {
    for (let i = 0; i < this.config.max; i++) {
      const client = createClient({ url });

      client.on('error', (err) => {
        console.error(`Redis client ${i} error:`, err);
        this.emit('pool-error', { type: 'redis', clientId: i, error: err });
      });

      await client.connect();
      this.redisPool.push(client);
    }

    console.log(`Redis pool initialized: ${this.redisPool.length} clients`);
  }

  // Get available Redis client (round-robin)
  getRedisClient(): RedisClientType {
    if (this.redisPool.length === 0) {
      throw new Error('Redis pool not initialized');
    }

    // Simple round-robin selection
    const client = this.redisPool.shift()!;
    this.redisPool.push(client);
    return client;
  }

  // Create HTTP client with keep-alive pooling
  getHttpClient(baseURL: string): AxiosInstance {
    if (!this.httpClients.has(baseURL)) {
      const client = axios.create({
        baseURL,
        timeout: this.config.connectionTimeoutMillis,
        maxRedirects: 5,
        httpAgent: new (require('http').Agent)({
          keepAlive: true,
          maxSockets: this.config.max,
          maxFreeSockets: this.config.min,
        }),
        httpsAgent: new (require('https').Agent)({
          keepAlive: true,
          maxSockets: this.config.max,
          maxFreeSockets: this.config.min,
        }),
      });

      this.httpClients.set(baseURL, client);
    }

    return this.httpClients.get(baseURL)!;
  }

  // Get PostgreSQL pool statistics
  getPostgresStats(): PoolStats {
    if (!this.pgPool) {
      return { total: 0, idle: 0, active: 0, waiting: 0 };
    }

    return {
      total: this.pgPool.totalCount,
      idle: this.pgPool.idleCount,
      active: this.pgPool.totalCount - this.pgPool.idleCount,
      waiting: this.pgPool.waitingCount,
    };
  }

  // Health check for all pools
  async healthCheck(): Promise<boolean> {
    try {
      // Check PostgreSQL
      if (this.pgPool) {
        const client = await this.getPostgresClient();
        await client.query('SELECT 1');
        client.release();
      }

      // Check Redis
      if (this.redisPool.length > 0) {
        const client = this.getRedisClient();
        await client.ping();
      }

      return true;
    } catch (error) {
      console.error('Health check failed:', error);
      this.emit('health-check-failed', error);
      return false;
    }
  }

  // Start periodic health checks
  startHealthChecks(): void {
    if (this.healthCheckInterval) {
      return;
    }

    const intervalMs = this.config.healthCheckIntervalMs || 30000;
    this.healthCheckInterval = setInterval(async () => {
      await this.healthCheck();
    }, intervalMs);
  }

  // Graceful shutdown
  async shutdown(): Promise<void> {
    console.log('Shutting down connection pools...');

    if (this.healthCheckInterval) {
      clearInterval(this.healthCheckInterval);
    }

    // Close PostgreSQL pool
    if (this.pgPool) {
      await this.pgPool.end();
      console.log('PostgreSQL pool closed');
    }

    // Close Redis clients
    await Promise.all(this.redisPool.map(client => client.quit()));
    console.log('Redis pool closed');

    console.log('All connection pools shut down');
  }
}

This connection pool manager prevents resource exhaustion by capping maximum connections, reusing idle connections, and implementing health checks. The manager emits events for monitoring and integrates with MCP server monitoring patterns.

Memory Management: Optimizing Buffers and Preventing Leaks

MCP servers streaming large datasets to ChatGPT must carefully manage memory to avoid heap exhaustion. Streaming responses, buffering tool results, and caching frequently accessed data all compete for memory.

This memory-efficient buffer handler implements backpressure, chunk streaming, and automatic cleanup:

// memory-efficient-buffer.js
const { Transform } = require('stream');
const { EventEmitter } = require('events');

class MemoryEfficientBuffer extends EventEmitter {
  constructor(options = {}) {
    super();
    this.maxBufferSize = options.maxBufferSize || 10 * 1024 * 1024; // 10MB default
    this.chunkSize = options.chunkSize || 64 * 1024; // 64KB chunks
    this.currentSize = 0;
    this.buffers = [];
    this.backpressure = false;
  }

  // Add data with backpressure handling
  write(data) {
    const buffer = Buffer.from(data);
    const bufferSize = buffer.length;

    // Check if adding this buffer exceeds limit
    if (this.currentSize + bufferSize > this.maxBufferSize) {
      this.backpressure = true;
      this.emit('backpressure', {
        current: this.currentSize,
        incoming: bufferSize,
        max: this.maxBufferSize,
      });
      return false;
    }

    this.buffers.push(buffer);
    this.currentSize += bufferSize;
    this.emit('data-added', { size: bufferSize, total: this.currentSize });

    return true;
  }

  // Read and remove data (drain buffer)
  read(size = this.chunkSize) {
    if (this.buffers.length === 0) {
      return null;
    }

    const chunks = [];
    let totalSize = 0;

    while (this.buffers.length > 0 && totalSize < size) {
      const buffer = this.buffers.shift();
      const remaining = size - totalSize;

      if (buffer.length <= remaining) {
        chunks.push(buffer);
        totalSize += buffer.length;
        this.currentSize -= buffer.length;
      } else {
        // Split buffer
        chunks.push(buffer.slice(0, remaining));
        this.buffers.unshift(buffer.slice(remaining));
        totalSize += remaining;
        this.currentSize -= remaining;
      }
    }

    if (this.backpressure && this.currentSize < this.maxBufferSize * 0.5) {
      this.backpressure = false;
      this.emit('backpressure-released');
    }

    return Buffer.concat(chunks);
  }

  // Stream data to destination with memory control
  async streamTo(destination) {
    return new Promise((resolve, reject) => {
      const transform = new Transform({
        highWaterMark: this.chunkSize,
        transform: (chunk, encoding, callback) => {
          // Process chunk and pass to destination
          callback(null, chunk);
        },
      });

      const interval = setInterval(() => {
        const chunk = this.read();
        if (chunk) {
          if (!transform.write(chunk)) {
            // Backpressure from destination
            transform.once('drain', () => {
              this.emit('drain-resumed');
            });
          }
        } else if (this.buffers.length === 0) {
          clearInterval(interval);
          transform.end();
        }
      }, 10);

      transform.pipe(destination);

      transform.on('finish', resolve);
      transform.on('error', reject);
    });
  }

  // Clear all buffers (force cleanup)
  clear() {
    const freedMemory = this.currentSize;
    this.buffers = [];
    this.currentSize = 0;
    this.backpressure = false;
    this.emit('cleared', { freedMemory });

    // Force garbage collection if available
    if (global.gc) {
      global.gc();
    }
  }

  // Get memory usage statistics
  getStats() {
    return {
      currentSize: this.currentSize,
      maxSize: this.maxBufferSize,
      utilization: (this.currentSize / this.maxBufferSize) * 100,
      bufferCount: this.buffers.length,
      backpressure: this.backpressure,
    };
  }
}

// Memory leak detector
class MemoryLeakDetector {
  constructor(options = {}) {
    this.checkIntervalMs = options.checkIntervalMs || 60000; // 1 minute
    this.thresholdMB = options.thresholdMB || 500;
    this.samples = [];
    this.maxSamples = 10;
    this.interval = null;
  }

  start() {
    this.interval = setInterval(() => {
      const usage = process.memoryUsage();
      const heapUsedMB = usage.heapUsed / 1024 / 1024;

      this.samples.push({
        timestamp: Date.now(),
        heapUsedMB,
        rss: usage.rss / 1024 / 1024,
        external: usage.external / 1024 / 1024,
      });

      if (this.samples.length > this.maxSamples) {
        this.samples.shift();
      }

      // Check for sustained growth
      if (this.samples.length >= this.maxSamples) {
        const firstSample = this.samples[0];
        const lastSample = this.samples[this.samples.length - 1];
        const growth = lastSample.heapUsedMB - firstSample.heapUsedMB;

        if (growth > this.thresholdMB) {
          console.warn(`Potential memory leak detected: ${growth.toFixed(2)}MB growth over ${this.maxSamples} samples`);
        }
      }
    }, this.checkIntervalMs);
  }

  stop() {
    if (this.interval) {
      clearInterval(this.interval);
    }
  }

  getReport() {
    return {
      samples: this.samples,
      currentUsage: process.memoryUsage(),
    };
  }
}

module.exports = { MemoryEfficientBuffer, MemoryLeakDetector };

This buffer implementation prevents memory leaks through automatic cleanup, backpressure handling, and bounded buffer sizes. Pair this with MCP server performance optimization for production deployments.

CPU Throttling: Rate Limiting and Queue Management

CPU-intensive operations like data transformation, encryption, or AI inference can saturate MCP server resources. Implementing CPU throttling prevents individual requests from monopolizing compute resources.

This Express middleware enforces rate limiting and queues CPU-intensive tasks:

// cpu-throttle-middleware.ts
import { Request, Response, NextFunction } from 'express';
import PQueue from 'p-queue';
import os from 'os';

interface ThrottleConfig {
  maxConcurrentCPUTasks: number;
  maxQueueSize: number;
  cpuThresholdPercent: number;
  checkIntervalMs: number;
}

interface TaskMetrics {
  queued: number;
  running: number;
  completed: number;
  failed: number;
  rejected: number;
}

export class CPUThrottleManager {
  private queue: PQueue;
  private metrics: TaskMetrics;
  private cpuCheckInterval: NodeJS.Timeout | null = null;
  private currentCPUUsage: number = 0;

  constructor(private config: ThrottleConfig) {
    this.queue = new PQueue({
      concurrency: config.maxConcurrentCPUTasks,
      queueSize: config.maxQueueSize,
    });

    this.metrics = {
      queued: 0,
      running: 0,
      completed: 0,
      failed: 0,
      rejected: 0,
    };

    this.startCPUMonitoring();
  }

  // Start monitoring CPU usage
  private startCPUMonitoring(): void {
    let lastCPUUsage = process.cpuUsage();
    let lastTimestamp = Date.now();

    this.cpuCheckInterval = setInterval(() => {
      const currentCPUUsage = process.cpuUsage(lastCPUUsage);
      const currentTimestamp = Date.now();
      const elapsedMs = currentTimestamp - lastTimestamp;

      // Calculate CPU percentage
      const totalMicroseconds = (currentCPUUsage.user + currentCPUUsage.system);
      const totalMs = elapsedMs * os.cpus().length * 1000;
      this.currentCPUUsage = (totalMicroseconds / totalMs) * 100;

      lastCPUUsage = process.cpuUsage();
      lastTimestamp = currentTimestamp;
    }, this.config.checkIntervalMs);
  }

  // Middleware for CPU-intensive endpoints
  middleware() {
    return async (req: Request, res: Response, next: NextFunction) => {
      // Check if CPU is already overloaded
      if (this.currentCPUUsage > this.config.cpuThresholdPercent) {
        this.metrics.rejected++;
        return res.status(503).json({
          error: 'Server overloaded',
          cpuUsage: this.currentCPUUsage.toFixed(2) + '%',
          retryAfter: 5,
        });
      }

      // Check queue capacity
      if (this.queue.size >= this.config.maxQueueSize) {
        this.metrics.rejected++;
        return res.status(429).json({
          error: 'Too many requests',
          queueSize: this.queue.size,
          retryAfter: 10,
        });
      }

      this.metrics.queued++;
      req.app.locals.taskStartTime = Date.now();

      try {
        await this.queue.add(async () => {
          this.metrics.queued--;
          this.metrics.running++;
          next();
        });
      } catch (error) {
        this.metrics.failed++;
        return res.status(500).json({ error: 'Task execution failed' });
      }
    };
  }

  // Execute CPU-intensive task with throttling
  async executeTask<T>(task: () => Promise<T>): Promise<T> {
    return this.queue.add(async () => {
      this.metrics.running++;
      const startTime = Date.now();

      try {
        const result = await task();
        this.metrics.completed++;
        this.metrics.running--;
        return result;
      } catch (error) {
        this.metrics.failed++;
        this.metrics.running--;
        throw error;
      } finally {
        const duration = Date.now() - startTime;
        console.log(`Task completed in ${duration}ms`);
      }
    });
  }

  // Get current metrics
  getMetrics(): TaskMetrics & { cpuUsage: number; queueSize: number } {
    return {
      ...this.metrics,
      cpuUsage: this.currentCPUUsage,
      queueSize: this.queue.size,
    };
  }

  // Graceful shutdown
  async shutdown(): Promise<void> {
    if (this.cpuCheckInterval) {
      clearInterval(this.cpuCheckInterval);
    }

    console.log('Waiting for queued tasks to complete...');
    await this.queue.onIdle();
    console.log('All tasks completed');
  }
}

This throttle manager prevents CPU saturation by limiting concurrent tasks and rejecting requests when resources are exhausted. Integrate with MCP server deployment patterns for production resilience.

Graceful Degradation: Circuit Breakers and Fallbacks

Circuit breakers prevent cascading failures by detecting unhealthy dependencies and failing fast. When an external API or database becomes unreliable, circuit breakers open to protect your MCP server.

This circuit breaker implementation tracks failures and provides fallback responses:

// circuit-breaker.ts
import { EventEmitter } from 'events';

enum CircuitState {
  CLOSED = 'CLOSED',     // Normal operation
  OPEN = 'OPEN',         // Failing, reject requests
  HALF_OPEN = 'HALF_OPEN' // Testing if recovered
}

interface CircuitBreakerConfig {
  failureThreshold: number;       // Failures before opening
  successThreshold: number;       // Successes to close from half-open
  timeout: number;                // Time before attempting recovery (ms)
  monitoringPeriodMs: number;     // Rolling window for failures
}

interface CircuitMetrics {
  failures: number;
  successes: number;
  rejections: number;
  lastFailureTime: number | null;
  state: CircuitState;
}

export class CircuitBreaker<T> extends EventEmitter {
  private state: CircuitState = CircuitState.CLOSED;
  private metrics: CircuitMetrics;
  private failureTimestamps: number[] = [];
  private successCount: number = 0;
  private openTimestamp: number | null = null;

  constructor(
    private action: () => Promise<T>,
    private fallback: () => T,
    private config: CircuitBreakerConfig
  ) {
    super();
    this.metrics = {
      failures: 0,
      successes: 0,
      rejections: 0,
      lastFailureTime: null,
      state: this.state,
    };
  }

  // Execute action with circuit breaker protection
  async execute(): Promise<T> {
    // Clean old failure timestamps outside monitoring window
    const now = Date.now();
    this.failureTimestamps = this.failureTimestamps.filter(
      ts => now - ts < this.config.monitoringPeriodMs
    );

    if (this.state === CircuitState.OPEN) {
      // Check if timeout elapsed
      if (this.openTimestamp && now - this.openTimestamp >= this.config.timeout) {
        this.state = CircuitState.HALF_OPEN;
        this.emit('state-change', { from: CircuitState.OPEN, to: CircuitState.HALF_OPEN });
        console.log('Circuit breaker transitioning to HALF_OPEN');
      } else {
        this.metrics.rejections++;
        this.metrics.state = this.state;
        console.log('Circuit breaker OPEN, using fallback');
        return this.fallback();
      }
    }

    try {
      const result = await this.action();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      console.error('Circuit breaker caught error:', error);
      return this.fallback();
    }
  }

  // Handle successful execution
  private onSuccess(): void {
    this.metrics.successes++;
    this.successCount++;

    if (this.state === CircuitState.HALF_OPEN) {
      if (this.successCount >= this.config.successThreshold) {
        this.state = CircuitState.CLOSED;
        this.successCount = 0;
        this.failureTimestamps = [];
        this.emit('state-change', { from: CircuitState.HALF_OPEN, to: CircuitState.CLOSED });
        console.log('Circuit breaker CLOSED (recovered)');
      }
    }

    this.metrics.state = this.state;
  }

  // Handle failed execution
  private onFailure(): void {
    const now = Date.now();
    this.metrics.failures++;
    this.metrics.lastFailureTime = now;
    this.failureTimestamps.push(now);
    this.successCount = 0;

    if (this.state === CircuitState.HALF_OPEN) {
      // Single failure in HALF_OPEN reopens circuit
      this.state = CircuitState.OPEN;
      this.openTimestamp = now;
      this.emit('state-change', { from: CircuitState.HALF_OPEN, to: CircuitState.OPEN });
      console.log('Circuit breaker reopened from HALF_OPEN');
    } else if (this.failureTimestamps.length >= this.config.failureThreshold) {
      // Too many failures in monitoring window
      this.state = CircuitState.OPEN;
      this.openTimestamp = now;
      this.emit('state-change', { from: CircuitState.CLOSED, to: CircuitState.OPEN });
      console.log(`Circuit breaker OPEN (${this.failureTimestamps.length} failures)`);
    }

    this.metrics.state = this.state;
  }

  // Get current state and metrics
  getMetrics(): CircuitMetrics {
    return { ...this.metrics };
  }

  // Force state change (for testing/admin)
  forceState(state: CircuitState): void {
    const previousState = this.state;
    this.state = state;
    this.metrics.state = state;
    this.emit('state-change', { from: previousState, to: state, forced: true });
  }
}

Circuit breakers protect your MCP server from wasting resources on failing dependencies. Combine with MCP server error handling patterns for robust production systems.

Monitoring and Alerting: Resource Metrics Collection

Effective resource management requires real-time metrics and alerting. This Prometheus-based monitor tracks CPU, memory, connections, and custom application metrics.

// resource-monitor.js
const client = require('prom-client');
const os = require('os');

class ResourceMonitor {
  constructor() {
    // Enable default metrics (CPU, memory, event loop, etc.)
    client.collectDefaultMetrics({
      prefix: 'mcp_server_',
      timeout: 5000
    });

    // Custom metrics
    this.connectionPoolGauge = new client.Gauge({
      name: 'mcp_server_connection_pool_size',
      help: 'Connection pool sizes by type',
      labelNames: ['pool_type', 'state'],
    });

    this.bufferSizeGauge = new client.Gauge({
      name: 'mcp_server_buffer_size_bytes',
      help: 'Current buffer size in bytes',
    });

    this.circuitBreakerGauge = new client.Gauge({
      name: 'mcp_server_circuit_breaker_state',
      help: 'Circuit breaker state (0=CLOSED, 1=HALF_OPEN, 2=OPEN)',
      labelNames: ['service'],
    });

    this.taskQueueGauge = new client.Gauge({
      name: 'mcp_server_task_queue_size',
      help: 'Number of queued CPU tasks',
    });

    this.requestDuration = new client.Histogram({
      name: 'mcp_server_request_duration_seconds',
      help: 'Request duration in seconds',
      labelNames: ['method', 'route', 'status'],
      buckets: [0.1, 0.5, 1, 2, 5, 10],
    });
  }

  // Update connection pool metrics
  updateConnectionPool(poolType, stats) {
    this.connectionPoolGauge.set(
      { pool_type: poolType, state: 'active' },
      stats.active
    );
    this.connectionPoolGauge.set(
      { pool_type: poolType, state: 'idle' },
      stats.idle
    );
    this.connectionPoolGauge.set(
      { pool_type: poolType, state: 'waiting' },
      stats.waiting
    );
  }

  // Update buffer metrics
  updateBuffer(sizeBytes) {
    this.bufferSizeGauge.set(sizeBytes);
  }

  // Update circuit breaker state
  updateCircuitBreaker(service, state) {
    const stateValue = { CLOSED: 0, HALF_OPEN: 1, OPEN: 2 }[state] || 0;
    this.circuitBreakerGauge.set({ service }, stateValue);
  }

  // Update task queue size
  updateTaskQueue(size) {
    this.taskQueueGauge.set(size);
  }

  // Record request duration
  recordRequest(method, route, status, durationSeconds) {
    this.requestDuration.observe(
      { method, route, status: status.toString() },
      durationSeconds
    );
  }

  // Get metrics in Prometheus format
  getMetrics() {
    return client.register.metrics();
  }

  // Health check with resource thresholds
  async healthCheck() {
    const memUsage = process.memoryUsage();
    const cpuUsage = process.cpuUsage();
    const loadAvg = os.loadavg();

    const heapUsedPercent = (memUsage.heapUsed / memUsage.heapTotal) * 100;
    const cpuCount = os.cpus().length;

    const health = {
      status: 'healthy',
      timestamp: new Date().toISOString(),
      memory: {
        heapUsedMB: (memUsage.heapUsed / 1024 / 1024).toFixed(2),
        heapTotalMB: (memUsage.heapTotal / 1024 / 1024).toFixed(2),
        heapUsedPercent: heapUsedPercent.toFixed(2),
        rssMB: (memUsage.rss / 1024 / 1024).toFixed(2),
      },
      cpu: {
        userMs: (cpuUsage.user / 1000).toFixed(2),
        systemMs: (cpuUsage.system / 1000).toFixed(2),
        loadAverage: {
          '1min': loadAvg[0].toFixed(2),
          '5min': loadAvg[1].toFixed(2),
          '15min': loadAvg[2].toFixed(2),
        },
      },
      uptime: process.uptime(),
    };

    // Check thresholds
    if (heapUsedPercent > 90) {
      health.status = 'degraded';
      health.warnings = ['High memory usage'];
    }

    if (loadAvg[0] > cpuCount * 2) {
      health.status = 'degraded';
      health.warnings = health.warnings || [];
      health.warnings.push('High CPU load');
    }

    return health;
  }
}

module.exports = ResourceMonitor;

This monitoring system provides the observability needed to detect resource issues before they impact users. Deploy alongside MCP server monitoring and observability for complete visibility.

Graceful Shutdown: Cleanup and Resource Release

Graceful shutdown ensures your MCP server releases resources cleanly during deployments or scaling events. This handler coordinates cleanup across all resource managers.

// graceful-shutdown.ts
import { Server } from 'http';
import { ConnectionPoolManager } from './connection-pool-manager';
import { CPUThrottleManager } from './cpu-throttle-middleware';
import { MemoryEfficientBuffer } from './memory-efficient-buffer';

interface ShutdownConfig {
  gracePeriodMs: number;      // Time to wait for connections to close
  forceShutdownMs: number;    // Hard timeout before forcing exit
}

export class GracefulShutdownHandler {
  private isShuttingDown: boolean = false;
  private shutdownCallbacks: Array<() => Promise<void>> = [];

  constructor(
    private server: Server,
    private config: ShutdownConfig
  ) {
    this.registerSignalHandlers();
  }

  // Register process signal handlers
  private registerSignalHandlers(): void {
    const signals: NodeJS.Signals[] = ['SIGTERM', 'SIGINT'];

    signals.forEach(signal => {
      process.on(signal, async () => {
        console.log(`Received ${signal}, starting graceful shutdown...`);
        await this.shutdown();
      });
    });

    // Handle uncaught errors
    process.on('uncaughtException', async (error) => {
      console.error('Uncaught exception:', error);
      await this.shutdown(1);
    });

    process.on('unhandledRejection', async (reason) => {
      console.error('Unhandled rejection:', reason);
      await this.shutdown(1);
    });
  }

  // Register cleanup callback
  onShutdown(callback: () => Promise<void>): void {
    this.shutdownCallbacks.push(callback);
  }

  // Execute graceful shutdown
  async shutdown(exitCode: number = 0): Promise<void> {
    if (this.isShuttingDown) {
      console.log('Shutdown already in progress...');
      return;
    }

    this.isShuttingDown = true;
    const startTime = Date.now();

    // Stop accepting new connections
    this.server.close(() => {
      console.log('HTTP server closed');
    });

    // Set hard timeout
    const forceTimeout = setTimeout(() => {
      console.error('Force shutdown timeout reached');
      process.exit(1);
    }, this.config.forceShutdownMs);

    try {
      // Wait for grace period
      await new Promise(resolve => setTimeout(resolve, this.config.gracePeriodMs));

      // Execute all registered cleanup callbacks
      console.log(`Executing ${this.shutdownCallbacks.length} cleanup callbacks...`);
      await Promise.all(this.shutdownCallbacks.map(cb => cb()));

      clearTimeout(forceTimeout);
      const duration = Date.now() - startTime;
      console.log(`Graceful shutdown completed in ${duration}ms`);
      process.exit(exitCode);
    } catch (error) {
      console.error('Error during shutdown:', error);
      clearTimeout(forceTimeout);
      process.exit(1);
    }
  }
}

// Usage example
export function setupGracefulShutdown(
  server: Server,
  poolManager: ConnectionPoolManager,
  cpuManager: CPUThrottleManager,
  buffers: MemoryEfficientBuffer[]
): GracefulShutdownHandler {
  const handler = new GracefulShutdownHandler(server, {
    gracePeriodMs: 5000,
    forceShutdownMs: 30000,
  });

  // Register cleanup for connection pools
  handler.onShutdown(async () => {
    console.log('Closing connection pools...');
    await poolManager.shutdown();
  });

  // Register cleanup for CPU task queue
  handler.onShutdown(async () => {
    console.log('Waiting for CPU tasks to complete...');
    await cpuManager.shutdown();
  });

  // Register cleanup for memory buffers
  handler.onShutdown(async () => {
    console.log('Clearing memory buffers...');
    buffers.forEach(buffer => buffer.clear());
  });

  return handler;
}

This graceful shutdown handler ensures zero-downtime deployments by coordinating resource cleanup. Integrate with MCP server testing strategies to validate shutdown behavior.

Conclusion: Building Resilient MCP Servers

Resource management transforms MCP servers from fragile prototypes into production-ready systems. By implementing connection pooling, memory optimization, CPU throttling, circuit breakers, and monitoring, you ensure your ChatGPT apps remain fast and reliable under load.

The patterns in this guide prevent the most common MCP server failures: connection exhaustion, memory leaks, CPU saturation, and cascading failures from unhealthy dependencies. Each production-ready code example can be deployed immediately and scaled to handle thousands of concurrent users.

Start with connection pooling to prevent resource exhaustion, add memory-efficient buffers to avoid heap overflow, implement CPU throttling for compute-intensive operations, and deploy circuit breakers to isolate failures. Monitor everything with Prometheus metrics and implement graceful shutdown to ensure zero-downtime deployments.

Ready to build ChatGPT apps with enterprise-grade resource management? Start your free trial at MakeAIHQ and deploy production-ready MCP servers in 48 hours—no coding required.

Related Resources

Internal Links

Complete Guide to Building MCP Servers for ChatGPT Apps - Pillar guide covering architecture, tools, and deployment
MCP Server Performance Optimization Strategies - Advanced performance tuning techniques
MCP Server Deployment Patterns for Production - Container orchestration, CI/CD, and infrastructure
MCP Server Monitoring and Observability - Logging, metrics, and distributed tracing
MCP Server Error Handling Best Practices - Error handling, retries, and recovery
MCP Server Testing Strategies for Reliability - Unit, integration, and load testing
MCP Server Security Hardening Guide - Authentication, encryption, and security controls
Integrating MCP Servers with ChatGPT Apps - ChatGPT integration patterns

External Links

Node.js Memory Management Documentation - Official Node.js memory profiling guide
Prometheus Monitoring Best Practices - Metric naming and monitoring patterns
Connection Pooling Strategies - PostgreSQL connection pooling configuration

About MakeAIHQ: We're the no-code platform for building production-ready ChatGPT apps with MCP servers. Deploy your first app to the ChatGPT Store in 48 hours—no coding required. Start building today.