MCP Server Resource Management: Memory, CPU & Connection Pooling
Building a Model Context Protocol (MCP) server that handles production traffic requires sophisticated resource management. A poorly optimized MCP server can consume excessive memory, overwhelm databases with connections, or bottleneck CPU resources under load. This guide covers production-tested patterns for connection pooling, memory optimization, CPU throttling, and graceful degradation—essential techniques for MCP servers serving ChatGPT applications at scale.
Resource management becomes critical when your MCP server powers ChatGPT apps with thousands of concurrent users. Unlike traditional REST APIs with request-response cycles, MCP servers maintain persistent connections and handle streaming responses, making resource control non-negotiable. Poor resource management leads to cascading failures: memory leaks crash servers, connection exhaustion blocks new requests, and CPU saturation increases latency.
This article provides battle-tested implementations for managing scarce resources in MCP servers. You'll learn how to implement connection pooling for databases and external APIs, optimize memory usage with streaming buffers, throttle CPU-intensive operations, and build graceful degradation patterns that keep your server responsive under stress. Each pattern includes production-ready TypeScript code you can deploy immediately.
Whether you're building a ChatGPT app with MCP server integration or scaling an existing deployment, these resource management techniques ensure your server remains stable, performant, and cost-effective under production loads.
Connection Pooling: Managing Database and HTTP Connections
Connection pooling prevents resource exhaustion by reusing expensive connections instead of creating new ones for each request. MCP servers handling multiple simultaneous tool calls benefit dramatically from pooled connections to databases, Redis caches, and external APIs.
The connection pool manager below implements configurable pooling with health checks, automatic reconnection, and connection lifecycle management:
// connection-pool-manager.ts
import { Pool, PoolClient } from 'pg';
import { createClient, RedisClientType } from 'redis';
import axios, { AxiosInstance } from 'axios';
import { EventEmitter } from 'events';
interface PoolConfig {
min: number;
max: number;
idleTimeoutMillis: number;
connectionTimeoutMillis: number;
healthCheckIntervalMs?: number;
}
interface PoolStats {
total: number;
idle: number;
active: number;
waiting: number;
}
export class ConnectionPoolManager extends EventEmitter {
private pgPool: Pool | null = null;
private redisPool: RedisClientType[] = [];
private httpClients: Map<string, AxiosInstance> = new Map();
private healthCheckInterval: NodeJS.Timeout | null = null;
constructor(private config: PoolConfig) {
super();
}
// Initialize PostgreSQL connection pool
async initPostgresPool(connectionString: string): Promise<void> {
this.pgPool = new Pool({
connectionString,
min: this.config.min,
max: this.config.max,
idleTimeoutMillis: this.config.idleTimeoutMillis,
connectionTimeoutMillis: this.config.connectionTimeoutMillis,
});
// Handle pool errors
this.pgPool.on('error', (err, client) => {
console.error('Unexpected pool error:', err);
this.emit('pool-error', { type: 'postgres', error: err });
});
// Verify connection
const client = await this.pgPool.connect();
await client.query('SELECT 1');
client.release();
console.log('PostgreSQL pool initialized:', this.getPostgresStats());
}
// Get client from PostgreSQL pool with timeout
async getPostgresClient(): Promise<PoolClient> {
if (!this.pgPool) {
throw new Error('PostgreSQL pool not initialized');
}
const timeoutPromise = new Promise<never>((_, reject) => {
setTimeout(() => reject(new Error('Connection timeout')),
this.config.connectionTimeoutMillis);
});
const client = await Promise.race([
this.pgPool.connect(),
timeoutPromise,
]);
return client;
}
// Initialize Redis connection pool
async initRedisPool(url: string): Promise<void> {
for (let i = 0; i < this.config.max; i++) {
const client = createClient({ url });
client.on('error', (err) => {
console.error(`Redis client ${i} error:`, err);
this.emit('pool-error', { type: 'redis', clientId: i, error: err });
});
await client.connect();
this.redisPool.push(client);
}
console.log(`Redis pool initialized: ${this.redisPool.length} clients`);
}
// Get available Redis client (round-robin)
getRedisClient(): RedisClientType {
if (this.redisPool.length === 0) {
throw new Error('Redis pool not initialized');
}
// Simple round-robin selection
const client = this.redisPool.shift()!;
this.redisPool.push(client);
return client;
}
// Create HTTP client with keep-alive pooling
getHttpClient(baseURL: string): AxiosInstance {
if (!this.httpClients.has(baseURL)) {
const client = axios.create({
baseURL,
timeout: this.config.connectionTimeoutMillis,
maxRedirects: 5,
httpAgent: new (require('http').Agent)({
keepAlive: true,
maxSockets: this.config.max,
maxFreeSockets: this.config.min,
}),
httpsAgent: new (require('https').Agent)({
keepAlive: true,
maxSockets: this.config.max,
maxFreeSockets: this.config.min,
}),
});
this.httpClients.set(baseURL, client);
}
return this.httpClients.get(baseURL)!;
}
// Get PostgreSQL pool statistics
getPostgresStats(): PoolStats {
if (!this.pgPool) {
return { total: 0, idle: 0, active: 0, waiting: 0 };
}
return {
total: this.pgPool.totalCount,
idle: this.pgPool.idleCount,
active: this.pgPool.totalCount - this.pgPool.idleCount,
waiting: this.pgPool.waitingCount,
};
}
// Health check for all pools
async healthCheck(): Promise<boolean> {
try {
// Check PostgreSQL
if (this.pgPool) {
const client = await this.getPostgresClient();
await client.query('SELECT 1');
client.release();
}
// Check Redis
if (this.redisPool.length > 0) {
const client = this.getRedisClient();
await client.ping();
}
return true;
} catch (error) {
console.error('Health check failed:', error);
this.emit('health-check-failed', error);
return false;
}
}
// Start periodic health checks
startHealthChecks(): void {
if (this.healthCheckInterval) {
return;
}
const intervalMs = this.config.healthCheckIntervalMs || 30000;
this.healthCheckInterval = setInterval(async () => {
await this.healthCheck();
}, intervalMs);
}
// Graceful shutdown
async shutdown(): Promise<void> {
console.log('Shutting down connection pools...');
if (this.healthCheckInterval) {
clearInterval(this.healthCheckInterval);
}
// Close PostgreSQL pool
if (this.pgPool) {
await this.pgPool.end();
console.log('PostgreSQL pool closed');
}
// Close Redis clients
await Promise.all(this.redisPool.map(client => client.quit()));
console.log('Redis pool closed');
console.log('All connection pools shut down');
}
}
This connection pool manager prevents resource exhaustion by capping maximum connections, reusing idle connections, and implementing health checks. The manager emits events for monitoring and integrates with MCP server monitoring patterns.
Memory Management: Optimizing Buffers and Preventing Leaks
MCP servers streaming large datasets to ChatGPT must carefully manage memory to avoid heap exhaustion. Streaming responses, buffering tool results, and caching frequently accessed data all compete for memory.
This memory-efficient buffer handler implements backpressure, chunk streaming, and automatic cleanup:
// memory-efficient-buffer.js
const { Transform } = require('stream');
const { EventEmitter } = require('events');
class MemoryEfficientBuffer extends EventEmitter {
constructor(options = {}) {
super();
this.maxBufferSize = options.maxBufferSize || 10 * 1024 * 1024; // 10MB default
this.chunkSize = options.chunkSize || 64 * 1024; // 64KB chunks
this.currentSize = 0;
this.buffers = [];
this.backpressure = false;
}
// Add data with backpressure handling
write(data) {
const buffer = Buffer.from(data);
const bufferSize = buffer.length;
// Check if adding this buffer exceeds limit
if (this.currentSize + bufferSize > this.maxBufferSize) {
this.backpressure = true;
this.emit('backpressure', {
current: this.currentSize,
incoming: bufferSize,
max: this.maxBufferSize,
});
return false;
}
this.buffers.push(buffer);
this.currentSize += bufferSize;
this.emit('data-added', { size: bufferSize, total: this.currentSize });
return true;
}
// Read and remove data (drain buffer)
read(size = this.chunkSize) {
if (this.buffers.length === 0) {
return null;
}
const chunks = [];
let totalSize = 0;
while (this.buffers.length > 0 && totalSize < size) {
const buffer = this.buffers.shift();
const remaining = size - totalSize;
if (buffer.length <= remaining) {
chunks.push(buffer);
totalSize += buffer.length;
this.currentSize -= buffer.length;
} else {
// Split buffer
chunks.push(buffer.slice(0, remaining));
this.buffers.unshift(buffer.slice(remaining));
totalSize += remaining;
this.currentSize -= remaining;
}
}
if (this.backpressure && this.currentSize < this.maxBufferSize * 0.5) {
this.backpressure = false;
this.emit('backpressure-released');
}
return Buffer.concat(chunks);
}
// Stream data to destination with memory control
async streamTo(destination) {
return new Promise((resolve, reject) => {
const transform = new Transform({
highWaterMark: this.chunkSize,
transform: (chunk, encoding, callback) => {
// Process chunk and pass to destination
callback(null, chunk);
},
});
const interval = setInterval(() => {
const chunk = this.read();
if (chunk) {
if (!transform.write(chunk)) {
// Backpressure from destination
transform.once('drain', () => {
this.emit('drain-resumed');
});
}
} else if (this.buffers.length === 0) {
clearInterval(interval);
transform.end();
}
}, 10);
transform.pipe(destination);
transform.on('finish', resolve);
transform.on('error', reject);
});
}
// Clear all buffers (force cleanup)
clear() {
const freedMemory = this.currentSize;
this.buffers = [];
this.currentSize = 0;
this.backpressure = false;
this.emit('cleared', { freedMemory });
// Force garbage collection if available
if (global.gc) {
global.gc();
}
}
// Get memory usage statistics
getStats() {
return {
currentSize: this.currentSize,
maxSize: this.maxBufferSize,
utilization: (this.currentSize / this.maxBufferSize) * 100,
bufferCount: this.buffers.length,
backpressure: this.backpressure,
};
}
}
// Memory leak detector
class MemoryLeakDetector {
constructor(options = {}) {
this.checkIntervalMs = options.checkIntervalMs || 60000; // 1 minute
this.thresholdMB = options.thresholdMB || 500;
this.samples = [];
this.maxSamples = 10;
this.interval = null;
}
start() {
this.interval = setInterval(() => {
const usage = process.memoryUsage();
const heapUsedMB = usage.heapUsed / 1024 / 1024;
this.samples.push({
timestamp: Date.now(),
heapUsedMB,
rss: usage.rss / 1024 / 1024,
external: usage.external / 1024 / 1024,
});
if (this.samples.length > this.maxSamples) {
this.samples.shift();
}
// Check for sustained growth
if (this.samples.length >= this.maxSamples) {
const firstSample = this.samples[0];
const lastSample = this.samples[this.samples.length - 1];
const growth = lastSample.heapUsedMB - firstSample.heapUsedMB;
if (growth > this.thresholdMB) {
console.warn(`Potential memory leak detected: ${growth.toFixed(2)}MB growth over ${this.maxSamples} samples`);
}
}
}, this.checkIntervalMs);
}
stop() {
if (this.interval) {
clearInterval(this.interval);
}
}
getReport() {
return {
samples: this.samples,
currentUsage: process.memoryUsage(),
};
}
}
module.exports = { MemoryEfficientBuffer, MemoryLeakDetector };
This buffer implementation prevents memory leaks through automatic cleanup, backpressure handling, and bounded buffer sizes. Pair this with MCP server performance optimization for production deployments.
CPU Throttling: Rate Limiting and Queue Management
CPU-intensive operations like data transformation, encryption, or AI inference can saturate MCP server resources. Implementing CPU throttling prevents individual requests from monopolizing compute resources.
This Express middleware enforces rate limiting and queues CPU-intensive tasks:
// cpu-throttle-middleware.ts
import { Request, Response, NextFunction } from 'express';
import PQueue from 'p-queue';
import os from 'os';
interface ThrottleConfig {
maxConcurrentCPUTasks: number;
maxQueueSize: number;
cpuThresholdPercent: number;
checkIntervalMs: number;
}
interface TaskMetrics {
queued: number;
running: number;
completed: number;
failed: number;
rejected: number;
}
export class CPUThrottleManager {
private queue: PQueue;
private metrics: TaskMetrics;
private cpuCheckInterval: NodeJS.Timeout | null = null;
private currentCPUUsage: number = 0;
constructor(private config: ThrottleConfig) {
this.queue = new PQueue({
concurrency: config.maxConcurrentCPUTasks,
queueSize: config.maxQueueSize,
});
this.metrics = {
queued: 0,
running: 0,
completed: 0,
failed: 0,
rejected: 0,
};
this.startCPUMonitoring();
}
// Start monitoring CPU usage
private startCPUMonitoring(): void {
let lastCPUUsage = process.cpuUsage();
let lastTimestamp = Date.now();
this.cpuCheckInterval = setInterval(() => {
const currentCPUUsage = process.cpuUsage(lastCPUUsage);
const currentTimestamp = Date.now();
const elapsedMs = currentTimestamp - lastTimestamp;
// Calculate CPU percentage
const totalMicroseconds = (currentCPUUsage.user + currentCPUUsage.system);
const totalMs = elapsedMs * os.cpus().length * 1000;
this.currentCPUUsage = (totalMicroseconds / totalMs) * 100;
lastCPUUsage = process.cpuUsage();
lastTimestamp = currentTimestamp;
}, this.config.checkIntervalMs);
}
// Middleware for CPU-intensive endpoints
middleware() {
return async (req: Request, res: Response, next: NextFunction) => {
// Check if CPU is already overloaded
if (this.currentCPUUsage > this.config.cpuThresholdPercent) {
this.metrics.rejected++;
return res.status(503).json({
error: 'Server overloaded',
cpuUsage: this.currentCPUUsage.toFixed(2) + '%',
retryAfter: 5,
});
}
// Check queue capacity
if (this.queue.size >= this.config.maxQueueSize) {
this.metrics.rejected++;
return res.status(429).json({
error: 'Too many requests',
queueSize: this.queue.size,
retryAfter: 10,
});
}
this.metrics.queued++;
req.app.locals.taskStartTime = Date.now();
try {
await this.queue.add(async () => {
this.metrics.queued--;
this.metrics.running++;
next();
});
} catch (error) {
this.metrics.failed++;
return res.status(500).json({ error: 'Task execution failed' });
}
};
}
// Execute CPU-intensive task with throttling
async executeTask<T>(task: () => Promise<T>): Promise<T> {
return this.queue.add(async () => {
this.metrics.running++;
const startTime = Date.now();
try {
const result = await task();
this.metrics.completed++;
this.metrics.running--;
return result;
} catch (error) {
this.metrics.failed++;
this.metrics.running--;
throw error;
} finally {
const duration = Date.now() - startTime;
console.log(`Task completed in ${duration}ms`);
}
});
}
// Get current metrics
getMetrics(): TaskMetrics & { cpuUsage: number; queueSize: number } {
return {
...this.metrics,
cpuUsage: this.currentCPUUsage,
queueSize: this.queue.size,
};
}
// Graceful shutdown
async shutdown(): Promise<void> {
if (this.cpuCheckInterval) {
clearInterval(this.cpuCheckInterval);
}
console.log('Waiting for queued tasks to complete...');
await this.queue.onIdle();
console.log('All tasks completed');
}
}
This throttle manager prevents CPU saturation by limiting concurrent tasks and rejecting requests when resources are exhausted. Integrate with MCP server deployment patterns for production resilience.
Graceful Degradation: Circuit Breakers and Fallbacks
Circuit breakers prevent cascading failures by detecting unhealthy dependencies and failing fast. When an external API or database becomes unreliable, circuit breakers open to protect your MCP server.
This circuit breaker implementation tracks failures and provides fallback responses:
// circuit-breaker.ts
import { EventEmitter } from 'events';
enum CircuitState {
CLOSED = 'CLOSED', // Normal operation
OPEN = 'OPEN', // Failing, reject requests
HALF_OPEN = 'HALF_OPEN' // Testing if recovered
}
interface CircuitBreakerConfig {
failureThreshold: number; // Failures before opening
successThreshold: number; // Successes to close from half-open
timeout: number; // Time before attempting recovery (ms)
monitoringPeriodMs: number; // Rolling window for failures
}
interface CircuitMetrics {
failures: number;
successes: number;
rejections: number;
lastFailureTime: number | null;
state: CircuitState;
}
export class CircuitBreaker<T> extends EventEmitter {
private state: CircuitState = CircuitState.CLOSED;
private metrics: CircuitMetrics;
private failureTimestamps: number[] = [];
private successCount: number = 0;
private openTimestamp: number | null = null;
constructor(
private action: () => Promise<T>,
private fallback: () => T,
private config: CircuitBreakerConfig
) {
super();
this.metrics = {
failures: 0,
successes: 0,
rejections: 0,
lastFailureTime: null,
state: this.state,
};
}
// Execute action with circuit breaker protection
async execute(): Promise<T> {
// Clean old failure timestamps outside monitoring window
const now = Date.now();
this.failureTimestamps = this.failureTimestamps.filter(
ts => now - ts < this.config.monitoringPeriodMs
);
if (this.state === CircuitState.OPEN) {
// Check if timeout elapsed
if (this.openTimestamp && now - this.openTimestamp >= this.config.timeout) {
this.state = CircuitState.HALF_OPEN;
this.emit('state-change', { from: CircuitState.OPEN, to: CircuitState.HALF_OPEN });
console.log('Circuit breaker transitioning to HALF_OPEN');
} else {
this.metrics.rejections++;
this.metrics.state = this.state;
console.log('Circuit breaker OPEN, using fallback');
return this.fallback();
}
}
try {
const result = await this.action();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
console.error('Circuit breaker caught error:', error);
return this.fallback();
}
}
// Handle successful execution
private onSuccess(): void {
this.metrics.successes++;
this.successCount++;
if (this.state === CircuitState.HALF_OPEN) {
if (this.successCount >= this.config.successThreshold) {
this.state = CircuitState.CLOSED;
this.successCount = 0;
this.failureTimestamps = [];
this.emit('state-change', { from: CircuitState.HALF_OPEN, to: CircuitState.CLOSED });
console.log('Circuit breaker CLOSED (recovered)');
}
}
this.metrics.state = this.state;
}
// Handle failed execution
private onFailure(): void {
const now = Date.now();
this.metrics.failures++;
this.metrics.lastFailureTime = now;
this.failureTimestamps.push(now);
this.successCount = 0;
if (this.state === CircuitState.HALF_OPEN) {
// Single failure in HALF_OPEN reopens circuit
this.state = CircuitState.OPEN;
this.openTimestamp = now;
this.emit('state-change', { from: CircuitState.HALF_OPEN, to: CircuitState.OPEN });
console.log('Circuit breaker reopened from HALF_OPEN');
} else if (this.failureTimestamps.length >= this.config.failureThreshold) {
// Too many failures in monitoring window
this.state = CircuitState.OPEN;
this.openTimestamp = now;
this.emit('state-change', { from: CircuitState.CLOSED, to: CircuitState.OPEN });
console.log(`Circuit breaker OPEN (${this.failureTimestamps.length} failures)`);
}
this.metrics.state = this.state;
}
// Get current state and metrics
getMetrics(): CircuitMetrics {
return { ...this.metrics };
}
// Force state change (for testing/admin)
forceState(state: CircuitState): void {
const previousState = this.state;
this.state = state;
this.metrics.state = state;
this.emit('state-change', { from: previousState, to: state, forced: true });
}
}
Circuit breakers protect your MCP server from wasting resources on failing dependencies. Combine with MCP server error handling patterns for robust production systems.
Monitoring and Alerting: Resource Metrics Collection
Effective resource management requires real-time metrics and alerting. This Prometheus-based monitor tracks CPU, memory, connections, and custom application metrics.
// resource-monitor.js
const client = require('prom-client');
const os = require('os');
class ResourceMonitor {
constructor() {
// Enable default metrics (CPU, memory, event loop, etc.)
client.collectDefaultMetrics({
prefix: 'mcp_server_',
timeout: 5000
});
// Custom metrics
this.connectionPoolGauge = new client.Gauge({
name: 'mcp_server_connection_pool_size',
help: 'Connection pool sizes by type',
labelNames: ['pool_type', 'state'],
});
this.bufferSizeGauge = new client.Gauge({
name: 'mcp_server_buffer_size_bytes',
help: 'Current buffer size in bytes',
});
this.circuitBreakerGauge = new client.Gauge({
name: 'mcp_server_circuit_breaker_state',
help: 'Circuit breaker state (0=CLOSED, 1=HALF_OPEN, 2=OPEN)',
labelNames: ['service'],
});
this.taskQueueGauge = new client.Gauge({
name: 'mcp_server_task_queue_size',
help: 'Number of queued CPU tasks',
});
this.requestDuration = new client.Histogram({
name: 'mcp_server_request_duration_seconds',
help: 'Request duration in seconds',
labelNames: ['method', 'route', 'status'],
buckets: [0.1, 0.5, 1, 2, 5, 10],
});
}
// Update connection pool metrics
updateConnectionPool(poolType, stats) {
this.connectionPoolGauge.set(
{ pool_type: poolType, state: 'active' },
stats.active
);
this.connectionPoolGauge.set(
{ pool_type: poolType, state: 'idle' },
stats.idle
);
this.connectionPoolGauge.set(
{ pool_type: poolType, state: 'waiting' },
stats.waiting
);
}
// Update buffer metrics
updateBuffer(sizeBytes) {
this.bufferSizeGauge.set(sizeBytes);
}
// Update circuit breaker state
updateCircuitBreaker(service, state) {
const stateValue = { CLOSED: 0, HALF_OPEN: 1, OPEN: 2 }[state] || 0;
this.circuitBreakerGauge.set({ service }, stateValue);
}
// Update task queue size
updateTaskQueue(size) {
this.taskQueueGauge.set(size);
}
// Record request duration
recordRequest(method, route, status, durationSeconds) {
this.requestDuration.observe(
{ method, route, status: status.toString() },
durationSeconds
);
}
// Get metrics in Prometheus format
getMetrics() {
return client.register.metrics();
}
// Health check with resource thresholds
async healthCheck() {
const memUsage = process.memoryUsage();
const cpuUsage = process.cpuUsage();
const loadAvg = os.loadavg();
const heapUsedPercent = (memUsage.heapUsed / memUsage.heapTotal) * 100;
const cpuCount = os.cpus().length;
const health = {
status: 'healthy',
timestamp: new Date().toISOString(),
memory: {
heapUsedMB: (memUsage.heapUsed / 1024 / 1024).toFixed(2),
heapTotalMB: (memUsage.heapTotal / 1024 / 1024).toFixed(2),
heapUsedPercent: heapUsedPercent.toFixed(2),
rssMB: (memUsage.rss / 1024 / 1024).toFixed(2),
},
cpu: {
userMs: (cpuUsage.user / 1000).toFixed(2),
systemMs: (cpuUsage.system / 1000).toFixed(2),
loadAverage: {
'1min': loadAvg[0].toFixed(2),
'5min': loadAvg[1].toFixed(2),
'15min': loadAvg[2].toFixed(2),
},
},
uptime: process.uptime(),
};
// Check thresholds
if (heapUsedPercent > 90) {
health.status = 'degraded';
health.warnings = ['High memory usage'];
}
if (loadAvg[0] > cpuCount * 2) {
health.status = 'degraded';
health.warnings = health.warnings || [];
health.warnings.push('High CPU load');
}
return health;
}
}
module.exports = ResourceMonitor;
This monitoring system provides the observability needed to detect resource issues before they impact users. Deploy alongside MCP server monitoring and observability for complete visibility.
Graceful Shutdown: Cleanup and Resource Release
Graceful shutdown ensures your MCP server releases resources cleanly during deployments or scaling events. This handler coordinates cleanup across all resource managers.
// graceful-shutdown.ts
import { Server } from 'http';
import { ConnectionPoolManager } from './connection-pool-manager';
import { CPUThrottleManager } from './cpu-throttle-middleware';
import { MemoryEfficientBuffer } from './memory-efficient-buffer';
interface ShutdownConfig {
gracePeriodMs: number; // Time to wait for connections to close
forceShutdownMs: number; // Hard timeout before forcing exit
}
export class GracefulShutdownHandler {
private isShuttingDown: boolean = false;
private shutdownCallbacks: Array<() => Promise<void>> = [];
constructor(
private server: Server,
private config: ShutdownConfig
) {
this.registerSignalHandlers();
}
// Register process signal handlers
private registerSignalHandlers(): void {
const signals: NodeJS.Signals[] = ['SIGTERM', 'SIGINT'];
signals.forEach(signal => {
process.on(signal, async () => {
console.log(`Received ${signal}, starting graceful shutdown...`);
await this.shutdown();
});
});
// Handle uncaught errors
process.on('uncaughtException', async (error) => {
console.error('Uncaught exception:', error);
await this.shutdown(1);
});
process.on('unhandledRejection', async (reason) => {
console.error('Unhandled rejection:', reason);
await this.shutdown(1);
});
}
// Register cleanup callback
onShutdown(callback: () => Promise<void>): void {
this.shutdownCallbacks.push(callback);
}
// Execute graceful shutdown
async shutdown(exitCode: number = 0): Promise<void> {
if (this.isShuttingDown) {
console.log('Shutdown already in progress...');
return;
}
this.isShuttingDown = true;
const startTime = Date.now();
// Stop accepting new connections
this.server.close(() => {
console.log('HTTP server closed');
});
// Set hard timeout
const forceTimeout = setTimeout(() => {
console.error('Force shutdown timeout reached');
process.exit(1);
}, this.config.forceShutdownMs);
try {
// Wait for grace period
await new Promise(resolve => setTimeout(resolve, this.config.gracePeriodMs));
// Execute all registered cleanup callbacks
console.log(`Executing ${this.shutdownCallbacks.length} cleanup callbacks...`);
await Promise.all(this.shutdownCallbacks.map(cb => cb()));
clearTimeout(forceTimeout);
const duration = Date.now() - startTime;
console.log(`Graceful shutdown completed in ${duration}ms`);
process.exit(exitCode);
} catch (error) {
console.error('Error during shutdown:', error);
clearTimeout(forceTimeout);
process.exit(1);
}
}
}
// Usage example
export function setupGracefulShutdown(
server: Server,
poolManager: ConnectionPoolManager,
cpuManager: CPUThrottleManager,
buffers: MemoryEfficientBuffer[]
): GracefulShutdownHandler {
const handler = new GracefulShutdownHandler(server, {
gracePeriodMs: 5000,
forceShutdownMs: 30000,
});
// Register cleanup for connection pools
handler.onShutdown(async () => {
console.log('Closing connection pools...');
await poolManager.shutdown();
});
// Register cleanup for CPU task queue
handler.onShutdown(async () => {
console.log('Waiting for CPU tasks to complete...');
await cpuManager.shutdown();
});
// Register cleanup for memory buffers
handler.onShutdown(async () => {
console.log('Clearing memory buffers...');
buffers.forEach(buffer => buffer.clear());
});
return handler;
}
This graceful shutdown handler ensures zero-downtime deployments by coordinating resource cleanup. Integrate with MCP server testing strategies to validate shutdown behavior.
Conclusion: Building Resilient MCP Servers
Resource management transforms MCP servers from fragile prototypes into production-ready systems. By implementing connection pooling, memory optimization, CPU throttling, circuit breakers, and monitoring, you ensure your ChatGPT apps remain fast and reliable under load.
The patterns in this guide prevent the most common MCP server failures: connection exhaustion, memory leaks, CPU saturation, and cascading failures from unhealthy dependencies. Each production-ready code example can be deployed immediately and scaled to handle thousands of concurrent users.
Start with connection pooling to prevent resource exhaustion, add memory-efficient buffers to avoid heap overflow, implement CPU throttling for compute-intensive operations, and deploy circuit breakers to isolate failures. Monitor everything with Prometheus metrics and implement graceful shutdown to ensure zero-downtime deployments.
Ready to build ChatGPT apps with enterprise-grade resource management? Start your free trial at MakeAIHQ and deploy production-ready MCP servers in 48 hours—no coding required.
Related Resources
Internal Links
- Complete Guide to Building MCP Servers for ChatGPT Apps - Pillar guide covering architecture, tools, and deployment
- MCP Server Performance Optimization Strategies - Advanced performance tuning techniques
- MCP Server Deployment Patterns for Production - Container orchestration, CI/CD, and infrastructure
- MCP Server Monitoring and Observability - Logging, metrics, and distributed tracing
- MCP Server Error Handling Best Practices - Error handling, retries, and recovery
- MCP Server Testing Strategies for Reliability - Unit, integration, and load testing
- MCP Server Security Hardening Guide - Authentication, encryption, and security controls
- Integrating MCP Servers with ChatGPT Apps - ChatGPT integration patterns
External Links
- Node.js Memory Management Documentation - Official Node.js memory profiling guide
- Prometheus Monitoring Best Practices - Metric naming and monitoring patterns
- Connection Pooling Strategies - PostgreSQL connection pooling configuration
About MakeAIHQ: We're the no-code platform for building production-ready ChatGPT apps with MCP servers. Deploy your first app to the ChatGPT Store in 48 hours—no coding required. Start building today.