Skip to content

feat: add HTTP polling fallback for /events SSE endpoint (corporate proxy compatibility) #204

@felipeconti

Description

@felipeconti

Problem

The AgentAPI chat UI is completely broken in corporate environments that use SSL inspection proxies (e.g., Netskope, Zscaler, Palo Alto). These proxies terminate long-lived SSE connections over HTTP/2, causing ERR_HTTP2_PROTOCOL_ERROR in the browser.

This affects both subdomain and path-based routing — the proxy kills the /events SSE stream regardless of the URL path. Regular short-lived requests (/status, /messages) work fine.

Evidence

Performance entries from the browser show the issue clearly:

Endpoint Protocol Duration Result
/status h2 ~50ms ✅ Works
/messages h2 ~50ms ✅ Works
/events "" (failed) 16+ seconds ERR_HTTP2_PROTOCOL_ERROR

The EventSource never receives any events — readyState goes from 0 (CONNECTING) straight to 2 (CLOSED), and the onerror handler fires immediately. Since chat-provider.tsx relies entirely on SSE for both message updates (message_update events) and status changes (status_change events), the chat input stays permanently disabled and no messages are ever rendered.

Environment

  • Coder v2.30.4
  • AgentAPI via coder-labs/codex module (registry v1.2.0)
  • Corporate SSL inspection proxy: Netskope (ca.<tenant>.goskope.com)
  • Tested with both subdomain = true and subdomain = false — same result

Proposed Solution

Add a polling fallback in chat/src/components/chat-provider.tsx that uses the existing REST endpoints (GET /messages and GET /status) when the SSE connection fails.

Approach

The simplest implementation would be to detect SSE failure and automatically fall back to polling:

// In the useEffect that sets up the EventSource:

// 1. Try SSE first (current behavior)
const eventSource = new EventSource(`${agentAPIUrl}/events`);

// 2. If SSE fails, fall back to polling
eventSource.onerror = () => {
  eventSource.close();
  console.warn("SSE connection failed, falling back to polling");

  const poll = async () => {
    try {
      const [messagesRes, statusRes] = await Promise.all([
        fetch(`${agentAPIUrl}/messages`),
        fetch(`${agentAPIUrl}/status`),
      ]);
      const messagesData = await messagesRes.json();
      const statusData = await statusRes.json();

      setMessages(messagesData.messages.map((m) => ({
        id: m.id, role: m.role, content: m.content,
      })));
      setServerStatus(statusData.status);
      setAgentType(statusData.agent_type || "unknown");
    } catch {
      setServerStatus("offline");
    }
  };

  poll();
  pollIntervalRef.current = setInterval(poll, 1000);
};

Trade-offs

  • Lost: agent_error events have no REST equivalent — these would not be available in polling mode
  • Reduced responsiveness: SSE pushes updates in ~25ms; polling at 1s intervals adds latency
  • Slightly higher server load: periodic HTTP requests vs. a single long-lived connection

Alternative: configurable option

An environment variable like NEXT_PUBLIC_TRANSPORT=polling|sse|auto could also work, where auto (default) tries SSE first and falls back to polling. This would give administrators explicit control in environments where SSE is known to be broken.

Impact

This issue makes AgentAPI chat completely unusable in corporate environments with SSL inspection proxies — a common setup in enterprise deployments. The chat input is permanently disabled, and no conversation history is shown, even though the AgentAPI backend is fully functional.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions