APXY is a local network proxy and debugging tool that captures, inspects, mocks, replays, and diffs HTTP/HTTPS traffic. It is built for developers and AI coding agents with a CLI-first workflow and an optional Web UI.

How is APXY different from Charles Proxy or mitmproxy?

APXY is purpose-built for the modern developer workflow including AI coding agents. It provides structured output (TOON format) that AI agents can reason over, a CLI-first API for scripting and automation, and first-class support for CI/CD, Docker, and headless environments. Unlike Charles Proxy, APXY requires no GUI and unlike mitmproxy, it includes API mocking with scripting and a built-in Web UI.

Does APXY work with AI coding agents like Claude Code and Cursor?

Yes. APXY has dedicated integrations for Claude Code, Cursor, GitHub Copilot, and OpenAI Codex. It returns compact, structured network traffic so AI coding agents can reason over real HTTP requests and responses instead of screenshots or raw HAR files.

APXY has a free tier that lets you capture traffic and explore the Web UI with no account or credit card required. Paid one-time licenses (Pro at $59, Personal at $79) unlock unlimited traffic history, API diagnosis, scripting, breakpoints, network simulation, and multi-device access.

Does APXY work in CI/CD pipelines and Docker?

Yes. APXY runs headlessly and works in local development, CI pipelines, SSH sessions, Docker containers, and background scripts. The core CLI workflow stays the same across all environments.

What operating systems does APXY support?

APXY supports macOS, Linux, and Windows.

Guide

Token Optimization: Fitting API Traffic into Your AI Agent's Context Window

Raw HTTP traffic is verbose. A single request-response pair can consume thousands of tokens. APXY's output formats compress traffic by 60–90% while keeping the information your agent actually needs to diagnose issues.

APXY TeamNovember 12, 20257 min read

When you paste raw HTTP traffic into a chat with an AI coding agent, you are spending context tokens on noise: verbose headers, binary payloads, internal proxy metadata, repeated boilerplate. A single captured request can easily consume 2,000–5,000 tokens before the agent sees anything useful.

At that rate, a 128K context window fills up fast. You get truncated responses, missed context, and an agent that cannot see the full picture of what is going wrong.

APXY includes three output formats specifically designed to solve this. They let you give your agent more useful information in fewer tokens — and they are available on every traffic output command.

The three formats

| Format | Token reduction | Best for | |---|---|---| | JSON (trimmed) | ~60% | Structured parsing, programmatic agents | | Markdown | ~75% | Readable output in chat interfaces | | TOON | ~90% | Large result sets, heavily constrained contexts |

JSON (trimmed)

The default JSON format applies a set of quiet optimizations before output:

Removes internal proxy headers that agents do not need
Masks sensitive values (Authorization, Cookie) to prevent leaking secrets into context
Handles binary bodies gracefully by replacing them with a size annotation
Trims large bodies to a configurable size limit
Omits null and empty fields

This is the format to use when an agent will parse the output programmatically — for example, when a LangChain agent calls a tool that returns traffic records.

apxy logs list --format json --limit 10

Markdown

Markdown format outputs traffic as structured tables and fenced code blocks. This is the most readable option for pasting into a chat interface like Cursor, Claude, or ChatGPT:

apxy logs list --format markdown --limit 10

Sample output:

| # | Method | URL                        | Status | Duration |
|---|--------|----------------------------|--------|----------|
| 1 | GET    | /api/users                 | 200    | 45ms     |
| 2 | POST   | /api/auth/login            | 401    | 120ms    |
| 3 | GET    | /api/products?category=top | 200    | 89ms     |

The agent can scan the table, spot the 401, and ask for the detail on that request — without having consumed a token on the other two.

TOON

TOON (Terse One-line Output Notation) is the most aggressive compression format. Each record becomes a single pipe-delimited line:

apxy logs list --format toon --limit 20

Sample output:

1|GET /api/users|200|45ms
2|POST /api/auth/login|401|120ms
3|GET /api/products?category=top|200|89ms
4|DELETE /api/sessions/abc123|204|12ms
5|POST /api/orders|422|230ms

At this density you can fit 50–100 traffic records into the space a single raw request would occupy. TOON is most useful when you want an agent to survey a broad traffic window — "which requests failed in the last hour?" — before drilling into a specific one.

A practical workflow

The best results come from combining formats: use TOON for the overview, Markdown for the detail.

Step 1: Get the overview in TOON

apxy logs list --format toon --limit 50

Paste the output into your agent with a prompt like: "These are the last 50 API calls my app made. Which ones look problematic?"

The agent can scan 50 records in a few hundred tokens, identify the failures, and ask for more detail.

Step 2: Drill into a specific request in Markdown

apxy logs show --id <id> --format markdown

Paste the detail into the follow-up turn. The agent now has the full request and response in a readable format without the raw HTTP noise.

Step 3: Export for reproduction

Once the agent has identified the issue, export the failing request as cURL to reproduce it:

apxy logs export-curl --id <id>

Paste the cURL command into your terminal to confirm the fix works outside your application.

Filtering before you format

Feeding an agent all traffic is wasteful. Use filters to narrow down to the requests that matter before applying format optimization:

# Only failed requests
apxy logs list --status 4xx,5xx --format toon
 
# Only calls to a specific service
apxy logs search --query "api.stripe.com" --format markdown
 
# Last 5 minutes
apxy logs list --since 5m --format json

On masking sensitive values

By default, APXY masks Authorization, Cookie, and other credential headers before including them in output. This means you can safely paste traffic output into an AI chat interface without exposing API keys or session tokens.

If you are working in a secure, controlled environment and need the agent to see the actual values, you can disable masking with --no-mask. Only do this in contexts where the chat history is not stored externally.

Token count estimates by format

Based on a typical REST API response (~500 byte JSON body, 10 standard headers):

| Format | Approximate tokens per record | |---|---| | Raw HTTP | 400–600 | | JSON trimmed | 160–240 | | Markdown | 100–150 | | TOON | 20–40 |

For a context window of 128K tokens and a typical 8K system prompt, TOON lets you fit roughly 3,000 traffic records in a single context. Markdown fits around 700. Raw HTTP fits around 200.

The format you choose determines how much history your agent can reason over in one turn.

For more on how AI agents can use captured traffic, see Why Your AI Coding Agent Needs Network Visibility and How to Capture HTTPS Traffic from Cursor and AI Agents.

token-optimizationai-agentsguidecontext-windowdeveloper-tools

Debug your APIs with APXY

Capture, inspect, mock, and replay HTTP/HTTPS traffic. Free to install.

Install Free

Insight

Why Your AI Coding Agent Needs Network Visibility

AI coding agents are excellent at reading code. They cannot see the network. That gap is where most agent-assisted debugging sessions get stuck. Here is how to close it.