> ## Documentation Index
> Fetch the complete documentation index at: https://docs.polygon.technology/llms.txt
> Use this file to discover all available pages before exploring further.

# Per-inference billing for AI APIs

> Price model inference on actual token usage with the x402 v2 upto scheme. Polygon's mainnet facilitator runs v2 today.

A model host or inference proxy serves AI agents at variable cost: short responses cost cents, long ones cost dollars, and the cost is only known after the work is done. A flat per-call price either overcharges short calls or underprices long ones. With the x402 v2 `upto` scheme, the buyer authorizes a maximum spend per call; the server settles for the actual amount, capped at that maximum. Each settlement reflects the real tokens generated, compute time, or output size.

This is the right primitive for inference billing, and it is supported on Polygon's mainnet and Amoy facilitators today.

**Who this is for:**

* Hosted-model providers and inference proxies opening an agent-buyer channel
* AI platform teams replacing token-bucket subscriptions with usage-based pricing
* Commercial leaders at AI companies who need per-call billing without overcharging

***

## How it works

<div style={{border:"1px solid #C8CFE1",borderRadius:"12px",overflow:"hidden",marginBottom:"24px"}}>
  <div style={{background:"linear-gradient(180deg,#EAE4F5 0%,#F6F3FB 100%)",borderBottom:"1px solid #D5C4F2",padding:"10px 16px",fontSize:"11px",fontWeight:"700",color:"#670DE5",letterSpacing:"0.06em",textTransform:"uppercase"}}>upto scheme flow</div>

  <div style={{borderBottom:"1px solid #EEF0F9",padding:"9px 16px",display:"flex",alignItems:"center",gap:"10px"}}>
    <span style={{color:"#929EBA",fontSize:"11px",fontWeight:"700",minWidth:"16px",textAlign:"right"}}>1</span>
    <span style={{background:"#EEF0F9",color:"#48526F",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"600",whiteSpace:"nowrap"}}>Agent</span>
    <span style={{color:"#670DE5",fontWeight:"700"}}>→</span>
    <span style={{background:"#EEF0F9",color:"#48526F",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"600",whiteSpace:"nowrap"}}>API</span>
    <span style={{fontSize:"13px",color:"#141635"}}>POST /v1/chat/completions</span>
  </div>

  <div style={{borderBottom:"1px solid #EEF0F9",padding:"9px 16px",display:"flex",alignItems:"center",gap:"10px"}}>
    <span style={{color:"#929EBA",fontSize:"11px",fontWeight:"700",minWidth:"16px",textAlign:"right"}}>2</span>
    <span style={{background:"#EEF0F9",color:"#48526F",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"600",whiteSpace:"nowrap"}}>API</span>
    <span style={{color:"#670DE5",fontWeight:"700"}}>→</span>
    <span style={{background:"#EEF0F9",color:"#48526F",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"600",whiteSpace:"nowrap"}}>Agent</span>
    <span style={{fontSize:"13px",color:"#141635"}}>402 with scheme: upto, maxAmountRequired (per-call ceiling)</span>
  </div>

  <div style={{borderBottom:"1px solid #EEF0F9",padding:"9px 16px",display:"flex",alignItems:"center",gap:"10px"}}>
    <span style={{color:"#929EBA",fontSize:"11px",fontWeight:"700",minWidth:"16px",textAlign:"right"}}>3</span>
    <span style={{background:"#EEF0F9",color:"#48526F",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"600",whiteSpace:"nowrap"}}>Agent</span>
    <span style={{color:"#670DE5",fontWeight:"700"}}>→</span>
    <span style={{background:"#EAE4F5",color:"#670DE5",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"700",whiteSpace:"nowrap"}}>Facilitator</span>
    <span style={{fontSize:"13px",color:"#141635"}}>Authorizes max via Permit2 (no funds move yet)</span>
  </div>

  <div style={{borderBottom:"1px solid #EEF0F9",padding:"9px 16px",display:"flex",alignItems:"center",gap:"10px"}}>
    <span style={{color:"#929EBA",fontSize:"11px",fontWeight:"700",minWidth:"16px",textAlign:"right"}}>4</span>
    <span style={{background:"#EAE4F5",color:"#670DE5",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"700",whiteSpace:"nowrap"}}>API</span>
    <span style={{fontSize:"13px",color:"#141635",marginLeft:"4px"}}>Runs inference, measures actual usage (tokens, time, bytes)</span>
  </div>

  <div style={{borderBottom:"1px solid #EEF0F9",padding:"9px 16px",display:"flex",alignItems:"center",gap:"10px"}}>
    <span style={{color:"#929EBA",fontSize:"11px",fontWeight:"700",minWidth:"16px",textAlign:"right"}}>5</span>
    <span style={{background:"#EEF0F9",color:"#48526F",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"600",whiteSpace:"nowrap"}}>API</span>
    <span style={{color:"#670DE5",fontWeight:"700"}}>→</span>
    <span style={{background:"#EAE4F5",color:"#670DE5",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"700",whiteSpace:"nowrap"}}>Facilitator</span>
    <span style={{fontSize:"13px",color:"#141635"}}>Settles actual amount, ≤ authorized max</span>
  </div>

  <div style={{padding:"9px 16px",display:"flex",alignItems:"center",gap:"10px"}}>
    <span style={{color:"#929EBA",fontSize:"11px",fontWeight:"700",minWidth:"16px",textAlign:"right"}}>6</span>
    <span style={{background:"#EEF0F9",color:"#48526F",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"600",whiteSpace:"nowrap"}}>API</span>
    <span style={{color:"#670DE5",fontWeight:"700"}}>→</span>
    <span style={{background:"#EEF0F9",color:"#48526F",padding:"2px 8px",borderRadius:"4px",fontSize:"11px",fontWeight:"600",whiteSpace:"nowrap"}}>Agent</span>
    <span style={{fontSize:"13px",color:"#141635"}}>200 OK with completion + PAYMENT-RESPONSE receipt</span>
  </div>
</div>

The `upto` scheme has two phases: at **verification** time, `maxAmountRequired` is the ceiling the buyer authorizes; at **settlement** time, the server passes the actual amount it computed from real usage. Replay protection comes from Permit2 nonces. Authorizations carry `validAfter` and `deadline` bounds so unsettled authorizations expire safely.

Polygon's mainnet (`x402.polygon.technology`) and Amoy (`x402-amoy.polygon.technology`) facilitators run x402 v2, so `upto` works today.

***

## Get started

Add x402 v2 middleware with the `upto` scheme in front of your inference route. The buyer authorizes a maximum spend; your server settles the actual amount after measuring real usage. Polygon's mainnet and Amoy facilitators run v2.

### Install

<Tabs>
  <Tab title="Express">
    ```bash theme={null}
    bun install @x402/express @x402/core @x402/evm express
    ```
  </Tab>

  <Tab title="Next.js">
    ```bash theme={null}
    bun install @x402/next @x402/core @x402/evm
    ```
  </Tab>

  <Tab title="Hono">
    ```bash theme={null}
    bun install @x402/hono @x402/core @x402/evm hono @hono/node-server
    ```
  </Tab>
</Tabs>

Full middleware setup, including configuration and the facilitator client, is in the [x402 Quickstart for Sellers](/payment-services/agentic-payments/x402/guides/quickstart-sellers).

### Declare a maximum per call

Return an `accepts` block with `scheme: "upto"` and a ceiling. `maxAmountRequired` is the maximum the buyer authorizes. USDC has six decimals, so `50000` = \$0.05.

```json theme={null}
{
  "x402Version": 2,
  "accepts": [
    {
      "scheme": "upto",
      "network": "eip155:137",
      "maxAmountRequired": "50000",
      "description": "Inference on model-7b-instruct, up to 4096 output tokens"
    }
  ]
}
```

After running inference, compute the real cost from your measured units and submit the actual amount to the facilitator's `/settle` endpoint. The actual amount must be less than or equal to `maxAmountRequired`.

### Test as a buyer

```bash theme={null}
polygon-agent x402-pay --url https://api.example.com/v1/chat/completions \
  --method POST \
  --body '{"model":"model-7b-instruct","messages":[{"role":"user","content":"hi"}]}'
```

***

## Implementation

<CardGroup cols={2}>
  <Card title="x402 Quickstart for Sellers" icon="code" href="/payment-services/agentic-payments/x402/guides/quickstart-sellers">
    Add x402 v2 middleware to Express, Next.js, or Hono.
  </Card>

  <Card title="x402 How It Works" icon="book-open" href="/payment-services/agentic-payments/x402/guides/how-it-works">
    Verification vs. settlement phases, accepts shape, receipts.
  </Card>

  <Card title="Using the Polygon facilitator" icon="server" href="/payment-services/agentic-payments/x402/guides/using-polygon-facilitator">
    Point middleware at the v2 facilitator on mainnet or Amoy.
  </Card>

  <Card title="Polygon Agent CLI" icon="terminal" href="/payment-services/agentic-payments/cli/index">
    Test as a buyer with `x402-pay`.
  </Card>
</CardGroup>
