The numbers
- GPT-4o input: $2.50/M tokens → $1.25/M tokens
- GPT-4o output: $10/M tokens → $5/M tokens
- Realtime API: 40% reduction
- Rate limits increased across all tiers
Why now
Anthropic's Claude 3.5 Sonnet was price-competitive and often faster. Google's Gemini 1.5 Pro offered 1M context at lower cost. OpenAI needed to defend its developer mindshare before enterprise contract renewals.
Who benefits
Startups and indie developers building on GPT-4o: this is a meaningful cost reduction. Enterprises on committed spend tiers: check your contract — discounts may not stack.
The trend
Inference costs are dropping ~50% every 12 months. Budget accordingly.