I would like to propose an optional usage model to complement existing monthly/yearly limits.
The Proposal:
Allow users to opt into hourly usage caps, similar to the Claude Code subscription model.
Suggested Functionality:
- Granular Windows: Instead of a single monthly bucket, allow users to set a hard cap on usage per hour.
- Smart Budgeting: Provide an option to automatically divide the remaining plan balance by the time left in the cycle to establish a dynamic hourly limit ($X/hour).
- Token-Based Enforcement: Continue using objective token costs to track and enforce these limits in real-time.
Benefit:
This would allow for better financial stewardship of API/LLM resources, preventing accidental depletion of a monthly plan in a short burst and ensuring consistent availability.