One price, covers all models.
Self-developed scheduling engine driver: intent recognition + L1-L5 difficulty classification automatically routes each request to the model with the best cost performance. Users don’t have to worry about choosing.
File selection, routing within the pool - predictable prices, self-set monthly caps
Fast
lowest priceEntry level, enough to meet most needs of enterprises
model: nexevo/fastBalanced
RecommendedAdvanced file, suitable for users who have strict requirements on quality
model: nexevo/balancedMonthly consumption limit (you have the final say)
Pay only for what you use — service auto-pauses when balance is depleted (top up to resume). Even if our platform takes a loss, we won't pause your service — **the price we promise is the price you get**.
- User control: set the upper limit yourself and stop when it reaches the limit
- Platform guarantee: Loss is our business and we will not force you to upgrade
- Intelligent insurance: automatically select cheaper models (slightly lower quality) when losing money, and never force interruptions
Usage
Fully compatible with OpenAI SDK, just write the file name in the model field:
from openai import OpenAI
client = OpenAI(
base_url="https://api.nexevo.ai/v1",
api_key="sk-...",
)
response = client.chat.completions.create(
model="nexevo/balanced", # 或 "nexevo/fast"
messages=[{"role": "user", "content": "..."}],
)5 major algorithms for intelligent routing
Intent identification, difficulty grading, capability routing, cost trade-off, circuit breaking - 5 core algorithms automatically select the most suitable model within the file, saving an average of 50%+.