Nexevo exclusive capabilities
Use max_price to capture the worst cost
Set an upper limit on the unit price to prevent out-of-control loops/untrusted inputs from burning out the quota.
Python
python
# Cap the worst-case cost of a single request before any upstream call.
# Useful when you have an unbounded loop or untrusted user input.
response = client.chat.completions.create(
extra_body={
"max_price": {
"prompt": 1.0, # $1 / M input tokens (cheap models only)
"completion": 5.0, # $5 / M output tokens
"total": 0.10, # worst-case single request <= $0.10
},
},
messages=[{"role": "user", "content": "Summarize this PDF..."}],
max_tokens=2000, # required for `total` cap to be enforceable
)
# If no model fits under the ceiling, you get HTTP 400 instead of a runaway bill.