Change log

Release history and noteworthy changes

Roadmap →

v0.13.32026-04-30Function
langchain-nexevo v0.1 — drop-in LangChain integration
pip install langchain-nexevo. ChatNexevo replaces ChatOpenAI in 1 line, gets smart routing + ELO + cascade automatically. Includes NexevoEmbeddings (RAG) and NexevoCheckpointSaver (LangGraph state persisted to /v1/conversations — multi-pod safe + admin auditable, no Postgres needed). 13/13 tests pass; PyPI publish pending.
v0.13.22026-04-30Function
OAuth sign-in (Google + GitHub) + region-aware cookie consent
Continue with Google / GitHub on /login — auto-register, auto-grant $2 email-verified milestone, skips 2FA gate (provider already step-up). Cookie consent bar shows only in GDPR / CCPA / PIPL / LGPD / nFADP / PoPIA regions; localStorage persists choice + dispatches event for 3rd-party SDKs to init.
v0.13.12026-04-30Function
Layer 3+ routing — ELO feeds catalog + history sparkline + feedback-sourced golden prompts
ELO ratings (≥30 duels) feed effective_catalog reasoning+creative dims as 'internal_elo' priority source. Per-model weekly snapshot history with inline SVG sparkline in admin /routing. New POST /admin/feedback/seed-golden-prompts samples high-quality user adoptions into golden prompt store — duel topics evolve with real traffic.
v0.13.02026-04-30safe
Full audit-fix sprint: P0 multi-process billing lock + 11 P1 hardening
Pg advisory lock for billing critical sections (workers > 1 safe). Stripe webhook persistent dedup (cross-worker). BYOK calculator now factors cache tokens. python_exec DoS caps. Per-email login lockout. DAG checkpoint status field. 4 more P2 fixes (InsufficientBalance error code, upload magic bytes, XFF reverse traversal, approve_task auto-resume). 4505 tests pass.
v0.12.52026-04-30Function
Admin BYOK + Agents monitoring; user geo-policy + invoice PDF + reconcile UI
New /admin/dashboard/byok and /agents pages. Per-key geo policy (allowed/blocked ISO countries) on user keys page. Monthly invoice PDF download in billing page. Per-tenant reconcile UI in security page.
v0.12.02026-04-28Function
Generation gateway: 8 providers all live
Image / video / 3D 生成统一接口 — OpenAI Sora 2 + Images, Google Veo 3 + Imagen 4, Tencent Hunyuan 3D direct (TC3-HMAC, -30% vs Replicate), Runway Gen-4 Turbo, Alibaba Wan 2.6, Replicate aggregator. Zero stubs left.
v0.12.12026-04-28Function
22-model generation catalog with tier expansion
Each provider gets fast/balanced/pro tiers. New: imagen-4-ultra, sora-2-pro (4K), veo-3-fast, wan-2.6-pro, runway-gen4-standard, hunyuan-3d-2-pro, flux-1.1-ultra, nexevo/{image-pro, video-pro, 3d-fast/balanced}.
v0.12.22026-04-28infrastructure
Aliyun OSS: reference image upload + GC + per-tenant quota
POST /v1/generation/upload (multipart, 10 MB cap), uploads history + delete, 200 MB / 100 files default quota per tenant, OSS GC CLI + /admin/health/oss-gc endpoint to clean orphans. Sora 2 / Veo 3 / Runway videos auto-mirrored to OSS for 24h signed URLs.
v0.12.32026-04-28Function
TS SDK v0.3 + Python SDK v0.2 — generation resources
nexevo.images.generate / videos.generate(_and_wait) / models3d.generate / generation.{models, jobs.{list,get,cancel,retry,wait_for_completion}, uploads.{upload,list,delete}}. Python sync + async parity. 64+33 SDK tests pass.
v0.12.42026-04-28Function
Playground 4-tab: Chat / Image / Video / 3D
/dashboard/playground now has tabs to test image/video/3D generation alongside chat. Lightweight inline generator, 'advanced ↗' link to full /dashboard/generate. Admin /admin/integrations now manages 8 provider creds + 7 health pings (incl. TC3-HMAC sign test).
2026-04-28Function
Catalog scaled to 92 models
Added 18 new 2026-Q2 flagships: DeepSeek-V4 Preview, Qwen3.6-Plus, Qwen3-VL-Plus, Kimi K2.6, GLM-5.1, MiniMax M2.5/M2.8, Ant Ling-2.6-Flash, Hunyuan 3.0, GPT-5.5, GPT-4.1 nano, GPT-5.2 Codex, Gemini 3.1 Pro, Claude Opus 4.6, Mistral Large 3, Phi-4.
2026-04-28Function
JSON-LD structured SEO
Organization / WebSite SearchAction / SoftwareApplication / FAQPage / TechArticle / BreadcrumbList — wired into homepage / pricing / faq / docs / cookbook. Google rich snippets enabled.
2026-04-28Function
Cascade audit trail
scoring v2 breakdown propagates into cascade.recent_decisions (maxlen 64). One-stop admin debug for routing decisions.
2026-04-28Function
New providers: Ant Group + Microsoft
Ant Ling-2.6-Flash ($0.10/M ultra-low) + Microsoft Phi-4 (14B edge-friendly). Provider total 25 -> 27.
2026-04-27Function
Smart Routing v2
80 specialty tags / 10 domain taxonomy + auto-tagging (~14 tags/model avg) + 8-signal weighted scoring + 4 hard filters + manual override + 5-tab admin UI + draft/publish weights gating. 33 new tests pass.
2026-04-27Function
Stripe top-up loop + Aliyun email
Stripe Customer/SetupIntent/off-session auto top-up + AliyunDirectMailProvider (Singapore) + one-stop /admin/integrations + health checks. 214 cases pass.
2026-04-27Function
BYOK end-to-end
Independent byok/ module, scheduler priority BYOK > settings, fixed 5% service fee. 113 cases pass.
2026-04-27Function
GET /v1/generation enumeration-safe
trace_id changed to UUID4 to prevent sequential enumeration.
2026-04-27Function
Provider data_policy 4 states
unknown/public/anonymous/private states + legal-fillable ToS URLs.
2026-04-27Function
5-tab Admin Routing UI
/admin/dashboard/routing refactored into 5 tabs: Specialty / Difficulty / Price / Algorithm / Weights & Preview. Existing capability/bandit/elo/cascade preserved under Algorithm tab.
2026-04-27Function
nexevo-auto fee unified to 5%
Original +10% smart-routing surcharge dropped, now matches other Passthrough upstream pricing.
2026-04-27Function
Catalog cleanup
Removed 11 deprecated models (deepseek-chat/reasoner, gpt-4-turbo, o1-mini, claude-3-5-haiku, claude-3-opus, gemini-1.5, grok-2, etc.) and added 10 new flagships (gpt-4.1, gpt-5, o3/o3-pro, gemini-2.5, grok-4, etc.).
2026-04-27Function
Pricing page + docs copy unified
Removed monthly-plan narrative (pure pay-as-you-go), removed +10% smart-routing surcharge, unified price examples to DeepSeek-V4 Pro / Qwen3-Max / Claude Opus 4.7 / GPT-5.
2026-04-27Function
Docs 3-tab + width alignment
Top Docs / SDK / Cookbook tabs (OpenRouter style); global max-w-7xl alignment with navbar; sidebar font enlarged & bolder; chat model desc multi-line + jump link.
2026-04-27Function
Distill v1 self-host roadmap finalized
Qwen-2.5-32B + AWQ + LoRA + RunPod A100 ($1800-2500/mo). Catalog placeholder + train_lora.py ready, awaiting data. Full SOP in docs/SELF_HOSTED_MODEL_ROADMAP.md.
2026-04-26Function
Cascade routing (P0 cost optimization)
Try cheap level first, return on confidence >= 0.7 to save tokens; heuristic signals (length/refusal/format/logprob) geo-mean; +30-50% extra token savings.
2026-04-26Function
Layer 3 ELO duel ranking
GoldenPromptStore 20 seed + EloStore K=24 + duel engine (swiss-pair + position-bias swap voting) + admin UI + weekly cron.
2026-04-26Function
Layer 2 Bandit self-learning
Thompson Sampling + cost penalty + RoutingDecisionLog + feedback hook + admin UI BanditSection; kill switch + persistence.
2026-04-26Function
Layer 1 data-driven catalog
benchmark fetcher + ChatbotArena/HF Open LLM adapter + CapabilityOverrideStore + effective_catalog merge + admin UI.
2026-04-26Function
Provider price auto-fetch + review queue
OpenRouter mirror pulls ~300 models + 6 stubs + never-auto-apply PriceProposalStore + admin UI diff table.
2026-04-26Function
Production hardening A+B sprint
Provider keys encrypted / TOTP encrypted / DistillationCollector PII redacted + PgUserStore + Alembic 0003.
2026-04-26Function
Production deployment architecture
HK ECS backend + co-located nginx+Next frontend + Aliyun RDS + Shenzhen ECS proxy_cn + Aliyun ESA. Code on GitHub bejason/nexevo-ai.
v0.10.222026-04-25Function
Team management is online
/dashboard/organization supports creating teams/inviting email members/assigning roles/transfer owner; backend 31 tests.
v0.10.212026-04-25Function
Public service status page
The /status page displays the availability of each provider in real time, including 24h availability, P50/P95 delay, and circuit breaker status.
v0.10.202026-04-25Function
Provider data policy transparency
GET /v1/providers exposes each provider's data retention policy (unknown/public/anonymous/private) and upstream ToS link.
v0.10.192026-04-25Function
Generation details endpoint
GET /v1/generation?id=<gen_id> can check detailed metadata (provider/latency/usage/cost estimate) of any request within 30 days.
v0.10.182026-04-25infrastructure
Agent deployment changed to Shenzhen, Mainland China
The agent master deployment is moved to the Alibaba Cloud Shenzhen Region (VPC intranet), and only the mainland IP adjustable model (Moonshot / Doubao / part of Zhipu) is unlocked. HK is retained as a compliance backup.
v0.10.102026-04-25Function
@nexevo/sdk TypeScript SDK v0.1.0
Official TS SDK, the complete type covers all extended fields (models[], max_price, provider, X-Nexevo-* metadata) and streaming async iterator.
v0.10.92026-04-25Function
seed/logprobs/n normalizes behavior across providers
Semantic fields that are not supported by the upstream will carry the X-Nexevo-Params-Warnings header, and customers can perceive that they are stripped or clamped.
v0.10.82026-04-25Function
Parameter compatibility matrix
Maintain a blacklist by provider, automatically stripping parameters that will cause an upstream 400 (logit_bias / logprobs/seed of some providers).
v0.10.72026-04-25Function
Limit current by end-user
OpenAI standard `user` field is transparently transmitted to the upstream + per-user current limit isolated by tenant, and the response carries X-Nexevo-RateLimit-* header.
v0.10.62026-04-24Function
max_price Single request cost limit
Request level `max_price: {prompt, completion, total}` filters overprice provider + estimates the worst cost, if the limit is exceeded, it will be 400 directly.
v0.10.52026-04-24Function
Provider routing preferences
OpenRouter style `provider: {order, allow_fallbacks, data_collection}` controls routing order and compliance options.
v0.10.42026-04-24Function
:fast / :cheap / :quality model suffix
Give routing hints through the model name suffix (such as `deepseek-chat:fast`) without reorganizing the request body.
v0.10.32026-04-24Function
Generation ID + delay header
Each response comes with X-Nexevo-Generation-Id for traceback + X-Nexevo-(Total|Upstream)-Latency-Ms for performance debugging.
v0.10.22026-04-24Repair
Streaming automatically turns on include_usage
The agent automatically injects stream_options.include_usage=true, and the client can get the token count of the last frame even if it forgets to transmit.
v0.10.12026-04-24Function
models[] client fallback chain
Pass `models: ['deepseek-chat', 'qwen-plus', 'glm-4-air']`, the agent will try one by one from left to right, and the first successful one will be returned.

Change log

langchain-nexevo v0.1 — drop-in LangChain integration

OAuth sign-in (Google + GitHub) + region-aware cookie consent

Layer 3+ routing — ELO feeds catalog + history sparkline + feedback-sourced golden prompts

Full audit-fix sprint: P0 multi-process billing lock + 11 P1 hardening

Admin BYOK + Agents monitoring; user geo-policy + invoice PDF + reconcile UI

Generation gateway: 8 providers all live

22-model generation catalog with tier expansion

Aliyun OSS: reference image upload + GC + per-tenant quota

TS SDK v0.3 + Python SDK v0.2 — generation resources

Playground 4-tab: Chat / Image / Video / 3D

Catalog scaled to 92 models

JSON-LD structured SEO

Cascade audit trail

New providers: Ant Group + Microsoft

Smart Routing v2

Stripe top-up loop + Aliyun email

BYOK end-to-end

GET /v1/generation enumeration-safe

Provider data_policy 4 states

5-tab Admin Routing UI

nexevo-auto fee unified to 5%

Catalog cleanup

Pricing page + docs copy unified

Docs 3-tab + width alignment

Distill v1 self-host roadmap finalized

Cascade routing (P0 cost optimization)

Layer 3 ELO duel ranking

Layer 2 Bandit self-learning

Layer 1 data-driven catalog

Provider price auto-fetch + review queue

Production hardening A+B sprint

Production deployment architecture

Team management is online

Public service status page

Provider data policy transparency

Generation details endpoint

Agent deployment changed to Shenzhen, Mainland China

@nexevo/sdk TypeScript SDK v0.1.0

seed/logprobs/n normalizes behavior across providers

Parameter compatibility matrix

Limit current by end-user

max_price Single request cost limit

Provider routing preferences

:fast / :cheap / :quality model suffix

Generation ID + delay header

Streaming automatically turns on include_usage

models[] client fallback chain