Features:
- Newly supported models:
- gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-chat
- Beta support for various models from the Llama, Mistral, Gemma, Qwen, Teuken, and InternVL families
- Users with access to multiple user groups can now switch between groups in the frontend
- Added notifications for automatically deleted chats, along with a new modal for manually deleting or extending these chats
- [ADMIN] Configurable retention period for automatic chat deletion per tenant
- [ADMIN] New report: “Hourly Activity Heatmap”
- [ADMIN] Tenants can now publish info banners visible to all users within the tenant
- [ADMIN] Automatic email notifications at 80%, 90%, and 100% of a user group’s hard limit consumption
- [ADMIN] For each deployment, an optional data processing region (Germany, EU, Worldwide) can be selected; this information is also displayed in the frontend
Bugfixes and Improvements:
- Content pages are now displayed correctly across all tenants
- Optimized print layout for chat history
- [ADMIN] Help menu entries and deployments are now sortable via drag & drop
- [ADMIN] In the deployment modal, model-type-specific parameters are only displayed after a model has been selected
- [ADMIN] “Reasoning Effort Level” is now a mandatory field; the maximum token limit can no longer be set to 0
- [ADMIN] Model selection in the deployment modal is now sorted alphabetically
- [ADMIN] For supported models, a price can now be defined for cached input tokens
- [ADMIN] Deployments can now be duplicated
- [ADMIN] Deployments are automatically set to inactive if their endpoint is deleted
- [ADMIN] Improved axis labeling in reports

