Release Notes
TOC
v1.4.0Features and EnhancementsImproved retrieval qualityHybrid keyword and semantic retrievalDocument-level retrievalBuilt-in knowledge base preset selectionBundled knowledge base refresh during upgradeConversation history compressionAgent Mode and MCP tool loading can now be configured independentlyBuild a Custom Knowledge BaseImprovementsBreaking ChangesKnown Limitationsv1.3.1Features and EnhancementsBYO Knowledge ToolMulti-Cluster SupportToken Quota LimitsHistoryImprovementsBug Fixesv1.2.1Bug FixesImprovementsv1.2.0Features and EnhancementsKnown Issuesv1.4.0
Features and Enhancements
Improved retrieval quality
The bundled knowledge base has been re-embedded with a stronger model, substantially improving answer relevance — including cross-language retrieval (for example, asking a question in Chinese against English documentation). Existing v1.3.x deployments are migrated automatically during the upgrade — see the Upgrade guide.
Hybrid keyword and semantic retrieval
Answers are now retrieved using a combination of semantic similarity and keyword matching, then fused into a single ranking. This recovers exact-keyword matches that pure semantic search misses — such as CRD names, command flags, and error strings — without giving up the ability to answer natural-language questions.
Document-level retrieval
Each document is now indexed at the document level in addition to its individual sections. Questions about a document's overall topic now find the right document even when the user's wording doesn't overlap with any specific paragraph.
Built-in knowledge base preset selection
The installation page now offers a choice between two presets for the bundled knowledge base. The default works well for most queries; the larger preset can give slightly better recall on questions that span long-form documents.
Bundled knowledge base refresh during upgrade
Upgrading from v1.3.x to v1.4.x replaces the bundled product knowledge base with the new GTE-embedded dump and migrates per-document metadata automatically. No manual database export, import, or restart step is required — see the Upgrade guide. Custom and BYO knowledge bases are not touched by this step (see Breaking Changes and Known Limitations below).
Conversation history compression
Long conversations are automatically summarised so they fit within the LLM's context window without losing earlier turns. This reduces token cost on extended agent-mode sessions and lets users continue an investigation across many messages without quality degradation.
Agent Mode and MCP tool loading can now be configured independently
Agent Mode (multi-step reasoning) and ACP MCP tool loading are now controlled by separate install switches. You can turn on Agent Mode without loading the ACP MCP tools — for example, when the LLM should plan over the bundled knowledge base alone — and turn the tools back on when live cluster operations are required.
Build a Custom Knowledge Base
A new guide walks administrators through ingesting internal Git repositories into the Hyperflux knowledge base, so Hyperflux can answer questions grounded in private runbooks, design documents, or other internal references. See Build a Custom Knowledge Base.
Improvements
- Reduced token cost on agent-mode answers when tools return long output (such as logs or large resource definitions).
- Improved storage performance on the bundled knowledge-base database.
Breaking Changes
- Custom knowledge bases built under v1.3.x must be rebuilt before upgrading. The new embedding model is incompatible with the previous one, so existing custom corpus data will not be usable until rebuilt. See the Build a Custom Knowledge Base guide for the new procedure.
Known Limitations
- Documents you uploaded via the BYO Knowledge tool are preserved unchanged across the v1.4.0 upgrade — only the bundled product knowledge base is refreshed.
v1.3.1
Features and Enhancements
BYO Knowledge Tool
The BYO Knowledge tool allows enterprises to import private knowledge and use it as a dedicated, searchable knowledge source during question answering. This helps teams provide responses based on internal documents, operational knowledge, and organization-specific context.
Multi-Cluster Support
Multi-cluster support enables users to access information from multiple clusters by cluster name, expanding question-answering capabilities across cluster boundaries. This makes it easier to query and compare resources in different cluster environments.
Token Quota Limits
Token quota limits allow request frequency and token usage to be restricted by user. This helps administrators control costs, manage quotas, and prevent excessive consumption of model resources.
History
History support enables users to review previous conversations and question-answering results. This makes it easier to trace context, continue earlier investigations, and troubleshoot issues based on past interactions.
Improvements
- Optimized the RAG (langchain) and reranking process to significantly improve answer accuracy and relevance.
- Upgraded the core AI framework to LangChain 1.0 to stay compatible with latest features and optimizations.
- Added routine system check prompts and performed comprehensive code polishing and unit test linting.
- Separated databases for system knowledge base, user knowledge base, and chat history to improve data isolation and performance.
- Redesigned the Smart Doc interaction page for a more intuitive and efficient user experience.
- Upgraded the MCP server, adding support for OAuth authentication and writable tool configurations.
- Enhanced file upload integration for a smoother knowledge ingestion process.
- Added support for IDs in custom elements and resolved related data redundancy issues.
- Implemented a Redis-based rate limiter to enhance system stability and manage API traffic.
Bug Fixes
- Fixed an issue where model downloading could fail and improved environment variable configuration for embedding models.
- Resolved data processing errors occurring during the merging and unpacking of update values.
- Fixed a bug that caused redundant data prefixes in custom elements.
- Resolved occasional service call failures to improve overall system reliability.
v1.2.1
NOTE: Agent mode is an experimental feature, please use with caution.
Bug Fixes
- Fixed an issue where setting the knowledge database name may not work. This fix adds an option to set the database dump file name during installation, and automatically use the specified database dump file to initialize the knowledge base.
- Fixed an issue where MCP tools can create or delete K8s resources without human confirmation in Agent mode.
- Fixed an issue when asking for disk space information using Agent mode, the server may get stuck.
- Fixed an issue when deploying on ACP 4.2 or above, the default node taints are not handled.
- Fixed a deployment error:
kubeVersion: >=1.20.0 which is incompatible with Kubernetes v1.33.7-1. - Fixed an issue where the API key for LLM service and rerank service appeared in plain text when deploying.
Improvements
- Improved the prompt for correct Hyperflux identity.
- Removed not used configuration items in the installation page.
v1.2.0
Features and Enhancements
- Default using RAG chain to answer user questions, improving answer accuracy.
- Support importing database dump to initialize knowledge base, simplifying the setup process.
- Experimental: Support enabling Agent mode to leverage MCP tools to retrieve real-time cluster information.
- Support connecting to PGVector database deployed outside the Alauda Hyperflux installation.
- Support Cohere Reranker model to improve answer relevance.
- Support setting RAG chain parameters such as total_search_k etc.
Known Issues
- When LLM returns errors, the answer generation may fail. When come back to view the chat history, will send the question again to LLM, causing duplicated conversations.