Enhanced query observability for greater insight and control

When using any software product, two key questions often arise:

Is it delivering the expected value?
Are users getting a seamless experience?

We’ve introduced new functionality to our web console and API, called Query Observability, to answer these questions by providing new insights so that you can better control the search experience in Vectara. You can now review a history of all past queries within the corpora you manage, including the query results and diagnostic details to inform and improve future query configurations.

For each historical query, you can inspect its configuration settings (like the prompt template value or reranker), how the system interpreted and acted on these settings, and the resulting outputs, including both responses and any errors.

Query configuration options

Vectara provides a broad range of configuration settings to tailor search queries and generative AI features to any use case. These options include:

Lambda – Adjusts the balance between neural-based and keyword-based search behavior
Reranking – Reorders results based on relevance using one or more rerankers
Custom Dimensions - Applies additional vectors of your choice to help control relevancy scores
Result Context – Configures the amount of contextual text displayed around matched results that is passed to the LLM to provide it with the context of the retrieved information
LLM Selection – Allows the choice of LLM for generating responses
Prompt Template – Provides granular control over how the generative LLM responds, so you can customize Vectara for your specific use case
Response Language – Sets the response language for generated answers
Summarization – Defines the number of results sent to the LLM to summarize
Factual Consistency – Evaluates the consistency of the summarized information relative to the retrieved facts and provides a score as a safeguard against hallucinations

These customization options empower you and your users to shape the Vectara search experience to each use case.

Observability: A new level of insight

The AI industry is evolving at a rapid pace and this makes it difficult to keep track of the innovations or understand how many AI systems work. Vectara is investing in Query Observability because we understand that even with intentionality behind creating a responsible RAG platform, AI is still very new and there are still concerns from organizations about the trustworthiness of these systems. We believe that only with transparency and a mindful approach to showing you each step in an AI transaction can we start to build stronger trust with these systems.

Query Observability provides unprecedented insight into how our RAG platform works by logging every query submitted to a corpus, but also adds broader visibility into the generative AI capabilities as well (e.g. summarization). This detailed configuration and performance data, along with diagnostic insights, empowers you to quickly spot potential issues, identify why specific queries may have underperformed, and most importantly, see every step in the transaction to form a better understanding of the system. Query

You can view a corpus’ query history using the new API endpoint or within the web console on the Corpus > Query history page.

Within the console, this feature will be enabled by default using the toggle on the Query tab. When enabled, Vectara will store each query entered into the console to view within the Query history tab. To stop storing your query history, you can disable this feature using the same toggle. Conversely, this feature will be disabled by default when using the Vectara API and you must explicitly set the value of save_history to true to start saving the history of queries made using the API.

Query Observability isn’t just about optimizing individual queries; it also provides oversight into the corpora you manage and brings transparency to how AI systems work. With this visibility, you can optimize your query configurations, ensuring that end users are getting the best possible search experience.

Conclusion

Query Observability adds more visibility for your query history and provides insight into how the system works to give you confidence in your understanding and to build trust in AI systems. With it, you can assess past configurations, troubleshoot issues, and continuously optimize queries to deliver a valuable search experience. By helping you see what works—and why—this feature enhances both the effectiveness of your corpus management and the overall user experience.

As always, we’d love to hear your feedback! Connect with us on our forums, on our Discord, or on our community. If you’d like to see what Vectara can offer you for retrieval augmented generation on your application or website, sign up for an account!