Announcing Vectara’s Python-SDK

Introduction

At Vectara, we’re committed to providing developers with the best tools to build GenAI applications that leverage our end-to-end RAG platform for building AI assistants and agents.

In June 2024, we launched Vectara’s APIv2, offering a simpler, more RESTful experience and significantly reducing boilerplate code. Since then, we’ve continued to evolve our platform, adding features like UDF reranking, chain reranking, query observability, and the all-new table data understanding.

But we know that ease of use is key to broader adoption. That’s why we’re thrilled to share our new Python SDK (in beta), which provides a native, Pythonic interface to Vectara’s APIs. Developed in partnership with Fern (YC W23), this SDK is always up-to-date, as it’s automatically generated from our OpenAPI spec. It lets you interface with all our endpoints more intuitively, reducing complexity and speeding up development.

Python SDK overview

The Vectara Python SDK mirrors every operation in our API. This means that any capability you have via the REST API, you can now perform using Python function calls. By using the SDK, you remove the need to manually handle authentication tokens, construct HTTP requests, and parse responses. It’s all handled for you, letting you focus on building features instead of boilerplate.

Key Advantages:

Simplicity & Readability: No more dealing directly with raw HTTP requests.
Feature Parity: The SDK covers all API endpoints, so you won’t be missing any functionality.
Familiar Python Patterns: The client is structured with classes and methods that map directly to API domains (e.g., upload, search, chat, documents).

Installing the SDK

You can install the SDK from PyPI:

Authentication

Once installed, you can authenticate using either an API key or OAuth credentials (client ID and client secret). Both methods are supported and easy to set up.

1. Authenticate with API Key

When using an API key, all requests made through the client are automatically authenticated with that key. Note that you can use a query-only API key, a query+indexing API key, or your personal API key. As you might expect some SDK operations may not work with keys that have weaker permissions - for example, you cannot upload a document or index data with a query-only API key.

2. Authenticate with OAuth

Under the hood, the SDK handles obtaining and refreshing OAuth tokens, so you don’t have to worry about token management.

Using the SDK

Now you have a client object, let’s see how you can run common operations with this client.

1. Uploading Files

A common use case is to upload documents for Vectara to process and index. The Vectara platform handles various file types (PDF, DOCX, PPTX, Markdown, and more), automatically extracting the text and chunking it into semantically meaningful segments. These segments are then encoded as vectors using Vectara’s Boomerang embedding model.

Uploading a file is straightforward:

You can also include metadata that will be associated with your document in the index:

Once uploaded, the file is processed and its content is made available shortly thereafter for querying.

2. Indexing text

Not all data comes in file form. Sometimes your content might be generated dynamically, sourced from a database, or scraped from a website. The SDK’s indexing functionality allows you to submit text directly to Vectara.

To do this, you’ll create a StructuredDocument containing the document sections you want to index. Each section can represent a chunk of text, optionally with associated metadata.

This will index the text directly. Each section is processed and made available for semantic search.

Note that the SDK also supports core indexing, which provides fine-grained control over the chunking process, using the CoreDocument object instead of StructuredDocument.

3. Executing queries

Once you’ve indexed documents (either through file uploads or direct text indexing), you can run queries to activate the full RAG pipeline.

4. Multi-turn chats

Vectara’s chat capability provides an integrated memory for chat history that makes it super easy to implement AI assistants and agents with multi-turn conversation sessions.

With the SDK this takes the following form. First, you create a chat

Then we can use the chat function to ask a query:

And then follow up with another question:

What else is in the SDK?

We have shown specific code examples of using the SDK that cover the basic use of Vectara: indexing documents, running a query, and using chat.

The SDK covers a lot more, and in fact mirrors the full functionality of our API including:

Listing existing corpora, creating a corpus, deleting a corpus, and many other corpus-related administrative tasks
List documents in a corpus and retrieve indexed documents.
Understand the options in your account: List available LLMs, generative presets, encoders, re-rankers, and jobs.
Manage users in your account: add a user, delete a user, or list all users.
Manage API keys and OAuth clients.
Understand query history in your account

Importantly, all query and chat functionality support streaming. Simple replace query() with query_stream(), for example:

Conclusion

The Vectara Python SDK provides a clean, intuitive interface to all of Vectara’s powerful features. From uploading files and indexing custom text to running sophisticated queries, starting multi-turn chat sessions, and managing your documents, the SDK streamlines the entire process.

This blog post offered a quick tour of the core functionality, and the full code is available in this IPython notebook. Of course, there is far more you can do with the SDK. We encourage you to explore the complete SDK reference for details on advanced parameters, observability features like query tracing, and reranking capabilities.

We hope this SDK helps you build richer, more interactive GenAI applications faster and with less code.

How to get started? Sign up for Vectara’s 30-day free trial and start coding. If you are already using Vectara with the API, please give the SDK a try and let us know what improvements you would like to see (as an issue on the SDK repository or on our Discord server).

Happy coding!