Index and search files
Creating an indexed file
If you need to index a file for semantic search just pass the index: true
parameter when creating the file.
Supported file formats
The following file formats are supported for indexing:
Format | File Extension | MIME Type |
---|---|---|
application/pdf | ||
HTML | .html | text/html |
Text | .txt | text/plain |
Markdown | .md | text/markdown |
Notes on indexing
- The file it will initially have a status of “indexing_pending” and will be indexed asynchronously. The time it takes to index the file will depend on the file size and the current load on the system.
- You can check the status of the file by calling the Get File endpoint and checking that the
file.status
property has changed to “indexing_completed”. - If the indexing failed the status will be set to “indexing_failed” and the reason of the failure will be available in the
failedStatusReason
property of the file.
Searching files
To run a semantic search on your bot’s files you can use the Search Files API endpoint. This is particularly useful for RAG (Retrieval Augmented Generation) implementations.
You can check the API reference for more details on the properties available for each passage object returned in the response.
Using the search results for RAG
You can use the search results to provide relevant information to a chatbot for generating a response to a user’s question.
Here’s an example of a simple RAG (Retrieval Augmented Generation) implementation that shows how you can use the ChatGPT large language model (through the OpenAI API) and the search results provided by our API to answer a user’s question:
Was this page helpful?