In the context of Large Language Models, embeddings are mathematical representations of words in a high-dimensional space. They capture the semantic relationships between words and serve as a foundation for understanding and generating text. Currently, only "text-embeddings-ada-002" is supported by Texti, but many domain-specific open-source models will soon be available.

Similarity Measure:

The similarity measure defines how the model gauges the closeness or similarity between different pieces of text. Currently, Texti only supports "Cosine" as a similarity measure.

Chunk Size (Tokens):

This refers to the number of tokens (words or subwords) processed together as a single block or 'chunk' when documents are converted into embeddings.

Last updated