|
139 | 139 | "source": [ |
140 | 140 | "### Step 3: Create an Index for your Documents\n", |
141 | 141 | "\n", |
142 | | - "First connect to your OCI search cluster." |
| 142 | + "First connect to your OCI search cluster. We can use the opensearchpy library to connect to the OpenSearch cluster." |
143 | 143 | ] |
144 | 144 | }, |
145 | 145 | { |
|
164 | 164 | "cell_type": "markdown", |
165 | 165 | "metadata": {}, |
166 | 166 | "source": [ |
167 | | - "First, you must create a k-NN index and set the index.knn parameter to true. This settings tells the plugin to generate native library indexes specifically tailored for k-NN searches. \n", |
| 167 | + "First, you must create a k-NN index and set the ``index.knn`` parameter to true. This settings tells the plugin to generate native library indexes specifically tailored for k-NN searches. \n", |
168 | 168 | "\n", |
169 | | - "Next, you must add one or more fields of the knn_vector data type. This example creates an index with one knn_vector and one text. The knn_vector uses lucene fields.\n", |
| 169 | + "Next, you must add one or more fields of the knn_vector data type. This example creates an index with one ``knn_vector``: ``embedding_vector`` and one ``text``: ``text``. \n", |
170 | 170 | "\n", |
171 | | - "See [documentation](https://opensearch.org/docs/2.7/search-plugins/knn/knn-index#method-definitions) for more details on parameters' definitions.\n", |
| 171 | + "The knn_vector uses Lucene fields that specify the configuration of the k-NN search algorithms. It employs the Hierarchical Navigable Small Worlds [HNSW](https://www.pinecone.io/learn/series/faiss/hnsw/) algorithm for super fast search and fantastic recall and consine similarity to measure distance. \n", |
| 172 | + "\n", |
| 173 | + "- ``efSearch`` controls how many entry points will be explored between layers during the search. A higher value of ef_search typically results in a more thorough and potentially higher-quality search, but increased computational cost. \n", |
| 174 | + "\n", |
| 175 | + "- ``efConstruction`` controls how many entry points will be explored when building the index. A higher value of \"ef_constructions\" typically results in a higher-quality graph structure but may also increase the computational cost of building the index.\n", |
| 176 | + "\n", |
| 177 | + "The ``dimension`` field defines the size of the embedding vector. In our case, we are using embedding vectors returned from the genAI embedding model, which is of length 1024. \n", |
| 178 | + "\n", |
| 179 | + "See [documentation](https://opensearch.org/docs/2.8/search-plugins/knn/knn-index#method-definitions) for more details on parameters' definitions. You\n", |
172 | 180 | "\n", |
173 | 181 | "**Note**: The Lucene engine can support dimension up to 1,024." |
174 | 182 | ] |
|
0 commit comments