rvo commited on
Commit
9c7beed
·
verified ·
1 Parent(s): f934061

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -12
README.md CHANGED
@@ -119,7 +119,10 @@ for i, query in enumerate(queries):
119
  See full example notebook [here](https://huggingface.co/MongoDB/mdbr-leaf-ir/blob/main/transformers_example.ipynb).
120
 
121
  ## Asymmetric Retrieval Setup
122
-
 
 
 
123
  `mdbr-leaf-ir` is *aligned* to [`snowflake-arctic-embed-m-v1.5`](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5), the model it has been distilled from. This enables flexible architectures in which, for example, documents are encoded using the larger model, while queries can be encoded faster and more efficiently with the compact `leaf` model:
124
  ```python
125
  # Use mdbr-leaf-ir for query encoding (real-time, low latency)
@@ -139,25 +142,19 @@ Retrieval results in asymmetric mode are often superior to the [standard mode ab
139
 
140
  Embeddings have been trained via [MRL](https://arxiv.org/abs/2205.13147) and can be truncated for more efficient storage:
141
  ```python
142
- from torch.nn import functional as F
143
-
144
- query_embeds = model.encode(queries, prompt_name="query", convert_to_tensor=True)
145
- doc_embeds = model.encode(documents, convert_to_tensor=True)
146
-
147
- # Truncate and normalize according to MRL
148
- query_embeds = F.normalize(query_embeds[:, :256], dim=-1)
149
- doc_embeds = F.normalize(doc_embeds[:, :256], dim=-1)
150
 
151
  similarities = model.similarity(query_embeds, doc_embeds)
152
 
153
  print('After MRL:')
154
  print(f"* Embeddings dimension: {query_embeds.shape[1]}")
155
- print(f"* Similarities:\n\t{similarities}")
156
 
157
  # After MRL:
158
  # * Embeddings dimension: 256
159
  # * Similarities:
160
- # tensor([[0.7136, 0.4989],
161
  # [0.4567, 0.6022]])
162
  ```
163
 
@@ -185,7 +182,7 @@ similarities = query_embeds.astype(int) @ doc_embeds.astype(int).T
185
 
186
  print('After quantization:')
187
  print(f"* Embeddings type: {query_embeds.dtype}")
188
- print(f"* Similarities:\n{similarities}")
189
 
190
  # After quantization:
191
  # * Embeddings type: int8
 
119
  See full example notebook [here](https://huggingface.co/MongoDB/mdbr-leaf-ir/blob/main/transformers_example.ipynb).
120
 
121
  ## Asymmetric Retrieval Setup
122
+
123
+ > [!Note]
124
+ > **Note**: a version of this asymmetric setup, conveniently packaged into a single model, is [available here](https://huggingface.co/MongoDB/mdbr-leaf-ir-asym).
125
+
126
  `mdbr-leaf-ir` is *aligned* to [`snowflake-arctic-embed-m-v1.5`](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5), the model it has been distilled from. This enables flexible architectures in which, for example, documents are encoded using the larger model, while queries can be encoded faster and more efficiently with the compact `leaf` model:
127
  ```python
128
  # Use mdbr-leaf-ir for query encoding (real-time, low latency)
 
142
 
143
  Embeddings have been trained via [MRL](https://arxiv.org/abs/2205.13147) and can be truncated for more efficient storage:
144
  ```python
145
+ query_embeds = model.encode(queries, prompt_name="query", truncate_dim=256)
146
+ doc_embeds = model.encode(documents, truncate_dim=256)
 
 
 
 
 
 
147
 
148
  similarities = model.similarity(query_embeds, doc_embeds)
149
 
150
  print('After MRL:')
151
  print(f"* Embeddings dimension: {query_embeds.shape[1]}")
152
+ print(f"* Similarities: \n\t{similarities}")
153
 
154
  # After MRL:
155
  # * Embeddings dimension: 256
156
  # * Similarities:
157
+ # tensor([[0.7136, 0.4989],
158
  # [0.4567, 0.6022]])
159
  ```
160
 
 
182
 
183
  print('After quantization:')
184
  print(f"* Embeddings type: {query_embeds.dtype}")
185
+ print(f"* Similarities: \n{similarities}")
186
 
187
  # After quantization:
188
  # * Embeddings type: int8