Using MeMemo for vector searching in the browser

April 09, 2025 | 9 minutes

In This Post

Why use vector search?
How does vector search support AI workflows?
Implementing vector search in a Next.js website using MeMemo

TL;DR: Unlock more accurate and contextually relevant search using MeMemo's in-browser vector search capabilities, which enable linguistic similarity-based results that support improved UX and support for AI workflows.

While you can use basic keyword searching to implement a search function on a website, vector searches allow for more accurate searching based on linguistic similarities between words. The MeMemo library brings this functionality to the browser so that you can add vector search to a website to support general search functionality or AI workflows.

Why use vector search?

Vector search makes it easier to search for similar text across a body of content by searching numeric representations of the content. The searchable content and search query are converted from semantic text to numbers called vectors. These vectors can contain multiple dimensions that represent the linguistic meaning of the text. Then, basic math operations can be used to compare them.

Vectors that are similar likely refer to synonyms, for example. While we could group synonyms with the semantic text itself, these groupings would be based more on "vibes" and creating mathematical relationships between them is more accurate.

The MeMemo library allows us to perform vector searches on the server or in the browser using the HNSW approximate nearest neighbor search technique. Along with the search results, it returns the distance between the vector for the search query and the vectors representing each result. Here's an example:

1{
2  keys: [
3    'Manage JavaScript Environment Variables with dotenv',
4    'Using the Spread Operator with Error Objects',
5    'Interact with Bluetooth Devices using the Web Bluetooth API',
6    'Automatically Start Node.js Applications When Your Raspberry Pi Boots Up',
7    'Using MeMemo for vector searching in the browser',
8    "Creating a React Hook for Chrome's window.ai model",
9    'Adding Shortcuts to Progressive Web Apps',
10    'Styling the HTML Color Input',
11    'package.json Fields for Publishing Libraries',
12    'The :is() and :where() CSS Pseudo-classes'
13 ],
14  distances: [
15    0.68803, 0.695923,
16    0.745732, 0.752662,
17    0.756145, 0.783056,
18    0.805127, 0.807467,
19    0.848459, 0.881811
20 ]
21}
Copy to clipboard

The distances represent how similar the result is to the search query. The distances range from 0 to 1. The closer the value is to 0, the more exact the match between the search query and result.

How does vector search support AI workflows?

Vector search is often used to support retrieval-augmented generation (RAG) in AI workflows. You could use MeMemo, for example, to store and search vector embeddings for your content and then pass its results to an LLM as context in a prompt so that the LLM can use it in its response. Note that with this approach, you need to pass the actual matching text from MeMemo to the LLM; you can't directly pass the matching embeddings to save space within the context window. LLMs expect text input and will not be able to decode the embeddings.

Implementing vector search in a Next.js website using MeMemo

MeMemo doesn't handle creating vector embeddings, but it needs them to perform its search functionality. So, the first step to convert your content to vector embeddings. I used the @xenova/transformers library to convert text content to vectors.

Creating a Transformers.js pipeline for feature extraction

To start, the PipelineSingleton class below lazily creates a Transformers.js pipeline for feature extraction. Feature extraction is the process of identifying and extracting relevant features from text data. We use this process to reduce the complexity of the data while retaining as much relevant information as possible, making our searches more efficient. Feature extraction is how we create vectors from our content.

The class uses the Xenova/all-MiniLM-L6-v2 model to perform the extraction.

By wrapping the PipelineSingleton class in a function, we can attach an instance of the class to the global object to reuse the same pipeline between hot reloads in development mode.

1import { pipeline, env } from "@xenova/transformers";
2
3// Skip local model check
4env.allowLocalModels = false;
5
6// Use the Singleton pattern to enable lazy construction of the pipeline.
7// NOTE: We wrap the class in a function to prevent code duplication (see below).
8const P = () =>
9  class PipelineSingleton {
10    static task = "feature-extraction";
11    static model = "Xenova/all-MiniLM-L6-v2";
12    static instance = null;
13
14    static async getInstance(progress_callback = null) {
15      if (this.instance === null) {
16        this.instance = pipeline(this.task, this.model, { progress_callback });
17 }
18      return this.instance;
19 }
20 };
21
22let PipelineSingleton;
23if (process.env.NODE_ENV !== "production") {
24  // When running in development mode, attach the pipeline to the
25  // global object so that it's preserved between hot reloads.
26  // For more information, see https://vercel.com/guides/nextjs-prisma-postgres
27  if (!global.PipelineSingleton) {
28    global.PipelineSingleton = P();
29 }
30  PipelineSingleton = global.PipelineSingleton;
31} else {
32  PipelineSingleton = P();
33}
34export default PipelineSingleton;
35
Copy to clipboard

Creating embeddings for the searchable content

With the PipelineSingleton class in place, we can create the embeddings. First, get an instance of the class.

1  // Get the classification pipeline. When called for the first time,
2  // this will load the pipeline and cache it for future use.
3  const featureExtractor = await PipelineSingleton.getInstance();
Copy to clipboard

The featureExtractor variable will be an asynchronous function that takes two parameters:

An array of strings containing your content
An object with the properties:
- pooling - represents the mode for generating fixed-sized sentence embeddings from variable-sized sentences
- normalize - represents whether to normalize the embeddings in the last dimension

By passing { pooling: "mean", normalize: true }, the vectors returned by the featureExtractor function will be a compact representation of our content.

1const embeddings = await featureExtractor(
2    inputs.map((input) => `${input.title} ${input.description}`),
3 {
4        pooling: "mean",
5        normalize: true,
6 },
7);
Copy to clipboard

After running this function, embeddings should be an object that looks like the following:

1{
2    dims: [25, 384] // dimensions of the vector
3    type: 'float32',
4    data: Float32Array(9600) [-0.0031234234, ...] // a large array of numbers representing the vectors
5    size: 9600
6}
Copy to clipboard

Indexing the vector embeddings with MeMemo

The next step in implementing our vector search is to index the embeddings with MeMemo. First, create a new instance of the HNSW search class provided by MeMemo.

The HNSW class provides an asynchronous bulkInsert method to index multiple embeddings at once and it accepts two parameters: keys and values below. The keys parameter is a 1D array and values is a 2D array. The length of the keys array must match the length of the outer array of the values array. To get the embeddings in the correct format for the bulkInsert method, call the tolist() method on the embeddings object.

1import { HNSW } from "mememo";
2
3// Creating a new index for MeMemo search
4const index = new HNSW({ distanceFunction: "cosine" });
5
6// Create an array of keys with the same number of dimensions as the vectors
7const keys = data.map((d) => d.title);
8
9// Calling the `tolist()` methods gives you the 2D array that contains the
10// vectors for the inputs
11const values = embeddings.tolist();
12
13// Create the index with the number of dimensions
14// that matches the vectors
15// The vectors must be a 2D array
16await index.bulkInsert(keys, values);
Copy to clipboard

Creating an embedding for the search query

Since our vector search will be searching vectors and not text, we also need to convert the search query to a vector embedding. For this, we can use the same process we used to create the vector embeddings for the content.

1const searchQuery = "javascript";
2
3const queryEmbedding = await featureExtractor([searchQuery], {
4    pooling: "mean",
5    normalize: true,
6});
Copy to clipboard

Performing a vector search

In addition to the bulkInsert method, the HNSW search class from MeMemo provides an asynchronous query function that we'll use to perform a vector search on the embeddings we previously inserted into our MeMemo instance.

The query function takes two parameters:

A 1D array for the vector embedding that represents the search query (using tolist() returns a 2D array which can be flattened to a 1D array with the flat() method)
A number representing the number of results (i.e., nearest neighbors) to return

1// Number of nearest neighbors to return
2const k = 10;
3
4const results = await index.query(queryEmbedding.tolist().flat(), k);
Copy to clipboard

The results variable should look similar to the example below. For the best matches, you can further filter the list of results using the distances array to include only the results close in similarity to the search query. The closer the distance is to 0, the more similar the results are to the search query.

1{
2  keys: [
3    'Mock window.matchMedia in Vitest',
4    'Using MeMemo for vector searching in the browser',
5    "Using Turborepo's --affected flag in CI",
6    'Programming the Adafruit HalloWing M4',
7    'Compare React App Performance Snapshots with Reactime',
8    'Watering Plants with a Raspberry Pi',
9    'The :is() and :where() CSS Pseudo-classes',
10    'Automatically Start Node.js Applications When Your Raspberry Pi Boots Up',
11    'Manage JavaScript Environment Variables with dotenv',
12    'Using the Spread Operator with Error Objects'
13 ],
14  distances: [
15    0.476673, 0.768659,
16    0.826878, 0.865016,
17    0.930229, 0.941275,
18    0.949908, 0.956553,
19    0.966251, 0.976808
20 ]
21}
Copy to clipboard

Troubleshooting: NaN in the distances array

If you have NaN values in the distances array, check the following to make sure that the vectors are valid and you're calling MeMemo correctly:

Verify that all of your content vectors are valid

Before calling the index.bulkInsert() method, log the following to the console and verify the results.

1console.log(embeddings.length); // should match number of inputs used to create embeddings
2console.log(embeddings[0].length); // should be 384
3console.log(embeddings[0].every(n => typeof n === 'number' && !isNaN(n))); // should be true for all vectors
Copy to clipboard

Verify that the query vector is valid

After creating the query, log the following to the console and verify the results.

1console.log(queryEmbedding.length); // 384
2console.log(queryEmbedding.every(n => typeof n === 'number' && !isNaN(n))); // should be true
Copy to clipboard

Make sure you're inserting the values and querying MeMemo correctly

After making any adjustments in the steps above, rerun the code to see if you now get numerical values in the distances array. If you're still getting NaN make sure that you're passing values to MeMemo with the correct level of array nesting. MeMemo will accept any array in the index.query() method, but will silently fail if the array isn't a 1D array, for example.

Complete Code Example

Here's the complete code for creating the vector embeddings, indexing the vectors with MeMemo, and using MeMemo to complete a vector search.

1import { HNSW } from "mememo";
2
3// Get the classification pipeline. When called for the first time,
4// this will load the pipeline and cache it for future use.
5const featureExtractor = await PipelineSingleton.getInstance();
6
7const embeddings = await featureExtractor(
8    inputs.map((input) => `${input.title} ${input.description}`),
9 {
10        pooling: "mean",
11        normalize: true,
12 },
13);
14
15// Creating a new index for MeMemo search
16const index = new HNSW({ distanceFunction: "cosine" });
17
18// Create an array of keys with the same number of dimensions as the vectors
19const keys = data.map((d) => d.title);
20
21// Calling the `tolist()` methods gives you the 2D array that contains the
22// vectors for the inputs
23const values = embeddings.tolist();
24
25// Create the index with the number of dimensions
26// that matches the vectors
27// The vectors must be a 2D array
28await index.bulkInsert(keys, values);
29
30const searchQuery = "javascript";
31
32const queryEmbedding = await featureExtractor([searchQuery], {
33    pooling: "mean",
34    normalize: true,
35});
36
37// Number of nearest neighbors to return
38const k = 10;
39
40const results = await index.query(queryEmbedding.tolist().flat(), k);
Copy to clipboard