Creating a React Hook for Chrome's window.ai model

August 22, 2024 | 8 minutes
TL;DR: You can use Chrome Dev or Chrome Canary try the new built-in window.ai model before it's widely released. Expect breaking changes, though.

Using AI models on the web is still a relatively young space with new options popping up seemingly daily. The two main approaches seem to be accessing a model through an API or requiring the user to configure the model(s) used with a website on their own through tools like window.ai. While these approaches work, they often introduce costs (which can be unpredictable) and require users to have a certain level of tech savviness and knowledge about how AI models work.

Google is attempting to solve these problems by offering the Gemini model for free to everyone using the Chrome browser. Developers will be able to call a new API on the window object in the browser to interact with Gemini Nano in websites without requiring users to set anything up. As an added bonus, it also works offline since the model is running locally through the browser. Right now, this feature is referred to as window.ai, not to be confused with the open source project window.ai.

While this feature isn't widely available yet, you can try it in Chrome's Dev/Canary releases. Let's look at how the Chrome window.ai API currently works and create a React hook to interact with Gemini Nano running on-device through the browser.

First, Set Up Chrome Dev or Chrome Canary

If you already have either Chrome Dev or Chrome Canary installed on your computer, you can skip this section.

Otherwise, choose a version of Chrome to install: Chrome Dev or Chrome Canary. Both versions of the browser include cutting edge features before they're officially released, but Chrome Canary is updated nightly whereas Chrome Dev is updated weekly. Both versions support the window.ai API.

Next, follow the steps below to enable the settings needed to interact with the window.ai API.

  1. Go to chrome://flags and set the following:
  • #optimization-guide-on-device-model should be set to "Enabled BypassPerfRequirement"
  • #prompt-api-for-gemini-nano should be set to "Enabled"
  1. Go to chrome://components and search for the "Optimization Guide On Device Model" component.
  • In my experience, this didn't show up right away. Making the changes in step 1, restarting Chrome, and re-attempting this step made the component appear. After step 1, you may have to relaunch Chrome multiple times before the component appears.
  • After the component appears, click the "Check for updates" button. This will download the latest version of the component. It might take a few minutes for the download to complete.

That's it! Now, the Gemini Nano model is available through the window.ai API.

Start a Session with the Model

The model is accessed through window.ai and has two methods: create and capabilities. This post will focus on the create method because we're going to create a session to interact with the model. The capabilities method returns details about the capabilities of the model. As with any feature still in early development, expect breaking changes, though.

To start a session with the Gemini Nano model, call await window.ai.assistant.create(). Note that this is an asynchronous operation, so it needs to be awaited. This will return a session that can be used to chat with the model.

To send the model messages, you now use the session returned by the create() method and call await session.prompt([INSERT CHAT TEXT]). This will send the model the input text and return a response. It's also an asynchronous request, so it also needs to be awaited. In my experience, the model generally responds relatively quickly, but it can be slow depending on how long its response is. The response speed is likely to be impacted by your computer's specs, as well, since the model is running locally.

While the API is pretty simple right now, this could change in the future and it might be nice to have a convenient, reusable React hook that provides a single place to initialize the model and reference it from different parts of our application. Let's create a React hook that we can use to initialize the model.

Create a React Hook to Intiatialize the Model

As a starting point, let's name our hook useWindowAIModel and create an empty function that looks like this:

1export const useWindowAIModel = () => {}

In our hook, we're going to want a stable way to reference the model and a ref seems like the best approach. Based on the current API, the model won't change during the lifecycle of our application's other components, so we don't need to use React state for it. In addition, we're also going to assume that this hook will be used in a chat app and won't need to be interacted with right away because it will take the user a few seconds to submit a question, giving the hook time to initialize and return the model. So, we'll initialize the model once when the hook mounts and don't need to use state to re-render the component that calls it. If you're app has different requirements, though, modifying this to use React state for storing the model should work fine.

1export const useWindowAIModel = () => {
2    const model = useRef(window.ai);
3}

We will, however, want to use React state to store the session with the model since creating the session will happen asynchronously.

1export const useWindowAIModel = () => {
2    const model = useRef(window.ai);
3    const [modelSession, setModelSession] = useState();
4}

Next, we need a useEffect hook that runs when the hook mounts. We only want to create a new session with the model once. Creating multiple sessions will cause the model to lose the context from any messages that the user previously entered. Within the useEffect hook, we'll make an asynchronous request to the model to create a session. Once the session is ready, we'll update React state, which will re-render the component that called the hook, making the model available to the UI.

It's possible that the hook may be used in a browser that doesn't support the new window.ai API. For now, the best indicator that I could find for this is that the model isn't available, but more robust feature detection will likely be added in the future. If the model isn't available, the hook can throw an error that the website can display to users so that they understand why part of the UI is unavailable.

When the hook unmounts, we should also destroy the model session so that we don't end up creating orphaned model sessions when the hook unmounts and then re-mounts.

At the end of the hook, we return the model session so that the UI can use it to allow users to interact with the model.

Here's the full code:

1export const useWindowAIModel = () => {
2  const model = useRef(window.ai);
3  const [modelSession, setModelSession] = useState();
4
5  useEffect(() => {
6    if (!model.current) {
7      throw new Error(
8        "The window.ai model is not available. Your browser may not support Chrome's window.ai API."
9      );
10    }
11
12    const intializeSessions = async () => {
13      const aiSession = await model.current.assistant.create();
14      setModelSession(aiSession);
15    };
16
17    intializeSessions();
18
19    // Destroy the session (if it exists) when hook unmounts
20    () => {
21      modelSession?.destroy();
22    };
23  }, []);
24
25  return modelSession;
26};

Using the Hook

The hook returns a session with the model that we can now pass text to using the prompt method on the session. Here's a scaled down component to interact with the model using the hook:

1export const ChatInput = () => {
2    // Get the model session from the hook
3    const modelSession = useWindowAIModel();
4
5    const handleInput = async (event) => {
6        event.preventDefault();
7
8        // Prompt the model session with the user's input and get its response
9        const modelResponse = await modelSession.prompt(
10            // The user input from the textarea
11            event.currentTarget.chatbox.value
12        );
13
14        // Do something here with the model's response
15    }
16
17    return (
18        <form submit={handleInput}>
19            <label for="chatbox">Chatbox:</label>
20            <textarea id="chatbox" name="chatbox" />
21            <button type="submit">Submit</button>
22        </form>
23    )
24}

window.ai and Typescript

Given that this API is so new and hasn't been officially released yet, it's not covered by the latest Typescript version and there aren't any type files available for it (that I know of). If you're using Create React App or a bundler like Vite to build your site, you can add the following to your react-app-env.d.ts or vite-env.d.ts file. Alternatively, you can create a filed named Window.d.ts and add the interface to it.

1interface Window {
2  ai: {
3    assistant: {
4      capabilities: () => Promise<any>;
5      create: () => Promise<any>;
6    };
7  };
8}

Example Chat App using window.ai

AI chat app powered by Chrome's window.ai

If you're looking for a starting point for creating your own chat app in advance of the window.ai API being released in Chrome, check out my chrome-window-ai-chat-app Github repo. It's a fully interactive chat that uses the hook above to chat with the model through a simple UI. Keep in mind that if you want to use this functionality, it'll only work for users accessing the app through Chrome, even once the new API is officially released. It's a little up in the air how to provide an on-device generative AI web experience that works across browsers right now. It's definitely a space that I'll be watching, though.

Next Up

Since this feature will only be available in Chrome, it seems like a good opportunity to create a Chrome extension that leverages it. I'll be exploring that in an upcoming blog post.

Thumbnail image credit: Google DeepMind