chat.completions API.
n1-preview-2025-11, we use the observation role to pass in screenshots.
n1-preview-2025-11. Future models will be robust to other resolutions.
We recommend using the WebP format for screenshots, as it offers significantly better compression than formats like PNG—especially when sharing multi-step trajectories with many images.
observation blocks as urls or base64 strings.
Passing a URL:
observation messages is recommended—but not required—for better attribution of the information source in stop messages.
For example:
n1-preview-2025-11 currently supports.
Note that n1-preview-2025-11 outputs relative coordinates in 1000×1000, which should be converted to absolute coordinates when executing actions in a browser environment.
| Action | Description | Arguments | Example |
|---|---|---|---|
| click | Left mouse click at a specific point on the page. | center_coordinates: [x, y] | { "action_type": "click", "center_coordinates": [500, 300] } |
| scroll | Scrolls the page in a given direction by a specified amount, centered around a given position. We recommend treating each scroll amount as 10-15% of the screen. | direction: string center_coordinates: [x, y] amount: int | { "action_type": "scroll", "direction": "down", "center_coordinates": [632, 500], "amount": 3} |
| type | Types text into the currently focused input element, optionally clearing it first and/or pressing Enter afterward. | text: string press_enter_after: bool clear_before_typing: bool | { "action_type": "type", "text": "example", "press_enter_after": false, "clear_before_typing": true} |
| key_press | Sends keyboard input (e.g. Escape). | key_comb: string (compatible with Playwright Keyboard press) | { "action_type": "key_press", "key_comb": "Escape" } |
| hover | Moves the mouse pointer to a specific location without clicking. | center_coordinates: [x, y] | { "action_type": "hover", "center_coordinates": [540, 210] } |
| drag | Click-and-hold on a starting coordinate, move the cursor to a destination coordinate. Note: center_coordinates is the destination. | start_coordinates: [x, y] center_coordinates: [x, y] | {"action_type": "drag", "start_coordinates": [63, 458], "center_coordinates": [273, 458] } |
| wait | Pauses without performing any UI action, usually to allow the page/UI to update. | (none) | { "action_type": "wait" } |
| refresh | Reloads the current page (browser refresh). | (none) | { "action_type": "refresh" } |
| go_back | Navigates back to the previous page in browser history. | (none) | { "action_type": "go_back" } |
| goto_url | Navigates directly to a specified URL. | url: string | { "action_type": "goto_url", "url": "https://example.com" } |
| read_texts_and_links | Reads visible on-screen text and saves relevant URLs for citation. No interaction with the page. This is implemented as an external VLM call using the current screenshot, the user’s task, and a simplified DOM (for links) as inputs. | (none) | { "action_type": "read_texts_and_links" } |
| stop | Ends the current trajectory immediately and returns the final answer or summary. | answer: string | { "action_type": "stop", "answer": "example" } |
Use Authorization: Bearer <api_key>
"n1-preview-2025-11"This field will be supported in future releases.
This field will be supported in future releases.
Successful Response