Navigator n1.5

Navigator n1.5 is the latest generation of the Navigator model family. Use the API model id n1.5-latest (or a dated version) in the model field of your chat.completions requests.

Model Versions

API model id	Description
`n1.5-latest`	Points to the latest stable Navigator n1.5 model. Currently points to `n1.5-20260428`.
`n1.5-20260428`	Stable release (2026-04-28).

Supported Actions

Core Tools

The default tool set (browser_tools_core-20260403). 18 coordinate-based browser tools.

Action	Description	Required Args	Optional Args
`left_click`	Left mouse click	`coordinates`	`ref`, `modifier`
`double_click`	Double left click	`coordinates`	`ref`, `modifier`
`triple_click`	Triple left click	`coordinates`	`ref`, `modifier`
`middle_click`	Middle mouse click	`coordinates`	`ref`, `modifier`
`right_click`	Right mouse click	`coordinates`	`ref`, `modifier`
`scroll`	Scroll in a direction	`coordinates`, `direction`, `amount`	`ref`, `modifier`
`type`	Type text into focused input	`text`
`key_press`	Press a key or combination	`key`
`drag`	Drag from start to end	`start_coordinates`, `coordinates`
`mouse_move`	Move mouse to a point	`coordinates`	`ref`
`mouse_down`	Press and hold left mouse button	`coordinates`	`ref`
`mouse_up`	Release left mouse button	`coordinates`	`ref`
`go_back`	Browser back
`go_forward`	Browser forward
`wait`	Pause execution		`duration`
`goto_url`	Navigate to URL	`url`
`refresh`	Reload page
`hold_key`	Hold a key down	`key`	`duration`

Parameter notes:

coordinates is always [x, y] in the normalized 1000x1000 space.
ref is an optional DOM element reference, used as an alternative to coordinates in browser contexts.
modifier is a modifier key held during the action: ctrl, shift, alt, meta, command, or super.
direction for scroll is one of: down, up, left, right.
amount for scroll is an integer where 1 unit is approximately 10% of the screen height.

Expanded Browser Tools

Includes all core tools plus DOM/ref-based extras (browser_tools_expanded-20260403):

Action	Description	Required Args	Optional Args
`extract_elements`	Extract a structured ARIA-snapshot-style representation of the page’s interactive and semantic elements. Returns a `pageContent` string of `- role "name" [ref=ref_N]` lines (with optional `id`, `href` attributes) that the model can read. Each `ref` is a stable handle that later actions (`left_click`, `set_element_value`, …) can target.		`filter` (`visible`, `interactive`, `all`)
`find`	Search the page for elements whose ARIA-snapshot line matches a substring. Returns a `matches` list of `- role "name" [ref=ref_N]` lines (and a `totalMatches` count) so the model can target them.	`text`
`set_element_value`	Set the value of an `<input>`, `<textarea>`, or `<select>` element directly by ref, dispatching the right `input`/`change` events so the page sees the update.	`ref`, `value`
`execute_js`	Run an arbitrary JavaScript expression or statement block against the page and return the (JSON-serialized) result back to the model. Lets the model interact with the page directly when that’s faster or more reliable than the equivalent click/type/scroll sequence — reading hidden state, calling page-internal APIs, or scripting multi-step flows in one shot.	`text`

Like every Navigator tool, these are predicted by the model and executed by your client — the API never touches your browser. What’s specific to the expanded tools is how you execute them: instead of mapping to a Playwright primitive (click, type, scroll), each one needs custom JavaScript evaluated against the page (e.g., via page.evaluate()). The result comes back as a tool message in the next request.

Reference Implementation

The Yutori Python SDK bundles a reference implementation for each tool, along with an async helper that evaluates them against a Playwright page. Import the script constants and evaluate_tool_script from yutori.navigator.tools:

Tool	SDK constant	Reference script	Returns
`extract_elements`	`EXTRACT_ELEMENTS_SCRIPT`	extract_elements.js	`{success, pageContent}` — also stores live refs on `window.__yutoriElementRefs` for later ref-based actions
`find`	`FIND_SCRIPT`	find.js	`{success, matches, totalMatches}` — substring filter over the same DOM walk
`set_element_value`	`SET_ELEMENT_VALUE_SCRIPT`	set_element_value.js	`{success, message}` — sets the value by `ref` and dispatches the right `input`/`change` events
`execute_js`	`EXECUTE_JS_SCRIPT`	execute_js.js	`{success, hasResult, result}` — wraps the model’s snippet in an `AsyncFunction` so both expressions and statement blocks work
`left_click` / `scroll` via `ref`	`GET_ELEMENT_BY_REF_SCRIPT`	get_element_by_ref.js	`{success, coordinates}` — resolves a `ref` to viewport pixel coordinates and scrolls it into view

The helper evaluate_tool_script(page, SCRIPT, *args) JSON-serializes its arguments, evaluates the script against the page, and returns a Python dict. Three common patterns:

from yutori.navigator.tools import (
    EXECUTE_JS_SCRIPT,
    EXTRACT_ELEMENTS_SCRIPT,
    GET_ELEMENT_BY_REF_SCRIPT,
    evaluate_tool_script,
)

# 1. Read structured DOM for the model — feed `pageContent` back as the tool result.
result = await evaluate_tool_script(page, EXTRACT_ELEMENTS_SCRIPT, "visible")
tool_result_text = result["pageContent"]

# 2. Resolve a `ref` (from extract_elements/find) into viewport pixels before a click.
result = await evaluate_tool_script(page, GET_ELEMENT_BY_REF_SCRIPT, ref)
if result["success"]:
    px_x, px_y = result["coordinates"]

# 3. Run the model's `execute_js` snippet — pass the raw `text` argument; the script
#    already wraps it in an async IIFE. Surface `result` when hasResult is True.
result = await evaluate_tool_script(page, EXECUTE_JS_SCRIPT, args["text"])
tool_result_text = str(result["result"]) if result.get("hasResult") else "undefined"

For the full agent loop — including how each tool’s response envelope feeds back into the next assistant turn — see examples/navigator_n1_5.py in the SDK.

Key Space

Navigator n1.5 uses lowercase key names. Combinations are joined with +, and sequential presses are separated by spaces.

Category	Key Names
Modifiers	`ctrl`, `alt`, `shift`, `meta`, `command`, `super`
Common	`enter`, `backspace`, `delete`, `tab`, `esc`, `space`
Arrow keys	`left`, `right`, `up`, `down`
Page navigation	`pageup`, `pagedown`, `home`, `end`
Function keys	`f1` through `f12`

Examples: ctrl+c, ctrl+shift+t, alt+left, down down down enter

Features

Tool Sets

Use the tool_set parameter to select which set of browser tools are available to the model:

response = client.chat.completions.create(
    model="n1.5-latest",
    messages=[...],
    extra_body={
        "tool_set": "browser_tools_expanded-20260403",
    }
)

Available tool sets:

browser_tools_core-20260403 (default) — coordinate-based visual browser tools
browser_tools_expanded-20260403 — core + DOM-based tools (extract_elements, find, set_element_value, execute_js)

Disabling Specific Tools

Remove specific tools from the active tool set:

response = client.chat.completions.create(
    model="n1.5-latest",
    messages=[...],
    extra_body={
        "disable_tools": ["hold_key", "drag"],
    }
)

JSON Structured Output

Provide a json_schema to get structured data extracted from the model’s response. The schema is appended to your task message, and the model returns JSON inside ```json code fences. The API parses this and returns it as a parsed_json field.

response = client.chat.completions.create(
    model="n1.5-latest",
    messages=[...],
    extra_body={
        "json_schema": {
            "type": "object",
            "properties": {
                "product_name": {"type": "string"},
                "price": {"type": "number"}
            },
            "required": ["product_name", "price"]
        }
    }
)

# Access the parsed result
parsed = response.parsed_json  # {"product_name": "Widget Pro", "price": 29.99}

When json_schema is provided, the API also adds a structural tag for guided decoding of the JSON output, constraining it to match your schema. If the model doesn’t return valid JSON (e.g., it’s still navigating), the parsed_json field will not be present in the response.

Differences from Navigator n1

Feature	Navigator n1	Navigator n1.5
JSON structured output	Not supported	`json_schema` param with `parsed_json` response
Tool sets	Fixed	Selectable (`browser_tools_core-`, `browser_tools_expanded-`)
`disable_tools`	Not supported	Supported
Additional tools	—	`hold_key`, `middle_click`, `mouse_down`, `mouse_up`, `go_forward`
Mouse move	`hover`	`mouse_move`
Key press param	`key_comb` (Playwright names)	`key` (lowercase key space)
Click modifiers	Not supported	`ref`, `modifier` params
`type` extras	`press_enter_after`, `clear_before_typing`	Not included

Getting Started

General

Navigator API

Browsing API

Research API

Scouting API

Webhooks

Model Versions

Supported Actions

Core Tools

Expanded Browser Tools

Reference Implementation

Key Space

Features

Tool Sets

Disabling Specific Tools

JSON Structured Output

Differences from Navigator n1

Getting Started

General

Navigator API

Browsing API

Research API

Scouting API

Webhooks

Documentation Index

​Model Versions

​Supported Actions

​Core Tools

​Expanded Browser Tools

​Reference Implementation

​Key Space

​Features

​Tool Sets

​Disabling Specific Tools

​JSON Structured Output

​Differences from Navigator n1

Model Versions

Supported Actions

Core Tools

Expanded Browser Tools

Reference Implementation

Key Space

Features

Tool Sets

Disabling Specific Tools

JSON Structured Output

Differences from Navigator n1