Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.yutori.com/llms.txt

Use this file to discover all available pages before exploring further.

Navigator n1.5 is the latest generation of the Navigator model family. Use the API model id n1.5-latest (or a dated version) in the model field of your chat.completions requests.

Model Versions

API model idDescription
n1.5-latestPoints to the latest stable Navigator n1.5 model. Currently points to n1.5-20260428.
n1.5-20260428Stable release (2026-04-28).

Supported Actions

Core Tools

The default tool set (browser_tools_core-20260403). 18 coordinate-based browser tools.
ActionDescriptionRequired ArgsOptional Args
left_clickLeft mouse clickcoordinatesref, modifier
double_clickDouble left clickcoordinatesref, modifier
triple_clickTriple left clickcoordinatesref, modifier
middle_clickMiddle mouse clickcoordinatesref, modifier
right_clickRight mouse clickcoordinatesref, modifier
scrollScroll in a directioncoordinates, direction, amountref, modifier
typeType text into focused inputtext
key_pressPress a key or combinationkey
dragDrag from start to endstart_coordinates, coordinates
mouse_moveMove mouse to a pointcoordinatesref
mouse_downPress and hold left mouse buttoncoordinatesref
mouse_upRelease left mouse buttoncoordinatesref
go_backBrowser back
go_forwardBrowser forward
waitPause executionduration
goto_urlNavigate to URLurl
refreshReload page
hold_keyHold a key downkeyduration
Parameter notes:
  • coordinates is always [x, y] in the normalized 1000x1000 space.
  • ref is an optional DOM element reference, used as an alternative to coordinates in browser contexts.
  • modifier is a modifier key held during the action: ctrl, shift, alt, meta, command, or super.
  • direction for scroll is one of: down, up, left, right.
  • amount for scroll is an integer where 1 unit is approximately 10% of the screen height.

Expanded Browser Tools

Includes all core tools plus DOM/ref-based extras (browser_tools_expanded-20260403):
ActionDescriptionRequired ArgsOptional Args
extract_elementsExtract a structured ARIA-snapshot-style representation of the page’s interactive and semantic elements. Returns a pageContent string of - role "name" [ref=ref_N] lines (with optional id, href attributes) that the model can read. Each ref is a stable handle that later actions (left_click, set_element_value, …) can target.filter (visible, interactive, all)
findSearch the page for elements whose ARIA-snapshot line matches a substring. Returns a matches list of - role "name" [ref=ref_N] lines (and a totalMatches count) so the model can target them.text
set_element_valueSet the value of an <input>, <textarea>, or <select> element directly by ref, dispatching the right input/change events so the page sees the update.ref, value
execute_jsRun an arbitrary JavaScript expression or statement block against the page and return the (JSON-serialized) result back to the model. Lets the model interact with the page directly when that’s faster or more reliable than the equivalent click/type/scroll sequence — reading hidden state, calling page-internal APIs, or scripting multi-step flows in one shot.text
Like every Navigator tool, these are predicted by the model and executed by your client — the API never touches your browser. What’s specific to the expanded tools is how you execute them: instead of mapping to a Playwright primitive (click, type, scroll), each one needs custom JavaScript evaluated against the page (e.g., via page.evaluate()). The result comes back as a tool message in the next request.

Reference Implementation

The Yutori Python SDK bundles a reference implementation for each tool, along with an async helper that evaluates them against a Playwright page. Import the script constants and evaluate_tool_script from yutori.navigator.tools:
ToolSDK constantReference scriptReturns
extract_elementsEXTRACT_ELEMENTS_SCRIPTextract_elements.js{success, pageContent} — also stores live refs on window.__yutoriElementRefs for later ref-based actions
findFIND_SCRIPTfind.js{success, matches, totalMatches} — substring filter over the same DOM walk
set_element_valueSET_ELEMENT_VALUE_SCRIPTset_element_value.js{success, message} — sets the value by ref and dispatches the right input/change events
execute_jsEXECUTE_JS_SCRIPTexecute_js.js{success, hasResult, result} — wraps the model’s snippet in an AsyncFunction so both expressions and statement blocks work
left_click / scroll via refGET_ELEMENT_BY_REF_SCRIPTget_element_by_ref.js{success, coordinates} — resolves a ref to viewport pixel coordinates and scrolls it into view
The helper evaluate_tool_script(page, SCRIPT, *args) JSON-serializes its arguments, evaluates the script against the page, and returns a Python dict. Three common patterns:
from yutori.navigator.tools import (
    EXECUTE_JS_SCRIPT,
    EXTRACT_ELEMENTS_SCRIPT,
    GET_ELEMENT_BY_REF_SCRIPT,
    evaluate_tool_script,
)

# 1. Read structured DOM for the model — feed `pageContent` back as the tool result.
result = await evaluate_tool_script(page, EXTRACT_ELEMENTS_SCRIPT, "visible")
tool_result_text = result["pageContent"]

# 2. Resolve a `ref` (from extract_elements/find) into viewport pixels before a click.
result = await evaluate_tool_script(page, GET_ELEMENT_BY_REF_SCRIPT, ref)
if result["success"]:
    px_x, px_y = result["coordinates"]

# 3. Run the model's `execute_js` snippet — pass the raw `text` argument; the script
#    already wraps it in an async IIFE. Surface `result` when hasResult is True.
result = await evaluate_tool_script(page, EXECUTE_JS_SCRIPT, args["text"])
tool_result_text = str(result["result"]) if result.get("hasResult") else "undefined"
For the full agent loop — including how each tool’s response envelope feeds back into the next assistant turn — see examples/navigator_n1_5.py in the SDK.

Key Space

Navigator n1.5 uses lowercase key names. Combinations are joined with +, and sequential presses are separated by spaces.
CategoryKey Names
Modifiersctrl, alt, shift, meta, command, super
Commonenter, backspace, delete, tab, esc, space
Arrow keysleft, right, up, down
Page navigationpageup, pagedown, home, end
Function keysf1 through f12
Examples: ctrl+c, ctrl+shift+t, alt+left, down down down enter

Features

Tool Sets

Use the tool_set parameter to select which set of browser tools are available to the model:
response = client.chat.completions.create(
    model="n1.5-latest",
    messages=[...],
    extra_body={
        "tool_set": "browser_tools_expanded-20260403",
    }
)
Available tool sets:
  • browser_tools_core-20260403 (default) — coordinate-based visual browser tools
  • browser_tools_expanded-20260403 — core + DOM-based tools (extract_elements, find, set_element_value, execute_js)

Disabling Specific Tools

Remove specific tools from the active tool set:
response = client.chat.completions.create(
    model="n1.5-latest",
    messages=[...],
    extra_body={
        "disable_tools": ["hold_key", "drag"],
    }
)

JSON Structured Output

Provide a json_schema to get structured data extracted from the model’s response. The schema is appended to your task message, and the model returns JSON inside ```json code fences. The API parses this and returns it as a parsed_json field.
response = client.chat.completions.create(
    model="n1.5-latest",
    messages=[...],
    extra_body={
        "json_schema": {
            "type": "object",
            "properties": {
                "product_name": {"type": "string"},
                "price": {"type": "number"}
            },
            "required": ["product_name", "price"]
        }
    }
)

# Access the parsed result
parsed = response.parsed_json  # {"product_name": "Widget Pro", "price": 29.99}
When json_schema is provided, the API also adds a structural tag for guided decoding of the JSON output, constraining it to match your schema. If the model doesn’t return valid JSON (e.g., it’s still navigating), the parsed_json field will not be present in the response.

Differences from Navigator n1

FeatureNavigator n1Navigator n1.5
JSON structured outputNot supportedjson_schema param with parsed_json response
Tool setsFixedSelectable (browser_tools_core-*, browser_tools_expanded-*)
disable_toolsNot supportedSupported
Additional toolshold_key, middle_click, mouse_down, mouse_up, go_forward
Mouse movehovermouse_move
Key press paramkey_comb (Playwright names)key (lowercase key space)
Click modifiersNot supportedref, modifier params
type extraspress_enter_after, clear_before_typingNot included