Mirar by RunAnywhere

Real-time vision agents for developers. Point Mirar at a webcam, a screen, or a video stream; declare what to watch for; get verified answers, signed webhooks, or a voice in your ear — while the model only looks when something happens.

Request early access See how it works

mirar · live session

watching

How it works

One pipeline from photons to webhooks.

Every source normalizes to a frame stream. Cheap local processors watch every frame; declarative triggers decide the moment that's worth reasoning about; the answer routes to whatever output your system speaks. You write the config — Mirar runs the loop.

Stream

Webcam, screen share, RTSP and RTMP feeds, video files, or raw frames pushed from your own code — everything becomes the same frame stream.

webcamscreenrtsp://video.mp4

Watch

Lightweight processors run on every sampled frame for fractions of a cent: motion gating in about a millisecond, object detection and tracking, OCR, pose.

motion ~1msRF-DETRocrpose

Decide

Declarative triggers are visual turn detection: the agent speaks when a rule fires, not on a timer. Two-stage verify means false positives don’t page anyone.

object.appearedzones2-stage verify

Reason

The trigger’s prompt goes to the model you choose, with rolling visual memory — keyframes, digests, and a compressed transcript under a hard token budget.

Gemini LiveGPT-4oClaudeBYO base_url

Act

Answers leave the loop in whatever shape your system speaks: HMAC-signed webhooks with retries, live text streams, tool calls, overlays, or voice.

webhook · signedSSE/WStool callvoice

Live demo

Work in progress

A real camera. Right now.

This is a live public broadcast of the Shibuya Scramble Crossing — the kind of feed Mirar is built to watch. Soon you'll type a prompt right here and Mirar will run it against this stream in real time: perception on our infrastructure, reasoning routed to a cloud VLM, answers on the page. Edge reasoning comes later.

shibuya-crossing · live public feed

Live

Ask Mirar about this stream — “how many people are crossing right now?”WIP

“tell me when the crowd thins out”“alert me if someone stops mid-crossing”“what’s the weather like on the ground?”

public broadcast: ANN · Shibuya Scramble Crossing, Tokyo · via YouTube Live · muted

The cost moat

Continuous vision you can afford.

Pointing a frontier model at a camera at a fixed 2 fps burns money looking at an empty room. Mirar's adaptive sampler is a state machine: idle streams drop to 0.2 fps, motion snaps the agent to 2–5 fps, and a per-hour budget cap clamps it back down — frames are dropped, never queued, so cost can't run away while you sleep.

cheaper than fixed 2 fps · identical triggers fired

idle 0.2 fpsactive 2–5 fpsbudget_usd_per_hourdrop, never queue

cost · fixed vs adaptive

fixed · 2.0 fps

$0.00/hr

adaptive · 0.2–5 fps

CALIBRATING$0.00/hr

≈68% cheaper

identical triggers fired

Put a camera on it.

We're onboarding design partners while Mirar hardens on staging — free usage and white-glove trigger tuning in exchange for a weekly call. Tell us what you want watched.

Request early access