Back to feed

jmerelnyc/Photo-agents

jmerelnyc/Photo-agents
349
+67/day
11
PythonAI/ML

Autonomous self-evolving agents. Vision-grounded layered memory and self-written skills for LLM agents that operate your computer.

From the README

Photo Agents

Autonomous self-evolving Photo Agents. A perceive / reason / act framework for photo-aware agents that operate your computer the way you do.

"100% autonomous, self-evolving agents." photo-agents.com

[## Star History

]()

About

Photo Agents is building the next generation of LLM-driven agents that ground in what they actually see on screen. Instead of dumping longer chat transcripts into a model and hoping for the best we treat memory the way biology does. Vision in. Bound observations stored in layers. Skills written by the agent itself from real success.

The package in this repo is the runtime that ships that idea. It runs locally so you keep ownership of your screen your data and your keys.

  • Website:
  • X / Twitter:

Follow @photoagents on X for build notes demos and the occasional rant about why text-only agents will never see your UI.

What it is

Photo Agents is a single Python package that bundles:

  • A streaming agent loop (photoagents.core.loop.run_agent_session) that drives any tool-calling LLM through a perceive → reason → act cycle.
  • A multi-provider LLM router (photoagents.llm.router) with first-class support for Anthropic Claude (native) OpenAI GPT (native) and a mixin failover session.
  • A physical-execution toolset: file I/O, sandboxed code execution (Python / PowerShell / bash), browser automation via a Chrome DevTools Protocol bridge and a layered memory system (working / global / SOP / session archive).
  • Pluggable clients: a polished Streamlit web app, a PyQt desktop app, a desktop companion and ready-to-run bots for Telegram, QQ, Feishu, WeCom and DingTalk.
  • Optional observability via Langfuse and a cron-style scheduler.

The whole thing is gated by a remote-validated Photo Agents API key so usage stays accountable.

Install

pip install photoagents
# or, with every optional client and integration
pip install "photoagents[all]"

Photo Agents needs Python 3.10+.

Get an API key

Photo Agents requires a license key, validated against ` Sign in and create one at:

**

Then make it available to the runtime in any of these ways (checked in order):

  1. Environment variable: PHOTOAGENTS_API_KEY=pk_live_...
  2. Saved config: ~/.photoagents/config.json field api_key
  3. Interactive prompt on first run (offered to be saved automatically)

A successful validation is cached for 24 hours so the gate stays fast.

LLM credentials

Copy the credentials template and fill in your provider key:

# from the repo root
cp photoagents/config/keys_template.py credentials.py
# then edit credentials.py and uncomment one of the provider configs

The runtime also accepts a JSON form (credentials.json) with the same shape.

Run

# Interactive REPL on your terminal
python -m photoagents

# One-shot file-IO mode
python -m photoagents --task my_task --input "List the largest fil