# AbstractVision — llms-full

> Single-file, agent-oriented context bundle generated from this repository’s docs and metadata.

How to use:
- If you only need a map of where to look, use `llms.txt`.
- If you need an all-in-one context bundle (for offline models or constrained retrieval), use this file.

Notes:
- This is a **convention** (many projects publish an `llms-full.txt`). It is not part of the core `llms.txt` spec.
- Relative links inside each file section are authored for that file’s original location. Use the `FILE: …` marker to interpret link paths.
- If you change docs or packaging metadata, regenerate this file by running:
  - `python scripts/generate_llms_full.py`

Included files:
- `llms.txt`
- `README.md`
- `docs/README.md`
- `docs/getting-started.md`
- `docs/api.md`
- `docs/architecture.md`
- `docs/faq.md`
- `docs/reference/backends.md`
- `docs/reference/configuration.md`
- `docs/reference/capabilities-registry.md`
- `docs/reference/artifacts.md`
- `docs/reference/abstractcore-integration.md`
- `playground/README.md`
- `CONTRIBUTING.md`
- `SECURITY.md`
- `ACKNOWLEDGMENTS.md`
- `CHANGELOG.md`
- `pyproject.toml`

--- 8< --- FILE: llms.txt --- 8< ---
# AbstractVision

> Model-agnostic generative vision API (images, optional video) with a capability registry, artifact-ref outputs, and backends for OpenAI-compatible HTTP, Diffusers, and stable-diffusion.cpp.

This repository’s current source of truth is the code under `src/abstractvision/` (docs in `docs/`).

Format note: this file follows the `llms.txt` Markdown spec (H1 + optional summary/details + H2 “file list” sections; the `## Optional` section can be skipped when you need a shorter context). Spec: https://llmstxt.org/#format

Maintenance tips:
- Keep link descriptions concise and unambiguous; avoid unexplained jargon.
- Regenerate `llms-full.txt` after doc/packaging changes: `python scripts/generate_llms_full.py`.

Agent quickstart (choose the path that matches your goal):
- **Use the library (Python / CLI)**: start with `README.md` → `docs/getting-started.md` → `docs/api.md` → `docs/reference/backends.md`.
- **Integrate with AbstractCore/Runtime**: read `docs/reference/abstractcore-integration.md` and `docs/reference/artifacts.md`.
- **Inspect provider model catalogs**: use `abstractvision provider-models`, `VisionManager.list_provider_models(...)`, or `llm.vision.list_provider_models(...)`; catalog listing is explicit and does not select the active model.
- **Need a single file**: open `llms-full.txt` (generated bundle of the core docs).
- **Need local Diffusers**: install `abstractvision[diffusers]`, set `ABSTRACTVISION_BACKEND=diffusers`, pre-download `runwayml/stable-diffusion-v1-5`, or set `ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1`.

Reality checks (current shipped behavior, anchored in code):
- Built-in backends implement `text_to_image` and `image_to_image`.
- `text_to_video` and `image_to_video` are supported only via the OpenAI-compatible backend when video endpoints are configured.
- `multi_view_image` exists in the API but no built-in backend implements it yet.
- Provider `/models` catalogs can be listed explicitly, including through the AbstractCore plugin boundary; they are not used for automatic model selection.

## Documentation

- [llms-full.txt](llms-full.txt): single-file bundle of the core docs (for agent ingestion)
- [README.md](README.md): overview, install, quickstart
- [docs/README.md](docs/README.md): docs index (map)
- [docs/getting-started.md](docs/getting-started.md): first image (OpenAI-compatible HTTP / Diffusers / sdcpp) + Playground
- [docs/api.md](docs/api.md): public Python API surface
- [docs/architecture.md](docs/architecture.md): how components fit together (with diagrams)
- [docs/faq.md](docs/faq.md): common questions + troubleshooting
- [docs/reference/backends.md](docs/reference/backends.md): backend support matrix + config notes
- [docs/reference/configuration.md](docs/reference/configuration.md): CLI/REPL commands + `ABSTRACTVISION_*` env vars
- [docs/reference/capabilities-registry.md](docs/reference/capabilities-registry.md): capability registry format + usage
- [docs/reference/artifacts.md](docs/reference/artifacts.md): artifact refs + stores
- [docs/reference/abstractcore-integration.md](docs/reference/abstractcore-integration.md): AbstractCore plugin + tool helpers
- [playground/README.md](playground/README.md): self-contained local web UI/API served by `abstractvision playground`
- [CONTRIBUTING.md](CONTRIBUTING.md): dev setup + tests + contribution guidelines
- [SECURITY.md](SECURITY.md): responsible vulnerability reporting
- [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md): upstream libraries/projects

## AbstractFramework ecosystem

- [AbstractFramework](https://github.com/lpalbou/AbstractFramework): ecosystem hub (how components fit together)
- [AbstractCore](https://github.com/lpalbou/abstractcore): orchestration + tool calling (AbstractVision integrates via plugin/tools)
- [AbstractRuntime](https://github.com/lpalbou/abstractruntime): runtime services (artifact store integration via adapter)

## Code entry points

- [src/abstractvision/vision_manager.py](src/abstractvision/vision_manager.py): `VisionManager` orchestrator API
- [src/abstractvision/types.py](src/abstractvision/types.py): request/response dataclasses (`ImageGenerationRequest`, `GeneratedAsset`, …)
- [src/abstractvision/errors.py](src/abstractvision/errors.py): error types (`CapabilityNotSupportedError`, …)
- [src/abstractvision/backends/base_backend.py](src/abstractvision/backends/base_backend.py): `VisionBackend` contract
- [src/abstractvision/backends/__init__.py](src/abstractvision/backends/__init__.py): lazy imports (keeps `import abstractvision` import-light)
- [src/abstractvision/backends/openai_compatible.py](src/abstractvision/backends/openai_compatible.py): OpenAI-compatible HTTP backend (+ optional video)
- [src/abstractvision/backends/huggingface_diffusers.py](src/abstractvision/backends/huggingface_diffusers.py): local Diffusers backend (T2I/I2I)
- [src/abstractvision/backends/stable_diffusion_cpp.py](src/abstractvision/backends/stable_diffusion_cpp.py): stable-diffusion.cpp backend (GGUF via `sd-cli` or python bindings)
- [src/abstractvision/model_capabilities.py](src/abstractvision/model_capabilities.py): capability registry loader + validator
- [src/abstractvision/artifacts.py](src/abstractvision/artifacts.py): artifact refs + stores (`LocalAssetStore`, `RuntimeArtifactStoreAdapter`)
- [src/abstractvision/cli.py](src/abstractvision/cli.py): CLI/REPL (`abstractvision`)
- [src/abstractvision/playground_server.py](src/abstractvision/playground_server.py): self-contained playground server and `/v1/vision/*` jobs
- [src/abstractvision/playground/vision_playground.html](src/abstractvision/playground/vision_playground.html): packaged playground UI asset
- [src/abstractvision/integrations/abstractcore_plugin.py](src/abstractvision/integrations/abstractcore_plugin.py): AbstractCore capability plugin entry point and provider catalog shim
- [src/abstractvision/integrations/abstractcore.py](src/abstractvision/integrations/abstractcore.py): AbstractCore tool helpers (`make_vision_tools`)

## Testing

- [Test suite](tests/): run `python -m unittest discover -s tests -p "test_*.py" -q`
- [Changelog](CHANGELOG.md): release notes
- [pyproject.toml](pyproject.toml): dependencies/extras + entry points
- [scripts/generate_llms_full.py](scripts/generate_llms_full.py): regenerate `llms-full.txt`

## Optional

- [Engineering backlog](docs/backlog/README.md): internal design notes + completion reports
--- >8 --- END FILE: llms.txt --- >8 ---

--- 8< --- FILE: README.md --- 8< ---
# AbstractVision

[![PyPI version](https://img.shields.io/pypi/v/abstractvision.svg)](https://pypi.org/project/abstractvision/)
[![CI](https://github.com/lpalbou/AbstractVision/actions/workflows/ci.yml/badge.svg)](https://github.com/lpalbou/AbstractVision/actions/workflows/ci.yml)
[![Tested Python](https://img.shields.io/badge/dynamic/yaml?url=https%3A%2F%2Fraw.githubusercontent.com%2Flpalbou%2FAbstractVision%2Fmain%2F.github%2Fworkflows%2Fci.yml&query=%24.jobs.test.strategy.matrix%5B%22python-version%22%5D&label=tested%20python&color=blue)](https://github.com/lpalbou/AbstractVision/actions/workflows/ci.yml)
[![license](https://img.shields.io/github/license/lpalbou/AbstractVision)](https://github.com/lpalbou/AbstractVision/blob/main/LICENSE)
[![GitHub stars](https://img.shields.io/github/stars/lpalbou/AbstractVision?style=social)](https://github.com/lpalbou/AbstractVision/stargazers)

Model-agnostic generative vision API (images, optional video) for Python and the Abstract* ecosystem.

## What you get

- A small orchestration API: [`VisionManager`](src/abstractvision/vision_manager.py)
- A packaged capability registry (“what models can do”): [`VisionModelCapabilitiesRegistry`](src/abstractvision/model_capabilities.py) backed by [`vision_model_capabilities.json`](src/abstractvision/assets/vision_model_capabilities.json)
- Optional artifact-ref outputs (small JSON refs): [`LocalAssetStore`](src/abstractvision/artifacts.py) and [`RuntimeArtifactStoreAdapter`](src/abstractvision/artifacts.py)
- Built-in backends (execution engines): [`src/abstractvision/backends/`](src/abstractvision/backends/)
  - OpenAI-compatible HTTP: [`openai_compatible.py`](src/abstractvision/backends/openai_compatible.py)
  - Local Diffusers: [`huggingface_diffusers.py`](src/abstractvision/backends/huggingface_diffusers.py)
  - Local stable-diffusion.cpp / GGUF: [`stable_diffusion_cpp.py`](src/abstractvision/backends/stable_diffusion_cpp.py)
- CLI/REPL for manual testing: [`abstractvision`](src/abstractvision/cli.py)
- Self-contained local Playground UI/API: [`playground/vision_playground.html`](playground/vision_playground.html) (docs: [`playground/README.md`](playground/README.md))

## How it fits together (diagram)

```mermaid
flowchart LR
  Caller[Python / CLI / AbstractCore] --> VM[VisionManager]
  VM --> BE[VisionBackend]
  BE --> VM
  VM -->|optional| Store[MediaStore]
  Store --> Ref[Artifact ref dict]
  VM -->|no store| Asset["GeneratedAsset (bytes + mime)"]
```

## Status (current backend support)

- Development status: **Alpha** (0.x). The public API is stable-by-design, but breaking changes may still happen and will be called out in `CHANGELOG.md`.
- Built-in backends implement: `text_to_image` and `image_to_image`.
- Video (`text_to_video`, `image_to_video`) is supported only via the OpenAI-compatible backend **when** endpoints are configured.
- `multi_view_image` is part of the public API (`VisionManager.generate_angles`) but no built-in backend implements it yet.

Details: [`docs/reference/backends.md`](docs/reference/backends.md).

## Installation

```bash
pip install abstractvision
```

The base install is lightweight. It includes the shared API, capability
registry, artifact helpers, CLI, AbstractCore plugin entry point, and the
stdlib OpenAI-compatible HTTP backend. Local inference runtimes are explicit
extras.

Optional extras:

| Extra | Use |
|---|---|
| `abstractvision[openai]` | Official OpenAI provider intent marker; no SDK dependency today. |
| `abstractvision[openai-compatible]` | Generic local/remote OpenAI-shaped endpoint intent marker; stdlib-only today. |
| `abstractvision[diffusers]` | Install Torch/Diffusers and related packages for local Diffusers generation. |
| `abstractvision[huggingface]` | Compatibility alias for callers that still request the historical Diffusers extra. |
| `abstractvision[sdcpp]` | Install `stable-diffusion-cpp-python` for the pip binding fallback. |
| `abstractvision[local]` | Convenience for both local backend dependency sets, including `diffusers` and `sdcpp`. |
| `abstractvision[all]` | All runtime backend dependencies, without contributor tooling. |
| `abstractvision[abstractcore]` | Compatibility marker only; AbstractCore is still supplied by the host application. |

Contributor-only extras:

| Extra | Use |
|---|---|
| `abstractvision[diffusers-dev]` / `abstractvision[huggingface-dev]` | Looser dependency pins for newer/unreleased Diffusers pipelines; install Diffusers `main` separately if needed. |
| `abstractvision[test]` | Local test dependencies. |
| `abstractvision[docs]` | Documentation build tooling. |
| `abstractvision[dev]` | Full contributor workflow: tests, docs, build, lint, formatting, and pre-commit. Do not use this as an application runtime profile. |

Note (CUDA): on Windows/Linux, `pip install "abstractvision[diffusers]"` may install a CPU-only PyTorch build. If you want to use an NVIDIA GPU, install a CUDA-enabled PyTorch build first (see <https://pytorch.org/get-started/locally/>) and verify `torch.cuda.is_available()` is `True`.

AbstractCore is not installed by AbstractVision. When an AbstractCore application
has AbstractVision installed in the same environment, AbstractCore can discover
the plugin entry point and use the integration modules lazily.

If you hit “missing pipeline class” errors for newer model families, see [`docs/getting-started.md`](docs/getting-started.md). In that case you may need Diffusers from source (`main`):

```bash
pip install -U "abstractvision[diffusers-dev]"
pip install -U "git+https://github.com/huggingface/diffusers@main"
```

For local development from a repo checkout:

```bash
pip install -e ".[dev]"
```

## Usage

Start here:
- Getting started: [`docs/getting-started.md`](docs/getting-started.md)
- FAQ: [`docs/faq.md`](docs/faq.md)
- API reference: [`docs/api.md`](docs/api.md)
- Architecture: [`docs/architecture.md`](docs/architecture.md)
- Docs index: [`docs/README.md`](docs/README.md)

### First local model (Diffusers / cross-platform)

Install the local runtime extra, pre-download the model outside the REPL, then
select the Diffusers backend explicitly:

```bash
pip install "abstractvision[diffusers]"
huggingface-cli download runwayml/stable-diffusion-v1-5
export ABSTRACTVISION_BACKEND=diffusers
export ABSTRACTVISION_MODEL_ID=runwayml/stable-diffusion-v1-5
export ABSTRACTVISION_DIFFUSERS_DEVICE=auto
abstractvision repl
```

For a fresh cache, you can also permit the REPL to download missing files:

```bash
ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1 abstractvision repl
```

More recommendations by VRAM: [`docs/getting-started.md`](docs/getting-started.md).

### Capability-driven model selection

```python
from abstractvision import VisionModelCapabilitiesRegistry

reg = VisionModelCapabilitiesRegistry()
assert reg.supports("runwayml/stable-diffusion-v1-5", "text_to_image")

print(reg.list_tasks())
print(reg.models_for_task("text_to_image"))
```

### Backend wiring + generation (artifact outputs)

The base install is import-light and does not install Torch/Diffusers. Heavy
local backend modules are imported lazily (see [`src/abstractvision/backends/__init__.py`](src/abstractvision/backends/__init__.py)).
Install `abstractvision[diffusers]` for local Diffusers, or
`abstractvision[sdcpp]` for the optional stable-diffusion.cpp python binding
fallback.

```python
from abstractvision import LocalAssetStore, VisionManager, VisionModelCapabilitiesRegistry, is_artifact_ref
from abstractvision.backends import OpenAICompatibleBackendConfig, OpenAICompatibleVisionBackend

reg = VisionModelCapabilitiesRegistry()

backend = OpenAICompatibleVisionBackend(
    config=OpenAICompatibleBackendConfig(
        base_url="http://localhost:1234/v1",
        api_key="YOUR_KEY",      # optional for local servers
        model_id="REMOTE_MODEL", # optional (server-dependent)
    )
)

vm = VisionManager(
    backend=backend,
    store=LocalAssetStore(),         # enables artifact-ref outputs
    model_id="zai-org/GLM-Image",    # optional: capability gating
    registry=reg,                   # optional: reuse loaded registry
)

out = vm.generate_image("a cinematic photo of a red fox in snow")
assert is_artifact_ref(out)
print(out)  # {"$artifact": "...", "content_type": "...", ...}

png_bytes = vm.store.load_bytes(out["$artifact"])  # type: ignore[union-attr]
```

When installed next to AbstractCore, AbstractVision is also discovered as a
`llm.vision` capability plugin. The plugin defaults to the official OpenAI
image endpoint (`https://api.openai.com/v1`) and reads `OPENAI_API_KEY` (or
`ABSTRACTVISION_API_KEY`). Set `OPENAI_BASE_URL` only when you need to override
that OpenAI-compatible base for the official OpenAI profile. Set
`ABSTRACTVISION_BACKEND=openai-compatible` plus `ABSTRACTVISION_BASE_URL` for a
local or remote compatible `/v1` server. Set `ABSTRACTVISION_MODEL_ID`,
`OPENAI_IMAGE_MODEL_ID`, or `OPENAI_IMAGE_MODEL` when you need an explicit
image model (static default OpenAI model: `gpt-image-1`). AbstractVision does
not query provider `/models` catalogs to discover or select image models
automatically, but you can inspect them explicitly with
`abstractvision provider-models`, `VisionManager.list_provider_models(...)`,
or the AbstractCore plugin method `llm.vision.list_provider_models(...)`.
After inspection, set the model env var explicitly for newer provider models
when available to your account. Set
`ABSTRACTVISION_BACKEND=diffusers` or `ABSTRACTVISION_BACKEND=sdcpp` when you
want AbstractCore to launch local AbstractVision generation directly.

### Interactive testing (CLI / REPL)

```bash
abstractvision models
abstractvision provider-models --openai --task text_to_image
abstractvision provider-models --base-url http://localhost:1234/v1 --task text_to_image
abstractvision tasks
abstractvision show-model runwayml/stable-diffusion-v1-5

abstractvision repl
```

Inside the REPL:

```text
/t2i "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 10 --open
```

For a newer but still relatively small local model, try `black-forest-labs/FLUX.2-klein-4B` after installing Diffusers
from source (see [`docs/getting-started.md`](docs/getting-started.md)):

```text
/backend diffusers black-forest-labs/FLUX.2-klein-4B mps float16
/t2i "a product photo of a matte black espresso machine" --steps 4 --guidance-scale 1.0 --open
```

OpenAI-compatible server example:

```text
/backend openai http://localhost:1234/v1
/t2i "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 10 --open
```

The CLI/REPL can also be configured via `ABSTRACTVISION_*` env vars; see [`docs/reference/configuration.md`](docs/reference/configuration.md).

### Local web playground

The playground is owned by AbstractVision and runs without AbstractCore. It is
a local/dev testing surface; use AbstractCore/Gateway for production routing,
authentication, and browser-origin policy.

```bash
abstractvision playground --port 8091
```

Open `http://127.0.0.1:8091/vision_playground.html`, select a cached model, then load it. The page and the API are served by the same process.

One-shot commands (OpenAI-compatible HTTP backend only):

```bash
abstractvision t2i --base-url http://localhost:1234/v1 "a studio photo of an espresso machine"
abstractvision i2i --base-url http://localhost:1234/v1 --image ./input.png "make it watercolor"
```

#### Local GGUF via stable-diffusion.cpp

If you want to run GGUF diffusion models locally, use the stable-diffusion.cpp backend (`sdcpp`). Start with a
single-file Stable Diffusion model when possible; Qwen Image and FLUX GGUF component sets are heavier.

Recommended:
- **macOS (Apple Silicon / Metal)**: install `sd-cli` (stable-diffusion.cpp executable) from releases and use CLI mode for Metal acceleration.
- Otherwise (pip-only convenience): `pip install "abstractvision[sdcpp]"` installs the stable-diffusion.cpp python bindings (`stable-diffusion-cpp-python`), but this may run CPU-only depending on the wheel build.

Alternative (external executable):

- Install `sd-cli`: <https://github.com/leejet/stable-diffusion.cpp/releases>

In the REPL:

```text
/backend sdcpp /path/to/sd-v1-5.gguf /path/to/sd-cli
/t2i "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 10 --open
```

FLUX.2-klein-4B GGUF component example:

```text
/backend sdcpp /path/to/flux-2-klein-4b-Q8_0.gguf /path/to/flux2_ae.safetensors /path/to/Qwen3-4B-Q4_K_M.gguf /path/to/sd-cli
/t2i "a product photo of a matte black espresso machine" --steps 4 --guidance-scale 1.0 --sampling-method euler --diffusion-fa --offload-to-cpu --open
```

Extra flags are forwarded via `request.extra`. In CLI mode they are forwarded to `sd-cli`; in python bindings mode, keys are mapped to python binding kwargs when supported and unsupported keys are ignored.

### AbstractCore tool integration (artifact refs)

If you’re using AbstractCore tool calling, AbstractVision can expose vision tasks as tools:

```python
from abstractvision.integrations.abstractcore import make_vision_tools

tools = make_vision_tools(vision_manager=vm, model_id="zai-org/GLM-Image")
```

Install `abstractcore` in the host application environment when you use these helpers; it is not pulled in by AbstractVision.

## AbstractFramework ecosystem

AbstractVision is part of the **AbstractFramework** ecosystem and is designed to compose with:

- **AbstractFramework** (project hub): <https://github.com/lpalbou/AbstractFramework>
- **AbstractCore** (orchestration + tool calling): <https://github.com/lpalbou/abstractcore>
- **AbstractRuntime** (runtime services, including artifact storage): <https://github.com/lpalbou/abstractruntime>

In practice:
- AbstractVision standardizes *generative vision outputs* (image/video) behind `VisionManager`.
- AbstractCore can discover and use AbstractVision via the capability plugin (`src/abstractvision/integrations/abstractcore_plugin.py`) or you can expose vision tasks as tools (`src/abstractvision/integrations/abstractcore.py`).
- Artifact refs returned by AbstractVision are designed to travel across processes; `RuntimeArtifactStoreAdapter` bridges to an AbstractRuntime-style artifact store (`src/abstractvision/artifacts.py`).

## Project

- Release notes: [`CHANGELOG.md`](CHANGELOG.md)
- Contributing: [`CONTRIBUTING.md`](CONTRIBUTING.md)
- Security: [`SECURITY.md`](SECURITY.md)
- Acknowledgments: [`ACKNOWLEDGMENTS.md`](ACKNOWLEDGMENTS.md)
- Agent docs: [`llms.txt`](llms.txt) and [`llms-full.txt`](llms-full.txt)

## Requirements

- Python >= 3.9

## License

MIT License - see LICENSE file for details.

## Author

Laurent-Philippe Albou

## Contact

contact@abstractcore.ai
--- >8 --- END FILE: README.md --- >8 ---

--- 8< --- FILE: docs/README.md --- 8< ---
# AbstractVision documentation

This folder contains the user-facing documentation for `abstractvision`.

## Start here (new users)

1) [Project overview + quickstart](../README.md)  
2) [Getting started](getting-started.md) (first image with Stable Diffusion 1.5; then klein-4B, GGUF, OpenAI-compatible HTTP, Playground)
3) [Architecture](architecture.md) (how the pieces fit together)

## Quick reference

- [FAQ](faq.md)
- [API reference](api.md)
- [Backends](reference/backends.md)
- [Configuration (CLI/REPL env vars + flags)](reference/configuration.md)
- [Capability registry (`vision_model_capabilities.json`)](reference/capabilities-registry.md)
- [Artifacts (artifact refs + stores)](reference/artifacts.md)
- [AbstractCore integration (capability plugin + tools)](reference/abstractcore-integration.md)
- Agent-oriented docs: [`../llms.txt`](../llms.txt) and [`../llms-full.txt`](../llms-full.txt)

## AbstractFramework ecosystem

AbstractVision is part of the **AbstractFramework** ecosystem and is designed to compose with:

- **AbstractFramework** (project hub): <https://github.com/lpalbou/AbstractFramework>
- **AbstractCore** (orchestration + tool calling): <https://github.com/lpalbou/abstractcore>
- **AbstractRuntime** (runtime services, including artifact storage): <https://github.com/lpalbou/abstractruntime>

## Current implementation status (as shipped)

Public API surface: [`VisionManager`](../src/abstractvision/vision_manager.py) exposes:
- `generate_image` (`text_to_image`), `edit_image` (`image_to_image`)
- `generate_video` (`text_to_video`), `image_to_video` (`image_to_video`) (backend-dependent)
- `generate_angles` (`multi_view_image`) (API exists; no built-in backend implements it yet)

Built-in backends implement:
- **Images**: Diffusers, stable-diffusion.cpp, OpenAI-compatible HTTP ([`../src/abstractvision/backends/`](../src/abstractvision/backends/))
- **Video**: OpenAI-compatible HTTP only, and only when endpoints are configured ([`openai_compatible.py`](../src/abstractvision/backends/openai_compatible.py))

If you’re looking for “what can model X do?”, the single source of truth is the packaged registry:
[`../src/abstractvision/assets/vision_model_capabilities.json`](../src/abstractvision/assets/vision_model_capabilities.json) (loaded by `VisionModelCapabilitiesRegistry` in [`../src/abstractvision/model_capabilities.py`](../src/abstractvision/model_capabilities.py)).

## Internal engineering notes

[`docs/backlog/`](backlog/) is an internal log (planned work + completion reports). It is not the normative user documentation surface.

## Project

- Release notes: [`CHANGELOG.md`](../CHANGELOG.md)
- Contributing: [`CONTRIBUTING.md`](../CONTRIBUTING.md)
- Security: [`SECURITY.md`](../SECURITY.md)
- License: [`LICENSE`](../LICENSE)
- Acknowledgments: [`ACKNOWLEDGMENTS.md`](../ACKNOWLEDGMENTS.md)
--- >8 --- END FILE: docs/README.md --- >8 ---

--- 8< --- FILE: docs/getting-started.md --- 8< ---
# Getting Started

This guide helps you generate your first image using AbstractVision with the built-in backends:

- **OpenAI-compatible HTTP**: call a local/remote server that exposes OpenAI-shaped image endpoints
- **Diffusers (local Python)**: Stable Diffusion / Qwen Image / FLUX 2 / GLM-Image (and other Diffusers pipelines)
- **stable-diffusion.cpp (local GGUF)**: GGUF diffusion models via `sd-cli` (recommended for GPU backends like **Metal**/**CUDA**) or via pip-installable python bindings (often **CPU-only** fallback)
- **Playground (web, optional)**: self-contained AbstractVision UI/API for local model loading and jobs (`/v1/vision/*`)

See also:
- Docs index: [docs/README.md](README.md)
- FAQ: [docs/faq.md](faq.md)
- API reference: [docs/api.md](api.md)
- Architecture: [docs/architecture.md](architecture.md)
- Backends: [docs/reference/backends.md](reference/backends.md)
- Configuration (CLI/REPL env vars): [docs/reference/configuration.md](reference/configuration.md)
- Capability registry: [docs/reference/capabilities-registry.md](reference/capabilities-registry.md)
- Artifacts: [docs/reference/artifacts.md](reference/artifacts.md)
- AbstractCore integration: [docs/reference/abstractcore-integration.md](reference/abstractcore-integration.md)

---

## 0) Install

From PyPI:

```bash
pip install abstractvision
```

AbstractVision’s base install is lightweight. It includes the shared API, capability registry, artifact helpers, CLI, AbstractCore plugin entry point, and stdlib OpenAI-compatible HTTP backend. Local inference runtimes are explicit extras: install `abstractvision[diffusers]` for Torch/Diffusers, `abstractvision[sdcpp]` for the stable-diffusion.cpp python binding fallback, or `abstractvision[local]` for both.

If you see “missing pipeline class” errors for newer model families, install the `diffusers-dev` extra (or compatibility alias `huggingface-dev`) to get compatible dependencies, then install Diffusers from source (`main`).

For that newer-pipeline workflow from a **repo checkout**, install the `diffusers-dev` extra (compatible deps; does not include Diffusers `main`):

```bash
pip install -e ".[diffusers-dev]"
```

If you're installing **AbstractVision from PyPI**, you can install the extra directly:

```bash
pip install -U "abstractvision[diffusers-dev]"
```

Or install Diffusers from source directly:

```bash
pip install -U "git+https://github.com/huggingface/diffusers@main"
```

Sanity check:

```bash
python -c "import diffusers; print(diffusers.__version__)"
python -c "import diffusers; print('GlmImagePipeline', hasattr(diffusers, 'GlmImagePipeline')); print('Flux2KleinPipeline', hasattr(diffusers, 'Flux2KleinPipeline'))"
```

Offline alternative (if you already have a local Diffusers checkout):

```bash
pip install -U -e /path/to/diffusers
```

Or, from a repo checkout (run in the repo root):

```bash
pip install -e .
```

For contributor tooling from a repo checkout, use:

```bash
pip install -e ".[dev]"
```

For local Diffusers generation, install `abstractvision[diffusers]` before selecting the `diffusers` backend. Use `diffusers-dev` only when you need newer Diffusers-compatible dependency pins, and use `sdcpp` only when you want the optional stable-diffusion.cpp python binding fallback.

Optional extras:

| Extra | Use |
|---|---|
| `openai` | Empty official OpenAI provider intent marker; the HTTP backend is stdlib-only today. |
| `openai-compatible` | Empty local/remote OpenAI-shaped endpoint intent marker; the HTTP backend is stdlib-only today. |
| `diffusers` | Installs Torch/Diffusers and related packages for local Diffusers generation. |
| `sdcpp` | Installs `stable-diffusion-cpp-python` for the stable-diffusion.cpp pip binding fallback. |
| `huggingface` | Compatibility alias for the historical Diffusers backend dependency set. |
| `local` | Convenience extra for both local backend dependency sets, including `sdcpp`. |
| `all` | All runtime backend dependencies, without contributor tooling. |
| `abstractcore` | Empty compatibility marker; install AbstractCore in the host application environment. |

Contributor-only extras:

| Extra | Use |
|---|---|
| `diffusers-dev` / `huggingface-dev` | Looser dependency pins for newer/unreleased Diffusers pipelines. Install Diffusers `main` separately when a pipeline is not in the latest release. |
| `test` | Local test dependencies. |
| `docs` | Documentation build tooling. |
| `dev` | Full contributor workflow: tests, docs, packaging, formatting, release checks, and pre-commit. Do not use this as an application runtime profile. |

Optional (recommended): pre-download heavyweight model sets (so first-run doesn’t do surprise multi‑GB downloads):

```bash
python scripts/download_model_sets.py --list
python scripts/download_model_sets.py --plan --set sd15_diffusers
python scripts/download_model_sets.py --plan --set flux2_klein_4b_gguf
python scripts/download_model_sets.py --set sd15_diffusers
```

### 0.1 Hardware quickstart (macOS Metal vs NVIDIA CUDA vs CPU)

AbstractVision can run “locally” via two main routes:

- **Diffusers backend**: uses Torch device selection (`cuda` / `mps` / `cpu`).
- **stable-diffusion.cpp backend (`sdcpp`)**: runs GGUF diffusion models using:
  - `sd-cli` (**recommended** when you want GPU backends like **Metal** or **CUDA**)
  - or `stable-diffusion-cpp-python` (convenient, but often **CPU-only**, especially on macOS)

#### macOS (Apple Silicon, Metal)

- **Diffusers**: start with Stable Diffusion 1.5, then move up:
  - `/backend diffusers runwayml/stable-diffusion-v1-5 mps float16`
  - `/backend diffusers black-forest-labs/FLUX.2-klein-4B mps float16` (requires Diffusers `main` today)
- **GGUF (`sdcpp`)**: install `sd-cli` from stable-diffusion.cpp releases and use **CLI mode** for Metal speed:
  - Download: <https://github.com/leejet/stable-diffusion.cpp/releases>
  - Pick the Darwin arm64 zip (example asset name: `sd-…-bin-Darwin-macOS-…-arm64.zip`)
  - If macOS blocks execution, clear quarantine: `xattr -dr com.apple.quarantine /path/to/sd-cli`
  - In the REPL, pass the full path as the last arg to `/backend sdcpp …` (see section **6)**).

If you see `Using CPU backend` in logs, you’re on CPU (it will work, but can be extremely slow for large models).

#### NVIDIA (CUDA)

- Install a CUDA-enabled PyTorch wheel first (see <https://pytorch.org/get-started/locally/>).
- Use Diffusers with `cuda` + `float16`:
  - `/backend diffusers runwayml/stable-diffusion-v1-5 cuda float16`
- For GGUF (`sdcpp`) on NVIDIA, use an `sd-cli` build compiled with CUDA (stable-diffusion.cpp releases provide multiple assets depending on tag).

#### CPU-only

- Expect slow inference. Prefer smaller models and lower resolutions/steps.
- `sdcpp` via python bindings is the simplest “no external binary” option, but it will use whatever backend the wheel was compiled with (often CPU).

---

## Recommended default models (VRAM guide)

If you run **locally** (Diffusers backend) and want a reliable starting point, here are practical model picks from the packaged capability registry (`src/abstractvision/assets/vision_model_capabilities.json`).

Notes:
- VRAM needs vary with resolution, dtype, and pipeline implementation. Treat this as a starting point.
- Some models are **gated** on Hugging Face and require accepting terms + setting `HF_TOKEN`.
- If you want a non-gated modern image model, try `black-forest-labs/FLUX.2-klein-4B` (but it currently requires installing Diffusers from source; see the FLUX section below).

| GPU VRAM | Recommended model id | Why | Install / quickstart |
|---:|---|---|---|
| ≤ 16 GB | `runwayml/stable-diffusion-v1-5` | Small, stable, and widely compatible (Windows/Linux CUDA, macOS MPS) | `pip install "abstractvision[diffusers]"` then run the REPL using the snippet below |
| 24-32 GB | `black-forest-labs/FLUX.2-klein-4B` | Newer non-gated model, much smaller than FLUX.2-dev | Install Diffusers `main`, then use the FLUX.2 klein section below |
| 32 GB | `stabilityai/stable-diffusion-3.5-large-turbo` | High-quality still images with low step counts (gated) | Accept model terms on HF, set `HF_TOKEN`, then use the SD3.5 section below |
| 64 GB | `Qwen/Qwen-Image-2512` | Strong prompt following and text rendering (large model) | Same as Diffusers setup; if pipeline import fails, use Diffusers `main` (see install section above) |
| 128 GB | `black-forest-labs/FLUX.2-dev` | Very high quality (very large; non-commercial license; gated) | Accept model terms on HF, set `HF_TOKEN`, then use the FLUX section below |

macOS Metal (Apple Silicon) quick picks:

- If you want **local quantized FLUX.2** on Metal: prefer **stable-diffusion.cpp** (GGUF) via the `sdcpp` backend (see section **6)**).
- If you want a fast local FLUX.2 for iteration: `black-forest-labs/FLUX.2-klein-4B` (or GGUF equivalents) is usually the most practical starting point.
- If you want strong prompt following + text rendering: `Qwen/Qwen-Image-2512` (Diffusers on `mps`, start with `float16`).

Recommended default (local, cross-platform) — Stable Diffusion 1.5:

```bash
pip install "abstractvision[diffusers]"
huggingface-cli download runwayml/stable-diffusion-v1-5
export ABSTRACTVISION_BACKEND=diffusers
export ABSTRACTVISION_MODEL_ID=runwayml/stable-diffusion-v1-5
export ABSTRACTVISION_DIFFUSERS_DEVICE=auto
abstractvision repl
```

Then type a prompt (plain text runs `/t2i`), or use `/t2i "..." --open`.

Jump to detailed recipes:
- Stable Diffusion 1.5: section **1) First local image (Diffusers)**
- FLUX.2-klein-4B: section **2) Next small model (FLUX.2-klein-4B)**
- OpenAI-compatible HTTP: section **2.1) OpenAI-compatible HTTP**
- Qwen Image: section **3) Qwen Image (Diffusers)**
- FLUX 2 details: section **4) FLUX 2 (Diffusers)**
- SD3.5: section **5) Stable Diffusion 3.5 (Diffusers, gated)**

---

## 1) First local image (Diffusers)

The REPL is cache-only by default, so it will not download model weights. Download the model separately first:

```bash
huggingface-cli download runwayml/stable-diffusion-v1-5
```

```bash
# Required for this local Diffusers recipe.
export ABSTRACTVISION_BACKEND=diffusers
export ABSTRACTVISION_MODEL_ID=runwayml/stable-diffusion-v1-5
export ABSTRACTVISION_DIFFUSERS_DEVICE=auto
# auto prefers cuda, then mps, then cpu. You can also set cuda/mps/cpu explicitly.
# Optional: override dtype (auto defaults to float16 on MPS for broad compatibility).
# - `float16` is usually the best speed/compatibility tradeoff on Apple Silicon
# - `bfloat16` can work for some models, but can trigger dtype-mismatch errors in some pipelines
# - `float32` is the most stable, but can require much more memory
# export ABSTRACTVISION_DIFFUSERS_TORCH_DTYPE=bfloat16
# export ABSTRACTVISION_DIFFUSERS_TORCH_DTYPE=float16
# export ABSTRACTVISION_DIFFUSERS_TORCH_DTYPE=float32
```

Quick sanity check (device):

```bash
python -c "import torch; print('mps', torch.backends.mps.is_available(), 'cuda', torch.cuda.is_available())"
```

If you have an NVIDIA GPU but `cuda` is `False`, you likely installed a CPU-only PyTorch build. Follow the PyTorch install guide to install a CUDA-enabled wheel, then re-run the sanity check: <https://pytorch.org/get-started/locally/>.

Start the REPL:

```bash
abstractvision repl
```

With `ABSTRACTVISION_BACKEND=diffusers` and `ABSTRACTVISION_MODEL_ID` set above, the REPL uses `runwayml/stable-diffusion-v1-5`:

```text
/set guidance_scale 7
/set seed 42
/t2i "a cinematic photo of a red fox in snow" --width 512 --height 512 --steps 10 --open
```

Change settings by changing `/set …` values, or pass flags per request:

```text
/t2i "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 20 --seed 123 --guidance-scale 6.5 --open
```

---

## 2) Next small model (FLUX.2-klein-4B)

After Stable Diffusion 1.5 works, `black-forest-labs/FLUX.2-klein-4B` is the next recommended local test. It is
non-gated and much smaller than FLUX.2-dev, but it currently needs Diffusers from source because released Diffusers
may not include `Flux2KleinPipeline`.

```bash
pip install -U "abstractvision[diffusers-dev]"
pip install -U "git+https://github.com/huggingface/diffusers@main"
```

Quick REPL test:

```text
/backend diffusers black-forest-labs/FLUX.2-klein-4B mps float16
/t2i "a product photo of a matte black espresso machine" --width 1024 --height 1024 --steps 4 --guidance-scale 1.0 --open
```

Use `cuda float16` on NVIDIA, or `auto` if you want AbstractVision/Torch to pick the device.

---

## 2.1) OpenAI-compatible HTTP

Use this path if you already have a server that exposes OpenAI-shaped image endpoints (e.g. a local model server).

For unknown or local OpenAI-compatible servers, AbstractVision forwards local extension fields such as `steps`, `seed`, `guidance_scale`, `width`, and `height`. For the real OpenAI API and known GPT image models, it suppresses unsupported local-only fields and sends the narrower OpenAI request shape.

List provider-advertised models explicitly:

```bash
abstractvision provider-models --openai --task text_to_image
abstractvision provider-models --base-url http://localhost:1234/v1 --task text_to_image
```

One-shot (stores output via `LocalAssetStore` and prints an artifact ref + file path):

```bash
abstractvision t2i --base-url http://localhost:1234/v1 "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 10 --open
```

Interactive REPL:

```bash
abstractvision repl
```

```text
/backend openai http://localhost:1234/v1
/t2i "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 10 --open
```

If your server also supports video endpoints, configure them via `ABSTRACTVISION_TEXT_TO_VIDEO_PATH` / `ABSTRACTVISION_IMAGE_TO_VIDEO_PATH` (see [docs/reference/configuration.md](reference/configuration.md)).

---

## 3) Qwen Image (Diffusers)

Qwen Image models in the registry:

- `Qwen/Qwen-Image` (older)
- `Qwen/Qwen-Image-2512` (newer)

Use the same Diffusers flow:

```text
/backend diffusers Qwen/Qwen-Image-2512 mps float16
/t2i "a poster with the word 'ABSTRACT' rendered perfectly in bold typography" --width 512 --height 512 --steps 10 --guidance-scale 2.5 --open
```

Notes:
- Qwen Image models are **large**.
- For best results, prefer the model card’s recommended sizes (e.g. 1328x1328 for 1:1). For quick tests, 512x512 is fine.
- On Apple Silicon (MPS), start with fp16 (default; best compatibility):
  - `ABSTRACTVISION_DIFFUSERS_TORCH_DTYPE=float16` (or in the REPL: `/backend diffusers Qwen/Qwen-Image-2512 mps float16`)
- If you get NaNs/black images, try fp32 (this can require **very** large peak memory during load):
  - `ABSTRACTVISION_DIFFUSERS_TORCH_DTYPE=float32` (or in the REPL: `/backend diffusers Qwen/Qwen-Image-2512 mps float32`)
- On Apple Silicon (MPS), AbstractVision upcasts the VAE to fp32 when using fp16 to avoid common “black image” issues.
- Automatic fp32 retry on all-black output is enabled by default on MPS (can increase peak memory):
  - disable with `ABSTRACTVISION_DIFFUSERS_AUTO_RETRY_FP32=0`
- In AbstractVision, `--guidance-scale` is mapped to Qwen’s `true_cfg_scale` when using Qwen pipelines (CFG). If you set `--guidance-scale` but don’t provide a `negative_prompt`, AbstractVision passes a placeholder negative prompt (`" "`) so CFG is actually enabled.

Tip: keep `guidance_scale` relatively low for some modern DiT models.

---

## 3.1) LoRA + Rapid-AIO (Diffusers)

AbstractVision can apply LoRA adapters (Diffusers adapter system) and optionally swap in a distilled “Rapid-AIO”
transformer for faster Qwen Image Edit inference.

These features follow the Diffusers download setting. The REPL is cache-only by default, so pre-download adapters or
Rapid-AIO weights separately before using repo ids here. If you intentionally want runtime downloads, set:

```bash
export ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1
```

LoRA example (REPL; note: `loras_json` is forwarded via `request.extra`):

```text
/backend diffusers Qwen/Qwen-Image-Edit-2511 mps float16
/t2i "a cinematic photo of a red fox in snow" --steps 8 --guidance-scale 1 --loras_json '[{"source":"lightx2v/Qwen-Image-Edit-2511-Lightning","scale":1.0}]' --open
```

Rapid-AIO example (distilled transformer override; Qwen Image Edit):

```text
/backend diffusers Qwen/Qwen-Image-Edit-2511 mps float16
/t2i "a cinematic photo of a red fox in snow" --steps 4 --guidance-scale 1 --rapid_aio_repo linoyts/Qwen-Image-Edit-Rapid-AIO --open
```

---

## 4) FLUX 2 (Diffusers)

FLUX 2 models in the registry:

- `black-forest-labs/FLUX.2-klein-4B` (Apache-2.0, not gated)
- `black-forest-labs/FLUX.2-klein-9B` (non-commercial license, gated on Hugging Face)
- `black-forest-labs/FLUX.2-dev` (non-commercial license, gated on Hugging Face)

Sanity check:

```bash
python -c "import diffusers; print(diffusers.__version__)"
```

Notes:
- `FLUX.2-dev` uses Diffusers `Flux2Pipeline` and works on released Diffusers (0.36+).
- `FLUX.2-klein-4B` and `FLUX.2-klein-9B` use `Flux2KleinPipeline`, which is not available in the released Diffusers (0.36.0). It currently
  requires installing Diffusers from source (with the `diffusers-dev` extra for compatible dependency pins):
  - `pip install -U "abstractvision[diffusers-dev]"`
  - `pip install -U "git+https://github.com/huggingface/diffusers@main"`

Recommended first FLUX example (`FLUX.2-klein-4B`, not gated):

```text
/backend diffusers black-forest-labs/FLUX.2-klein-4B mps float16
/t2i "a product photo of a matte black espresso machine" --width 1024 --height 1024 --steps 4 --guidance-scale 1.0 --seed 0 --open
```

Example (`FLUX.2-klein-9B`, gated; requires Diffusers `main` and HF access):

```text
/backend diffusers black-forest-labs/FLUX.2-klein-9B mps float16
/t2i "a minimalist product photo of a matte black espresso machine, studio lighting" --width 1024 --height 1024 --steps 4 --guidance-scale 1.0 --seed 0 --open
```

Example (`FLUX.2-dev`, gated; you must pre-download it into your HF cache first):

```text
/backend diffusers black-forest-labs/FLUX.2-dev mps
/t2i "a minimalist product photo of a matte black espresso machine, studio lighting" --width 1024 --height 1024 --steps 4 --guidance-scale 1.0 --seed 0 --open
```

If you use gated models (like `FLUX.2-dev`), you typically must accept the model’s terms on Hugging Face and set `HF_TOKEN` in your environment.

---

## 5) Stable Diffusion 3.5 (Diffusers, gated)

SD3.5 models (all gated on Hugging Face):

- `stabilityai/stable-diffusion-3.5-large-turbo`
- `stabilityai/stable-diffusion-3.5-large`
- `stabilityai/stable-diffusion-3.5-medium`

1) Accept the model terms on Hugging Face (in your browser).  
2) Export a token:

```bash
export HF_TOKEN=...   # your Hugging Face access token
```

Then in the REPL:

```text
/backend diffusers stabilityai/stable-diffusion-3.5-large-turbo mps
/t2i "a modern product photo of a watch, studio lighting" --width 1024 --height 1024 --steps 6 --guidance-scale 4 --seed 42 --open
```

Turbo models are usually best with low step counts (e.g. ~4–8).

---

## 6) GGUF diffusion models (stable-diffusion.cpp)

If you downloaded a GGUF diffusion model (like Qwen Image GGUF or FLUX.2 GGUF), Diffusers cannot load it. Use the stable-diffusion.cpp backend instead (either via pip-installed python bindings or `sd-cli`).

### 6.1 Install stable-diffusion.cpp runtime

The base `pip install abstractvision` path does not install local inference runtimes. Use one of these explicit stable-diffusion.cpp runtime choices:

```bash
pip install "abstractvision[sdcpp]"
```

This pip binding path is convenient, but it may require a native build or run **CPU-only** depending on how the wheel was built.

Alternative (external executable):

- Download `sd-cli` from: <https://github.com/leejet/stable-diffusion.cpp/releases>
- Ensure `sd-cli` is in your `PATH` (or pass a full path as the last arg to `/backend sdcpp …`).

On macOS (Apple Silicon), **`sd-cli` is the recommended path** to get **Metal** acceleration. If you see `Using CPU backend`,
install `sd-cli` and re-run in CLI mode.

### 6.2 Single-file Stable Diffusion model

This is the lowest-friction `sdcpp` shape: one model file plus an optional `sd-cli` path. Use it for Stable Diffusion
1.x/2.x/SDXL checkpoints or GGUF conversions that stable-diffusion.cpp can load as `--model`.

```bash
abstractvision repl
```

```text
/backend sdcpp /path/to/sd-v1-5.gguf /path/to/sd-cli
/t2i "a watercolor painting of a lighthouse" --width 512 --height 512 --steps 10 --open
```

If `sd-cli` is already in your `PATH`, you can omit the final `/path/to/sd-cli` argument. If it is not available,
AbstractVision falls back to `stable-diffusion-cpp-python` when that package is installed, for example through `pip install "abstractvision[sdcpp]"`.

### 6.3 Download the required Qwen Image VAE

```bash
curl -L -o ./qwen_image_vae.safetensors \
  https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors
```

### 6.4 Run Qwen Image with `sdcpp` component mode

```bash
abstractvision repl
```

Then:

```text
/backend sdcpp /path/to/qwen-image-2512-Q4_K_M.gguf ./qwen_image_vae.safetensors /path/to/Qwen2.5-VL-7B-Instruct-*.gguf /path/to/sd-cli
/set width 1024
/set height 1024
/t2i "a cinematic photo of a red fox in snow" --sampling-method euler --offload-to-cpu --diffusion-fa --flow-shift 3 --open
```

Any extra `--flag` you pass (like `--sampling-method euler`) is forwarded to the backend as `extra`.
- CLI mode: forwarded to `sd-cli`
- Python bindings mode: keys are mapped to python binding kwargs when supported; unsupported keys are ignored (see [`../src/abstractvision/backends/stable_diffusion_cpp.py`](../src/abstractvision/backends/stable_diffusion_cpp.py))
- Diffusers backend: only forwards kwargs that the pipeline `__call__` accepts; unknown keys are ignored (see [`../src/abstractvision/backends/huggingface_diffusers.py`](../src/abstractvision/backends/huggingface_diffusers.py))

### 6.5 FLUX.2-klein-4B (GGUF) example

Stable-diffusion.cpp supports FLUX.2-klein-4B GGUF when you provide:

- a GGUF diffusion model (e.g. `flux-2-klein-4b-Q8_0.gguf`)
- the FLUX.2 VAE (safetensors)
- an LLM text encoder (GGUF), e.g. `Qwen3-4B-Q4_K_M.gguf`

You can download the matching set with:

```bash
python scripts/download_model_sets.py --set flux2_klein_4b_gguf
```

Example (REPL):

```text
/backend sdcpp /path/to/flux-2-klein-4b-Q8_0.gguf /path/to/flux2_ae.safetensors /path/to/Qwen3-4B-Q4_K_M.gguf /path/to/sd-cli
/t2i "a product photo of a matte black espresso machine" --steps 4 --guidance-scale 1.0 --sampling-method euler --diffusion-fa --offload-to-cpu --open
```

FLUX.2-dev and Qwen Image GGUF are still documented here as heavier follow-ups, but try the single-file Stable
Diffusion path or klein-4B first when you are testing a fresh machine.

---

## 7) Web UI testing (optional): Playground

This repo includes a self-contained web UI and local API server. It is owned by
AbstractVision and does not require AbstractCore. Treat it as a local/dev
testing surface; use AbstractCore/Gateway for production routing,
authentication, and browser-origin policy.

### 7.1 Start the playground

```bash
abstractvision playground --port 8091
```

Open:

- `http://127.0.0.1:8091/vision_playground.html`

In the UI:
- The API Base URL defaults to the same process that serves the page
- Select a cached model and load it
- Generate (T2I) or upload an input image (I2I) and run edits

For the endpoint list, see `playground/README.md`.
--- >8 --- END FILE: docs/getting-started.md --- >8 ---

--- 8< --- FILE: docs/api.md --- 8< ---
# API reference

This document describes the **public** Python API surface of `abstractvision` (0.x / Alpha) and points to the implementation.

See also:
- Getting started (end-to-end examples): [docs/getting-started.md](getting-started.md)
- Architecture (how the pieces fit): [docs/architecture.md](architecture.md)
- Backends reference (support matrix): [docs/reference/backends.md](reference/backends.md)
- FAQ (common questions): [docs/faq.md](faq.md)

## Public exports

The package exports the following symbols from `abstractvision` (see [`../src/abstractvision/__init__.py`](../src/abstractvision/__init__.py)):

- `VisionManager`
- `ProviderModelInfo`
- `VisionModelCapabilitiesRegistry`
- `LocalAssetStore`
- `RuntimeArtifactStoreAdapter`
- `is_artifact_ref`
- `__version__`

## Core concepts

### Tasks

`VisionManager` exposes one method per task (implementation: [`../src/abstractvision/vision_manager.py`](../src/abstractvision/vision_manager.py)):

- `generate_image(...)` → `text_to_image`
- `edit_image(...)` → `image_to_image`
- `generate_video(...)` → `text_to_video` (backend-dependent)
- `image_to_video(...)` → `image_to_video` (backend-dependent)
- `generate_angles(...)` → `multi_view_image` (API exists; no built-in backend implements it yet)

Task names are also used by the capability registry ([`../src/abstractvision/assets/vision_model_capabilities.json`](../src/abstractvision/assets/vision_model_capabilities.json)).

### Backends

Backends are execution engines that implement the `VisionBackend` interface ([`../src/abstractvision/backends/base_backend.py`](../src/abstractvision/backends/base_backend.py)).

Built-in backends live in [`../src/abstractvision/backends/`](../src/abstractvision/backends/):
- `OpenAICompatibleVisionBackend` (HTTP)
- `HuggingFaceDiffusersVisionBackend` (local Diffusers)
- `StableDiffusionCppVisionBackend` (local stable-diffusion.cpp / GGUF)

Backend config classes are re-exported from `abstractvision.backends` via lazy imports (see [`../src/abstractvision/backends/__init__.py`](../src/abstractvision/backends/__init__.py)).

Provider catalog listing is exposed as a backend contract:

```python
from abstractvision.backends import OpenAICompatibleBackendConfig, OpenAICompatibleVisionBackend

backend = OpenAICompatibleVisionBackend(
    config=OpenAICompatibleBackendConfig(base_url="http://localhost:1234/v1")
)
for model in backend.list_provider_models(task="text_to_image"):
    print(model.id)
```

For official OpenAI, use `base_url="https://api.openai.com/v1"` and an API key. Catalog listing is explicit and does not change the configured generation model.

When AbstractVision is loaded as an AbstractCore capability plugin, the plugin shim exposes the
same explicit catalog surface as `llm.vision.list_provider_models(task="text_to_image")`. It
returns JSON-safe dictionaries so Core/Gateway route code can avoid private backend reach-throughs.

### Outputs: bytes vs artifact refs

`VisionManager` returns:

- `GeneratedAsset` (bytes) when no store is configured ([`../src/abstractvision/types.py`](../src/abstractvision/types.py))
- an artifact ref `dict` when `VisionManager.store` is configured (via `MediaStore.store_bytes(...)`)

Artifact helpers and stores are defined in [`../src/abstractvision/artifacts.py`](../src/abstractvision/artifacts.py).

## VisionManager (orchestrator)

`VisionManager` is intentionally thin: it validates/gates best-effort and delegates to the configured backend.

Signature (see [`../src/abstractvision/vision_manager.py`](../src/abstractvision/vision_manager.py)):
- `backend`: a `VisionBackend` implementation (required to run anything)
- `store`: optional `MediaStore` to enable artifact-ref outputs
- `model_id`: optional capability-gating model id (must exist in the registry)
- `registry`: optional `VisionModelCapabilitiesRegistry` instance (reused when gating is enabled)

### Minimal example (OpenAI-compatible backend + artifact refs)

```python
from abstractvision import LocalAssetStore, VisionManager, is_artifact_ref
from abstractvision.backends import OpenAICompatibleBackendConfig, OpenAICompatibleVisionBackend

backend = OpenAICompatibleVisionBackend(
    config=OpenAICompatibleBackendConfig(base_url="http://localhost:1234/v1")
)
store = LocalAssetStore()
vm = VisionManager(backend=backend, store=store)

ref = vm.generate_image("a studio photo of an espresso machine", width=768, height=768, steps=20)
assert is_artifact_ref(ref)
png_bytes = store.load_bytes(ref["$artifact"])
```

### Local example (Diffusers backend)

Install `abstractvision[diffusers]` before using this backend.

```python
from abstractvision import VisionManager
from abstractvision.backends import HuggingFaceDiffusersBackendConfig, HuggingFaceDiffusersVisionBackend

backend = HuggingFaceDiffusersVisionBackend(
    config=HuggingFaceDiffusersBackendConfig(
        model_id="runwayml/stable-diffusion-v1-5",
        device="auto",
        allow_download=False,
    )
)
vm = VisionManager(backend=backend)
asset = vm.generate_image("a watercolor painting of a lighthouse", width=512, height=512, steps=10)
```

Note: `allow_download=False` is the default. Pre-download model weights separately, or set `allow_download=True` only when you want runtime downloads.

## Passing advanced backend parameters (`extra`)

Request dataclasses include an `extra: dict` field ([`../src/abstractvision/types.py`](../src/abstractvision/types.py)). Use it to pass backend-specific parameters in a controlled way:

```python
asset_or_ref = vm.generate_image(
    "a product photo of a matte black espresso machine",
    steps=8,
    guidance_scale=1.0,
    extra={
        # Example keys used by some Diffusers flows:
        "loras_json": [{"source": "lightx2v/Qwen-Image-Edit-2511-Lightning", "scale": 1.0}],
        "rapid_aio_repo": "linoyts/Qwen-Image-Edit-Rapid-AIO",
    },
)
```

Backends may ignore unknown keys; consult the backend implementation and [docs/reference/backends.md](reference/backends.md).

## Capability registry (what models can do)

The packaged registry is loaded by `VisionModelCapabilitiesRegistry` ([`../src/abstractvision/model_capabilities.py`](../src/abstractvision/model_capabilities.py)).

```python
from abstractvision import VisionModelCapabilitiesRegistry

reg = VisionModelCapabilitiesRegistry()
print(reg.list_tasks())
print(reg.models_for_task("text_to_image"))

reg.require_support("runwayml/stable-diffusion-v1-5", "text_to_image")
```

Optional gating:
- If you construct `VisionManager(model_id=..., registry=...)`, the manager will fail fast on unsupported tasks before calling a backend ([`../src/abstractvision/vision_manager.py`](../src/abstractvision/vision_manager.py)).

Important: the registry is *not* a guarantee that your configured backend can execute a task at runtime.
Use [docs/reference/backends.md](reference/backends.md) for backend support.

## Artifacts and stores

Artifact helpers and store implementations live in [`../src/abstractvision/artifacts.py`](../src/abstractvision/artifacts.py):

- `LocalAssetStore` (standalone local files, default `~/.abstractvision/assets`)
- `RuntimeArtifactStoreAdapter` (duck-typed adapter for an external artifact store)
- `is_artifact_ref(...)` / `make_media_ref(...)`

See: [docs/reference/artifacts.md](reference/artifacts.md).

## Errors you may want to handle

Common exceptions (defined in [`../src/abstractvision/errors.py`](../src/abstractvision/errors.py)):

- `BackendNotConfiguredError` (calling `VisionManager` without a backend)
- `CapabilityNotSupportedError` (task isn’t supported by the model registry or backend)
- `UnknownModelError` (model id isn’t present in the registry)
- `OptionalDependencyMissingError` (backend dependency is missing, e.g. Diffusers/Torch)
--- >8 --- END FILE: docs/api.md --- >8 ---

--- 8< --- FILE: docs/architecture.md --- 8< ---
# AbstractVision architecture

AbstractVision is a model-agnostic Python layer that standardizes **generative vision outputs** behind a small API:
text→image, image→image (and optionally video when a backend supports it).

This document describes the *current code in this repo* and links to the supporting reference docs.

See also:
- Docs index: [docs/README.md](README.md)
- Getting started: [docs/getting-started.md](getting-started.md)
- API reference: [docs/api.md](api.md)
- FAQ: [docs/faq.md](faq.md)
- Backends: [docs/reference/backends.md](reference/backends.md)
- Capability registry: [docs/reference/capabilities-registry.md](reference/capabilities-registry.md)
- Artifacts: [docs/reference/artifacts.md](reference/artifacts.md)
- AbstractCore integration: [docs/reference/abstractcore-integration.md](reference/abstractcore-integration.md)

## AbstractFramework ecosystem (positioning)

AbstractVision is one component in the **AbstractFramework** ecosystem:

- **AbstractFramework** (project hub): <https://github.com/lpalbou/AbstractFramework>
- **AbstractCore** (orchestration + tool calling): <https://github.com/lpalbou/abstractcore>
- **AbstractRuntime** (runtime services, including artifact storage): <https://github.com/lpalbou/abstractruntime>

Where AbstractVision fits:
- AbstractVision focuses on *producing* images/videos (generators).
- AbstractCore focuses on orchestration, tool calling, and higher-level workflows (it can discover AbstractVision via the plugin entry point in `pyproject.toml` and `src/abstractvision/integrations/abstractcore_plugin.py`).
- AbstractRuntime provides runtime services and an artifact store interface; `RuntimeArtifactStoreAdapter` bridges AbstractVision to an AbstractRuntime-style artifact store (`src/abstractvision/artifacts.py`).

## Scope (and non-goals)

AbstractVision focuses on **producing** images/videos.

It is not the owner of “LLM image/video input attachments” (multimodal inputs to LLMs); those concerns live in higher-level layers (e.g., AbstractCore).

## Key components (with evidence pointers)

- **Orchestrator**: [`VisionManager`](../src/abstractvision/vision_manager.py)
  - Delegates execution to a backend.
  - Optionally gates requests using the capability registry when `model_id` is set.
  - Optionally stores outputs and returns artifact refs when `store` is set.
- **Backend contract**: [`VisionBackend`](../src/abstractvision/backends/base_backend.py)
  - Implementations live in [`../src/abstractvision/backends/`](../src/abstractvision/backends/).
- **Capability registry**: [`VisionModelCapabilitiesRegistry`](../src/abstractvision/model_capabilities.py)
  - Loads packaged data: [`vision_model_capabilities.json`](../src/abstractvision/assets/vision_model_capabilities.json).
- **Artifact outputs**: [`MediaStore`](../src/abstractvision/artifacts.py), [`LocalAssetStore`](../src/abstractvision/artifacts.py), [`RuntimeArtifactStoreAdapter`](../src/abstractvision/artifacts.py)
  - Artifact ref helper: `is_artifact_ref()` (see [`../src/abstractvision/artifacts.py`](../src/abstractvision/artifacts.py)).
- **CLI/REPL**: `abstractvision` entrypoint ([`../src/abstractvision/cli.py`](../src/abstractvision/cli.py))
  - Lets you inspect the registry and manually test generation backends.
- **AbstractCore integration**:
  - Capability plugin: [`../src/abstractvision/integrations/abstractcore_plugin.py`](../src/abstractvision/integrations/abstractcore_plugin.py) (registered in `pyproject.toml`)
  - Tool helpers: [`../src/abstractvision/integrations/abstractcore.py`](../src/abstractvision/integrations/abstractcore.py)

## High-level flow (library mode)

```mermaid
flowchart LR
  Caller[Caller<br/>(Python / CLI)] --> VM[VisionManager]
  VM -->|request dataclass| BE[VisionBackend]
  BE -->|GeneratedAsset| VM
  VM -->|store set| Store[MediaStore<br/>(LocalAssetStore / Runtime adapter)]
  Store --> Ref[Artifact ref dict]
  VM -->|store not set| Asset[GeneratedAsset<br/>(bytes + mime)]
```

Notes (anchored in code):
- `VisionManager` creates request dataclasses like `ImageGenerationRequest` / `ImageEditRequest` ([`../src/abstractvision/types.py`](../src/abstractvision/types.py)).
- When `store` is set, `VisionManager._maybe_store()` calls `store.store_bytes(...)` and returns an artifact ref dict ([`../src/abstractvision/vision_manager.py`](../src/abstractvision/vision_manager.py), [`../src/abstractvision/artifacts.py`](../src/abstractvision/artifacts.py)).

## Capability gating (model-level) vs runtime gating (backend-level)

AbstractVision separates two kinds of “can I do this?” checks:

1) **Model-level gating** (optional): “Does model X support task Y?”
   - Implemented by `VisionModelCapabilitiesRegistry.require_support(...)` ([`../src/abstractvision/model_capabilities.py`](../src/abstractvision/model_capabilities.py))
   - Used by `VisionManager._require_model_support(...)` when `VisionManager.model_id` is set ([`../src/abstractvision/vision_manager.py`](../src/abstractvision/vision_manager.py))

2) **Backend-level gating** (best-effort): “Does this configured backend support task Y / mask edits?”
   - Backends may implement `get_capabilities()` returning `VisionBackendCapabilities` ([`../src/abstractvision/types.py`](../src/abstractvision/types.py))
   - Enforced by `VisionManager._require_backend_support(...)` and mask checks in `VisionManager.edit_image(...)` ([`../src/abstractvision/vision_manager.py`](../src/abstractvision/vision_manager.py))

## Backend reality (what runs today)

The public API includes `text_to_video`, `image_to_video`, and `multi_view_image`, but backend support is currently limited:

- Built-in backends implement **images** (`text_to_image`, `image_to_image`):
  - OpenAI-compatible HTTP backend ([`../src/abstractvision/backends/openai_compatible.py`](../src/abstractvision/backends/openai_compatible.py))
  - Diffusers backend ([`../src/abstractvision/backends/huggingface_diffusers.py`](../src/abstractvision/backends/huggingface_diffusers.py))
  - stable-diffusion.cpp backend ([`../src/abstractvision/backends/stable_diffusion_cpp.py`](../src/abstractvision/backends/stable_diffusion_cpp.py))
- Video is supported **only** by the OpenAI-compatible backend, and only when `text_to_video_path` / `image_to_video_path` are configured ([`../src/abstractvision/backends/openai_compatible.py`](../src/abstractvision/backends/openai_compatible.py)).
- No built-in backend implements `multi_view_image` yet (they raise `CapabilityNotSupportedError` in `generate_angles(...)`).

For a detailed support matrix and configuration options, see [docs/reference/backends.md](reference/backends.md).

## AbstractCore plugin flow (framework integration)

AbstractVision can be discovered by AbstractCore via an entry point:
`[project.entry-points."abstractcore.capabilities_plugins"]` in [`../pyproject.toml`](../pyproject.toml).

```mermaid
flowchart LR
  AC[AbstractCore] -->|loads entry point| Plugin[AbstractVision plugin<br/>register(...)]
  Plugin --> Cap[VisionCapability<br/>(t2i/i2i/t2v/i2v)]
  Cap --> VM[VisionManager]
  VM --> BE{Configured backend}
  BE --> HTTP[OpenAI-compatible HTTP<br/>OpenAI or local /v1 server]
  BE --> HF[Local Diffusers]
  BE --> SDCPP[Local stable-diffusion.cpp]
```

Current plugin behavior (evidence in [`../src/abstractvision/integrations/abstractcore_plugin.py`](../src/abstractvision/integrations/abstractcore_plugin.py)):
- Default: OpenAI HTTP with backend id `abstractvision:openai`; the legacy backend id `abstractvision:openai-compatible` remains registered and preserves compatible-endpoint defaults when selected directly.
- Compatible endpoints should set `ABSTRACTVISION_BACKEND=openai-compatible` plus `ABSTRACTVISION_BASE_URL`; legacy base-url-only configs still resolve as compatible endpoints.
- Local Diffusers and stable-diffusion.cpp are supported when `vision_backend` / `ABSTRACTVISION_BACKEND` selects `diffusers` or `sdcpp`.
- Configuration is read from `owner.config` keys like `vision_base_url`, `vision_model_id`, `vision_backend`, and backend-specific keys, then falls back to `ABSTRACTVISION_*` and standard OpenAI env vars where relevant.

## Extending AbstractVision (practical steps)

- Add a new backend:
  1) Implement `VisionBackend` ([`../src/abstractvision/backends/base_backend.py`](../src/abstractvision/backends/base_backend.py))
  2) Add capability reporting via `get_capabilities()` when you can (optional)
  3) Add tests under [`../tests/`](../tests/)
- Update the registry:
  1) Edit [`../src/abstractvision/assets/vision_model_capabilities.json`](../src/abstractvision/assets/vision_model_capabilities.json)
  2) Validate by running the test suite (validator is wired into the registry loader)
  3) Use `abstractvision show-model <id>` to sanity-check task/param printing ([`../src/abstractvision/cli.py`](../src/abstractvision/cli.py))
--- >8 --- END FILE: docs/architecture.md --- >8 ---

--- 8< --- FILE: docs/faq.md --- 8< ---
# FAQ

See also:
- Getting started: [docs/getting-started.md](getting-started.md)
- API reference: [docs/api.md](api.md)
- Architecture: [docs/architecture.md](architecture.md)
- Backends: [docs/reference/backends.md](reference/backends.md)
- Configuration: [docs/reference/configuration.md](reference/configuration.md)

## What is AbstractVision?

AbstractVision is a small, model-agnostic API for **generative vision outputs** (images, optional video) with:
- a small orchestrator ([`VisionManager`](../src/abstractvision/vision_manager.py))
- pluggable execution engines (“backends”) in [`../src/abstractvision/backends/`](../src/abstractvision/backends/)
- a packaged capability registry ([`vision_model_capabilities.json`](../src/abstractvision/assets/vision_model_capabilities.json))
- optional artifact-ref outputs via stores ([`../src/abstractvision/artifacts.py`](../src/abstractvision/artifacts.py))

## How does AbstractVision fit into AbstractFramework?

AbstractVision is part of the **AbstractFramework** ecosystem:

- **AbstractFramework** (project hub): <https://github.com/lpalbou/AbstractFramework>
- **AbstractCore** (orchestration + tool calling): <https://github.com/lpalbou/abstractcore>
- **AbstractRuntime** (runtime services, including artifact storage): <https://github.com/lpalbou/abstractruntime>

Where AbstractVision fits:
- It standardizes *generative vision outputs* behind `VisionManager` (library mode).
- AbstractCore can discover and use AbstractVision via the capability plugin (see [`../src/abstractvision/integrations/abstractcore_plugin.py`](../src/abstractvision/integrations/abstractcore_plugin.py) and the entry point in [`../pyproject.toml`](../pyproject.toml)).
- Artifact refs are designed to cross process boundaries; `RuntimeArtifactStoreAdapter` bridges to an AbstractRuntime-style artifact store (see [`../src/abstractvision/artifacts.py`](../src/abstractvision/artifacts.py)).

## What does AbstractVision support today?

- Built-in backends implement **images**: `text_to_image` and `image_to_image`.
- Video (`text_to_video`, `image_to_video`) works only via the OpenAI-compatible backend **when** video endpoints are configured.
- `multi_view_image` exists in the public API (`VisionManager.generate_angles`) but no built-in backend implements it yet (they raise `CapabilityNotSupportedError`).

Details: [docs/reference/backends.md](reference/backends.md).

## Which backend should I use?

- **OpenAI-compatible HTTP** ([`../src/abstractvision/backends/openai_compatible.py`](../src/abstractvision/backends/openai_compatible.py)): call a server that exposes OpenAI-shaped image endpoints (and optional video endpoints).
- **Diffusers (local)** ([`../src/abstractvision/backends/huggingface_diffusers.py`](../src/abstractvision/backends/huggingface_diffusers.py)): run Diffusers pipelines locally (heavy deps).
- **stable-diffusion.cpp (local GGUF)** ([`../src/abstractvision/backends/stable_diffusion_cpp.py`](../src/abstractvision/backends/stable_diffusion_cpp.py)): run GGUF diffusion models via `sd-cli` or `stable-diffusion-cpp-python`.

## What model should I start with (local)?

If you’re running locally via the Diffusers backend and want a reliable starting point, we recommend:

- **Default / ≤16GB VRAM (cross-platform)**: `runwayml/stable-diffusion-v1-5`

Quickstart:

```bash
huggingface-cli download runwayml/stable-diffusion-v1-5
export ABSTRACTVISION_BACKEND=diffusers
export ABSTRACTVISION_MODEL_ID=runwayml/stable-diffusion-v1-5
export ABSTRACTVISION_DIFFUSERS_DEVICE=auto
abstractvision repl
```

More model recommendations (by VRAM tier) are in [docs/getting-started.md](getting-started.md).

After that works, `black-forest-labs/FLUX.2-klein-4B` is the recommended next local test for a newer non-gated model
(it currently requires Diffusers from source).

## Does `abstractvision t2i` run locally?

`abstractvision t2i` / `abstractvision i2i` are one-shot helpers for the **OpenAI-compatible HTTP backend** ([`../src/abstractvision/cli.py`](../src/abstractvision/cli.py)).

For local generation, use `abstractvision repl` with `/backend diffusers ...` or `/backend sdcpp ...`.

## Where do generated outputs go?

It depends on whether you configured a store:

- **CLI/REPL**: stores outputs in a local store by default (`LocalAssetStore`), under `~/.abstractvision/assets` unless `ABSTRACTVISION_STORE_DIR` is set ([`../src/abstractvision/artifacts.py`](../src/abstractvision/artifacts.py), [`../src/abstractvision/cli.py`](../src/abstractvision/cli.py)).
- **Python**:
  - if `VisionManager.store` is set, methods return an artifact ref dict (stored via `store.store_bytes(...)`)
  - otherwise they return a `GeneratedAsset` containing bytes ([`../src/abstractvision/types.py`](../src/abstractvision/types.py))

## What is an “artifact ref”?

An artifact ref is a small JSON dict that points to a stored blob. Minimal shape:

```json
{"$artifact":"<id>"}
```

Helpers: `is_artifact_ref()` / `make_media_ref()` in [`../src/abstractvision/artifacts.py`](../src/abstractvision/artifacts.py).

## How do I allow or block Diffusers downloads?

- REPL: cache-only is the default. Pre-download models separately, or set `ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1` when you intentionally want runtime downloads ([`../src/abstractvision/cli.py`](../src/abstractvision/cli.py)).
- Python: `HuggingFaceDiffusersBackendConfig` defaults to `allow_download=False`; set `allow_download=True` only when you want runtime downloads ([`../src/abstractvision/backends/huggingface_diffusers.py`](../src/abstractvision/backends/huggingface_diffusers.py)).

## Why do I get “missing pipeline class” errors (e.g. `GlmImagePipeline`)?

Some newer pipelines may only exist on Diffusers GitHub `main`. Install:

- `pip install -U "abstractvision[diffusers-dev]"` (compatible dependency versions)
- `pip install -U "git+https://github.com/huggingface/diffusers@main"` (Diffusers `main`)

See: [docs/getting-started.md](getting-started.md).

## macOS (MPS): why do I get black images / dtype errors?

The Diffusers backend includes MPS-specific mitigations (e.g. VAE upcast and optional fp32 retry) in [`../src/abstractvision/backends/huggingface_diffusers.py`](../src/abstractvision/backends/huggingface_diffusers.py).

Common fixes:
- set `ABSTRACTVISION_DIFFUSERS_TORCH_DTYPE=float32` (more stable, higher memory)
- disable retry if memory is tight: `ABSTRACTVISION_DIFFUSERS_AUTO_RETRY_FP32=0`
- consider using the stable-diffusion.cpp backend for GGUF diffusion models ([docs/getting-started.md](getting-started.md))

## Windows/Linux (CUDA): why is `torch.cuda.is_available()` false?

On Windows/Linux, `pip install torch` (and packages that depend on `torch`) may install a CPU-only PyTorch build by default.

If you have an NVIDIA GPU and want CUDA acceleration:

1) Install a CUDA-enabled PyTorch wheel using the official selector: <https://pytorch.org/get-started/locally/>  
2) Verify:

```bash
python -c "import torch; print('cuda', torch.cuda.is_available())"
```

## How do I pass advanced flags / parameters?

AbstractVision exposes an `extra` dict on requests ([`../src/abstractvision/types.py`](../src/abstractvision/types.py)), and the REPL forwards unknown `--flags` into `request.extra` ([`../src/abstractvision/cli.py`](../src/abstractvision/cli.py)).

Examples:
- Diffusers backend: accepts extra keys like `loras_json` and `rapid_aio_repo` (used by Qwen Image Edit flows; see [docs/getting-started.md](getting-started.md) and [`../src/abstractvision/backends/huggingface_diffusers.py`](../src/abstractvision/backends/huggingface_diffusers.py)).
- stable-diffusion.cpp backend:
  - CLI mode forwards flags to `sd-cli`
  - python-binding mode maps supported keys to binding kwargs and ignores unsupported keys ([`../src/abstractvision/backends/stable_diffusion_cpp.py`](../src/abstractvision/backends/stable_diffusion_cpp.py))

## What does the capability registry mean (and what does it not mean)?

The registry answers “what a model *claims* to support” (task keys/params) and can be used for **optional gating**:

- `VisionModelCapabilitiesRegistry.supports(...)` / `.require_support(...)` ([`../src/abstractvision/model_capabilities.py`](../src/abstractvision/model_capabilities.py))
- `VisionManager(model_id=...)` uses it to fail fast before calling a backend ([`../src/abstractvision/vision_manager.py`](../src/abstractvision/vision_manager.py))

It does **not** guarantee your configured backend can execute the task; backend support is a separate constraint ([docs/reference/backends.md](reference/backends.md)).

## I only need the HTTP backend. Do I have to install Torch/Diffusers?

No. The base install is lightweight and includes the stdlib OpenAI-compatible HTTP backend without Torch/Diffusers (see [`../pyproject.toml`](../pyproject.toml)). Heavy local backend modules are still imported lazily ([`../src/abstractvision/backends/__init__.py`](../src/abstractvision/backends/__init__.py)).

Install `abstractvision[diffusers]` only when you want local Diffusers generation. Use `abstractvision[sdcpp]` or an external `sd-cli` only when you need stable-diffusion.cpp.

## How do I integrate with AbstractCore?

Two options (details in [docs/reference/abstractcore-integration.md](reference/abstractcore-integration.md)):

- **Capability plugin**: [`../src/abstractvision/integrations/abstractcore_plugin.py`](../src/abstractvision/integrations/abstractcore_plugin.py) supports Diffusers, stable-diffusion.cpp, and OpenAI-compatible backends through env/config.
- **Tool helpers**: `make_vision_tools(...)` in [`../src/abstractvision/integrations/abstractcore.py`](../src/abstractvision/integrations/abstractcore.py) requires `VisionManager.store` for artifact-ref outputs.

AbstractCore is the host package; AbstractVision does not install it as a dependency.

## How do I run tests?

From the repo root:

```bash
python -m unittest discover -s tests -p "test_*.py" -q
```
--- >8 --- END FILE: docs/faq.md --- >8 ---

--- 8< --- FILE: docs/reference/backends.md --- 8< ---
# Backends (execution engines)

AbstractVision executes tasks via a `VisionBackend` adapter ([`../../src/abstractvision/backends/base_backend.py`](../../src/abstractvision/backends/base_backend.py)).
`VisionManager` is intentionally thin and delegates to the configured backend ([`../../src/abstractvision/vision_manager.py`](../../src/abstractvision/vision_manager.py)).

See also:
- Getting started (REPL examples): [docs/getting-started.md](../getting-started.md)
- Configuration (env vars / CLI flags): [docs/reference/configuration.md](configuration.md)

## Support matrix (built-in backends)

| Backend | Implementation | Tasks implemented | Notes |
|---|---|---|---|
| OpenAI-compatible HTTP | [`openai_compatible.py`](../../src/abstractvision/backends/openai_compatible.py) | `text_to_image`, `image_to_image` (+ optional `text_to_video`, `image_to_video`) | Stdlib-only (`urllib`). Video is **opt-in** via configured paths. |
| Diffusers (local) | [`huggingface_diffusers.py`](../../src/abstractvision/backends/huggingface_diffusers.py) | `text_to_image`, `image_to_image` | Requires `abstractvision[diffusers]`. Supports cache-only/offline mode. |
| stable-diffusion.cpp (local GGUF/checkpoints) | [`stable_diffusion_cpp.py`](../../src/abstractvision/backends/stable_diffusion_cpp.py) | `text_to_image`, `image_to_image` | Uses external `sd-cli` if present, else `abstractvision[sdcpp]` python bindings. Start with single-file Stable Diffusion models; Qwen/FLUX GGUF may need VAE + LLM components. |

Notes:
- `multi_view_image` (`VisionManager.generate_angles`) is part of the public API, but **no built-in backend implements it yet** (all raise `CapabilityNotSupportedError` today).
- Backends may also expose best-effort `get_capabilities()`, `preload()`, `unload()`, `generate_image_with_progress(...)`, and `edit_image_with_progress(...)` hooks via the shared `VisionBackend` contract.

## OpenAI-compatible HTTP backend

**When to use**
- You already run a service that exposes OpenAI-shaped endpoints (local or remote).
- You want to keep inference out-of-process.

**Core config**
- `base_url` (required): points to a `/v1`-style root, e.g. `http://localhost:1234/v1`
- `api_key` (optional): sent as `Authorization: Bearer ...`
- `model_id` (optional): forwarded as `model` in requests
- `models_path` (default `/models`): provider catalog path for explicit model listing

Request shape:
- Unknown/local endpoints receive local extension fields when provided, including `steps`, `seed`, `guidance_scale`, `negative_prompt`, `width`, and `height`.
- Real OpenAI-looking endpoints and known OpenAI image models use the narrower OpenAI request shape; GPT image models do not receive unsupported local-only fields such as `steps`, `seed`, or `guidance_scale`.

Provider model catalogs:
- `OpenAICompatibleVisionBackend.list_provider_models(...)` queries `GET /models` by default.
- `VisionManager.list_provider_models(...)` delegates to the configured backend.
- The AbstractCore plugin exposes the same catalog through `llm.vision.list_provider_models(...)`.
- CLI examples: `abstractvision provider-models --openai --task text_to_image` and `abstractvision provider-models --base-url http://localhost:1234/v1 --task text_to_image`.
- Listing is explicit; AbstractVision does not use provider catalogs to silently select a model.

Code pointers:
- Config: `OpenAICompatibleBackendConfig` ([`../../src/abstractvision/backends/openai_compatible.py`](../../src/abstractvision/backends/openai_compatible.py))
- Backend: `OpenAICompatibleVisionBackend` ([`../../src/abstractvision/backends/openai_compatible.py`](../../src/abstractvision/backends/openai_compatible.py))

**Video endpoints (optional)**
`OpenAICompatibleVisionBackend` only enables:
- `text_to_video` if `text_to_video_path` is set
- `image_to_video` if `image_to_video_path` is set

## Diffusers backend (local)

**When to use**
- You want local inference for Diffusers pipelines.
- Start with `runwayml/stable-diffusion-v1-5` for the lowest-risk local test.
- Move to `black-forest-labs/FLUX.2-klein-4B` after that if you want a newer non-gated model and can install Diffusers `main`.

Install:
- `pip install "abstractvision[diffusers]"`
- For newer/unreleased pipeline classes: `pip install "abstractvision[diffusers-dev]"` plus Diffusers from source.

Code pointers:
- Config: `HuggingFaceDiffusersBackendConfig` ([`../../src/abstractvision/backends/huggingface_diffusers.py`](../../src/abstractvision/backends/huggingface_diffusers.py))
- Backend: `HuggingFaceDiffusersVisionBackend` ([`../../src/abstractvision/backends/huggingface_diffusers.py`](../../src/abstractvision/backends/huggingface_diffusers.py))

**Offline / cache-only mode**
The Python backend and REPL are cache-only by default (`allow_download=False`). Pre-download model weights separately,
or set `allow_download=True` / `ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1` when runtime downloads are desired (see
config/env in [docs/reference/configuration.md](configuration.md)).

Config fields:
- `model_id`, `device`, `torch_dtype`
- `allow_download`, `auto_retry_fp32`
- `cache_dir`, `revision`, `variant`
- `use_safetensors`, `low_cpu_mem_usage`

## stable-diffusion.cpp backend (local GGUF/checkpoints)

**When to use**
- You want to run single-file Stable Diffusion checkpoints/GGUF or component-based GGUF diffusion models locally.

Runtime modes (auto-selected):
- **CLI mode** via `sd-cli` (stable-diffusion.cpp executable) when available in `PATH`
- **Python mode** via `stable-diffusion-cpp-python` when `sd-cli` is not available

Notes:
- If you care about **GPU acceleration** (macOS **Metal**, NVIDIA **CUDA**, etc.), prefer **CLI mode** via `sd-cli`.
- Python bindings run whatever backend the installed wheel was built with. On macOS, that often means **CPU-only**, so FLUX/Qwen-class models can be extremely slow.
- REPL selection supports both `/backend sdcpp <model.gguf|model.safetensors> [sd_cli_path]` and
  `/backend sdcpp <diffusion_model.gguf> <vae.safetensors> <llm.gguf> [sd_cli_path]`.
- Python code and AbstractCore plugin configuration can also pass component paths such as `clip_l`, `clip_g`, `t5xxl`, `llm_vision`, plus `extra_args`, `timeout_s`, and `cwd`.

Code pointers:
- Config: `StableDiffusionCppBackendConfig` ([`../../src/abstractvision/backends/stable_diffusion_cpp.py`](../../src/abstractvision/backends/stable_diffusion_cpp.py))
- Backend: `StableDiffusionCppVisionBackend` ([`../../src/abstractvision/backends/stable_diffusion_cpp.py`](../../src/abstractvision/backends/stable_diffusion_cpp.py))
--- >8 --- END FILE: docs/reference/backends.md --- >8 ---

--- 8< --- FILE: docs/reference/configuration.md --- 8< ---
# Configuration (CLI / REPL)

AbstractVision configuration is intentionally simple:

- In Python, you configure backends by instantiating backend config objects (see [docs/reference/backends.md](backends.md)).
- The CLI/REPL/playground reads `ABSTRACTVISION_*` environment variables to set defaults ([`../../src/abstractvision/cli.py`](../../src/abstractvision/cli.py), [`../../src/abstractvision/playground_server.py`](../../src/abstractvision/playground_server.py)).
- The AbstractCore capability plugin also reads `owner.config` plus a small set of standard OpenAI environment aliases. Plugin-only aliases are called out below.

See also:
- Getting started (examples): [docs/getting-started.md](../getting-started.md)
- Backends: [docs/reference/backends.md](backends.md)

## CLI commands (overview)

Implemented in [`../../src/abstractvision/cli.py`](../../src/abstractvision/cli.py):

- `abstractvision models` — list known registry model ids
- `abstractvision tasks` — list known tasks
- `abstractvision show-model <id>` — print a model’s tasks + params
- `abstractvision provider-models --openai --task text_to_image` — explicitly query the official OpenAI `/models` catalog
- `abstractvision provider-models --base-url http://localhost:1234/v1 --task text_to_image` — explicitly query an OpenAI-compatible provider catalog
- `abstractvision repl` — interactive testing (supports `openai`, `diffusers`, `sdcpp`)
- `abstractvision playground [--host 127.0.0.1] [--port 8091]` — self-contained local web UI and `/v1/vision/*` API
- `abstractvision serve [--host 127.0.0.1] [--port 8091]` — alias for `abstractvision playground`
- `abstractvision t2i ...` / `abstractvision i2i ...` — one-shot commands using the **OpenAI-compatible HTTP backend**

Note:
- `abstractvision t2i` / `abstractvision i2i` always use the OpenAI-compatible backend (they do not switch based on `ABSTRACTVISION_BACKEND`).
- Use `abstractvision repl` or `abstractvision playground` for local backends (`diffusers`, `sdcpp`).
- Local Diffusers requires `abstractvision[diffusers]`. stable-diffusion.cpp python binding fallback requires `abstractvision[sdcpp]`; external `sd-cli` can be used without the binding.

## REPL backend selection

Inside `abstractvision repl`:

- `/backend openai <base_url> [api_key] [model_id]`
- `/provider-models [--task text_to_image] [--json]` — query the configured OpenAI-compatible provider catalog
- `/backend diffusers <model_id_or_path> [device] [torch_dtype]`
- `/backend sdcpp <model.gguf|model.safetensors> [sd_cli_path]`
- `/backend sdcpp <diffusion_model.gguf> <vae.safetensors> <llm.gguf> [sd_cli_path]`

Run `/help` in the REPL to see the full command list (generated by `_repl_help()` in [`../../src/abstractvision/cli.py`](../../src/abstractvision/cli.py)).

## Environment variables

The CLI/REPL state object (`_ReplState` in [`../../src/abstractvision/cli.py`](../../src/abstractvision/cli.py)) reads the common and backend env vars below. The playground server reads the same backend family plus a few playground-only Diffusers cache/version vars through `PlaygroundServerConfig` in [`../../src/abstractvision/playground_server.py`](../../src/abstractvision/playground_server.py).

### Common

- `ABSTRACTVISION_BACKEND` — backend selector: `openai`, `openai-compatible`, `diffusers`, or `sdcpp`
  - AbstractCore plugin default: `openai` using `https://api.openai.com/v1` plus `OPENAI_API_KEY` or `ABSTRACTVISION_API_KEY`
  - AbstractCore compatibility: selecting `abstractvision:openai-compatible` directly, or setting only `ABSTRACTVISION_BASE_URL`, keeps compatible-endpoint semantics
  - if unset and `ABSTRACTVISION_BASE_URL` is set, REPL/playground default to `openai`
  - if unset and no base URL is configured, no backend is selected until you use `/backend ...` or load a model explicitly
- `ABSTRACTVISION_STORE_DIR` — local artifact output directory (default: `~/.abstractvision/assets`)
- `ABSTRACTVISION_TIMEOUT_S` — HTTP timeout for OpenAI-compatible backend (default: `300`)
- `ABSTRACTVISION_MODEL_ID` — model id for the current backend in the REPL:
  - `openai`: sent as `model` in HTTP requests (optional; server-dependent)
  - `diffusers`: Diffusers model id or local path
- `ABSTRACTVISION_CAPABILITIES_MODEL_ID` — optional capability-gating model id (must exist in the registry)

### OpenAI / OpenAI-Compatible HTTP Backend

- `ABSTRACTVISION_BASE_URL` — compatible `/v1` endpoint; optional for the AbstractCore `openai` default because it uses `https://api.openai.com/v1`
- `OPENAI_BASE_URL` — override for the official OpenAI profile when `ABSTRACTVISION_BASE_URL` / `vision_base_url` are unset; also used by `abstractvision provider-models --openai`
- `ABSTRACTVISION_API_KEY` — bearer token for OpenAI or compatible providers that require auth
- `OPENAI_API_KEY` — fallback when `ABSTRACTVISION_API_KEY` / `vision_api_key` are unset; also used by `abstractvision provider-models --openai`
- `ABSTRACTVISION_MODEL_ID` — optional remote model id/name (see also “Common”)
- `OPENAI_IMAGE_MODEL_ID` / `OPENAI_IMAGE_MODEL` — plugin-only OpenAI model aliases when `ABSTRACTVISION_MODEL_ID` / `vision_model_id` are unset
- `ABSTRACTVISION_MODELS_PATH` / `vision_models_path` — provider catalog path for explicit listing (default: `/models`, so a `/v1` base URL queries `/v1/models`)
- Many OpenAI-compatible providers expose `GET /models`. AbstractVision exposes that catalog via `abstractvision provider-models`, `VisionManager.list_provider_models(...)`, and the AbstractCore plugin method `llm.vision.list_provider_models(...)`; it does not call the catalog automatically or use it to select a model. The plugin uses its static default (`gpt-image-1`) unless a model id is configured.
- `ABSTRACTVISION_IMAGES_GENERATIONS_PATH` — default: `/images/generations`
- `ABSTRACTVISION_IMAGES_EDITS_PATH` — default: `/images/edits`
- `ABSTRACTVISION_TEXT_TO_VIDEO_PATH` — optional (enables `text_to_video`)
- `ABSTRACTVISION_IMAGE_TO_VIDEO_PATH` — optional (enables `image_to_video`)
- `ABSTRACTVISION_IMAGE_TO_VIDEO_MODE` — `multipart` (default) or `json_b64`

### Diffusers backend

- `ABSTRACTVISION_DIFFUSERS_DEVICE` — `auto` (default), `cpu`, `cuda`, `mps`, …
- `ABSTRACTVISION_DIFFUSERS_TORCH_DTYPE` — optional (`float16`, `bfloat16`, `float32`)
- `ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD` — `0` (default/cache-only/offline) or `1` to permit runtime downloads
- `ABSTRACTVISION_DIFFUSERS_AUTO_RETRY_FP32` — `1` (default) or `0` (MPS-only fallback behavior)

Playground-only Diffusers vars:
- `ABSTRACTVISION_ALLOW_DOWNLOAD` — legacy fallback used only when `ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD` is unset
- `ABSTRACTVISION_DIFFUSERS_CACHE_DIR` — optional Hugging Face cache directory override
- `ABSTRACTVISION_DIFFUSERS_REVISION` — optional model revision
- `ABSTRACTVISION_DIFFUSERS_VARIANT` — optional model variant

### stable-diffusion.cpp backend

- `ABSTRACTVISION_SDCPP_BIN` — `sd-cli` path/name (default: `sd-cli`)
- `ABSTRACTVISION_SDCPP_MODEL` — optional full-model path for single-file Stable Diffusion checkpoints/GGUF (alternative to component mode)
- `ABSTRACTVISION_SDCPP_DIFFUSION_MODEL` — GGUF diffusion model path
- `ABSTRACTVISION_SDCPP_VAE` — VAE safetensors path (required for component-mode models like Qwen Image GGUF and FLUX.2 GGUF)
- `ABSTRACTVISION_SDCPP_LLM` — text encoder path (often GGUF; required for component-mode models like Qwen Image GGUF and FLUX.2 GGUF)
- `ABSTRACTVISION_SDCPP_LLM_VISION` — optional vision encoder GGUF path
- `ABSTRACTVISION_SDCPP_EXTRA_ARGS` — extra `sd-cli` flags (string, split like a shell)

Tip:
- If you want **Metal** acceleration on macOS (Apple Silicon), install a Metal-capable `sd-cli` binary from
  stable-diffusion.cpp releases and point `ABSTRACTVISION_SDCPP_BIN` at it (or pass the path as the last arg to
  `/backend sdcpp ...` in the REPL). If you don’t, the backend may fall back to python bindings that run CPU-only.
--- >8 --- END FILE: docs/reference/configuration.md --- >8 ---

--- 8< --- FILE: docs/reference/capabilities-registry.md --- 8< ---
# Capability registry (`vision_model_capabilities.json`)

AbstractVision keeps a single packaged “source of truth” for what models can do:

- Asset: [`../../src/abstractvision/assets/vision_model_capabilities.json`](../../src/abstractvision/assets/vision_model_capabilities.json)
- Loader + validator: `VisionModelCapabilitiesRegistry` / `validate_capabilities_json()` in [`../../src/abstractvision/model_capabilities.py`](../../src/abstractvision/model_capabilities.py)

See also:
- CLI/REPL inspection commands: [docs/reference/configuration.md](configuration.md)
- Backends (execution reality): [docs/reference/backends.md](backends.md)

## What the registry is used for

- **Discovery**: list known task keys and model ids.
- **Optional safety gating**:
  - `VisionManager(model_id=..., registry=...)` will fail fast if the model doesn’t support a task ([`../../src/abstractvision/vision_manager.py`](../../src/abstractvision/vision_manager.py)).
  - The CLI/REPL can enforce gating via `--capabilities-model-id` (CLI) or `/cap-model` (REPL).

Important:
- The registry describes **model capability intent**.
- Your configured backend still needs to implement the task at runtime (see backend support matrix in [docs/reference/backends.md](backends.md)).

## Minimal Python usage

```python
from abstractvision import VisionModelCapabilitiesRegistry

reg = VisionModelCapabilitiesRegistry()
print(reg.schema_version())
print(reg.list_tasks())

assert reg.supports("runwayml/stable-diffusion-v1-5", "text_to_image")
print(reg.models_for_task("text_to_image"))
```

## JSON shape (high level)

The validator enforces a “soft schema”:

- Top-level keys:
  - `schema_version`
  - `tasks` (keyed by task name; includes human descriptions)
  - `models` (keyed by model id)
- Each model entry includes:
  - `provider` (string)
  - `license` (string; informational)
  - `tasks` (map of task name → task spec)
- Each task spec includes:
  - `inputs`, `outputs` (lists of strings)
  - `params` (object where each param has `required: bool`, plus additive fields)
  - optional `requires` for dependencies like `base_model_id`
--- >8 --- END FILE: docs/reference/capabilities-registry.md --- >8 ---

--- 8< --- FILE: docs/reference/artifacts.md --- 8< ---
# Artifacts (artifact refs + stores)

AbstractVision supports “artifact-first” outputs: return a small JSON dict that points to a stored blob instead of inlining bytes.

Code pointers:
- Store interface + helpers: [`../../src/abstractvision/artifacts.py`](../../src/abstractvision/artifacts.py)
- Orchestration logic: `VisionManager._maybe_store()` in [`../../src/abstractvision/vision_manager.py`](../../src/abstractvision/vision_manager.py)

See also:
- Getting started (REPL stores outputs by default): [docs/getting-started.md](../getting-started.md)

## Output shapes

`VisionManager` returns:

- **Without a store**: `GeneratedAsset` ([`../../src/abstractvision/types.py`](../../src/abstractvision/types.py))
  - contains bytes (`data`), `mime_type`, and best-effort metadata
- **With a store**: an artifact ref dict (via `MediaStore.store_bytes(...)`)
  - minimum shape: `{"$artifact": "<id>"}` (`is_artifact_ref()` checks this)
  - common fields: `content_type`, `sha256`, `filename`, `size_bytes`, `metadata`

## LocalAssetStore (standalone mode)

`LocalAssetStore` stores files under `~/.abstractvision/assets` by default ([`../../src/abstractvision/artifacts.py`](../../src/abstractvision/artifacts.py)):

- Blob: `~/.abstractvision/assets/<artifact_id>.<ext>`
- Metadata: `~/.abstractvision/assets/<artifact_id>.meta.json`

Minimal usage:

```python
from abstractvision import LocalAssetStore, VisionManager
from abstractvision.backends import OpenAICompatibleBackendConfig, OpenAICompatibleVisionBackend

store = LocalAssetStore()
backend = OpenAICompatibleVisionBackend(config=OpenAICompatibleBackendConfig(base_url="http://localhost:1234/v1"))
vm = VisionManager(backend=backend, store=store)

ref = vm.generate_image("a watercolor painting of a lighthouse")
blob = store.load_bytes(ref["$artifact"])  # type: ignore[index]
```

## RuntimeArtifactStoreAdapter (framework mode)

`RuntimeArtifactStoreAdapter` is a duck-typed adapter for an external artifact store (designed for AbstractRuntime),
so AbstractVision can depend on an artifact store **without** a hard dependency ([`../../src/abstractvision/artifacts.py`](../../src/abstractvision/artifacts.py)).

Related:
- AbstractRuntime: <https://github.com/lpalbou/abstractruntime>
--- >8 --- END FILE: docs/reference/artifacts.md --- >8 ---

--- 8< --- FILE: docs/reference/abstractcore-integration.md --- 8< ---
# AbstractCore integration

AbstractVision offers two integration surfaces for AbstractCore:

1) **Capability plugin** (so `abstractcore` can discover a vision backend)
2) **Tool helpers** (so you can expose vision tasks as tools with artifact-ref outputs)

Code pointers:
- Plugin: [`../../src/abstractvision/integrations/abstractcore_plugin.py`](../../src/abstractvision/integrations/abstractcore_plugin.py)
- Tools: [`../../src/abstractvision/integrations/abstractcore.py`](../../src/abstractvision/integrations/abstractcore.py)
- Entry point registration: [`../../pyproject.toml`](../../pyproject.toml) (`[project.entry-points."abstractcore.capabilities_plugins"]`)

See also:
- Artifacts: [docs/reference/artifacts.md](artifacts.md)
- Backends: [docs/reference/backends.md](backends.md)

## 1) Capability plugin (AbstractCore → VisionCapability)

The plugin registers these backend ids:

- `abstractvision:openai` (default backend id) and `abstractvision:openai-compatible` (legacy compatibility backend id) are registered by [`../../src/abstractvision/integrations/abstractcore_plugin.py`](../../src/abstractvision/integrations/abstractcore_plugin.py). The implementation also supports local backends.

Current behavior:
- Default `abstractvision:openai`: OpenAI HTTP (`https://api.openai.com/v1`). Set `OPENAI_API_KEY` or `ABSTRACTVISION_API_KEY`.
- OpenAI model ids are configured, not discovered dynamically. Providers may expose an OpenAI-compatible `GET /models` catalog; AbstractVision exposes it through `abstractvision provider-models`, `VisionManager.list_provider_models(...)`, and the plugin method `llm.vision.list_provider_models(...)`, but the plugin does not call it automatically or use it to select a model. The static plugin default is `gpt-image-1`; set `OPENAI_IMAGE_MODEL_ID`, `OPENAI_IMAGE_MODEL`, `ABSTRACTVISION_MODEL_ID`, or `vision_model_id` for newer provider models.
- Compatible HTTP: set `ABSTRACTVISION_BACKEND=openai-compatible` and `ABSTRACTVISION_BASE_URL` to a local/remote compatible `/v1` server. Legacy `ABSTRACTVISION_BASE_URL`-only deployments still use compatible semantics, but new configs should set the backend explicitly.
- Legacy `abstractvision:openai-compatible`: keeps compatible-endpoint defaults when that backend id is selected directly.
- Local Diffusers: install `abstractvision[diffusers]`, then set `ABSTRACTVISION_BACKEND=diffusers` with `runwayml/stable-diffusion-v1-5` or another Diffusers model. It is cache-only/offline unless `ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1` is set.
- stable-diffusion.cpp: set `ABSTRACTVISION_BACKEND=sdcpp` and configure a model path. Use an external `sd-cli`, or install `abstractvision[sdcpp]` for the python binding fallback.
- The plugin reads AbstractCore owner config keys when present, then falls back to `ABSTRACTVISION_*` env vars.
- Gateway/Core should pass process-level config or `owner.config` and report readiness; they should not mutate AbstractVision environment variables per request.

Key config keys (owner.config):
- `vision_backend_instance` / `vision_backend_factory` (advanced injection hooks; bypass env-driven backend creation)
- `vision_backend` (`openai`, `openai-compatible`, `diffusers`, or `sdcpp`; default `openai`)
- `vision_model_id` (Diffusers/OpenAI-compatible model id; default `gpt-image-1` only for the official OpenAI profile and `runwayml/stable-diffusion-v1-5` for Diffusers)
- `vision_device` / `vision_torch_dtype` / `vision_allow_download` / `vision_auto_retry_fp32` (Diffusers)
- `vision_base_url` / `vision_api_key` (OpenAI or compatible HTTP)
- `vision_sdcpp_model` / `vision_sdcpp_diffusion_model` / `vision_sdcpp_bin` (stable-diffusion.cpp)
- `vision_sdcpp_vae` / `vision_sdcpp_llm` / `vision_sdcpp_llm_vision` / `vision_sdcpp_clip_l` / `vision_sdcpp_clip_g` / `vision_sdcpp_t5xxl` / `vision_sdcpp_extra_args` (stable-diffusion.cpp component mode)
- `vision_timeout_s` (optional)
- `vision_models_path` (optional provider catalog path; default `/models`)
- Optional video endpoint keys:
  - `vision_text_to_video_path`
  - `vision_image_to_video_path`
  - `vision_image_to_video_mode`

Env-only aliases:
- `ABSTRACTVISION_DIFFUSERS_MODEL_ID` is accepted for the Diffusers plugin backend before falling back to `ABSTRACTVISION_MODEL_ID`.
- `OPENAI_BASE_URL` is accepted by the official OpenAI profile when `vision_base_url` / `ABSTRACTVISION_BASE_URL` are unset.
- `OPENAI_API_KEY` is accepted after `ABSTRACTVISION_API_KEY`.
- `OPENAI_IMAGE_MODEL_ID` and `OPENAI_IMAGE_MODEL` are accepted when `vision_model_id` / `ABSTRACTVISION_MODEL_ID` are unset.
- `ABSTRACTVISION_SDCPP_CLIP_L`, `ABSTRACTVISION_SDCPP_CLIP_G`, and `ABSTRACTVISION_SDCPP_T5XXL` are accepted for stable-diffusion.cpp component mode.

Examples:

```bash
# Local Diffusers. Pre-download weights first, or explicitly allow runtime downloads.
export ABSTRACTVISION_BACKEND=diffusers
export ABSTRACTVISION_MODEL_ID=runwayml/stable-diffusion-v1-5
export ABSTRACTVISION_DIFFUSERS_DEVICE=auto
```

```python
from abstractcore import create_llm

llm = create_llm("openai", model="gpt-4o-mini")
png_bytes = llm.vision.t2i("a red square", width=512, height=512, steps=20)
```

```bash
# OpenAI API.
export OPENAI_API_KEY=...
export OPENAI_IMAGE_MODEL=gpt-image-1
```

```bash
# Local OpenAI-compatible HTTP server, for example AbstractCore Server.
export ABSTRACTVISION_BACKEND=openai-compatible
export ABSTRACTVISION_BASE_URL=http://localhost:8000/v1
export ABSTRACTVISION_MODEL_ID=server/default
```

### Provider Catalog Discovery

Core/Gateway hosts can inspect provider-advertised model catalogs through the same capability
object used for generation:

```python
models = llm.vision.list_provider_models(task="text_to_image")
for model in models:
    print(model["id"])
```

The return value is a JSON-safe list of dictionaries serialized from `ProviderModelInfo`. Raw
provider metadata is retained in a bounded `raw` field for diagnostics. This method is explicit
inspection only: it does not mutate the configured backend or select a generation model.

Backends that do not implement provider catalog listing raise a clear AbstractVision error instead
of returning a misleading empty catalog. Local Diffusers and stable-diffusion.cpp model discovery
remain separate local-backend concerns.

## 2) Tool helpers (`make_vision_tools`)

`make_vision_tools(...)` builds AbstractCore `@tool` callables for:
- text→image
- image→image
- multi-view image
- text→video
- image→video

Important:
- Tool outputs are designed to be **artifact refs**, so `VisionManager.store` must be set ([`../../src/abstractvision/integrations/abstractcore.py`](../../src/abstractvision/integrations/abstractcore.py)).
- This module requires AbstractCore to be installed by the host application. AbstractVision does not install AbstractCore as a dependency.

Tip (framework mode):
- If your runtime provides an artifact store (e.g. AbstractRuntime), use `RuntimeArtifactStoreAdapter` so tool outputs can be stored and referenced across processes (see [docs/reference/artifacts.md](artifacts.md)).
--- >8 --- END FILE: docs/reference/abstractcore-integration.md --- >8 ---

--- 8< --- FILE: playground/README.md --- 8< ---
# AbstractVision Playground (Web)

This is a tiny web UI for testing AbstractVision locally. It is powered by the
self-contained `abstractvision playground` command; it does **not** require an
AbstractCore server.

The playground is a local/dev surface. Do not expose it as an authenticated
production serving boundary; use AbstractCore/Gateway for production routing,
authentication, and browser-origin policy.

## Required API endpoints

The page calls:

- `GET /v1/models` (ping)
- `GET /v1/vision/models` (list cached models + active model)
- `POST /v1/vision/model/load` (load a model into memory)
- `POST /v1/vision/model/unload` (unload the active model)
- `POST /v1/vision/jobs/images/generations` (start a text→image job)
- `POST /v1/vision/jobs/images/edits` (start an image→image job)
- `GET /v1/vision/jobs/{job_id}` (poll job status)
  - on success, the page calls `GET /v1/vision/jobs/{job_id}?consume=1` to fetch-and-consume the result

## 1) Start the local playground server

From an AbstractVision checkout:

```bash
PYTHONPATH=src python -m abstractvision playground --port 8091
```

Or, when installed:

```bash
abstractvision playground --port 8091
```

Quick sanity checks (should return JSON):

```bash
curl -s http://127.0.0.1:8091/v1/models | head
curl -s http://127.0.0.1:8091/v1/vision/models | head
```

## 2) Open the page

Open:

- `http://127.0.0.1:8091/vision_playground.html`

Usage notes:
- You must **select a cached model** and load it before running inference.
- Raw Hugging Face model ids such as `runwayml/stable-diffusion-v1-5` load directly; no `diffusers/` provider prefix is required.
- For first tests, prefer a small cached model such as Stable Diffusion 1.5 before loading larger Qwen/FLUX models.
- “Extra JSON” is forwarded to the server:
  - T2I: merged into the JSON request body
  - I2I: sent as a string field `extra_json` in the multipart form body

## 3) stable-diffusion.cpp / GGUF notes

If your server is configured to run GGUF diffusion models via stable-diffusion.cpp, you’ll typically need:
- a diffusion model (`.gguf`)
- a VAE (`.safetensors`) for some families (e.g. Qwen Image GGUF)
- a text encoder/LLM (`.gguf`) for some families (e.g. Qwen Image GGUF)

Exact configuration is backend-specific; check AbstractVision’s backend docs.
--- >8 --- END FILE: playground/README.md --- >8 ---

--- 8< --- FILE: CONTRIBUTING.md --- 8< ---
# Contributing to AbstractVision

Thanks for taking the time to contribute. This repository aims to stay small, stable-by-design, and easy to integrate.

AbstractVision is part of the **AbstractFramework** ecosystem:
- AbstractFramework: <https://github.com/lpalbou/AbstractFramework>
- AbstractCore: <https://github.com/lpalbou/abstractcore>
- AbstractRuntime: <https://github.com/lpalbou/abstractruntime>

## Ground rules

- Keep the public API stable (`VisionManager` in [`src/abstractvision/vision_manager.py`](src/abstractvision/vision_manager.py)).
- Prefer additive changes (new fields, new models, new backends) over breaking changes.
- Don’t commit model weights, large binaries, or cache artifacts.
- Make docs and examples match the code (the repo is intended to be “readme-first”).
- Keep imports lazy for heavy stacks (see [`src/abstractvision/backends/__init__.py`](src/abstractvision/backends/__init__.py)).

## Development setup

```bash
python -m venv .venv
. .venv/bin/activate
python -m pip install -U pip
python -m pip install -e ".[dev]"
```

Optional (if you work on AbstractCore integration locally):

```bash
python -m pip install abstractcore
```

The `abstractvision[abstractcore]` extra is only a compatibility marker. AbstractCore is intentionally supplied by the host application, not installed by AbstractVision.

## Run tests

```bash
python -m unittest discover -s tests -p "test_*.py" -q
```

## Common contribution types

### 1) Improve documentation

Core entrypoints:
- [`README.md`](README.md)
- [`docs/getting-started.md`](docs/getting-started.md)
- [`docs/architecture.md`](docs/architecture.md)
- [`docs/api.md`](docs/api.md)
- [`docs/faq.md`](docs/faq.md)

Doc hygiene checklist:
- Commands are copy/pastable.
- Links resolve (relative links are preferred).
- Claims about support status match the current code (see [`docs/reference/backends.md`](docs/reference/backends.md)).
- Major claims are anchored in evidence (link to the relevant `src/` implementation).
- Prefer diagrams in Mermaid when they improve clarity ([`docs/architecture.md`](docs/architecture.md) is the canonical place).

### 2) Add or update models in the capability registry

Source of truth:
- `src/abstractvision/assets/vision_model_capabilities.json`

Validator + loader:
- `src/abstractvision/model_capabilities.py`

Checklist:
- Add/update the model entry in the JSON.
- Run the unit tests (they validate schema + coverage).
- Sanity check CLI output:
  - `abstractvision show-model <model_id>`

### 3) Add a new backend

Backend interface:
- `src/abstractvision/backends/base_backend.py`

Where backends live:
- `src/abstractvision/backends/`

Checklist:
- Implement the `VisionBackend` methods (raise `CapabilityNotSupportedError` for unsupported tasks).
- Keep imports lazy (avoid importing Torch/Diffusers at module import time unless unavoidable).
- Add/extend tests under `tests/`.
- Document the backend in `docs/reference/backends.md` and, if user-facing, add a short section in `docs/getting-started.md`.

## Submitting a change

Please include:
- A short explanation of the change and why it’s needed.
- Test results (`python -m unittest ...`).
- Any doc updates required to keep the repository truthful.

## Questions / discussions

If you’re unsure about scope or design, open an issue with a minimal proposal and a concrete example (inputs/outputs).
--- >8 --- END FILE: CONTRIBUTING.md --- >8 ---

--- 8< --- FILE: SECURITY.md --- 8< ---
# Security policy

We take security issues seriously and appreciate responsible disclosure.

## Reporting a vulnerability

Please **do not** open a public GitHub issue for security reports.

Instead, report privately by email:

- `contact@abstractcore.ai`

Include as much of the following as you can:

- A clear description of the issue and its impact
- Reproduction steps (or a minimal PoC)
- Affected versions / commit hash (if known)
- Any relevant logs, stack traces, or configuration
- Suggested mitigation (if you have one)

If you believe the issue is in an upstream dependency (e.g. Torch/Diffusers), it can still be helpful to notify us so we can assess impact and coordinate messaging for AbstractVision users.

## What to expect

We aim to:

- Acknowledge receipt within **3 business days**
- Provide a status update within **7 business days**

If a coordinated disclosure timeline is needed, please include your preferred timeline in the report.

## Scope

This policy applies to vulnerabilities in this repository’s code and packaging.

For non-security bugs and feature requests, please use the normal issue tracker.
--- >8 --- END FILE: SECURITY.md --- >8 ---

--- 8< --- FILE: ACKNOWLEDGMENTS.md --- 8< ---
# Acknowledgments

AbstractVision stands on the shoulders of excellent open-source projects and communities.

## Optional runtime dependencies (declared as extras)

- **Hugging Face Diffusers** (local pipeline runtime; used by the Diffusers backend): [`src/abstractvision/backends/huggingface_diffusers.py`](src/abstractvision/backends/huggingface_diffusers.py) (declared in the `diffusers`/`local`/`all` extras)
- **PyTorch** (tensor runtime for local inference; used via Diffusers): [`src/abstractvision/backends/huggingface_diffusers.py`](src/abstractvision/backends/huggingface_diffusers.py) (declared in the `diffusers`/`local`/`all` extras)
- **Hugging Face Transformers** (tokenizers/encoders used by some diffusion pipelines; imported by the Diffusers backend): [`src/abstractvision/backends/huggingface_diffusers.py`](src/abstractvision/backends/huggingface_diffusers.py) (declared in the `diffusers`/`local`/`all` extras)
- **Accelerate** (installed for ecosystem compatibility; used transitively by some pipelines): declared in optional extras in `pyproject.toml`
- **Safetensors** (model weight format support; used by Diffusers/Transformers): declared in optional extras in `pyproject.toml`
- **SentencePiece** (T5/tokenizer support for some model families): declared in optional extras in `pyproject.toml`
- **protobuf** (runtime dependency for some tokenizers/pipelines): declared in optional extras in `pyproject.toml`
- **einops** (tensor ops used by some modern architectures): declared in optional extras in `pyproject.toml`
- **PEFT** (LoRA adapter support used by Diffusers): declared in optional extras in `pyproject.toml`
- **Pillow** (image I/O utilities used by local backends): [`src/abstractvision/backends/huggingface_diffusers.py`](src/abstractvision/backends/huggingface_diffusers.py), [`src/abstractvision/backends/stable_diffusion_cpp.py`](src/abstractvision/backends/stable_diffusion_cpp.py) (declared in optional extras in `pyproject.toml`)
- **stable-diffusion-cpp-python** (python bindings used when `sd-cli` is not available): [`src/abstractvision/backends/stable_diffusion_cpp.py`](src/abstractvision/backends/stable_diffusion_cpp.py) (declared in the `sdcpp`/`local`/`all` extras)

## Runtime dependencies (transitive but central)

- **huggingface_hub** (model and adapter downloads; used by Diffusers/Transformers pipelines)

## Upstream projects

- **stable-diffusion.cpp** (upstream project that provides `sd-cli` and the core GGUF runtime wrapped by the bindings): [`src/abstractvision/backends/stable_diffusion_cpp.py`](src/abstractvision/backends/stable_diffusion_cpp.py)

## Optional integrations

- **AbstractCore** (tool integration helpers + capability plugin): [`src/abstractvision/integrations/`](src/abstractvision/integrations/) (optional dependency in [`pyproject.toml`](pyproject.toml))

## Packaging

- **setuptools** and **wheel** (build system): [`pyproject.toml`](pyproject.toml)

## Community and contributors

Thanks to everyone who reports issues, suggests improvements, and contributes fixes or documentation updates.
--- >8 --- END FILE: ACKNOWLEDGMENTS.md --- >8 ---

--- 8< --- FILE: CHANGELOG.md --- 8< ---
# Changelog

## Unreleased

## 0.3.1 - 2026-05-07

- CI/release: move GitHub Actions checkout/setup-python/artifact/release actions to Node 24-compatible major versions, removing the Node 20 deprecation warnings from release runs.
- Tests: strengthen packaging metadata coverage so Diffusers aliases and `local`/`all` runtime bundles cannot drift from their intended dependency sets.
- Docs: separate contributor-only extras from runtime install profiles and explicitly mark `dev` as unsuitable for application/runtime dependency declarations.

## 0.3.0 - 2026-05-07

- Packaging: make the base install lightweight. `pip install abstractvision` no longer installs Torch, Diffusers, Transformers, Pillow, or local inference runtimes by default.
- Extras: add canonical runtime profiles `abstractvision[openai]`, `abstractvision[openai-compatible]`, `abstractvision[diffusers]`, `abstractvision[sdcpp]`, `abstractvision[local]`, and `abstractvision[all]`; keep `huggingface`/`huggingface-dev` compatibility aliases.
- Runtime defaults: keep AbstractCore plugin and one-shot CLI remote-first, and stop the REPL/playground from silently selecting Diffusers when no backend is configured.
- Errors/tests/CI: improve local-backend missing-extra hints, add stronger packaging/import-light/OpenAI-compatible coverage, and split CI into lightweight base and local Diffusers paths.
- Docs: update install guidance, backend references, and internal backlog policy for explicit local runtime extras.

## 0.2.6 - 2026-05-06

- Docs: refresh install extras, AbstractCore integration, playground ownership, OpenAI-compatible request-shape notes, and backend/config references so the public docs match the current code.
- Agent docs: update `llms.txt`, include playground endpoint docs in the generated `llms-full.txt` bundle, and regenerate the AI-ready documentation from current sources.
- Contributing: clarify that `abstractvision[abstractcore]` is a compatibility marker and that AbstractCore is supplied by the host application.

## 0.2.5 - 2026-05-06

- Packaging: keep the default Diffusers backend installable while moving `stable-diffusion-cpp-python` out of the base dependency set and into the explicit `sdcpp`/`local` extras. This keeps AbstractCore plugin installs from failing on platforms where stable-diffusion.cpp bindings need a local native build.
- AbstractCore plugin: restore the OpenAI-compatible HTTP backend as the default while keeping local `diffusers` and `sdcpp` backends explicit through config/env.
- OpenAI-compatible backend: shape requests correctly for real OpenAI GPT image models while preserving local OpenAI-compatible extensions for unknown model ids.
- Playground: capture the active backend at job submission time so background jobs do not accidentally run on a newly selected model.
- Docs/tests: clarify the default install shape, document when to install `abstractvision[sdcpp]`, and add metadata/OpenAI/playground coverage so these paths do not regress.

## 0.2.4 - 2026-05-06

- Playground: add a self-contained `abstractvision playground` command that serves both the web UI and `/v1/vision/*` API locally, so playground testing no longer depends on an AbstractCore server.
- Playground: package the HTML asset in the wheel, default the UI to the serving origin, and avoid stale persisted API URLs that could keep calling an older AbstractCore endpoint.
- Playground model loading: accept raw Hugging Face ids such as `runwayml/stable-diffusion-v1-5` directly, while still accepting explicit backend prefixes like `diffusers/...`, `sdcpp/...`, and `openai-compatible/...`.
- Packaging/CI: keep AbstractCore out of AbstractVision dependency metadata and test workflows; AbstractCore remains an optional host integration loaded lazily when present.
- Docs/tests: refresh playground docs around the self-contained local server and add coverage for cached model listing, raw model loading, playground jobs, and tool integration without installing AbstractCore.

## 0.2.3 - 2026-05-06

- AbstractCore plugin: support local Diffusers and stable-diffusion.cpp backends through `llm.vision`, not only OpenAI-compatible HTTP. The default plugin path now matches the REPL default: local Diffusers with `runwayml/stable-diffusion-v1-5`, cache-only unless `ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1` is set.
- AbstractCore plugin: keep OpenAI-compatible usage available with `ABSTRACTVISION_BACKEND=openai` plus `ABSTRACTVISION_BASE_URL`, and preserve artifact-store behavior for generated media outputs.

## 0.2.2 - 2026-05-06

- Release automation: add GitHub Actions CI/release workflows, issue templates, pre-commit config, MkDocs config, PyPI trusted publishing, GitHub Releases, and release-time docs deployment to `gh-pages`.
- Packaging: support Python 3.9-3.13, modernize license metadata, include package data explicitly, and add test/docs/dev extras.
- Defaults: make the REPL default to local Diffusers with `runwayml/stable-diffusion-v1-5`, `ABSTRACTVISION_DIFFUSERS_DEVICE=auto`, and cache-only/offline runtime downloads disabled by default.
- Diffusers backend: keep `allow_download=False` as the default, force Hugging Face offline/cache-only env during loads/calls, disable implicit HF token use offline, and load from cached snapshot paths when present.
- Diffusers backend: ignore unknown REPL `extra` flags that the pipeline `__call__` does not accept, avoiding `unexpected keyword argument` crashes.
- Diffusers backend: add better fp16 variant fallback behavior, MPS dtype/invalid-output retry handling, LoRA/Rapid-AIO offline handling, and clearer missing-local-model errors.
- Capability registry: add `black-forest-labs/FLUX.2-klein-9B` and normalize FLUX license id to `flux-non-commercial-license`.
- Docs: refresh quickstarts around Stable Diffusion 1.5 first, add clearer macOS Metal/NVIDIA CUDA/CPU guidance, expand stable-diffusion.cpp notes, and point users to cache-only local workflows.
- Tooling: add `scripts/download_model_sets.py` for explicit heavyweight model downloads (Stable Diffusion 1.5, FLUX 2 GGUF/Diffusers, and Qwen Image snapshots).
- Cleanup: remove the misspelled duplicate `ACKNOWLEDMENTS.md`.

## 0.2.1

- Documentation refresh for public release:
  - add `docs/api.md` and strengthen cross-linking between README and docs
  - add `CONTRIBUTING.md`, `SECURITY.md`, and `ACKNOWLEDGMENTS.md`
  - add `llms.txt` and generated `llms-full.txt` for agent-oriented context
  - clarify playground/server endpoint expectations (`/v1/vision/*`)

## 0.2.0

- Add stable-diffusion.cpp (`sd-cli`) backend for local GGUF diffusion models.
- REPL: forward unknown `--flags` as backend `extra` parameters.
- Add a tiny web playground (`playground/vision_playground.html`) for testing via AbstractCore Server vision endpoints (`/v1/vision/*`).

## 0.1.0

- Initial MVP: capability registry + schema validation.
- Artifact-first outputs via `LocalAssetStore` and runtime adapter.
- OpenAI-compatible HTTP backend for image generation/editing (optional video endpoints via config).
- Local Diffusers backend for images (opt-in deps).
- AbstractCore tool integration (`make_vision_tools`) with artifact refs.
- CLI/REPL for interactive manual testing.
--- >8 --- END FILE: CHANGELOG.md --- >8 ---

--- 8< --- FILE: pyproject.toml --- 8< ---
```toml
[build-system]
requires = ["setuptools>=77.0.0", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "abstractvision"
dynamic = ["version"]
description = "Model-agnostic generative vision abstractions (image/video) for the Abstract ecosystem"
readme = "README.md"
license = "MIT"
license-files = ["LICENSE"]
authors = [{name = "Laurent-Philippe Albou", email = "contact@abstractcore.ai"}]
requires-python = ">=3.9"
classifiers = [
  "Development Status :: 3 - Alpha",
  "Intended Audience :: Developers",
  "Operating System :: OS Independent",
  "Programming Language :: Python :: 3",
  "Programming Language :: Python :: 3.9",
  "Programming Language :: Python :: 3.10",
  "Programming Language :: Python :: 3.11",
  "Programming Language :: Python :: 3.12",
  "Programming Language :: Python :: 3.13",
  "Topic :: Multimedia",
  "Topic :: Scientific/Engineering :: Artificial Intelligence",
]
# Base install is intentionally lightweight. Local inference runtimes are
# explicit extras so OpenAI-compatible/AbstractCore plugin hosts do not pull
# Torch, Diffusers, CUDA-adjacent wheels, or local model stacks by default.
dependencies = []

[project.urls]
Homepage = "https://github.com/lpalbou/abstractvision"
Repository = "https://github.com/lpalbou/abstractvision"

[project.scripts]
abstractvision = "abstractvision.cli:main"

[project.entry-points."abstractcore.capabilities_plugins"]
abstractvision = "abstractvision.integrations.abstractcore_plugin:register"

[project.optional-dependencies]
# Official OpenAI image endpoint intent. The backend is stdlib-only today and
# does not require the OpenAI SDK.
openai = []

# Generic OpenAI-shaped HTTP endpoint intent (local or remote /v1 servers).
# The backend is stdlib-only today; keep this extra for readable install commands.
openai-compatible = []

# Local image generation via Hugging Face Diffusers/Torch.
diffusers = [
  "diffusers>=0.36.0",
  "torch>=2.0,<3.0.0",
  "transformers>=4.0,<6.0.0",
  "accelerate>=0.0",
  "safetensors>=0.0",
  # Needed by T5 tokenizers used in SD3/FLUX and some other diffusion pipelines.
  "sentencepiece>=0.1.99",
  # Some HF tokenizers/pipelines require protobuf at runtime.
  "protobuf>=3.20.0",
  # Used by some modern diffusion architectures.
  "einops>=0.7.0",
  # LoRA adapter support in Diffusers.
  "peft>=0.10.0",
  "Pillow>=9.0",
]

# Local generation via stable-diffusion.cpp python bindings (pip-installable).
sdcpp = [
  "stable-diffusion-cpp-python>=0.4.2",
  "Pillow>=9.0",
]

# Compatibility extra for callers that still request the historical Diffusers extra.
huggingface = [
  "diffusers>=0.36.0",
  "torch>=2.0,<3.0.0",
  "transformers>=4.0,<6.0.0",
  "accelerate>=0.0",
  "safetensors>=0.0",
  "sentencepiece>=0.1.99",
  "protobuf>=3.20.0",
  "einops>=0.7.0",
  "peft>=0.10.0",
  "Pillow>=9.0",
]

# Convenience: installs both local backends (Diffusers + stable-diffusion.cpp python bindings).
local = [
  "diffusers>=0.36.0",
  "torch>=2.0,<3.0.0",
  "transformers>=4.0,<6.0.0",
  "accelerate>=0.0",
  "safetensors>=0.0",
  "sentencepiece>=0.1.99",
  "protobuf>=3.20.0",
  "einops>=0.7.0",
  "peft>=0.10.0",
  "stable-diffusion-cpp-python>=0.4.2",
  "Pillow>=9.0",
]

# Convenience: installs every runtime backend, but not contributor tooling.
all = [
  "diffusers>=0.36.0",
  "torch>=2.0,<3.0.0",
  "transformers>=4.0,<6.0.0",
  "accelerate>=0.0",
  "safetensors>=0.0",
  "sentencepiece>=0.1.99",
  "protobuf>=3.20.0",
  "einops>=0.7.0",
  "peft>=0.10.0",
  "stable-diffusion-cpp-python>=0.4.2",
  "Pillow>=9.0",
]

# NOTE: PyPI rejects VCS/direct URL dependencies in package metadata.
# If you need Diffusers "main" for unreleased pipelines, install it explicitly *after*:
#   pip install "abstractvision[diffusers-dev]"
#   pip install "diffusers @ git+https://github.com/huggingface/diffusers@main"
diffusers-dev = [
  "diffusers>=0.36.0",
  "torch>=2.0,<3.0.0",
  "transformers>=5.0",
  "accelerate>=0.0",
  "safetensors>=0.0",
  "sentencepiece>=0.1.99",
  "protobuf>=3.20.0",
  "einops>=0.7.0",
  "peft>=0.10.0",
  "Pillow>=9.0",
]

# Compatibility extra for callers that still request the historical dev extra.
huggingface-dev = [
  "diffusers>=0.36.0",
  "torch>=2.0,<3.0.0",
  "transformers>=5.0",
  "accelerate>=0.0",
  "safetensors>=0.0",
  "sentencepiece>=0.1.99",
  "protobuf>=3.20.0",
  "einops>=0.7.0",
  "peft>=0.10.0",
  "Pillow>=9.0",
]

# Compatibility extra only. AbstractVision is loaded by AbstractCore as a plugin,
# so it deliberately does not install AbstractCore as a dependency.
abstractcore = []

# Test dependencies used by local contributors. CI installs the package with
# `--no-deps` and then adds only these light test-time dependencies.
test = [
  "pytest>=7.0.0",
  "Pillow>=9.0",
  "torch>=2.0,<3.0.0",
]

# Documentation dependencies.
docs = [
  "mkdocs>=1.5.0",
  "mkdocs-material>=9.0.0",
]

# Complete local development environment.
dev = [
  "pytest>=7.0.0",
  "Pillow>=9.0",
  "torch>=2.0,<3.0.0",
  "diffusers>=0.36.0",
  "transformers>=4.0,<6.0.0",
  "accelerate>=0.0",
  "safetensors>=0.0",
  "sentencepiece>=0.1.99",
  "protobuf>=3.20.0",
  "einops>=0.7.0",
  "peft>=0.10.0",
  "mkdocs>=1.5.0",
  "mkdocs-material>=9.0.0",
  "build>=1.0.0",
  "twine>=4.0.0",
  "ruff>=0.5.7",
  "black>=23.0.0",
  "pre-commit>=3.0.0",
]

[tool.setuptools]
packages = [
  "abstractvision",
  "abstractvision.assets",
  "abstractvision.backends",
  "abstractvision.integrations",
  "abstractvision.playground",
]

[tool.setuptools.package-dir]
"" = "src"

[tool.setuptools.dynamic]
version = {attr = "abstractvision.__version__"}

[tool.setuptools.package-data]
abstractvision = ["assets/*.json", "playground/*.html"]

[tool.black]
line-length = 100
target-version = ["py39"]
include = '\.pyi?$'
extend-exclude = '''
/(
  \.eggs
  | \.git
  | \.hg
  | \.mypy_cache
  | \.tox
  | \.venv
  | build
  | dist
)/
'''

[tool.isort]
profile = "black"
multi_line_output = 3
line_length = 100
known_first_party = ["abstractvision"]

[tool.ruff]
target-version = "py39"
line-length = 100

[tool.ruff.lint]
select = [
  "E",
  "W",
  "F",
  "I",
  "B",
  "C4",
  "UP",
]
ignore = [
  "E501",
  "B008",
  "B904",
]

[tool.ruff.lint.per-file-ignores]
"__init__.py" = ["F401"]
"tests/**/*" = ["B011"]

[tool.pytest.ini_options]
minversion = "7.0"
addopts = "-ra --tb=short --strict-markers --strict-config --disable-warnings -p no:cacheprovider"
testpaths = ["tests"]
python_files = ["test_*.py"]
python_functions = ["test_*"]
python_classes = ["Test*"]
```
--- >8 --- END FILE: pyproject.toml --- >8 ---
