Metadata-Version: 2.4
Name: acaua
Version: 0.1.0.dev0
Summary: Apache-2.0 PyTorch-native computer vision library. One API for detection, segmentation, and (soon) OBB/OCR/VLM/zero-shot.
Project-URL: Homepage, https://github.com/CondadosAI/acaua
Project-URL: Repository, https://github.com/CondadosAI/acaua
Project-URL: Issues, https://github.com/CondadosAI/acaua/issues
Author-email: Luis Condados <condadoslgpc@gmail.com>
License: Apache-2.0
License-File: LICENSE
License-File: LICENSES.md
Keywords: computer-vision,image-segmentation,object-detection,pytorch,transformers
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Requires-Python: >=3.10
Requires-Dist: huggingface-hub<1.0,>=0.23
Requires-Dist: numpy>=1.24
Requires-Dist: pillow>=10
Requires-Dist: requests>=2.31
Requires-Dist: supervision>=0.20
Requires-Dist: torch>=2.1
Requires-Dist: transformers>=4.40
Provides-Extra: all
Provides-Extra: export
Provides-Extra: ocr
Provides-Extra: vlm
Provides-Extra: zeroshot
Description-Content-Type: text/markdown

# acaua

Apache-2.0 PyTorch-native computer vision. One API — `predict`, `train`, `export` — across detection, segmentation, OBB, OCR, VLM, and zero-shot.

One class to learn. One to remember.

```python
import acaua

model = acaua.Model.from_pretrained("CondadosAI/rtdetr_r50vd")
results = model.predict("image.jpg")
print(results.boxes, results.labels, results.scores)
```

## Why this exists

The Python computer-vision stack has three ugly choices today:

- **Ultralytics** — great DX, but the weights are AGPL. Ship them in anything commercial and you inherit the whole AGPL umbrella. Lawyers say no.
- **mmdetection / mmsegmentation** — depth, but registries, config DSLs, and `mmengine/mmcv` abstractions that feel foreign to anyone who wrote a `torch.nn.Module` in the last decade.
- **PaddleDetection / PaddleOCR** — state of the art, but you're shipping Paddle custom ops. Different ecosystem. Different GPU kernels. Different problem.

And then there's **Roboflow Supervision** — Apache-2.0, pythonic, but intentionally thin. It handles annotation and post-processing, not training and multi-task model serving.

acaua fills the gap: an Apache-2.0, PyTorch-native, pythonic library that unifies detection, segmentation, OCR, VLM, and zero-shot tasks behind a single `predict` / `train` / `export` surface.

## Install

```bash
pip install acaua
```

Torch and Transformers come as direct deps. First install is a few hundred MB; get a coffee.

## The parity demo

Two tasks, one API:

```python
import acaua

# Object detection
det = acaua.Model.from_pretrained("CondadosAI/rtdetr_r50vd")
det_result = det.predict("http://images.cocodataset.org/val2017/000000039769.jpg")
print(det_result.boxes.shape, det_result.labels.tolist(), det_result.scores.tolist())

# Instance segmentation — same .predict(), different task
seg = acaua.Model.from_pretrained("CondadosAI/mask2former_swin_tiny_coco_instance")
seg_result = seg.predict("http://images.cocodataset.org/val2017/000000039769.jpg")
print(seg_result.masks.shape)  # adds masks on top of the common fields

# Plug into Roboflow Supervision for annotation
detections = det_result.to_supervision()
```

The signature equality is not an accident — `tests/test_predict_parity.py` is a contract test that fails CI if any adapter drifts.

## What's in v0.1

| Task | Backend | Weights |
|---|---|---|
| Object detection | transformers `AutoModelForObjectDetection` (RT-DETR) | [`CondadosAI/rtdetr_r50vd`](https://huggingface.co/CondadosAI/rtdetr_r50vd) |
| Instance segmentation | transformers `AutoModelForUniversalSegmentation` (Mask2Former-Swin-Tiny) | [`CondadosAI/mask2former_swin_tiny_coco_instance`](https://huggingface.co/CondadosAI/mask2former_swin_tiny_coco_instance) |

Both adapters:
- Share the same `.predict()` signature (contract-tested).
- Wrap the forward pass in `torch.inference_mode()`.
- Accept the same input Union: `str` (path or http/https URL), `pathlib.Path`, `PIL.Image`, `np.ndarray` (HWC uint8 RGB), `torch.Tensor` (CHW float32), or a sequence of any of the above.
- Batch under the hood with `batch_size=8` by default.
- Resolve device auto → `cuda` > `mps` > `cpu`, or take an explicit `device=` kwarg.
- Auto-reject BGR-smelling numpy arrays (OpenCV users, you know who you are) with an error that names the exact fix.

## What's coming

| | v0.1 | v0.2 | v0.3 |
|---|---|---|---|
| `predict` | detection + instance segmentation | + OBB, OCR | + VLM, zero-shot |
| `train` | per-backend idioms (not parity-gated) | signature parity across adapters | — |
| `export` | (empty extras shipped) | ONNX parity | TensorRT, CoreML |

Tasks arrive as extras to avoid bloating the core install:

```bash
pip install acaua[ocr]         # doctr / surya family (v0.2+)
pip install acaua[vlm]         # Florence-2, Qwen-VL, Moondream (v0.3+)
pip install acaua[zeroshot]    # GroundingDINO, SAM2, OWLv2, YOLO-World (v0.3+)
pip install acaua[export]      # ONNX, TensorRT, CoreML (v0.2+)
pip install acaua[all]
```

The extras claim their slots today and ship empty — installing `acaua[ocr]` right now is a no-op, but when v0.2 lands it transparently picks up the new adapter.

## License

**Code:** Apache-2.0. See [`LICENSE`](./LICENSE).

**Weights:** every weight in the default load path has an audited Apache-2.0 upstream, mirrored under `CondadosAI/*` at a pinned commit SHA. See [`LICENSES.md`](./LICENSES.md) for the full matrix, attribution chains, and policy notes. Non-Apache weights raise `LicenseError` unless you pass `allow_non_apache=True`.

## How acaua differs from the alternatives

| | License | API | PyTorch-native? | Tasks in scope |
|---|---|---|---|---|
| Ultralytics | AGPL weights | class per task | yes | detection, pose, classification, segmentation |
| mmdetection | Apache-2.0 | registries + config DSLs | yes (but mmengine) | detection, segmentation, OBB |
| PaddleDetection | Apache-2.0 | Paddle-native | no (Paddle) | everything, but in a different ecosystem |
| Supervision | Apache-2.0 | annotation + post-processing | yes | not a model library |
| **acaua** | Apache-2.0 code + weights | one class, three verbs | yes | detection, segmentation now; OBB/OCR/VLM/zero-shot next |

If Ultralytics' license wasn't AGPL, it would fit this slot. It is, so acaua is here.

## Design non-negotiables

These are what acaua is actively *not* doing, so you can stop us if you see us drift:

- **No config-file DSL.** No registry system. No "model zoo" registry class.
- **No framework fork.** PyTorch idioms only. No mmcv / mmengine. No Paddle custom ops.
- **No re-implementing.** Extend `torchvision.models.detection` and `transformers` where possible.
- **Per-adapter LOC budget.** Every adapter module stays under 300 logical lines. If we blow that budget, the abstraction is leaking — stop and redesign, don't just raise the limit.
- **License-audit every weight.** Code license isn't enough; weight license is the actual ship-blocker.

## Development

```bash
git clone https://github.com/CondadosAI/acaua
cd acaua
uv sync --group dev
uv run pytest tests/ -m "not e2e"
```

CI (`.github/workflows/ci.yml`) runs ruff, mypy, the adapter LOC budget script, the unit suite on Python 3.10–3.13 with a 90% coverage floor, plus one E2E job. Nightly runs the full cross-OS matrix.

## Contributing

Open an issue before a PR. For new adapters: the 300-line LOC budget is real and CI-enforced. If your adapter needs more than that, the abstraction is probably leaking and we need to design before we code.

## Acknowledgements

acaua is a thin layer over work done by the PyTorch team, Hugging Face, torchvision, Roboflow Supervision, and the authors of RT-DETR, Mask2Former, and Swin. See [`LICENSES.md`](./LICENSES.md) for the full chain.

---

*The name is **acauã** — a Brazilian bird of prey (Herpetotheres cachinnans) that hunts snakes. Named for where the library was made.*
