Metadata-Version: 2.4
Name: abstractvoice
Version: 0.6.2
Summary: A modular Python library for voice interactions with AI systems
Author-email: Laurent-Philippe Albou <contact@abstractcore.ai>
License-Expression: MIT
Project-URL: Repository, https://github.com/lpalbou/abstractvoice
Project-URL: Documentation, https://github.com/lpalbou/abstractvoice#readme
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24.0
Requires-Dist: requests>=2.31.0
Requires-Dist: appdirs>=1.4.0
Requires-Dist: piper-tts>=1.2.0
Requires-Dist: huggingface_hub>=0.20.0
Requires-Dist: faster-whisper>=0.10.0
Requires-Dist: sounddevice>=0.4.6
Requires-Dist: soundfile>=0.12.1
Requires-Dist: webrtcvad>=2.0.10
Provides-Extra: voice
Requires-Dist: sounddevice>=0.4.6; extra == "voice"
Requires-Dist: webrtcvad>=2.0.10; extra == "voice"
Requires-Dist: soundfile>=0.12.1; extra == "voice"
Provides-Extra: audio-fx
Requires-Dist: librosa>=0.10.0; extra == "audio-fx"
Provides-Extra: cloning
Requires-Dist: f5-tts>=1.1.0; extra == "cloning"
Provides-Extra: chroma
Requires-Dist: torch>=2.0.0; extra == "chroma"
Requires-Dist: torchaudio>=2.0.0; extra == "chroma"
Requires-Dist: torchvision>=0.15.0; extra == "chroma"
Requires-Dist: transformers>=5.0.0rc0; extra == "chroma"
Requires-Dist: accelerate>=1.0.0; extra == "chroma"
Requires-Dist: av>=14.0.0; extra == "chroma"
Requires-Dist: librosa>=0.11.0; extra == "chroma"
Requires-Dist: audioread>=3.0.0; extra == "chroma"
Requires-Dist: pillow>=11.0.0; extra == "chroma"
Requires-Dist: safetensors>=0.5.0; extra == "chroma"
Provides-Extra: aec
Requires-Dist: aec-audio-processing>=1.0.1; extra == "aec"
Provides-Extra: stt
Requires-Dist: openai-whisper>=20230314; extra == "stt"
Requires-Dist: tiktoken>=0.6.0; extra == "stt"
Provides-Extra: web
Requires-Dist: flask>=2.0.0; extra == "web"
Provides-Extra: all
Requires-Dist: piper-tts>=1.2.0; extra == "all"
Requires-Dist: sounddevice>=0.4.6; extra == "all"
Requires-Dist: webrtcvad>=2.0.10; extra == "all"
Requires-Dist: openai-whisper>=20230314; extra == "all"
Requires-Dist: librosa>=0.10.0; extra == "all"
Requires-Dist: soundfile>=0.12.1; extra == "all"
Requires-Dist: flask>=2.0.0; extra == "all"
Requires-Dist: tiktoken>=0.6.0; extra == "all"
Requires-Dist: f5-tts>=1.1.0; extra == "all"
Requires-Dist: aec-audio-processing>=1.0.1; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: black>=22.0.0; extra == "dev"
Requires-Dist: flake8>=5.0.0; extra == "dev"
Provides-Extra: voice-full
Requires-Dist: sounddevice>=0.4.6; extra == "voice-full"
Requires-Dist: webrtcvad>=2.0.10; extra == "voice-full"
Requires-Dist: openai-whisper>=20230314; extra == "voice-full"
Requires-Dist: librosa>=0.10.0; extra == "voice-full"
Requires-Dist: soundfile>=0.12.1; extra == "voice-full"
Requires-Dist: tiktoken>=0.6.0; extra == "voice-full"
Provides-Extra: core-stt
Requires-Dist: openai-whisper>=20230314; extra == "core-stt"
Requires-Dist: tiktoken>=0.6.0; extra == "core-stt"
Provides-Extra: audio-only
Requires-Dist: sounddevice>=0.4.6; extra == "audio-only"
Requires-Dist: webrtcvad>=2.0.10; extra == "audio-only"
Requires-Dist: soundfile>=0.12.1; extra == "audio-only"
Dynamic: license-file

# AbstractVoice

A modular Python library for **voice I/O** around AI applications.

- **TTS (default)**: Piper (cross-platform, no system deps)
- **STT (default)**: faster-whisper
- **Local assistant**: `listen()` + `speak()` with playback/listening control
- **Headless/server**: `speak_to_bytes()` / `speak_to_file()` and `transcribe_*`

Status: **alpha** (`0.6.1`). The supported integrator surface is documented in `docs/api.md`.

Next: `docs/getting-started.md` (recommended setup + first smoke tests).

> AbstractVoice will ultimately be integrated as the voice modality of AbstractFramework.  
> An OpenAI-compatible voice endpoint is an optional demo/integration layer (see backlog).

---

## Install

```bash
pip install abstractvoice
```

Optional extras (feature flags):

```bash
pip install "abstractvoice[all]"
```

Notes:
- `abstractvoice[all]` enables most optional features (incl. cloning + AEC + audio-fx), but **does not** include the GPU-heavy Chroma runtime.
- For the full list of extras (and platform troubleshooting), see `docs/installation.md`.

### Explicit model downloads (recommended; never implicit in the REPL)

Some features rely on large model weights/artifacts. AbstractVoice will **not**
download these implicitly inside the REPL (offline-first).

After installing, prefetch explicitly (cross-platform):

```bash
abstractvoice-prefetch --stt small
abstractvoice-prefetch --piper en
abstractvoice-prefetch --openf5
abstractvoice-prefetch --chroma
```

Or equivalently:

```bash
python -m abstractvoice download --stt small
python -m abstractvoice download --piper en
python -m abstractvoice download --openf5
python -m abstractvoice download --chroma
```

Notes:
- `--piper <lang>` downloads the Piper ONNX voice for that language into `~/.piper/models`.
- `--openf5` is ~5.4GB. `--chroma` is very large (GPU-heavy).

---

## Quick smoke tests

### REPL (fastest end-to-end)

```bash
abstractvoice --verbose
# or (from a source checkout):
python -m abstractvoice cli --verbose
```

Notes:
- Mic voice input is **off by default** for fast startup. Enable with `--voice-mode stop` (or in-session: `/voice stop`).
- The REPL is **offline-first**: no implicit model downloads. Use the explicit download commands above.

See `docs/repl_guide.md`.

### Minimal Python

```python
from abstractvoice import VoiceManager

vm = VoiceManager()
vm.speak("Hello! This is AbstractVoice.")
```

---

## Public API (stable surface)

See `docs/api.md` for the supported integrator contract.

At a glance:
- **TTS**: `speak()`, `stop_speaking()`, `pause_speaking()`, `resume_speaking()`, `speak_to_bytes()`, `speak_to_file()`
- **STT**: `transcribe_file()`, `transcribe_from_bytes()`
- **Mic**: `listen()`, `stop_listening()`, `pause_listening()`, `resume_listening()`

---

## Documentation (minimal set)

- **Docs index**: `docs/README.md`
- **Getting started**: `docs/getting-started.md`
- **FAQ**: `docs/faq.md`
- **Orientation**: `docs/overview.md`
- **Acronyms**: `docs/acronyms.md`
- **Public API**: `docs/api.md`
- **REPL guide**: `docs/repl_guide.md`
- **Install troubleshooting**: `docs/installation.md`
- **Multilingual support**: `docs/multilingual.md`
- **Architecture (internal)**: `docs/architecture.md` + `docs/adr/`
- **Model management (Piper-first)**: `docs/model-management.md`
- **Licensing notes**: `docs/voices-and-licenses.md`

---

## Project

- **Changelog**: `CHANGELOG.md`
- **Contributing**: `CONTRIBUTING.md`
- **Security**: `SECURITY.md`
- **Acknowledgments**: `ACKNOWLEDGMENTS.md`

## License

MIT. See `LICENSE`.
