Metadata-Version: 2.4
Name: acatome-store
Version: 0.7.2
Summary: Persistent storage, dedup, metadata queries, and semantic search for acatome bundles
Project-URL: Homepage, https://github.com/retospect/acatome-store
Project-URL: Repository, https://github.com/retospect/acatome-store
Author-email: Reto Stamm <reto@retostamm.com>
License-Expression: GPL-3.0-or-later
License-File: LICENSE
Keywords: citations,postgres,scientific-papers,semantic-search,sqlite,storage
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.11
Requires-Dist: chromadb>=0.5
Requires-Dist: llama-index-core>=0.11
Requires-Dist: llama-index-vector-stores-chroma>=0.2
Requires-Dist: precis-summary>=0.1.0
Requires-Dist: sqlalchemy>=2.0
Requires-Dist: typer>=0.12
Provides-Extra: embeddings
Requires-Dist: sentence-transformers>=3.0; extra == 'embeddings'
Provides-Extra: mysql
Requires-Dist: pymysql>=1.0; extra == 'mysql'
Provides-Extra: neo4j
Requires-Dist: neo4j>=5.0; extra == 'neo4j'
Provides-Extra: postgres
Requires-Dist: pgvector>=0.3; extra == 'postgres'
Requires-Dist: psycopg[binary]>=3.0; extra == 'postgres'
Description-Content-Type: text/markdown

# acatome-store

Persistent storage, deduplication, metadata queries, and semantic search for scientific paper bundles.

## Features

- **SQLAlchemy ORM** — portable across SQLite, Postgres, MySQL
- **Refs + Papers split** — identity table (refs) separate from ingested content (papers)
- **Citation graph** — directed `citing → cited` edges, works for ingested + stub papers
- **Supplements** — ingest supplementary PDFs with scoped block retrieval
- **Retractions** — flag papers as retracted with notes
- **Vector search** — ChromaDB (default) or pgvector (zero text duplication)
- **CLI** — `acatome-store` command for ingest, reingest, query, retract, and stats
- **Schema management** — `reset_schema()` and `reingest --drop` for clean rebuilds

## Installation

```bash
uv pip install -e .
```

With Postgres support:

```bash
uv pip install -e ".[postgres]"
```

## Usage

```python
from acatome_store import Store

store = Store()
ref_id = store.ingest(bundle_path)
paper = store.get(ref_id)
results = store.search_text("transformer attention", top_k=5)
# hits include paper info, block summaries, and text
```

## CLI

```bash
acatome-store ingest /path/to/bundle.acatome    # single bundle
acatome-store ingest /path/to/dir/               # directory of bundles
acatome-store reingest                            # re-ingest all from ~/.acatome/papers/
acatome-store reingest --drop                     # drop schema + re-ingest (confirm prompt)
acatome-store reingest --path /other/dir          # custom bundle directory
acatome-store stats
acatome-store search "CO2 capture"
acatome-store list
acatome-store info doi:10.1234/example
acatome-store retract doi:10.1234/fake --note "Fabricated data"
```

### Schema Reset

If the database schema drifts from the model (e.g. after upgrading acatome-store),
use `reingest --drop` to drop all tables, recreate from the current SQLAlchemy model,
and re-ingest all `.acatome` bundles. No data is lost since bundles are the source of truth.

```bash
acatome-store reingest --drop
# prompts for confirmation, then:
# 1. Drops all tables (refs, blocks, papers, links, etc.)
# 2. Recreates schema from current model
# 3. Re-ingests all bundles from ~/.acatome/papers/
```

Programmatically:

```python
store = Store()
store.reset_schema()  # drop + recreate tables
```

## Testing

```bash
uv run python -m pytest tests/ -v
```

## License

LGPL-3.0-or-later — see [LICENSE](LICENSE).
