# AbstractMemory (full)

> Append-only, temporal, provenance-aware triple assertions + deterministic query API, with optional LanceDB persistence and vector/semantic retrieval.

This file is intended to be a standalone, copy/pasteable context for agentic coding assistants.
For the human entry point, see [README.md](README.md). For getting started, see [docs/getting-started.md](docs/getting-started.md).

## Docs index

User-facing docs:
- [docs/getting-started.md](docs/getting-started.md)
- [docs/faq.md](docs/faq.md)
- [docs/api.md](docs/api.md)
- [docs/stores.md](docs/stores.md)
- [docs/architecture.md](docs/architecture.md)
- [docs/development.md](docs/development.md)
- [docs/README.md](docs/README.md)

Release/docs hygiene:
- [CHANGELOG.md](CHANGELOG.md)
- [CONTRIBUTING.md](CONTRIBUTING.md)
- [SECURITY.md](SECURITY.md)
- [LICENSE](LICENSE)
- [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md)

## Repository layout

- `src/abstractmemory/` — library source (src-layout)
- `docs/` — user-facing documentation
- `tests/` — unit tests (LanceDB tests are optional and skipped if `lancedb` is not installed)

Public API exports are defined in [src/abstractmemory/__init__.py](src/abstractmemory/__init__.py).

## Install

Requires Python 3.10+ (see [pyproject.toml](pyproject.toml)).

From PyPI (when published):

```bash
python -m pip install AbstractMemory
python -m pip install "AbstractMemory[lancedb]"
```

From source (recommended for this monorepo package):

```bash
python -m pip install -e .
```

Optional persistent backend + vector search:

```bash
python -m pip install -e ".[lancedb]"
```

Dev extras (tests):

```bash
python -m pip install -e ".[dev]"
```

Note: the distribution name is `AbstractMemory` (pip is case-insensitive). The import name is `abstractmemory`.

## Quickstart (in-memory)

```python
from abstractmemory import InMemoryTripleStore, TripleAssertion, TripleQuery

store = InMemoryTripleStore()
store.add(
    [
        TripleAssertion(
            subject="Scrooge",
            predicate="related_to",
            object="Christmas",
            scope="session",
            owner_id="sess-1",
            observed_at="2026-01-01T00:00:00+00:00",
            provenance={"span_id": "span_123"},
        )
    ]
)

hits = store.query(TripleQuery(subject="scrooge", scope="session", owner_id="sess-1", limit=10))
assert hits[0].object == "christmas"  # terms are canonicalized (trim + lowercase)
```

Evidence:
- Canonicalization is tested in [tests/test_term_canonicalization.py](tests/test_term_canonicalization.py).

## Core concepts (v0)

### Data model: `TripleAssertion`

Source: [src/abstractmemory/models.py](src/abstractmemory/models.py)

An append-only semantic assertion with temporal + provenance metadata.

Fields (selected):
- `subject`, `predicate`, `object` (canonicalized: trim + lowercase)
- `scope` (`run|session|global`) + optional `owner_id`
- `observed_at` (timestamp string), `valid_from` / `valid_until` (optional validity window)
- `confidence` (optional float)
- `provenance` dict (e.g. span/artifact pointers)
- `attributes` dict (extractor evidence/context and retrieval metadata)

Helpers:
- `to_dict()` / `from_dict(...)` for serialization.

### Query model: `TripleQuery`

Source: [src/abstractmemory/store.py](src/abstractmemory/store.py)

Structured filters:
- `subject`, `predicate`, `object` (exact match after canonicalization)
- `scope`, `owner_id`
- `since` / `until` filter `observed_at`
- `active_at` filters by validity window:
  - include if `(valid_from is None or valid_from <= active_at)` and `(valid_until is None or valid_until > active_at)`
  - end is **exclusive**

Semantic/vector retrieval (optional):
- `query_text` requires a configured embedder (no keyword fallback; stores raise `ValueError`).
- `query_vector` bypasses embedding generation.
- `vector_column` controls the vector field name (default `vector`).
- `min_score` is a cosine similarity threshold.

Result shaping:
- `order`: `"asc" | "desc"` by `observed_at` for non-semantic queries
- `limit <= 0` means “unbounded” (see [tests/test_triple_store_limits.py](tests/test_triple_store_limits.py)).

Vector query results:
- When using `query_text` or `query_vector`, stores attach retrieval metadata to `attributes["_retrieval"]` (cosine score; LanceDB also includes `_distance`).

Important implementation detail:
- Timestamps are compared/filtered as strings; prefer RFC-3339/UTC strings like `2026-01-01T00:00:00+00:00`.

### Stores

All stores implement the `TripleStore` protocol (see [src/abstractmemory/store.py](src/abstractmemory/store.py)):
- `add(assertions) -> list[str]` (returns generated assertion ids)
- `query(q) -> list[TripleAssertion]`
- `close()`

#### InMemoryTripleStore

Source: [src/abstractmemory/in_memory_store.py](src/abstractmemory/in_memory_store.py)

- Dependency-free, stores rows (and optional vectors) in process memory.
- If constructed with an `embedder`, `add(...)` embeds a canonical text representation per assertion.
- `query_text` requires an embedder; otherwise raises `ValueError` (see [tests/test_in_memory_query_text_fallback.py](tests/test_in_memory_query_text_fallback.py)).

#### LanceDBTripleStore (optional)

Source: [src/abstractmemory/lancedb_store.py](src/abstractmemory/lancedb_store.py)

- Persistent LanceDB-backed table stored under a local path (`uri`).
- Creates the table on first insert.
- Stores `provenance` and `attributes` as JSON strings plus a canonical `text` column.
- Vector search uses `metric("cosine")` and attaches retrieval metadata to `attributes["_retrieval"]`.
- Persistence across reopen is tested in [tests/test_lancedb_triple_store.py](tests/test_lancedb_triple_store.py).

## Embeddings boundary (no AbstractCore dependency)

Source: [src/abstractmemory/embeddings.py](src/abstractmemory/embeddings.py)

- `TextEmbedder` protocol: `embed_texts(texts) -> list[list[float]]`
- `AbstractGatewayTextEmbedder`: calls an AbstractGateway embeddings endpoint via HTTP (`POST` JSON `{ "input": [...] }`) and expects an OpenAI-like `data[]` response with `embedding` (and optional `index`).

## Architecture (diagram)

See the maintained architecture doc with a component diagram:
- [docs/architecture.md](docs/architecture.md)

## Development & tests

- Run tests: `python -m pytest -q`
- LanceDB tests are skipped when `lancedb` is not installed.
- `tests/conftest.py` bootstraps `sys.path` for monorepo layouts: [tests/conftest.py](tests/conftest.py)

## Change checklist (when modifying behavior)

- Update/extend tests (especially store/query contracts).
- Update user-facing docs (`README.md`, `docs/getting-started.md`, and relevant `docs/*.md`).
- Add an entry to `CHANGELOG.md`.
