Metadata-Version: 2.4
Name: aablocks
Version: 0.1.1
Summary: A-Alpha Bio SDK for accessing Atlas datasets
Author: A-Alpha Bio
Project-URL: Homepage, https://aalphabio.com
Keywords: alphaseq,datasets,api,client
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: requests>=2.28.0
Requires-Dist: oauthlib>=3.2.0
Requires-Dist: click>=8.0.0
Requires-Dist: tqdm>=4.64.0

# aablocks

A-Alpha Bio SDK for accessing Atlas datasets.

## Installation

```bash
pip install aablocks
```

For DataFrame support:

```bash
pip install aablocks pandas   # For pandas
pip install aablocks polars   # For polars
```

## Quick Start

### Python

```python
import aablocks as aa

# Login (opens browser)
aa.login()

# List datasets
datasets = aa.list_datasets()

# Get data as pandas DataFrame
df = aa.get_dataset("ab1001")
```

### CLI

```bash
# Login (opens browser)
> aablocks login

# List datasets
> aablocks list

# Download a dataset
> aablocks get ab1001 -o data.csv
```

## Python API

### `aa.login()`

Authenticate with the Atlas. Opens browser for OAuth. No-op if already logged in.

```python
import aablocks as aa
aa.login()
```

### `aa.logout()`

Clear cached authentication token.

```python
aa.logout()
```

### `aa.list_datasets(all_versions=False, format=None)`

List all accessible datasets.

**Parameters:**

| Name           | Type   | Description                                       |
| -------------- | ------ | ------------------------------------------------- |
| `all_versions` | `bool` | Return all versions (default: latest only)        |
| `format`       | `str`  | `"csv"`, `"list"`, `"pandas"`, or `"polars"` (default: config) |

**Returns:** `list[Dataset] | str | DataFrame`

```python
# As pandas DataFrame
df = aa.list_datasets()

# As Dataset objects
datasets = aa.list_datasets(format="list")
for d in datasets:
    print(f"{d.id}: {d.name}")

# All versions as polars
df = aa.list_datasets(all_versions=True, format="polars")
```

### `aa.get_details(dataset_id)`

Get metadata for a specific dataset.

**Parameters:**

| Name         | Type  | Description                    |
| ------------ | ----- | ------------------------------ |
| `dataset_id` | `str` | Dataset ID (e.g., `"ab1001"`) |

**Returns:** `Dataset`

```python
dataset = aa.get_details("ab1001")
print(dataset.name)
print(dataset.modes)  # ['default', 'ml']
```

### `aa.get_readme(dataset_id, version=None)`

Get README content for a dataset.

**Parameters:**

| Name         | Type  | Description              |
| ------------ | ----- | ------------------------ |
| `dataset_id` | `str` | Dataset ID               |
| `version`    | `str` | Version (default: latest) |

**Returns:** `str` (markdown)

```python
readme = aa.get_readme("ab1001")
print(readme)
```

### `aa.get_dataset(dataset_id, version=None, mode=None, max_rows=None, format=None, output_path=None, output_compressed=False, progress=None)`

Download dataset data.

**Parameters:**

| Name                | Type   | Description                      |
| ------------------- | ------ | -------------------------------- |
| `dataset_id`        | `str`  | Dataset ID (e.g., `"ab1001"`)   |
| `version`           | `str`  | Version (default: latest)        |
| `mode`              | `str`  | Data mode: `"default"` or `"ml"` |
| `max_rows`          | `int`  | Max rows to return (max 10,000)  |
| `format`            | `str`  | `"csv"`, `"pandas"`, or `"polars"` |
| `output_path`       | `str`  | Write to file instead of returning |
| `output_compressed` | `bool` | Keep gzip compression            |
| `progress`          | `bool` | Show progress bar                |

**Returns:** `str | DataFrame | None`

```python
# Pandas DataFrame (default)
df = aa.get_dataset("ab1001")

# ML-ready data
df = aa.get_dataset("ab1001", mode="ml")

# Preview first 100 rows
df = aa.get_dataset("ab1001", max_rows=100)

# Download to file
aa.get_dataset("ab1001", output_path="data.csv")

# Download compressed
aa.get_dataset("ab1001", output_path="data.csv.gz", output_compressed=True)
```

### `aa.set_config(key, value)`

Set a configuration value.

| Key          | Description                        | Default    |
| ------------ | ---------------------------------- | ---------- |
| `api_format` | Output format (csv, pandas, polars) | `"pandas"` |
| `progress`   | Show download progress             | `True`     |

```python
aa.set_config("api_format", "polars")
aa.set_config("progress", False)
```

### `Dataset`

Dataset metadata container returned by `list_datasets(format="list")` and `get_details()`.

| Attribute      | Type             | Description                              |
| -------------- | ---------------- | ---------------------------------------- |
| `id`           | `str`            | Dataset identifier                       |
| `name`         | `str`            | Human-readable name                      |
| `experiment`   | `str`            | Overview/use case                        |
| `details`      | `str`            | Experimental details                     |
| `groups`       | `list[str]`      | Access groups (e.g., `["tier1"]`)        |
| `modes`        | `list[str]`      | Data modes (e.g., `["default", "ml"]`)   |
| `version`      | `str`            | Version number                           |
| `release_date` | `str \| None`    | Release date (ISO format)                |
| `locked`       | `bool`           | Locked for current user's tier           |
| `url`          | `str \| None`    | Direct URL                               |

## CLI

After installation, the `aablocks` command is available in your terminal.

### Global Options

| Option      | Description           |
| ----------- | --------------------- |
| `--version` | Show version and exit |
| `--help`    | Show help and exit    |

### `aablocks login`

Log in to the Atlas. Opens your browser for authentication. Tokens are cached locally and automatically refreshed.

```bash
> aablocks login
Opening browser for authentication...
Logged in successfully.
```

### `aablocks logout`

Log out and clear cached credentials.

```bash
> aablocks logout
Logged out successfully.
```

### `aablocks list [OPTIONS]`

List all datasets accessible to the current user.

| Option           | Description                                     |
| ---------------- | ----------------------------------------------- |
| `--all-versions` | Include all versions of each dataset             |
| `-f, --format`   | Output format: `table`, `csv`, or `json` (default: `table`) |

```bash
# List as table (default)
> aablocks list
ID           Name                           Version  Released     Groups
----------------------------------------------------------------------------------
ab1001       AlphaBlock 1001                1        2026-01-21   tier1
ab1479       AlphaBlock 1479                1        2026-01-21   tier1
ab1614       AlphaBlock 1614                1        2026-01-21   tier2

# List as JSON
> aablocks list -f json

# List as CSV
> aablocks list -f csv

# Include all versions
> aablocks list --all-versions
```

### `aablocks details <dataset_id>`

Show detailed metadata for a specific dataset.

```bash
> aablocks details ab1001
ID:           ab1001
Name:         AlphaBlock 1001
Version:      1
Released:     2026-01-21
Groups:       tier1
Modes:        default, ml
Experiment:   Local affinity landscape on VHH72-SARS-CoV-2 RBD
```

### `aablocks readme <dataset_id> [OPTIONS]`

Show the README documentation for a dataset.

| Option          | Description                          |
| --------------- | ------------------------------------ |
| `-v, --version` | Specific version to retrieve         |
| `--raw`         | Output raw markdown without rendering |

```bash
# Display rendered README
> aablocks readme ab1001

# Get raw markdown
> aablocks readme ab1001 --raw

# Save to file
> aablocks readme ab1001 --raw > README.md
```

### `aablocks get <dataset_id> [OPTIONS]`

Download CSV data for a dataset.

| Option                    | Description                        |
| ------------------------- | ---------------------------------- |
| `-v, --version`           | Specific version to retrieve       |
| `-m, --mode`              | Data mode: `default` or `ml`       |
| `-n, --max-rows`          | Max rows to return (max 10,000)    |
| `-f, --format`            | Output format: `csv`, `table`, or `gz` |
| `-o, --output`            | Output file path                   |
| `--progress/--no-progress` | Show download progress bar         |

```bash
# Print CSV to stdout
> aablocks get ab1001

# Preview first 100 rows
> aablocks get ab1001 -n 100

# Download to file
> aablocks get ab1001 -o data.csv

# Download compressed
> aablocks get ab1001 -o data.csv.gz -f gz

# Get ML-ready variant
> aablocks get ab1001 --mode ml

# Display as table
> aablocks get ab1001 -n 10 -f table

# Pipe to other tools
> aablocks get ab1001 | head -100 > sample.csv
```

### `aablocks config [key] [value]`

Get or set configuration options.

```bash
# Show all settings
> aablocks config

# Get a specific value
> aablocks config cli_format

# Set a value
> aablocks config cli_format table
```

| Key           | Values                    | Default   | Description                   |
| ------------- | ------------------------- | --------- | ----------------------------- |
| `api_format`  | `csv`, `pandas`, `polars` | `pandas`  | Default format for Python API |
| `cli_format`  | `csv`, `table`            | `csv`     | Default format for CLI output |
| `progress`    | `true`, `false`           | `true`    | Show download progress bars   |

## Examples

### Download and Analyze

```bash
> aablocks get ab1001 -o ab1001.csv
```

```python
import pandas as pd
df = pd.read_csv("ab1001.csv")
df.head()
```

### Complete Python Workflow

```python
import aablocks as aa

# Authenticate
aa.login()

# Browse datasets
datasets = aa.list_datasets(format="list")
print(f"Found {len(datasets)} datasets")

for d in datasets[:3]:
    print(f"{d.id}: {d.name} (v{d.version})")

# Get details
details = aa.get_details("ab1001")
print(f"Modes: {details.modes}")

# Download data
df = aa.get_dataset("ab1001")
print(df.head())

# ML version
df_ml = aa.get_dataset("ab1001", mode="ml")

# Read docs
readme = aa.get_readme("ab1001")
print(readme)
```

### Scripting

```bash
# List dataset IDs only
> aablocks list -f csv | tail -n +2 | cut -d, -f1

# Download all accessible datasets
> for id in $(aablocks list -f csv | tail -n +2 | cut -d, -f1); do
    aablocks get $id -o "${id}.csv"
done
```

## License

Apache 2.0 — see [LICENSE](LICENSE) for details.
