Metadata-Version: 2.4
Name: accesspdf
Version: 1.2.0
Summary: AI-powered PDF accessibility remediation, at scale
Project-URL: Homepage, https://github.com/laurenaulet/accesspdf
Project-URL: Repository, https://github.com/laurenaulet/accesspdf
Project-URL: Issues, https://github.com/laurenaulet/accesspdf/issues
Project-URL: Changelog, https://github.com/laurenaulet/accesspdf/CHANGELOG.md
License:                                  Apache License
                                   Version 2.0, January 2004
                                http://www.apache.org/licenses/
        
           Licensed under the Apache License, Version 2.0 (the "License");
           you may not use this file except in compliance with the License.
           You may obtain a copy of the License at
        
               http://www.apache.org/licenses/LICENSE-2.0
        
           Unless required by applicable law or agreed to in writing, software
           distributed under the License is distributed on an "AS IS" BASIS,
           WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
           See the License for the specific language governing permissions and
           limitations under the License.
License-File: LICENSE
Keywords: a11y,accessibility,alt-text,pdf,wcag
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Text Processing :: Markup
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Requires-Dist: fastapi>=0.110
Requires-Dist: httpx>=0.27
Requires-Dist: jinja2>=3.1
Requires-Dist: langdetect>=1.0.9
Requires-Dist: pdfminer-six>=20221105
Requires-Dist: pikepdf<10.0,>=8.0
Requires-Dist: pillow>=10.0
Requires-Dist: pydantic>=2.0
Requires-Dist: python-multipart>=0.0.9
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0
Requires-Dist: textual>=0.50
Requires-Dist: typer>=0.12
Requires-Dist: uvicorn[standard]>=0.29
Provides-Extra: all
Requires-Dist: anthropic>=0.25; extra == 'all'
Requires-Dist: fastapi>=0.110; extra == 'all'
Requires-Dist: httpx>=0.27; extra == 'all'
Requires-Dist: mypy>=1.9; extra == 'all'
Requires-Dist: openai>=1.25; extra == 'all'
Requires-Dist: pytest-asyncio>=0.23; extra == 'all'
Requires-Dist: pytest-cov>=5.0; extra == 'all'
Requires-Dist: pytest-vcr>=1.0; extra == 'all'
Requires-Dist: pytest>=8.0; extra == 'all'
Requires-Dist: python-multipart>=0.0.9; extra == 'all'
Requires-Dist: reportlab>=4.0; extra == 'all'
Requires-Dist: ruff>=0.4; extra == 'all'
Requires-Dist: types-pillow; extra == 'all'
Requires-Dist: types-pyyaml; extra == 'all'
Requires-Dist: uvicorn[standard]>=0.29; extra == 'all'
Requires-Dist: vcrpy>=6.0; extra == 'all'
Provides-Extra: all-providers
Requires-Dist: anthropic>=0.25; extra == 'all-providers'
Requires-Dist: openai>=1.25; extra == 'all-providers'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.25; extra == 'anthropic'
Provides-Extra: dev
Requires-Dist: fastapi>=0.110; extra == 'dev'
Requires-Dist: httpx>=0.27; extra == 'dev'
Requires-Dist: mypy>=1.9; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest-vcr>=1.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: python-multipart>=0.0.9; extra == 'dev'
Requires-Dist: reportlab>=4.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Requires-Dist: types-pillow; extra == 'dev'
Requires-Dist: types-pyyaml; extra == 'dev'
Requires-Dist: uvicorn[standard]>=0.29; extra == 'dev'
Requires-Dist: vcrpy>=6.0; extra == 'dev'
Provides-Extra: openai
Requires-Dist: openai>=1.25; extra == 'openai'
Provides-Extra: web
Description-Content-Type: text/markdown

# AccessPDF

Make PDFs accessible. Fixes structure, reading order, tables, and headings automatically -- then helps you add image descriptions with local AI or by hand.

Targets **WCAG 2.1 AA** and **PDF/UA**.

## Quick start

```bash
pip install accesspdf
accesspdf serve
```

This opens a browser UI at `http://localhost:8080`. Upload a PDF, get an accessibility report, download the fixed version. If your PDF has images, you can write alt text right in the browser -- or let AI do a first draft.

For AI-generated alt text, we recommend **[Ollama](https://ollama.com)** -- it's free, runs locally, and needs no API key. Install it, then:

```bash
ollama pull llava
```

That's it. Select "Ollama" in the web UI and click generate.

---

## How it works

AccessPDF does two things:

1. **Fixes structure automatically** -- tags, language, reading order, headings, tables, links, bookmarks
2. **Helps you add image descriptions** -- the one part that needs a human (or AI + human review)

Your original PDF is never modified. Output always goes to a new file.

## CLI workflow

If you prefer the command line over the web UI:

```bash
# 1. See what's wrong (read-only, never touches your file)
accesspdf check my-document.pdf

# 2. Fix structural issues
accesspdf fix my-document.pdf -o my-document_accessible.pdf

# 3. Generate AI alt text drafts (optional)
accesspdf generate-alt-text my-document_accessible.pdf

# 4. Review and approve the drafts
accesspdf review my-document_accessible.pdf

# 5. Re-run fix to inject approved descriptions
accesspdf fix my-document.pdf -o my-document_accessible.pdf --alt-text my-document.alttext.yaml
```

## AI alt text providers

AccessPDF uses AI vision models to draft image descriptions. You always review before anything gets injected.

| Provider | Setup | API key? | Cost |
|----------|-------|----------|------|
| **Ollama** (recommended) | [Install Ollama](https://ollama.com), `ollama pull llava` | No | Free (local) |
| Google Gemini | None | `GOOGLE_API_KEY` | Free tier |
| Anthropic (Claude) | `pip install accesspdf[anthropic]` | `ANTHROPIC_API_KEY` | Paid |
| OpenAI (GPT-4) | `pip install accesspdf[openai]` | `OPENAI_API_KEY` | Paid |

**Ollama is the easiest** -- no API key, no account, nothing leaves your machine. Just install it and pull a model.

For cloud providers, set your key as an environment variable or pass it directly:

```bash
accesspdf generate-alt-text my-document.pdf --provider gemini --api-key AIza...
```

In the web UI, you can paste your API key in the settings panel -- it's sent per-request and never saved to disk.

## Batch processing

Fix every PDF in a folder:

```bash
accesspdf batch ./papers/ -o ./papers/accessible/
accesspdf batch ./papers/ -o ./papers/accessible/ -r   # include subdirectories
```

## The sidecar file

Image descriptions live in a `.alttext.yaml` file next to your PDF:

```yaml
images:
- id: img_37044c
  page: 1
  ai_draft: 'Bar chart showing quarterly revenue from 2023-2025.'
  alt_text: 'Bar chart showing quarterly revenue. Q1 2025 is highest at $4.2M.'
  status: approved
```

Statuses: **needs_review** (not yet described), **approved** (gets injected), **decorative** (screen readers skip it). You can edit this file by hand.

## CLI reference

```
accesspdf check <pdf>                    # Analyze accessibility (read-only)
accesspdf fix <pdf> -o <output>          # Fix structure + inject alt text
accesspdf fix <pdf> --alt-text <yaml>    # Fix with sidecar descriptions
accesspdf batch <dir> -o <outdir>        # Fix all PDFs in a directory
accesspdf review <pdf>                   # Terminal UI for alt text
accesspdf serve                          # Web UI at localhost:8080
accesspdf generate-alt-text <pdf>        # AI drafts (Ollama default)
accesspdf providers                      # Show available AI providers
```

## Installation options

```bash
pip install accesspdf                # CLI + web UI (everything you need)
pip install "accesspdf[anthropic]"   # Add Claude provider
pip install "accesspdf[openai]"      # Add GPT-4 provider
```

## Contributing

```bash
git clone https://github.com/laurenaulet/accesspdf.git
cd accesspdf
pip install -e ".[dev]"
python -m pytest tests/ -v
```

## License

Apache 2.0
