No description
- Python 56.6%
- Svelte 19.6%
- TypeScript 12%
- Shell 8.8%
- PowerShell 2.6%
- Other 0.4%
| .claude | ||
| .specify | ||
| backend | ||
| config | ||
| docs | ||
| frontend | ||
| mobile | ||
| nginx | ||
| scripts | ||
| specs | ||
| src | ||
| .env.example | ||
| .gitignore | ||
| CLAUDE.md | ||
| docker-compose.yml | ||
| ledger_speckit.md.docx | ||
| README.md | ||
| requirements.txt | ||
| Sabo Alliance Map.pages | ||
| SABO_Context_Document.md.docx | ||
SABO
Stop Allowing Big money to Overwhelm — give everyday voters coordinated economic leverage against corporate lobbying.
LDA Data Pipeline
Queries the LDA API weekly for high-spend lobbying filings, matches them against active bills, stores campaigns for curator review, and exposes approved campaigns to downstream systems (browser extension, dashboard).
Prerequisites
- Python 3.11+
- A free LDA API key from lda.senate.gov (register for an account)
- macOS or Linux
Setup
# 1. Install dependencies
pip install -r requirements.txt
# 2. Configure environment
cp .env.example .env
# Edit .env and set your LDA_API_KEY
Configure Monitored Bills
Edit config/bills.yml to add the bills you want to track:
bills:
- bill_id: "S-4006"
title: "Healthcare Price Transparency Act"
keywords:
- "health care"
- "price transparency"
- "hospital costs"
status: "active"
Only bills with status: "active" are used by the pipeline. Keyword matching is case-insensitive substring search against the filing's issue descriptions.
Run the Pipeline
# Dry run (no database writes — good for testing API connectivity)
python -m src.pipeline.runner --dry-run
# Live run (fetches last 8 days of filings)
python -m src.pipeline.runner
# Specify a custom date range
python -m src.pipeline.runner --since 2026-01-01
# Verbose output (logs each filing as it is processed)
python -m src.pipeline.runner --verbose
Exit codes: 0 = success, 1 = API error, 2 = config error, 3 = database error
Review Campaigns
# List campaigns pending review
python -m src.review.cli list
# List by status
python -m src.review.cli list --status approved
python -m src.review.cli list --status rejected
# Approve a campaign
python -m src.review.cli approve 1
# Reject a campaign (with optional reason)
python -m src.review.cli reject 2 --reason "Spend data appears incorrect"
# View full history with filters
python -m src.review.cli history
python -m src.review.cli history --status approved
python -m src.review.cli history --company "amazon"
python -m src.review.cli history --since 2026-01-01
Configure Cron (Weekly Schedule)
# Make the script executable (already done if you cloned the repo)
chmod +x scripts/run_pipeline.sh
# Edit your crontab
crontab -e
# Add this line to run every Sunday at midnight:
0 0 * * 0 /path/to/sabo/scripts/run_pipeline.sh >> /path/to/sabo/data/pipeline.log 2>&1
Validation
See specs/001-lda-data-pipeline/quickstart.md for a step-by-step end-to-end validation guide covering all four scenarios.
Project Structure
src/
models/models.py — LDAFiling, MonitoredBill, Campaign dataclasses
db/database.py — SQLite connection + schema
db/campaigns.py — Campaign CRUD + downstream query interface
db/bills.py — MonitoredBill read operations
pipeline/runner.py — Pipeline entry point
pipeline/lda_client.py — LDA API HTTP client
review/cli.py — Curator review CLI
config/loader.py — YAML bills config loader
config/bills.yml — Monitored bills (curator-maintained)
scripts/run_pipeline.sh — Cron wrapper
data/sabo.db — SQLite database (gitignored)