# TipSharks

Advanced harness racing ratings and predictions powered by multi-runner Elo algorithms. TipSharks computes sophisticated ratings for horses, drivers, and trainers using HRNZ race data to help you make smarter betting decisions.

## Features

- **Multi-runner Elo ratings**: Pairwise logistic model that handles races with 2-20+ starters
- **Multi-entity ratings**: Separate ratings for horses, drivers, and trainers with configurable weights
- **Condition adjustments**: Learn barrier and handicap adjustments from historical outcomes
- **Deterministic recomputation**: Same inputs always produce same ratings
- **Web UI**: Interactive interface for browsing ratings, viewing predictions, and searching entities
- **REST API**: Query ratings, histories, and race data
- **Evaluation metrics**: Winner accuracy, top-3 hit rate, and probability calibration

## Quick Start

### Prerequisites

- Docker and Docker Compose
- Optional: HRNZ InfoHorse scraping (Playwright runs inside the API container)

### 1. Clone and Configure

```bash
git clone <repository-url>
cd tipsharks
cp .env.example .env
```

Edit `.env` and set an admin token and optional scraper defaults:
```bash
API_ADMIN_TOKEN=change_me_in_production
HRNZ_CLUB_CODES=all
```

### 2. Start Services

```bash
docker compose up -d
```

This starts:
- PostgreSQL database on port 5432
- FastAPI service with integrated Web UI on port 8000
  - Web UI: http://localhost:8000/
  - API docs: http://localhost:8000/docs

### 3. Run Database Migrations

```bash
docker compose run --rm worker alembic upgrade head
```

### 4. Ingest Race Data

Backfill one month of data:
```bash
docker compose run --rm worker python -m apps.backend.worker.cli ingest \
  --from 2024-01-01 \
  --to 2024-01-31
```

### 5. Compute Ratings

```bash
docker compose run --rm worker python -m apps.backend.worker.cli recompute \
  --from 2024-01-01 \
  --to 2024-01-31
```

### 6. Access Web UI or API

**Option A: Web Interface** (Recommended for browsing)

Open http://localhost:8000/ in your browser to access the interactive web UI with:
- Homepage with top-rated horses, drivers, and trainers
- Entity detail pages with rating history charts
- Search functionality across all entities
- Race predictions and results

**Option B: API Endpoints** (For programmatic access)

```bash
# Top horses
curl http://localhost:8000/ratings/horses?limit=20

# Specific horse history
curl http://localhost:8000/ratings/horses/12345

# Top drivers
curl http://localhost:8000/ratings/drivers?limit=20
```

Open API docs at: http://localhost:8000/docs

### 7. Run Evaluation

```bash
docker compose run --rm worker python scripts/evaluate.py \
  --from 2024-01-01 \
  --to 2024-01-31 \
  --out reports/eval_jan2024.json
```

## Development Setup

### Install Dependencies

```bash
python3.12 -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
```

### Run Tests

```bash
pytest
```

### Run Linting

```bash
ruff check .
black --check .
```

### Database Operations

```bash
# Create a new migration
alembic revision --autogenerate -m "description"

# Apply migrations
alembic upgrade head

# Rollback one migration
alembic downgrade -1
```

## Architecture

See [docs/architecture.md](docs/architecture.md) for system overview and data flow.

## Configuration

All configuration is in `.env`. Key settings:

### Rating Parameters

- `ELO_SCALE_C`: Logistic scale factor (default: 400.0)
- `ELO_K_BASE`: Base K-factor for updates (default: 24.0)
- `DRIVER_WEIGHT_ALPHA`: Driver contribution weight (default: 0.35)
- `TRAINER_WEIGHT_BETA`: Trainer contribution weight (default: 0.15)
- `ADJ_LEARNING_RATE`: Condition adjustment learning rate (default: 0.5)

### Feature Flags

- `ENABLE_DRIVER`: Include driver ratings (default: true)
- `ENABLE_TRAINER`: Include trainer ratings (default: true)
- `ENABLE_ADJUSTMENTS`: Learn barrier/handicap adjustments (default: true)
- `ENABLE_RD`: Track rating deviation (default: false)

### HRNZ Scraper Proxy (Decodo)

To rotate IPs after each scraped page, configure Decodo credentials in `.env`:

```bash
HRNZ_DECODO_PROXY_SERVER=http://<decodo-host>:<port>
HRNZ_DECODO_PROXY_USERNAME=<your_username>
HRNZ_DECODO_PROXY_PASSWORD=<your_password>
HRNZ_DECODO_ROTATE_EACH_REQUEST=true
```

Optional overrides (use when needed per Decodo docs):

```bash
HRNZ_DECODO_PROXY_HOST=<decodo-host>
HRNZ_DECODO_PROXY_PORT=<port>
HRNZ_DECODO_PROXY_SCHEME=http
HRNZ_DECODO_SESSION_PARAM=session
HRNZ_DECODO_USERNAME_TEMPLATE=customer-your_user-zone-residential-session-{session}
```

For the full list of rating knobs and how they affect computation, see
`docs/elo_tuning.md` and `ALGORITHM.md`.

## Common Tasks

### Backfill Historical Data

```bash
# Backfill one year (TAB API)
docker compose run --rm worker python -m apps.backend.worker.cli ingest \
  --from 2023-01-01 \
  --to 2023-12-31
```

```bash
# HRNZ InfoHorse scrape via webhook (Playwright)
curl -X POST http://localhost:8000/webhook/scrape \
  -H "Authorization: Bearer <API_ADMIN_TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
    "club_codes": "all",
    "date_from": "2023-01-01",
    "date_to": "2023-12-31",
    "recompute": true
  }'
```

### Recompute All Ratings

```bash
# Clear existing ratings
docker compose exec db psql -U tipsharks -c "TRUNCATE rating_snapshots;"

# Recompute from scratch
docker compose run --rm worker python -m apps.backend.worker.cli recompute \
  --from 2020-01-01 \
  --to 2024-12-31
```

### Export Ratings

```bash
# Export top 100 horses to CSV
curl "http://localhost:8000/ratings/horses?limit=100&format=csv" > ratings.csv
```

## Troubleshooting

### HRNZ Scraper Issues

- Ensure the API container has Playwright installed (Dockerfile.api does this by default)
- Check the webhook response/logs for "Skipping non-meeting page" messages
- Verify URLs are valid: `https://infohorse.hrnz.co.nz/datahrs/results/<MMDD><CC>rs.htm`

### Database Connection Error

```bash
# Check database is running
docker compose ps

# View logs
docker compose logs db

# Restart database
docker compose restart db
```

### Ratings Not Appearing

- Ensure ingestion completed: check `meetings`, `races`, `starters` tables
- Ensure recompute ran: check `rating_snapshots` table
- Check logs: `docker compose logs worker`

### Tests Failing

```bash
# Run with verbose output
pytest -v

# Run specific test
pytest tests/test_ratings.py::test_basic_elo -v

# Run with coverage
pytest --cov=packages --cov-report=html
```

## Documentation

- [Docs Index](docs/README.md) - Overview of all documentation
- [Architecture](docs/architecture.md) - System design and data flow
- [Data Model](docs/data_model.md) - Database schema
- [Rating Math](docs/rating_math.md) - Elo formulas and tuning
- [Rating Algorithm](ALGORITHM.md) - End-to-end Elo calculation walkthrough
- [Operations](docs/ops.md) - Deployment and maintenance
- [HRNZ Scraper Guide](docs/hrnz/HRNZ_SCRAPER_GUIDE.md) - Scraper usage and limits
- [Setup Guide](docs/setup/SETUP_GUIDE.md) - Local setup walkthrough

## API Reference

Full API documentation available at `/docs` when running the API service.

OpenAPI spec: `elo-openapi.json`

Key endpoints:
- `GET /health` - Health check
- `GET /ratings/horses` - List horse ratings
- `GET /ratings/horses/{id}` - Horse detail and history
- `GET /ratings/drivers` - List driver ratings
- `GET /ratings/trainers` - List trainer ratings
- `GET /races/{id}` - Race details with starters
- `POST /admin/ingest` - Trigger ingestion (requires token)
- `POST /admin/recompute` - Trigger recompute (requires token)

## Contributing

1. Create feature branch
2. Make changes with tests
3. Run tests and linting: `pytest && ruff check .`
4. Submit pull request

## License

MIT