# TipSharks Production Deployment

This directory contains production deployment configuration for the TipSharks platform.

## Contents

| File/Directory | Purpose |
|----------------|---------|
| `docker-compose.prod.yml` (root) | Production service orchestration |
| `.env.prod.example` (root) | Environment variable template |
| `nginx/nginx.conf` | Nginx reverse proxy configuration |
| `nginx/ssl/` | Place SSL certificates here (`.crt`, `.key`) |

---

## Prerequisites

- **Docker** >= 24.x and **Docker Compose** >= 2.20
- A domain name with DNS pointing to your server
- **SSL certificates** (Let's Encrypt recommended)
- Server with at least **4 GB RAM** and **2 CPU cores** (8 GB / 4 cores recommended)
- Ports **80** and **443** open (HTTP/HTTPS)

---

## Quick Start

### 1. Clone and prepare

```bash
git clone <repo-url> tipsharks
cd tipsharks
```

### 2. Configure environment

```bash
cp .env.prod.example .env.prod
# Edit .env.prod with strong, unique passwords and your domain URLs
```

### 3. Set up SSL certificates

**Option A: Let's Encrypt (recommended)**

```bash
# Install certbot
sudo apt install certbot

# Obtain certificates (temporary standalone mode)
sudo certbot certonly --standalone -d yourdomain.com -d app.yourdomain.com

# Copy to nginx ssl directory
sudo cp /etc/letsencrypt/live/yourdomain.com/fullchain.pem infrastructure/nginx/ssl/
sudo cp /etc/letsencrypt/live/yourdomain.com/privkey.pem infrastructure/nginx/ssl/

# Set up auto-renewal
sudo certbot renew --post-hook "docker compose -f docker-compose.prod.yml exec nginx nginx -s reload"
```

**Option B: Self-signed (for testing only)**

```bash
openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
  -keyout infrastructure/nginx/ssl/privkey.pem \
  -out infrastructure/nginx/ssl/fullchain.pem \
  -subj "/CN=yourdomain.com"
```

### 4. Uncomment HTTPS in nginx config

Edit `infrastructure/nginx/nginx.conf` and uncomment the HTTPS server block once SSL certificates are in place.

### 5. Start production stack

```bash
docker compose -f docker-compose.prod.yml --env-file .env.prod up -d
```

### 6. Verify

```bash
# Check all services are running
docker compose -f docker-compose.prod.yml ps

# View logs
docker compose -f docker-compose.prod.yml logs -f

# Health check endpoint
curl http://localhost/health
```

---

## Updating

### Apply configuration changes

```bash
docker compose -f docker-compose.prod.yml --env-file .env.prod up -d --force-recreate
```

### Rebuild and redeploy a single service

```bash
docker compose -f docker-compose.prod.yml --env-file .env.prod build tab-api-ingest
docker compose -f docker-compose.prod.yml --env-file .env.prod up -d tab-api-ingest
```

---

## Environment Variables

All secrets are managed through environment variables. See `.env.prod.example` for the full list with descriptions.

### Critical secrets to change

| Variable | Why it matters |
|----------|---------------|
| `POSTGRES_INGEST_PASSWORD` | Database access for ingest service |
| `POSTGRES_ELO_PASSWORD` | Database access for Elo API |
| `MONGO_ROOT_PASSWORD` | Database access for client backend |
| `API_ADMIN_TOKEN` | Admin endpoint authentication |

### URL configuration

| Variable | Example | Notes |
|----------|---------|-------|
| `CORS_ORIGINS` | `https://app.yourdomain.com` | Comma-separated list |
| `INGEST_DATABASE_URL` | `postgresql://user:pass@postgres-ingest:5432/racing_db` | Uses Docker service name |
| `ELO_DATABASE_URL` | `postgresql+psycopg://user:pass@postgres-elo:5432/tipsharks` | Uses Docker service name |
| `MONGO_URL` | `mongodb://user:pass@mongo:27017` | Uses Docker service name |

---

## SSL Certificate Setup

For a production deployment with HTTPS:

1. Obtain certificates via Let's Encrypt (see above)
2. Uncomment the HTTP-to-HTTPS redirect server block in `nginx.conf`
3. Replace the `listen 80;` server block with the HTTPS variant:

```nginx
server {
    listen 443 ssl http2;
    server_name yourdomain.com;

    ssl_certificate     /etc/nginx/ssl/fullchain.pem;
    ssl_certificate_key /etc/nginx/ssl/privkey.pem;

    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;
    ssl_prefer_server_ciphers on;
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 10m;

    # ... location blocks same as HTTP server ...
}
```

4. Reload nginx: `docker compose -f docker-compose.prod.yml exec nginx nginx -s reload`

---

## Scaling

### Vertical scaling (single server)

Resource limits are defined in `docker-compose.prod.yml` under `deploy.resources.limits.memory`. Adjust per service:

| Service | Current limit | Suggested max |
|---------|--------------|---------------|
| `tipsharks-worker` | 2 GB | 4 GB (for large recomputes) |
| `tipsharks-elo-api` | 1 GB | 2 GB (with caching) |
| `postgres-ingest` | 1 GB | 2 GB (with many meetings) |
| `postgres-elo` | 1 GB | 2 GB |
| `mongo` | 512 MB | 1 GB |
| `tab-api-ingest` | 512 MB | 1 GB |
| `tipsharks-client-backend` | 512 MB | 1 GB |
| `redis` | 256 MB | 512 MB |

### Horizontal scaling (multi-server with Docker Swarm)

The `deploy.replicas` field is set for Swarm mode:

| Service | Replicas | Notes |
|---------|----------|-------|
| `tipsharks-elo-api` | 2 | Stateless, safe to scale |
| `tipsharks-client-backend` | 2 | Stateless, safe to scale |
| `tab-api-ingest` | 1 | Stateful (BullMQ jobs) |
| `tipsharks-worker` | 1 | Avoid duplicate job processing |

To use Swarm:

```bash
docker swarm init
docker stack deploy -c docker-compose.prod.yml tipsharks
```

> **Warning**: Databases should NOT be replicated in Swarm mode without shared storage or external DB hosting.

### Cloud database recommendation

For production at scale, consider using managed database services:

- **PostgreSQL**: AWS RDS, Azure Database for PostgreSQL, or DigitalOcean Managed Databases
- **MongoDB**: MongoDB Atlas
- **Redis**: Upstash or AWS ElastiCache

---

## Backup Strategy

### Automated daily backups

Create a cron script `/etc/cron.daily/tipsharks-backup`:

```bash
#!/bin/bash
BACKUP_DIR=/var/backups/tipsharks
DATE=$(date +%Y%m%d-%H%M%S)
mkdir -p "$BACKUP_DIR"

# Backup ingest database
docker exec tipsharks-postgres-ingest pg_dump -U racing racing_db \
  | gzip > "$BACKUP_DIR/ingest-$DATE.sql.gz"

# Backup elo database
docker exec tipsharks-postgres-elo pg_dump -U tipsharks tipsharks \
  | gzip > "$BACKUP_DIR/elo-$DATE.sql.gz"

# Backup MongoDB
docker exec tipsharks-mongo mongodump \
  --username admin --password "$MONGO_ROOT_PASSWORD" \
  --archive="$BACKUP_DIR/mongo-$DATE.archive"

# Keep only last 30 days
find "$BACKUP_DIR" -name "*.gz" -mtime +30 -delete
find "$BACKUP_DIR" -name "*.archive" -mtime +30 -delete
```

### Restore from backup

```bash
# Ingest PostgreSQL
gunzip -c ingest-20250101-120000.sql.gz | docker exec -i tipsharks-postgres-ingest psql -U racing -d racing_db

# Elo PostgreSQL
gunzip -c elo-20250101-120000.sql.gz | docker exec -i tipsharks-postgres-elo psql -U tipsharks -d tipsharks

# MongoDB
docker exec -i tipsharks-mongo mongorestore --username admin --password "$PASSWORD" --archive=mongo-20250101-120000.archive
```

### What to back up

| Data | Frequency | Retention | Method |
|------|-----------|-----------|--------|
| PostgreSQL (ingest) | Daily | 30 days | pg_dump |
| PostgreSQL (elo) | Daily | 30 days | pg_dump |
| MongoDB | Daily | 30 days | mongodump |
| Environment file | On change | Manual | Secure vault |
| SSL certificates | On renewal | Manual | Backup with certbot |
| Nginx config | On change | Git history | Source control |

---

## Monitoring

### Health checks

Each service has a Docker health check defined. The overall stack health can be monitored via:

```bash
docker ps --format "table {{.Names}}\t{{.Status}}"
```

### Application metrics

- **tab-api-ingest**: Prometheus metrics at port 9090 (`/metrics`)
- **tipsharks-elo-api**: FastAPI health endpoint at `/health`

### Logging

All services log to stdout (Docker). View logs:

```bash
# All services
docker compose -f docker-compose.prod.yml logs -f

# Single service
docker compose -f docker-compose.prod.yml logs -f tipsharks-elo-api

# Last 100 lines with timestamps
docker compose -f docker-compose.prod.yml logs --tail=100 -t
```

### Recommended external monitoring

- **Uptime monitoring**: UptimeRobot, Pingdom, or Checkly
- **Error tracking**: Sentry (integrate with FastAPI and Express)
- **Infrastructure**: Prometheus + Grafana (development configs in each service directory)
- **Log aggregation**: Use Docker logging drivers (e.g., `gelf`, `fluentd`, `awslogs`)

---

## Security Checklist

- [ ] All default passwords changed in `.env.prod`
- [ ] SSL/TLS enabled on port 443
- [ ] HTTP -> HTTPS redirect active
- [ ] Security headers configured in nginx
- [ ] `API_ADMIN_TOKEN` set to a strong random value
- [ ] `CORS_ORIGINS` restricted to your domain(s)
- [ ] PostgreSQL ports not exposed to the internet
- [ ] MongoDB authentication enabled
- [ ] Redis secured (ACL or `--requirepass` in production)
- [ ] Docker daemon not exposed on TCP (no `-H tcp://`)
- [ ] Regular security updates applied to host OS
- [ ] Firewall configured (ports 80, 443, SSH only)

---

## Troubleshooting

### Service won't start

Check logs:
```bash
docker compose -f docker-compose.prod.yml logs <service-name>
```

### Database connection refused

Ensure the database service is healthy:
```bash
docker compose -f docker-compose.prod.yml ps
```

Check that `DATABASE_URL` in `.env.prod` uses the correct Docker service name (not `localhost`).

### nginx: upstream host not found

Services start asynchronously. nginx may attempt to connect before upstreams are ready. The `depends_on` condition helps, but Docker Compose does not wait for readiness inside nginx. This is expected to self-resolve within seconds. If persistent, restart nginx:

```bash
docker compose -f docker-compose.prod.yml restart nginx
```

### Out of memory

Check resource limits in `docker-compose.prod.yml` and increase as needed. Monitor with:

```bash
docker stats
```
