# HRNZ API Alignment Issues - Critical Fixes Required

**Date**: 2025-12-27
**Status**: CRITICAL - Code does not match HRNZ API specification

## Summary

The codebase has **critical misalignments** with the official HRNZ API v1.1 specification. The API endpoints, authentication method, and field mappings are incorrect and need immediate fixing.

---

## 1. API Endpoints - CRITICAL ❌

### Current Implementation (WRONG)
```python
# In client.py
async def get_meetings(self, date_from: str, date_to: str):
    response = await self._request_with_retry(
        "GET", "meeting",  # WRONG ENDPOINT
        params={"date": date_from, "toDate": date_to},  # WRONG PARAMS
    )

async def get_meeting(self, meeting_id: int):
    response = await self._request_with_retry("GET", f"meeting/{meeting_id}")  # WRONG

async def get_race(self, meeting_id: int, race_number: int):
    response = await self._request_with_retry(
        "GET", f"meeting/{meeting_id}/race/{race_number}"  # WRONG
    )
```

### HRNZ API Spec (CORRECT)
```
GET /racing/meetings?startDate={YYYY-MM-DD}&endDate={YYYY-MM-DD}
GET /racing/meetings/{meetingId}
GET /racing/meetings/{meetingId}/races/{raceId}
```

**Note**: The spec uses `raceId` (not `race_number`)! We need to extract `raceId` from the meeting's races array.

### Fix Required
- Change `"meeting"` → `"racing/meetings"`
- Change params to `startDate` and `endDate`
- Change `f"meeting/{id}/race/{num}"` → `f"racing/meetings/{meetingId}/races/{raceId}"`
- Handle the fact that we need `raceId` not just `race_number`

---

## 2. Authentication - CRITICAL ❌

### Current Implementation (WRONG)
```python
# client.py lines 234-249
async def _request_with_retry(...):
    # Use basic auth for all requests (hrKey not used for authentication)
    auth = (self.hrnz_config.username, self.hrnz_config.password)

    response = await self._client.request(
        method, url, params=params, auth=auth  # WRONG!
    )
```

The code fetches `hrKey` but never uses it!

### HRNZ API Spec (CORRECT)
```yaml
security:
  - ApiKeyAuth: []  # For all endpoints except /security/hrkey

securitySchemes:
  basicAuth:  # Only for /security/hrkey
    type: http
    scheme: basic

  ApiKeyAuth:  # For all other endpoints
    type: apiKey
    name: X-HR-KEY  # Header name
    in: header
```

### Fix Required
- `/security/hrkey` endpoint: Use basic auth (already correct)
- ALL other endpoints: Use `X-HR-KEY` header with the hrKey value
- Remove basic auth from regular requests

---

## 3. Meeting Field Mappings - HIGH PRIORITY ❌

### Current Implementation (WRONG)
```python
# repositories.py MeetingRepository.upsert()
meeting_date = meeting_data.get("date")  # WRONG FIELD NAME
venue = safe_get(meeting_data, "venue", "name", default="Unknown")  # WRONG STRUCTURE
```

### HRNZ API Spec Response (CORRECT)
```json
{
  "meetingId": 12345,
  "meetingDate": "2024-01-15",  // NOT "date"
  "clubNumber": 42,
  "clubName": "Auckland",  // NOT "venue.name"
  "track": {
    "number": 1,
    "name": "Alexandra Park",
    "direction": "Anticlockwise"
  },
  "numberOfRaces": 8
}
```

### Fix Required
- Change `meeting_data.get("date")` → `meeting_data.get("meetingDate")`
- Change venue mapping to use `clubName` instead of `venue.name`
- Consider storing `track.name` as venue instead of `clubName`

---

## 4. Race Field Mappings - HIGH PRIORITY ❌

### Current Implementation (WRONG)
```python
# repositories.py RaceRepository.upsert()
distance_m = race_data.get("distance")  # WRONG - returns object, not int
start_type = race_data.get("startType")  # ✓ CORRECT
gait = race_data.get("gait")  # WRONG FIELD NAME
race_datetime = race_data.get("startTime")  # May need parsing
```

### HRNZ API Spec Response (CORRECT)
```json
{
  "raceId": 789,
  "raceNumber": 1,
  "raceGait": "Pace",  // NOT "gait"
  "startType": "Mobile",
  "distance": {
    "units": "Metres",  // or "Yards"
    "distance": 2000
  },
  "startTime": "14:30"
}
```

### Fix Required
- Extract distance: `race_data.get("distance", {}).get("distance")`
- Change `gait` → `raceGait`
- Parse `startTime` combined with `meetingDate` to create `race_datetime`

---

## 5. Runner/Starter Field Mappings - HIGH PRIORITY ❌

### Current Implementation (WRONG)
```python
# repositories.py StarterRepository.upsert()
barrier = runner_data.get("barrier")  # WRONG FIELD NAME, WRONG TYPE
handicap_m = runner_data.get("handicap")  # ✓ Correct field, but is it int or object?
placing = runner_data.get("placing")  # WRONG - must come from result object
did_not_finish = runner_data.get("didNotFinish", False)  # Field doesn't exist in spec
```

### HRNZ API Spec Response (CORRECT)
```json
{
  "horseId": 123,
  "horseName": "Fast Horse",
  "barrierDraw": "1",  // STRING not int!
  "handicap": 0,  // Meters (int)
  "scratched": false,
  "horse": {...},
  "driver": {
    "driverId": 456,
    "driverName": "John Smith"
  },
  "trainer": {
    "trainerId": 789,
    "trainerName": "Jane Doe"
  },
  "result": {  // Only present after race is run
    "placing": "1",  // STRING or number?
    "isEqualPlace": false,
    "stakesWon": 10000.00,
    "horseTime": 154.2,
    "mileRate": 119.5
  }
}
```

### Fix Required
- Change `barrier` → `barrierDraw` (and handle string → int conversion)
- Get `placing` from `result.placing` if result exists
- Use `scratched` field instead of inferring `did_not_finish`
- Handle case where `result` object doesn't exist (future races)

---

## 6. Settings Configuration - MEDIUM PRIORITY ❌

### Current Implementation (WRONG)
```python
# settings.py line 22
class HRNZSettings(BaseSettings):
    key_header: str = Field(default="hrKey", description="Header name for HR key")
```

### Fix Required
- Change default from `"hrKey"` → `"X-HR-KEY"` to match API spec

---

## 7. Response Structure Issues

### Current Code Assumptions
The code assumes:
```python
meetings = response.get("meetings", [])  # May be wrong wrapping
```

### HRNZ API Spec
The GET `/racing/meetings` endpoint returns:
```json
[
  {
    "meetingId": 123,
    "meetingDate": "2024-01-15",
    ...
  },
  ...
]
```

It's an **array at the root**, not wrapped in `{"meetings": [...]}`.

Need to verify if the actual API response wraps the array or returns it directly.

---

## Impact Assessment

### Critical (Breaks Functionality)
1. ✅ Authentication - requests will fail with 401/403
2. ✅ API endpoints - requests return 404 or wrong data
3. ✅ Field mappings - data extraction fails, NULL values everywhere

### High Priority (Data Quality Issues)
1. Meeting/race field mappings cause incorrect data storage
2. Missing or incorrect results data
3. Type mismatches (string vs int for barrier)

### Medium Priority (Future Issues)
1. Settings configuration needs updating
2. Response structure assumptions

---

## Testing Plan

After fixes, we must:
1. Test authentication flow (get hrKey, use in subsequent requests)
2. Test meeting list retrieval
3. Test individual meeting fetch
4. Test race fetch with runners
5. Verify all field mappings with actual API responses
6. Run full ingestion for a single date
7. Verify database contains correct data

---

## Files to Modify

1. `/home/user/tipsharks/harnesselo/packages/hrnz_client/client.py` - Endpoints and auth
2. `/home/user/tipsharks/harnesselo/packages/common/settings.py` - Header name
3. `/home/user/tipsharks/harnesselo/packages/storage/repositories.py` - Field mappings
4. Tests files - Update mocks and expectations

---

## Priority Order

1. **FIRST**: Fix authentication (X-HR-KEY header)
2. **SECOND**: Fix API endpoint paths
3. **THIRD**: Fix meeting field mappings
4. **FOURTH**: Fix race field mappings
5. **FIFTH**: Fix runner/starter field mappings
6. **SIXTH**: Test with real API
7. **SEVENTH**: Update tests

---

## Notes

- The HRNZ API uses **camelCase** for all field names
- Many fields are objects, not primitives (distance, track, result)
- The `raceId` is different from `raceNumber` - we need both!
- String types are used for some fields we might expect as ints (barrierDraw, placing)
- Results are only present for past races, not future races
