| 1 |
log_odds |
Odds |
Natural logarithm of the decimal odds. Normalises the wide odds range (1.5 to 50+) into a compressed scale so the model treats short and long prices proportionally. |
| 2 |
implied_prob |
Odds |
Break-even probability derived from odds (1/odds). At $3.00 odds, you need to win 33% of the time to break even. This is the baseline the model tries to beat. |
| 3 |
tipster_edge_30d |
Edge |
Tipster's recent 30-day edge: (actual win rate) minus (average implied probability). Positive means the tipster is beating the market recently. Shrunk toward zero when sample size is under 30 tips. |
| 4 |
tipster_edge_all |
Edge |
Tipster's all-time edge across their full history. More stable than 30d but slower to react to form changes. Also shrunk toward zero with small samples. |
| 5 |
tipster_volume |
Edge |
Log of the tipster's total bet count. More data means the tipster's edge measurement is more reliable. Acts as an implicit confidence signal. |
| 6 |
source_edge |
Edge |
Edge of the data source (Betfair, Hub, SEN, Sky Racing, Sportsbet, Racing.com). Measures whether tips from a particular platform systematically outperform or underperform the market. |
| 7 |
category_edge |
Edge |
Edge for the sport/category (horse racing, harness, greyhounds, AFL, etc.). Some sports may be more predictable by experts than others. |
| 8 |
odds_bracket_edge |
Edge |
Edge for the odds range: short (<2.0), mid (2-3.5), midlong (3.5-6), long (6-10), very long (10+). Captures whether experts are better at picking favourites vs long shots. |
| 9 |
tipster_x_odds_edge |
Interaction |
Interaction between tipster and odds bracket. A tipster might be excellent at picking short-priced winners but poor at long shots (or vice versa). This feature captures that specialisation. |
| 10 |
venue_edge |
Racing |
Edge for the race venue (Flemington, Randwick, etc.). Some tracks have structural biases that expert tipsters exploit better than others. Zero for non-racing or unknown venues. |
| 11 |
jockey_edge |
Racing |
Jockey profitability edge. Some jockeys consistently win more often than their odds suggest. Measures if the jockey adds value beyond market expectations. Zero if jockey data unavailable. |
| 12 |
trainer_edge |
Racing |
Trainer profitability edge. Captures whether a trainer's horses outperform or underperform their market odds. Shrunk toward zero with limited data. |
| 13 |
tipster_recent_form |
Momentum |
Recent 14-day win rate minus all-time win rate. Positive = tipster is on a hot streak. Negative = cold streak. Requires at least 3 recent tips to compute, otherwise zero. |
| 14 |
tipster_consensus |
Signal |
Log of distinct tipsters who picked the same market+selection. More consensus = stronger signal. Built from all historical tips. |
| 15 |
odds_movement |
Market |
Difference between Betfair exchange odds and tip source odds, as a fraction. Positive = price drifting (less backed). Available for ~29% of tips. |
| 16 |
is_lay |
v7 |
1 if the tip is a LAY bet, 0 for BACK. LAY bets have inverted P/L (small wins, large losses). The model learned LAY bets at long odds are unprofitable. |
| 17 |
consensus_2plus |
v7 |
Binary: 1 if 2+ tipsters picked the same selection, 0 otherwise. Data shows consensus picks lose only -3% ROI vs -16.5% for solo picks. |
| 18 |
form_score |
Racing |
Parsed from the horse's recent form string (e.g. "548141"). Weighted average of finishing positions (recent results weighted more). Higher = better form. Zero if no form data. |
| 19 |
barrier |
Racing |
Starting barrier position (stall draw). Inside barriers tend to have an advantage on tight tracks. Zero if not available (~85% of tips). |
| 20 |
days_since_run |
Racing |
Days since the horse last raced. Rest period affects fitness/freshness. The model learned a small negative weight (-0.021) suggesting longer rest is slightly worse. |
| 21 |
horse_age |
Racing |
Horse age in years. Performance varies by age β younger horses may be improving, older horses declining. Zero if not available. |