All you need are the stats…
Until very recently, like yesterday, I had no idea there was a website that revolved entirely around pro surfing stats. A real nice man called Balyn McDonald had hit me up, asked if I’d consider running some of their upcoming interviews with pro’s. Told him, if they’re any good, why not?
But what interested me more was the detail in his website surf-stats.com. I wondered is there a way of Moneyballing surf? Crunch enough numbers and swoop on not just Fantasy Surfer’s paltry gift-bags but on the big betting co’s that sell surf?
So, yeah, let’s talk.
BeachGrit: You said you were surprised at the odds betting agencies give. Why?
Balyn: I’m not sure what models they’re using (and I’d love to see their systems as the betting agencies usually have the best stats in the game), but it seems that some of their odds neglect factors that fantasy fans or even knowledgeable surfers may identify as significant.
One example is Jordy Smith for Pipe. Sure he has looked great at times this year and he has a top seeding, but Jordy hasn’t bettered a 13th at Pipe since 2010 and his record in left-hand reef breaks is way lower than, say, righthand beachies. If the conditions are four-to-six-foot Backdoor, then maybe you would take a punt, but otherwise the odds just don’t stack up.
The odds for surf markets aren’t too bad either. While John John is the universal favourite for Pipe and he’s down to around four or five dollars on most sites, he’s no guarantee (he’s still never won the Pipe Masters event) and the pay-out isn’t huge. But look to the second and third-lowest odds and the pay-off isn’t too stingy. Medina is around $7.5o and Slater is as high as $9, which are fairly good odds for past winners or perennial favourites. If you put $10 on each, the worst you’d get is a $20 profit if one of the three won and the best result would be $60. Of course, you could get something like last year where Adriano defied the odds to win.
BeachGrit: What are the most surprising stats, right now?
There are always random stats that you could spit out:
* Margies provided the highest event average heat score (13.37) across all surfers this year, Tahiti was the worst (10.87).
* Wiggolly Dantas has the worst AHS for Pipe over the past 2 years (only one event at 2.82), while Slater has the best (14.20). Medina is second with 12.88 and John third at 12.85.
* Medina’s AHS for 2016 is nearly identical to the year he won. Johns has risen by nearly 1.5 in that same time, despite people consistently claiming that he’s surfing safe.
* Adriano’s AHS has dropped by 1.5 since last year
* Slater’s AHS in lefts is nearly two points higher than rights, and his reef break averages are more than four points higher than in beach breaks
* Jeremy Flores has probably been the most unlucky surfer on tour this year, with an AHS better than top-placed guys like Adriano and Wilko but no results to show for it
BeachGrit: How can I use your stats to game the betting co’s?
To be honest, I only started betting as a method of testing my stats against the market. Most of the time I prefer to just have clubhouse wagers against my mates. I’m sure there are people out there who could school me in the ways of betting seriously, so take what I say with a grain of salt:
I strongly suggest that players pay close attention to the forecast conditions as these can’t possibly be taken into consideration by the agencies when they open the market. Different surfers can defy their historical averages – and odds – at an event if the conditions are atypical. If we’d know that the event would be held in soft, two-foot lefts we’d have definitely given someone like Keanu Asing more of a chance in France.
Another essential is shopping around. I mentioned Jordy above as an example, well he’s $12 on some sites – which I’d never take – but he’s $23 on another, which seems a little more justifiable if the conditions were perfect. Another example is Flores. He ranges from $26 to $34, and he isn’t a bad dark-horse given his stats in these conditions (though he’s had a pretty poor year).
Also, I hedge my bets slightly to cover losses; I’ve generally played this year with $5-10 on whoever I pick as my favourite, and then between $2 and $5 on one or two back-up surfers. I have found that I’ve broken even at worst across the year. I actually had a progressive bet rolling across a few contests where I re-invested winnings into the next contest (guaranteeing no net loss). It had built up nicely on the back of a few wins, but it ended badly with when I put it all on Medina in France. I made some money back with John in Portugal though.
BeachGrit: How did this whole thing start? And how deep do you dig, stats-wise?
We actually have limitations on what we’re officially “allowed” to use. The site was originally set-up and run by a guy called Mike Jordan (no, not him), who was pretty fucking good with stats. He knew the game and he had the skills to play it well. The data that he was originally using went all the way back so that it covered the careers of every tour surfer, so we’re talking 20-plus years for the likes of Slater. I emailed him and asked if I could write some pieces to flesh out the data and put some more personality into the site and we started working together from there.
At the end of last year, Mike was snatched up by the WSL as their official stats guy. He’s currently running a fine-toothed comb through every archived contest in ASP/WSL history and feeding it into their database. He then provides the WSL with a holistic perspective on all history, outcomes and possibilities for contests. If you hear Joe Turpel quoting something crazily specific like, “No goofy-footed surfer has ever won this contest in overcast, onshore conditions since 1996”, there’s a good chance that he’s quoting Mike’s stats. Actually, it’s surprising how much more the statistics are being used by the WSL since Mike came on board. I’m not sure if you guys have noticed, but from someone who deals with surfing stats all the time I’ve been impressed by how much they’ve stepped it up in their commentary this year.
One of the problems for us was that Mike had to cut ties with the site before his contract with WSL started, leaving behind only the information that was available to anyone on the public record via heats-on-demand from the WSL site (which, as of 12 months ago, was the 2014/15 seasons – it’s now even less). It would be great if we still had access to his databases, but there were specific stipulations expressed by the WSL when they poached him. Older results are still available on the public record, but without the video footage it’s hard to be as detailed with our data regarding elements such as wind, swell, weather etc. For the most part though, three years of data is more than enough to keep our numbers relevant.
As far as new input goes, we have a couple of spreadsheets for the 2016 season and we enter new data as each contest progresses, including results, scores, conditions details and more. The rest is about how you use that current data along with the historical information to create models that are meaningful. We publish sortable tables in the lead-up to each event to provide readers with a snapshot for that contest. We also give projections for surfers based on algorithms that we have created. The published tables can be manipulated so that readers can pick and choose which stats they value and which ones they want to ignore. There are claims like “stats don’t lie”, which is absolutely true as they are quantitative facts, but we aren’t arrogant enough to suggest that future results can’t deviate from historically-based projections. Just look at Keanu in France. We missed that one by a mile.
BeachGrit: Tell me about your magic algorithm? Did you create it and how?
Well, I can’t really give that away as anyone can replicate it if they have the time and know how to use excel at a functional level. I will say that it uses more data than what’s published in our sortable tables. We basically create an Average Heat Score (AHS) projection for each surfer and then we use that projection to calculate the likelihood of surfers making heats.
I created a model at the start of the season based on something Mike had previously created, but slightly modified it because our database had changed so much with his leaving. I tweaked it across the first half of the season as we saw how the numbers were falling, and now it’s actually quite different (in subtle ways). We’ve got some changes that we want to apply for next season too so that the projections more accurately reflect surfers’ form, but that will have to wait for the off-season.
BeachGrit: Do you Moneyball Fantasy Surfer? How accurate are your stats in predicting winners?
We do to an extent, but Fantasy Surfer is harder to calculate for individuals as we can’t factor in all of the variations in surfer prices. We do provide a cost-per-projected-point value for each surfer at their current market value, which gives a good basis for comparison, but players will need to check that data against their own team’s prices. The WSL game is a lot simpler as it breaks the surfers into tiers, so we can recommend surfers for each tier and the information is consistent for everyone.
As far as accuracy goes, we find that our teams based purely on the data (our “Numbers” teams) are basically guaranteed to do OK, but they can’t predict the anomalies such as Wilko at Cooly or Seabass at Margies. The data will ensure that you make an informed decision and will stop you from sucking completely, but the top fantasy players are the ones who know when and where to deviate from history and make a bold decision based on outlier factors or gut instincts. That’s why we always write up a full contest analysis in order to put the stats into context and detail the contest as a whole.
As far as accuracy goes, it’s easier to use the WSL game model for measuring success. A “perfect” FS team isn’t always possible due to salary cap constraints and, due to fluctuations in prices, is hard to compare. Given that you are required to choose eight surfers from a field of 36 in each contest, your base-line, close-your-eyes-and-throw-a-dart-at-the-board chance of having the perfect team of the best two tier A, four tier B and two tier C surfers is 22.22%. For the 2016 season we are currently sitting at a fairly modest 37%, but that’s partly because our algorithms were in need of tweaking at the beginning of the season. If you look at our recommendations post-Margs, we have selected the perfect team with 55.36% accuracy.
As another measure, our “Numbers” team in the WSL game is ranked 655th in the world, putting it in the top 0.55% of all registered teams (though we recognise that many of the registered teams may not be full-season or fully active teams).
BeachGrit: Did you give the title to John John, and when?
Our data clearly told us that Wilko couldn’t maintain his lead across the season. While consistency was always going to be his achilles, there was also the fact that he was winning with a much lower AHS than many of his peers. He was still surfing beyond his previous career averages, but our data suggested that he would be overtaken by the likes of John and Gabriel if his heat scores stayed comparable to guys like Jack Freestone who were struggling to requalify.
As far as calling it, we actually had the title race to be much tighter between Gabs and Double-John. The 2016 data favoured John, but we didn’t expect it to be wrapped up before Pipe. A big part of that was down to Medina failing to come anywhere near his projected results for Portugal. It’s worth noting also that, as insignificant as it may seem, Medina’s loss to Keanu in the France final was massive. Had he won in France he’d still be in the title race (and he’d have won me a bit of money).
BeachGrit: What are your projections for Pipe?
We’re still working on our final projections for Pipe, but it goes without saying that our suggestions will depend greatly on the forecast. Looking at our data though, and assuming it’s big Pipe (as opposed to Backdoor), Medina and Slater dominate most areas including reefs, lefts and eight-to-10-foot conditions. If Backdoor is on, guys like Joel and Julian start to look good as well. John’s figures are pretty solid, and we’ve seen some improved numbers from him across most areas this season. He’s definitely on the radar. He’s going to be on everyone’s team though so you won’t get much of an advantage by playing him.
We’ll have our full sortable tables up in a few days, so anyone wanting to check the projections or sift through the finer details can do it on our site. We put our full contest analysis up (summarising the data and factoring in forecasts, Triple Crown form, injuries, wildcards etc.) a few days before the event window. We also send a linked email reminder out to our subscribers just before each contest (our only email contact, I swear) so that they don’t miss events or have to go searching for the data.
BeachGrit: How’s the final top five going to look?
I can’t see it changing a great deal. Mathematically, it looks like the top 13 are all eligible to finish top five after Pipe, but many of them are dependent on others failing. John John has been crowned champ already, so that’s locked, and Medina could only slip as low as fifth in the absolute worst-case scenario. Jordy could possibly overtake Medina with a fifth or better. Wilko and Kolohe are almost identically placed and would need a second at Pipe or better to beat Medinas current total. Kelly could climb as high as fourth with a Pipe win, but would still need other results to go his way just to make the top five. If you want me to name names, I’m predicting John, Gabe, Jordy, Kolohe and Joel will round out the top five after Pipe.