All you need are the stats…
Until very recently, like yesterday, I had no idea there
was a website that revolved entirely around pro surfing
stats. A real nice man called Balyn McDonald had hit me up,
asked if I’d consider running some of their upcoming interviews
with pro’s. Told him, if they’re any good, why not?
But what interested me more was the detail in his website
surf-stats.com. I wondered is there a way of Moneyballing surf? Crunch enough
numbers and swoop on not just Fantasy Surfer’s
paltry gift-bags but on the big betting co’s that sell
surf?
So, yeah, let’s talk.
BeachGrit: You said you were surprised at the odds
betting agencies give. Why?
Balyn: I’m not sure what models they’re using (and I’d love to
see their systems as the betting agencies usually have the best
stats in the game), but it seems that some of their odds neglect
factors that fantasy fans or even knowledgeable surfers may
identify as significant.
One example is Jordy Smith for Pipe. Sure he has looked great at
times this year and he has a top seeding, but Jordy hasn’t bettered
a 13th at Pipe since 2010 and his record in left-hand reef breaks
is way lower than, say, righthand beachies. If the conditions are
four-to-six-foot Backdoor, then maybe you would take a punt, but
otherwise the odds just don’t stack up.
The odds for surf markets aren’t too bad either. While John John
is the universal favourite for Pipe and he’s down to around four or
five dollars on most sites, he’s no guarantee (he’s still
never won the Pipe Masters event) and the pay-out isn’t huge. But
look to the second and third-lowest odds and the pay-off isn’t too
stingy. Medina is around $7.5o and Slater is as high as $9, which
are fairly good odds for past winners or perennial favourites. If
you put $10 on each, the worst you’d get is a $20 profit if
one of the three won and the best result would be $60. Of course,
you could get something like last year where Adriano defied the
odds to win.
BeachGrit: What are the most surprising stats, right
now?
There are always random stats that you could spit out:
* Margies provided the highest event average heat score (13.37)
across all surfers this year, Tahiti was the worst (10.87).
* Wiggolly Dantas has the worst AHS for Pipe over the past 2 years
(only one event at 2.82), while Slater has the best (14.20). Medina
is second with 12.88 and John third at 12.85.
* Medina’s AHS for 2016 is nearly identical to the year he won.
Johns has risen by nearly 1.5 in that same time, despite people
consistently claiming that he’s surfing safe.
* Adriano’s AHS has dropped by 1.5 since last year
* Slater’s AHS in lefts is nearly two points higher than rights,
and his reef break averages are more than four points higher than
in beach breaks
* Jeremy Flores has probably been the most unlucky surfer on tour
this year, with an AHS better than top-placed guys like Adriano and
Wilko but no results to show for it
BeachGrit: How can I use your stats to game the betting
co’s?
To be honest, I only started betting as a method of testing my
stats against the market. Most of the time I prefer to just have
clubhouse wagers against my mates. I’m sure there are people out
there who could school me in the ways of betting seriously, so take
what I say with a grain of salt:
I strongly suggest that players pay close attention to the
forecast conditions as these can’t possibly be taken into
consideration by the agencies when they open the market. Different
surfers can defy their historical averages – and odds – at an event
if the conditions are atypical. If we’d know that the event would
be held in soft, two-foot lefts we’d have definitely given someone
like Keanu Asing more of a chance in France.
Another essential is shopping around. I mentioned Jordy above as
an example, well he’s $12 on some sites – which I’d never take –
but he’s $23 on another, which seems a little more justifiable if
the conditions were perfect. Another example is Flores. He ranges
from $26 to $34, and he isn’t a bad dark-horse given his stats in
these conditions (though he’s had a pretty poor year).
Also, I hedge my bets slightly to cover losses; I’ve generally
played this year with $5-10 on whoever I pick as my favourite, and
then between $2 and $5 on one or two back-up surfers. I have found
that I’ve broken even at worst across the year. I actually had a
progressive bet rolling across a few contests where I re-invested
winnings into the next contest (guaranteeing no net loss). It had
built up nicely on the back of a few wins, but it ended badly with
when I put it all on Medina in France. I made some money back with
John in Portugal though.
BeachGrit: How did this whole thing start? And how deep
do you dig, stats-wise?
We actually have limitations on what we’re officially “allowed”
to use. The site was originally set-up and run by a guy called
Mike Jordan (no, not him), who was pretty fucking good with stats.
He knew the game and he had the skills to play it well. The data
that he was originally using went all the way back so that it
covered the careers of every tour surfer, so we’re talking 20-plus
years for the likes of Slater. I emailed him and asked if I could
write some pieces to flesh out the data and put some more
personality into the site and we started working together from
there.
At the end of last year, Mike was snatched up by the WSL as
their official stats guy. He’s currently running a fine-toothed
comb through every archived contest in ASP/WSL history and feeding
it into their database. He then provides the WSL with a holistic
perspective on all history, outcomes and possibilities for
contests. If you hear Joe Turpel quoting something crazily specific
like, “No goofy-footed surfer has ever won this contest in
overcast, onshore conditions since 1996”, there’s a good chance
that he’s quoting Mike’s stats. Actually, it’s surprising how much
more the statistics are being used by the WSL since Mike came on
board. I’m not sure if you guys have noticed, but from someone who
deals with surfing stats all the time I’ve been impressed by how
much they’ve stepped it up in their commentary this year.
One of the problems for us was that Mike had to cut ties with
the site before his contract with WSL started, leaving behind only
the information that was available to anyone on the public record
via heats-on-demand from the WSL site (which, as of 12 months ago,
was the 2014/15 seasons – it’s now even less). It would be great if
we still had access to his databases, but there were specific
stipulations expressed by the WSL when they poached him. Older
results are still available on the public record, but without the
video footage it’s hard to be as detailed with our data regarding
elements such as wind, swell, weather etc. For the most part
though, three years of data is more than enough to keep our numbers
relevant.
As far as new input goes, we have a couple of spreadsheets for
the 2016 season and we enter new data as each contest progresses,
including results, scores, conditions details and more. The rest is
about how you use that current data along with the historical
information to create models that are meaningful. We publish
sortable tables in the lead-up to each event to provide readers
with a snapshot for that contest. We also give projections for
surfers based on algorithms that we have created. The published
tables can be manipulated so that readers can pick and choose which
stats they value and which ones they want to ignore. There are
claims like “stats don’t lie”, which is absolutely true as they are
quantitative facts, but we aren’t arrogant enough to suggest that
future results can’t deviate from historically-based projections.
Just look at Keanu in France. We missed that one by a mile.
BeachGrit: Tell me about your magic algorithm? Did you
create it and how?
Well, I can’t really give that away as anyone can replicate it
if they have the time and know how to use excel at a functional
level. I will say that it uses more data than what’s published in
our sortable tables. We basically create an Average Heat Score
(AHS) projection for each surfer and then we use that projection to
calculate the likelihood of surfers making heats.
I created a model at the start of the season based on something
Mike had previously created, but slightly modified it because our
database had changed so much with his leaving. I tweaked it across
the first half of the season as we saw how the numbers were
falling, and now it’s actually quite different (in subtle ways).
We’ve got some changes that we want to apply for next season too so
that the projections more accurately reflect surfers’ form, but
that will have to wait for the off-season.
BeachGrit: Do you Moneyball Fantasy Surfer? How accurate
are your stats in predicting winners?
We do to an extent, but Fantasy Surfer is harder to calculate
for individuals as we can’t factor in all of the variations in
surfer prices. We do provide a cost-per-projected-point value for
each surfer at their current market value, which gives a good basis
for comparison, but players will need to check that data against
their own team’s prices. The WSL game is a lot simpler as it breaks
the surfers into tiers, so we can recommend surfers for each tier
and the information is consistent for everyone.
As far as accuracy goes, we find that our teams based purely on
the data (our “Numbers” teams) are basically guaranteed to do OK,
but they can’t predict the anomalies such as Wilko at Cooly or
Seabass at Margies. The data will ensure that you make an informed
decision and will stop you from sucking completely, but the top
fantasy players are the ones who know when and where to deviate
from history and make a bold decision based on outlier factors or
gut instincts. That’s why we always write up a full contest
analysis in order to put the stats into context and detail the
contest as a whole.
As far as accuracy goes, it’s easier to use the WSL game model
for measuring success. A “perfect” FS team isn’t always possible
due to salary cap constraints and, due to fluctuations in prices,
is hard to compare. Given that you are required to choose eight
surfers from a field of 36 in each contest, your base-line,
close-your-eyes-and-throw-a-dart-at-the-board chance of having the
perfect team of the best two tier A, four tier B and two tier C
surfers is 22.22%. For the 2016 season we are currently sitting at
a fairly modest 37%, but that’s partly because our algorithms were
in need of tweaking at the beginning of the season. If you look at
our recommendations post-Margs, we have selected the perfect team
with 55.36% accuracy.
As another measure, our “Numbers” team in the WSL game is ranked
655th in the world, putting it in the top 0.55% of all registered
teams (though we recognise that many of the registered teams may
not be full-season or fully active teams).
BeachGrit: Did you give the title to John John, and
when?
Our data clearly told us that Wilko couldn’t maintain his lead
across the season. While consistency was always going to be his
achilles, there was also the fact that he was winning with a much
lower AHS than many of his peers. He was still surfing beyond his
previous career averages, but our data suggested that he would be
overtaken by the likes of John and Gabriel if his heat scores
stayed comparable to guys like Jack Freestone who were struggling
to requalify.
As far as calling it, we actually had the title race to be much
tighter between Gabs and Double-John. The 2016 data favoured John,
but we didn’t expect it to be wrapped up before Pipe. A big part of
that was down to Medina failing to come anywhere near his projected
results for Portugal. It’s worth noting also that, as insignificant
as it may seem, Medina’s loss to Keanu in the France final was
massive. Had he won in France he’d still be in the title race (and
he’d have won me a bit of money).
BeachGrit: What are your projections for
Pipe?
We’re still working on our final projections for Pipe, but it
goes without saying that our suggestions will depend greatly on the
forecast. Looking at our data though, and assuming it’s big Pipe
(as opposed to Backdoor), Medina and Slater dominate most areas
including reefs, lefts and eight-to-10-foot conditions. If Backdoor
is on, guys like Joel and Julian start to look good as well. John’s
figures are pretty solid, and we’ve seen some improved numbers from
him across most areas this season. He’s definitely on the radar.
He’s going to be on everyone’s team though so you won’t get much of
an advantage by playing him.
We’ll have our full sortable tables up in a few days, so anyone
wanting to check the projections or sift through the finer details
can do it on our site. We put our full contest analysis up
(summarising the data and factoring in forecasts, Triple Crown
form, injuries, wildcards etc.) a few days before the event window.
We also send a linked email reminder out to our subscribers just
before each contest (our only email contact, I swear) so that they
don’t miss events or have to go searching for the data.
BeachGrit: How’s the final top five going to
look?
I can’t see it changing a great deal. Mathematically, it looks
like the top 13 are all eligible to finish top five after Pipe, but
many of them are dependent on others failing. John John has been
crowned champ already, so that’s locked, and Medina could only slip
as low as fifth in the absolute worst-case scenario. Jordy could
possibly overtake Medina with a fifth or better. Wilko and Kolohe
are almost identically placed and would need a second at Pipe or
better to beat Medinas current total. Kelly could climb as high as
fourth with a Pipe win, but would still need other results to go
his way just to make the top five. If you want me to name
names, I’m predicting John, Gabe, Jordy, Kolohe and Joel will round
out the top five after Pipe.
Come
come to a land where riches await here!