Thursday, March 9, 2017

So just how "hard" was the Rottnest Channel Swim 2017 and will the 2000 Solo record ever be broken?

So just how "hard" was the Rottnest Channel Swim 2017 and will the 2000 Solo record ever be broken?

Dear Swimmers

Following on from the very well received article last week Rottnest Swim 2017: Managing Expectation With Reality we are pleased to bring you this follow up article which aims to discuss and demonstrate statistically just how "hard" the 2017 event was in comparison to other years. 

This should be particularly interesting given how Saturday 25th February 2017 looked all set to be a record breaking day, in fact I vividly recall saying to the Channel Ten reporters how the conditions could see the mid to back pack swimmers swimming up to an hour quicker than "normal" (as was the view point of many people with the weather forecast). Even though myself and the prominent oceanographer Chari Pattiratchi from the University of WA's Ocean Institute correctly predicted the current and weather conditions with the instruction to definitely head north from the start line, I think the current proved to be much stronger than everyone expected. 

As we mentioned in the last blog, the "silent assassin" that is the current that lurks beneath, doesn't always give the impression that the conditions are necessarily "hard" per se (especially in comparison to notably rough years like 2003 and 2006 for those that can recall the event back that far), but as we found on the 25th times can become much slower than expected and this in itself makes for a "tough" day due to extra exposure to the elements.

One of the challenges with open water marathon swimming events is that the conditions can play a massive role on not just your finishing time, but also on whether you finish or not! Time to complete a set distance is always a quick and easy objective measure on how someone would perceive your performance (the first thing people will always ask you is "how long did it take you?"), but it never accounts for the conditions on the day. I remember swimming across the English Channel in September 2011 in prime physical condition hoping for a crossing time of roughly 9 hours, however, as I battled the 25-30kt SW winds (head on) and huge swell that day, my time of 12h14m (or 19m50s per km, which is barely "average") has never really felt like it was a just result for all the training, effort on the day and skill of my crew. When you view the conditions here you'll instantly sympathise with why it took me that long, but the record books only ever record your finishing time of course. Conversely when I won the world's longest and most prestigious marathon swim event (a 46km circumnavigation of Manhattan Island in New York) in 2013 in a time of 7h14m (or 9m25s per km, which is equivalent to Sun Yang's 1500m world record but repeated a crazy 30 times per continuously and without push-offs!), people who don't know the assistive currents of that race would assume I was superman or at the very least that I'd improved a LOT between 2011 and 2013! Whilst I definitely made improvements in those two years, I certainly wasn't more than twice as quick of course!

Bottom line, therefore, is that conditions are everything in open water marathon swimming. Judging your performance based on time alone is not very productive at all, but it does then beg the question, is there a way to statistically show within a large group of swimmers (like the Rottnest Channel Swim) some sort of scale of "hardness" for each year which is more objective than the simple subjective summation on the finish line of "that was a tough year" shared between competitors. Could you then even look back and retrospectively calculate what time you might have swum on a previous "good" year and equally can it help us identify whether someone like the Olympian Jarrod Poort who took out this year's race in fine style against the current, could have broken the 2000 record by Mark Saliba in 4h00m15s given more favourable conditions and even predict what time he might have feasibly done? The answer is yes…we think so!

If you love your numbers and your statistics you're going to love the solid work which squad swimmer Mike Fischer has put together single-handedly on this. The sceptics might also like to bring up the phrase popularised by Mark Twain:
"There are three kinds of lies: lies, damn lies, and statistics!" 

…but that's OK, this is meant to be a little food for thought and perhaps some further solace for those of you who are still lingering on the feeling of under-achievement from this year's event. At the end of the day, we can't control Mother Nature, you can't change your results, but what you can do is try to understand how and why they were what they were.

Mike picks up the story and the detail...

Using data like this to draw conclusions always requires a few assumptions and the one that sits behind this analysis is that the field has been made up of swimmers with about the same abilities every year. There will always be faster and slower ones and individual performances on the day may vary from year to year, but overall we will assume that the field as a group is comparable from year to year. We have then generated graphs which show finishing time (or average pace) plotted against finishing position. The statistical "trick" we have used here is to consider the finishing position as a percentage of the overall field, rather than as an absolute number. This means that a swimmer finishing 150th in a field of 300 plots at the 50th percentile level, as would a swimmer finishing 100th in a field of 200 swimmers. This allows us to account for variations in the size of field from year to year and plot the data on the same axes.

Displayed as finishing time in minutes

Displayed as average pace in minutes per km

From Paul: interestingly enough I've always wanted to break 5 hours but have never yet done it. I have swum Solos in 2009, 2011, 2013 and 2015. Using the data of these swims above which were between 9 and 23 minutes over 5 hours and looking at 2014 as an example (a cracker year), I would have probably swum under 5 hours in 2014, but only just…so it looks like that is still a very challenging target for me personally!

The first couple of graphs show these plots for each of the last 10 years, together with 2003 and 2006 and you will see that they all have a very similar shape, albeit that they are "shifted" vertically on the time/pace axis. The "S" shape curve reflects the fact that every field has a group of "gun" swimmers (typically around 10% of the field), a middle group (from around P10 through to P80) and a group of "steadier" swimmers bringing up the rear. The one exception is the toughest year, 2003, where the start was delayed by an hour due to conditions yet the cutoff times weren't altered. That means that the slower "tail" were either timed out or finished but didn't have their times recorded. We have replotted the data slightly to take this into account, but all the other data is exactly as the times were recorded – so every dot on the graph represents a swimmer walking out of the water at Thompsons Bay.
If we assume that the field has around the same abilities each year, then the "shift" in curves from year to year must be due to a combination of environmental conditions, with the ones towards the top (ie. The slowest) being the toughest years and the ones towards the bottom being the "easier" years (ie. The fastest). This ties in very well with anecdotal evidence from swimmers who have competed in multiple years, although this analysis allows us to be a little more quantitative. It is important to note that we can't separate the effects of swell/wind/current etc., but are looking at a combined effect of all the environmental factors. 
Having done this, we can then start using the data to draw some conclusions:
The first is the "degree of difficulty" (DoD) of each year's swim. If we assume that, on a scale from 1-100, the fastest years (2000 and 2014) are a "1" and the toughest year (2003) is a "100" we can then see where each year falls on that scale. Rather than use a single figure for each year, we have looked at the "gun" group, the middle group and the steady "tail" separately, characterizing them by the P20, the P50 and the P80 time/pace respectively. We have done this because although the curves have an overall similar shape there are subtle, but potentially significant, differences. As an example, look at the 2000 data where up to around P65 (corresponding to those who finished in a little under 7 hours) the field had a great swim. The curve becomes much steeper at that point, suggesting that something (either a current or, more likely, the sea breeze) has slowed the back end of the field significantly. The same trend can be seen in the data from 2006 where an already tough swim became increasingly difficult at P70 (corresponding to around 8hrs 30 mins) and 2009 at P40 (around 7 hrs). Unfortunately it always seems to get harder for the second half of the field (a steeper curve) and never easier…. 
We've then plotted the "degree of difficulty" for the front, the middle and the back of the field in a number of different ways; a simple bar chart, a "pinwheel of pain" and a "triangle of torture". Using the data to rank 2017, the quick end of the field had it the "easiest", with a score of 33%, although it still ranks as the most difficult swim since 2006. The "visible" conditions in 2017 were close to perfect, but the "silent assassin" has clearly had a major impact, even on the elite end of the pack. The day didn't get any easier unfortunately and the curve stays very steep through the middle of the pack with a P50 DoD of 54%. There is an interesting steepening of the curve at around P40, corresponding to a finish time of around 7 hours (around 1pm or so). This corresponds to the time at which the current was forecast to increase dramatically on the CSIRO data. Remember that a current of 1 knot equates to around 2km/hr so, assuming a swimmer was staying on or parallel to the rhumb line, for each kilometer they were traveling over the ground they were actually swimming close to 1.5km which may be the cause of the slowing of the field. The trend increases towards the back of the field, where it was a seriously tough day at the office, where the P20 DoD is 76% and close to both 2003 and 2006 in hardness, albeit for very different reasons.

Demonstration of which years have been the most "difficult" in the last 17 years (2001, 2002, 2004, 2005 data omitted given specialist cases of 2000 - a known good year and 2003 / 2006 - known tough years. 2007 was cancelled due to bad weather)

The slower you are, sadly the harder the tougher years are for you as well compared to the pointy end of the field. This demonstrates how much of an advantage an early start time can prove to be also!

As above - the slower quartile receive the brunt of the environmental elements each year!

At the very sharp end of the field, we have plotted the winning time/pace against the P20 time/pace, with the latter being a "proxy" for the conditions. The data shows a nice linear trend as you would expect, with the tougher years having correspondingly slower winning times. There are, however, two exceptions to this trend; Mark Saliba in 2000 who finished in 4 hours and 15 seconds and Jarrod Poort in 2017 who finished in 4 hours 12 mins. Both of these swimmers "outperformed" the rest of the field by a considerable amount, making them the absolute standouts.

It is also possible to "reverse engineer" the environmental conditions out of individual performances and estimate what time a swimmer could have achieved had they performed the way they did in 2017, but had actually swum in a different year. This is certainly pushing the data a very long way but it brings up some interesting numbers. In particular, we have looked at Jarrod Poorts winning time in 2017 and estimated what times he would have swum  in each of the years we have looked at.

The data shows that had Jarrod travelled back in a time machine and swum in 2000, putting in a comparable performance to this year, he would have finished in around 3 hrs 48mins. Indeed, the graph shows that he would also have beaten the 4 hour mark in 2010, 2013, 2014 and 2016.
Unfortunately the data is not predictive – meaning that we can only do this type of analysis after the race is over. So the mantra of anyone planning to swim a solo is to hope that it is a good year for conditions, but train as though it will be a tough year and then recognize that every year will be different, although you may not know it until you are well into the crossing. Ultimately the conditions are likely to have far more effect on your time than you expect, with variations of up to an hour at the sharp end, 2 hours or more in the middle of the field and up to 3 hours towards the back of the field and swimmers who can consistently hold a "race pace" of 20min/km in a good year, may struggle to achieve 25min/km in a more difficult year.

We hope you have enjoyed this analysis of the 2017 Rottnest Channel Swim - please feel free to share with your friends. Comments / feedback to 



No comments:

Post a Comment

Please add your comments here: