Wednesday, September 24, 2014

A Closer Look at the 2014 Games Programming

Today's post will basically put a bow on the 2014 CrossFit Games season.  What's that, you say?  Didn't the Games end 2 months ago?  Haven't we moved onto the Team Series already?  Isn't the 2014 Games season in the distant past?  NO!  It's not over until I say it's over, and since I've been too busy to wrap it up until now, it's not over yet!

Sorry.  Anyway...

Like last year, I'll break things down based on the five goals I think that should be driving the programming of the Games, in order of importance:
  1. Ensure that the fittest athletes win the overall championship
  2. Make the competition as fair as possible for all athletes involved
  3. Test events across broad time and modal domains (i.e., stay in keeping with CrossFit's general definition of fitness)
  4. Balance the time and modal domains so that no elements are weighted too heavily
  5. Make the event enjoyable for the spectators
For each goal, there will be some brief discussion and analysis, and I'll conclude by pointing out suggestions for improvement, because simply identifying the problems only gets us halfway there. Additionally, I'll point out things that I felt worked out particularly well.  For those who've been following this site for a while, this is basically the same way I broke things down at the end of last season.

So let's get started.

1. Ensure that the fittest athletes win the overall championship

I'll start by saying that I do think the two fittest athletes won the titles this year.  Rich Froning came through when it counted, and despite looking more human than in years past, I think there is no doubt that he deserved his fourth straight title.  Camille was ridiculously consistent the whole season, and her combination of Olympic lifting prowess and gymnastic ability is pretty much unrivaled.  Also, as discussed in this post from August, several different scoring systems would have produced the same champions.

That being said, if we're being honest, we have to acknowledge that Sam Briggs realistically could have won the Games if she qualified.  She dominated the Open, and aside from one event (an extremely specialized event at that), she was among the best in the world at Regionals.  While maybe not the most skilled, I feel she has the best conditioning in the world.  Looking at the programming that came out at the Games I think she would have fared well, but it would not have been easy for her.

Looking at the 36 events she competed in during 2013 and 2014, she's only finished outside the top 20 worldwide in 7 of them: 2013 Regional Event 2 (3-RM OHS), 2013 Games Zigzag Sprint, 2013 Games Clean & Jerk, 2014 Open Event 2 (C2B and OHS), 2014 Regional Event 1 (max hang squat snatch), 2014 Regional Event 2 (max HS walk) and 2014 Regional Event 7 (pull-ups and heavy OHS).  The theme here is heavy Olympic lifts and extremely short time domains.  Other than that, she's at the very top of the world.

Applying that to this year's Games, she potentially would have struggled on the 1-RM OHS, the Clean Speed Ladder, the Midline March (due to HS walks) and Thick and Quick.  Other than that, I'd expect her to be top 10 in basically everything else.  Would that have been enough to win?  It's hard to say.

How We Can Do Better: Avoid programming extremely volatile events at Regionals (like a single attempt at a HS walk).  Remember, the goal is to find the fittest athletes for the Games, not just to see who can avoid screwing up (this isn't American Ninja Warrior).  Also, I hope the qualification system gets tweaked to allow more athletes from the elite regions to make it (Central East men, for instance).
Credit Where Credit is Due: The Games test was not overly grueling this year (see discussion in my last post), and while we did have a couple of top contenders fall out due to injury (Kara Webb and Anna Tunnicliffe), it seemed like most of the athletes were competing at their best the entire week.

2. Make the competition as fair as possible for all athletes involved

I think HQ learned from the mistakes of 2013 as they programmed the Open this year.   Judging was pretty straightforward on every event in the Open, and really judging wasn't a major issue throughout the season.  To me, that's a huge key for this sport moving forward.  The less we can have spectators talking about the judging, the better.

I also like the improvements in the way ties were handled at the Games.  We saw very few big logjams the way we did in the in Cinco 1 and 2 in 2013.  All the athletes were able to separate themselves throughout the field on every event.

How We Can Do Better: Though I personally liked the workout, the inclusion of a rower in Open Event 4 goes against the fact that HQ has consistently said that the Open is for anyone in the world.  Rowers are far more expensive than the other pieces of equipment that have been required in previous years.
Credit Where Credit is Due: Judging is becoming less and less of a factor throughout Open, Regionals and Games.  Also, there were fewer massive ties in the standings at the Games.

3. Test events across broad time and modal domains (i.e., stay in keeping with CrossFit's general definition of fitness)

Like last year, let's start by looking at a list of all the movements used this season, along with the movement subcategory I've placed each one into. I realize the subcategories are subjective, and an argument could be made to shift a few movements around or create a new subcategory. In general, I think this is a decent organizational scheme (and I've used it in the past), but I'm open to suggestions.

As we've seen in recent years, the season as a whole is testing a very wide variety of movements.  This year saw 27 different movements, compared with 29 last year.  The only movements that did not appear at all this year that have appeared in at least two other seasons are: bike, KB swing, wall climb-over and push-up. I don't think leaving those out are too much of a concern.

Another key goal is to hit a wide variety of time domains and weight loads. Below are charts showing the distribution of the times and the relative weight loads (for men) throughout the entire 2013 and 2014 seasons. The explanation behind the relative weight loads can be found here.  Two notes: 1) some of the Regional and Games movements had to be estimated because I don't have any data on them (such as weighted overhead lunge and pig flips); 2) the time domains for workouts that weren't AMRAP were rough estimates of the average finishing times.

What was lacking this year, as discussed previously, were the extremely long events.  There were no events beyond 45 minutes this season, and there were twice as many sub-5:00 events as last year.  You'll also notice that there were a smaller number of lifts in all of the ranges except the very low (0.4-0.8) and the very high (2.4+).  Part of this is just the fact that there were more lifts last season, though that was not the case at the Games.  We have previously noted that the Regionals were particularly bodyweight-focused this season, but the Games made up for that by being particularly heavy.

How We Can Do Better: I'd like to see us go really long at least once (though probably not more than once).
Credit Where Credit is Due: The season really didn't miss out on much.  It's really not likely that an athlete could finish well these days with a hole in any area of their fitness.

4. Balance the time and modal domains so that no elements are weighted too heavily

Based on the subcategories of movements I've defined above, below is a breakdown of movements in each segment of the 2014 Games Season. As in the past, these percentages are based on the importance each movement was given in each workout, not simply the number of times the movement occurred (so an OHS in the 1 RM at the Games was worth more than the OHS in Open 14.2).

We see this year that there was consistently focus on the Olympic lifts at all phases of the competition, but we saw changes in the other categories.  Pure conditioning (running/rowing/double-unders) increased in value steadily throughout, whereas high skill gymnastics increased dramatically after the Open.  Conversely, basic gymnastics decreased in value steadily and was not much of a factor in the Games.  As is typical, we didn't see many of the uncommon CrossFit movements until the Games.

In my opinion, the Olympic lifts were maybe a touch overvalued (a third of the competition seems like a bit much).  Also, I'd like to see the three phases of competition be a little more similar, so that you don't have athletes qualifying for the Games without being competent in a particular area, only to be exposed at the Games.  For instance, we continue to see running have little value until the Games, which means we may be letting in athletes who are poor runners.  Of course, those athletes won't win the Games anyway, but you may be keeping out athletes who would have been better Games performers.

That being said, there's no such thing as perfect balance here.  We know the Open can't have a ton of high skill gymnastics movements, and we know that Open-style workouts might not be spectator-friendly at the Games.  You could also argue that powerlifting-style lifts are undervalued, but I'm of the opinion that the Olympic lifts are a much better test, and HQ seems to agree, so I doubt we will see that change.  I don't see any glaring problems in the chart above, so that's good news.

As far as time domains, I think there was pretty good balance, even though a bit more weight was given to shorter workouts than in years past.  Like I mentioned previously, I can't complain too much about this, because it seems to keep the athletes fresher throughout the Games.  As long as HQ continues to throw in a few nasty workouts in the 15-25 minute range, plus one or two really long ones, I think that's sufficient.

Another way to see if we're not weighting one area too much is to look at the rank correlations between the events. If the rankings for two separate events are highly correlated, it indicates that we may be over-emphasizing one particular area.  As I did last year, I focused only on the Games, since it's not really a problem if we test the same thing in two different competitions since the scoring resets each time.  It's the overemphasis within the same competition that's a problem.

The chart below shows all the combinations of men's events at the Games, excluding the finals, since the entire field did not compete.

The cells highlighted yellow had a correlation above 50%, and the cells in red had a correlation below 0% (for some reason, the "-" sometimes doesn't come through well on the picture, but those are negative numbers).  Only two combinations had a correlation above 50%, and one of those were the two sprint sleds (each of which were worth only 50 points, so this actually isn't really a problem that they're highly correlated).  The other combination was event 2 (1-RM OHS) and event 9 (clean speed ladder) - this shouldn't be surprising.

For women, the results were similar, although there were also high correlations between Event 1 (The Beach) and Event 3 (Triple-3) and between Event 2 (1-RM OHS) and Event 6 (21-15-9).  Not sure about the OHS/21-15-9 combination, but the other one also makes sense intuitively since both events are very long.

It's also interesting to note the negative correlations, most notably the -31% correlation between Event 3 (Triple-3) and Event 9 (Clean Speed Ladder).  Women showed a -14% correlation here as well.

All in all, this is similar to what we saw last year, but probably a bit more balanced.  Between men and women, there were 6 correlations above 50%, compared to 8 last year, and as mentioned, 2 of those (sprint sled events for men and women) were only given half-weight.  

How We Can Do BetterI've said it before, but it bears repeating: We need to test running earlier in the season (and more than just 50-yard jogs between rope climbs).  Also, I'd prefer we lighten up just a bit on the Olympic lifting emphasis.
Credit Where Credit is Due: Things were really balanced overall, and cross-correlations between events at the Games were lower, indicating we weren't testing the same things a bunch of different times.

5. Make the event enjoyable for the spectators

I was lucky enough to attend the Games in person for the third straight year (thank you to my lovely wife), and I can say without a doubt that they are improving the spectator experience each year.  I was originally concerned about the fact that the morning and afternoon events would all be in the soccer stadium, but I think they made it work.  There's still no doubt that the tennis stadium is the more exciting venue, but the issue there is that you can't fit as many athletes in each heat, so the sheer number of heats tends to make things drag on.  In the soccer stadium, they were able to keep most events to 3 heats, which really is a big improvement over 4 heats.  Not to mention that all the silver ticket holders get to watch those events as well.

Another change was that the team events were all done in the morning prior to any individual events, so it really separated that out from the individual competition.  While some of the teams may disagree, I liked this move because it shortened the breaks between individual events.  Some of the team events can really drag on, and while the top heats are often exciting, the majority of fans probably don't want to sit through tons of the team competition if they don't have to.  And of course, those who do want to watch can show up in the morning and get great seats.

I think ultimately, the Games are going to have to have all of the events in the soccer stadium, or some other large venue.  There is simply too much demand to continue limiting the Gold tickets to only 10,000 fans.  From my understanding, they sold out in a matter of minutes this year, and even some people who were logged in prior to the tickets going on sale could not get them.  Let's hope HQ can find a way to satisfy the demand while still keeping the event as exciting as possible for those who do get tickets.

How We Can Do Better: Do not let Swizz Beatz perform again.  In fact, replace the musical act completely with those crazy gladiator dudes.
Credit Where Credit is Due: The experience continues to improve each year.  The views in the soccer stadium were improved, there were fewer lulls in the action and the spectators were more engaged than in previous years.  The conclusion of that men's Push-Pull event with Bridges narrowly holding off Froning was still the highlight of the weekend for me.

And now... the 2014 CrossFit Games season is in the books.

Saturday, August 30, 2014

The 2014 Games Were Heavier, Higher-Skill and Shorter Than Recent Years

Welcome back for another relatively quick one.  Today I'm going to hit a few highlights of the analysis I've done on the programming at this year's Games.

As the title would suggest, the big point here is that this year's Games were heavy and high-skill. Conversely, in comparison to prior years, they weren't as much about stamina, endurance and generally managing fatigue.

Let's consider the first point.  The chart below shows two key loading metrics for men for all eight CrossFit Games.  For those unfamiliar with these metrics, start here.

The load-based emphasis on lifting (LBEL) is always the first place I look when evaluating how "heavy" a competition was.  It gives us an indication of both the loads that were used as well as how often lifts were prescribed.  The LBEL at this year's Games was 0.89; the next-highest was 0.73 back in 2009.

The high LBEL in 2009 was based partly on the fact that there were two max-effort events out of eight total.  In metcons, the loadings were actually quite light at that time.  But in 2014, the met cons were heavy as well.  The average relative weight in metcons was 1.43; the next-highest was 1.36 in 2013.  For context, a 1.43 relative weight is equivalent to a 193-lb. clean, a 146-lb. snatch and a 343-lb. deadlift.  These are average weights used in metcons; the days of the bodyweight specialist competing at the CrossFit Games are over.

The women's numbers tell a similar story.  The LBEL was 0.62, about 30% higher than the previous high of 0.48 in 2009.  The average relative weight in met cons was 1.01, significantly higher than the next-highest (0.88 in 2013).  The 1.01 is on par with the men's loads from 2007-2010.

Not only were the Games programmed heavy, but the athletes are just flat-out getting stronger.  The chart below shows the relative weights achieved during the max-effort lifts in the Games historically. These represent the average across the entire field (except in 2007, when I limited the field to the non-scaled participants only).

Not only was the men's average of 2.71 in the overhead squat well above the previous high of 2.36, but the women's average of 1.80 is higher than the men achieved in the CrossFit Total in 2007 and the Jerk in 2010 (granted, that lift occurred within 90 seconds of the Pyramid Helen workout).

Now, as far as the high-skill comment, consider the types of movements that were emphasized at the Games.  I generally categorize movements into seven broad groups: Olympic-Style Barbell Lifts, Basic Gymnastics, Pure Conditioning, High Skill Gymnastics, Powerlifting-Style Barbell Lifts, KB/DB Lifts and Uncommon CrossFit Movements (sled pulls, object carries, etc.).  The two that require the most technical ability are High Skill Gymnastics (such as muscle-ups and HSPU) and Olympic-Style Barbell Lifts.

This season, Olympic-Style Barbell Lifts accounted for 32% of the total points, which is second all-time (2008 they were worth 38%).  The High Skill Gymnastics movements accounted for 17%, which was only topped once (2010).  Combined, those two groups accounted for 49%, which is second all-time.  The only year with a greater emphasis was 2010, which actually included the incredibly challenging ring HSPU.  Still, the sheer volume of high-skill movements required of athletes was far higher this year.  The muscle-up biathlon included 45 muscle-ups; in 2010, "Amanda" was crushing about half the field with just 21 muscle-ups.  The ring HSPU were tough in 2010, but were they more challenging than the 10-inch strict deficit HSPU this year?  Remember, women were only required to do regular HSPU back in 2010, and only 28 of them.  These days, 28 regular HSPU is nothing for the elite women's athletes.

On the flip side, this Games had much less volume than recent years.  The chart below shows the longest event (based on winning time), the approximate average length of all events (including finals) and the approximate total time that athletes competed, dating back to 2011.  It's clear that this year was much less grueling than the past two seasons, and it was very similar to 2011 (including starting with a ~40-minute beach workout).

The theme of this year's Games was strength and skill, not stamina.  Why?  Well, I have to believe television had something to do with it.  This year's events were more spectator-friendly across the board, and they may set the stage for future seasons in which every event is broadcast on cable (ESPN won't be showing a 90-minute rowing workout, that's for sure).  People don't like to watch events that take forever, but they do like to watch people lift heavy stuff and generally perform feats of strength and skill that make you say "I could never do that."

My hope is that the Games can continue to be spectator-friendly without losing the events with that "suck factor" that we in the community know and love.

Thursday, August 21, 2014

Rich Froning's Comeback Could Have Been Even More Amazing (and more scoring system thoughts)

Today will be the first in a series of posts breaking down the 2014 CrossFit Games in more detail.  In the past, I have combined a lot of thoughts into one or two longer posts reviewing the Games (in particular, the programming).  However, this year, due to time constraints from my work and personal life, I'm planning to get the analysis out there in smaller doses, otherwise it might be another month before my next post.  And in fact, this might be the best way to handle things going forward, but we will have to see.  Anyway, let's get moving.

Unlike the past two seasons, Rich Froning did not enter the final day of competition with a commanding lead.  In fact, he didn't even enter the final event with a commanding lead.  All it would have taken was a fifth-place finish by Froning and a first-place finish by Mathew Fraser on Double Grace for Froning to finish runner-up this season.  But what you may not have realized is that it could have been even tighter.

In the Open and the Regionals, the scoring system is simple: add up your placements across all events, and the lowest cumulative total wins.  At the Games, however, the scoring system changes to use a scoring table that translates each placement into a point value.  The athlete with the highest point total wins.  I've written plenty about this in the past (start here if you're interested), but the key difference is this: in the Games scoring system, there is a greater reward for finishes at the very top, and less punishment for finishes near the bottom.  The reason is that the point differential between high places is much higher (5 points between 1st and 2nd) than between lower places (1 point between 30th and 31st).

So you know that small lead Froning had going into the final event?  Well, under the regional scoring system*, he would actually have been trailing going into that event... BY 8 POINTS!  And he would have made that deficit up, because he won the event while Fraser took 11th.  I think it is safe to say that would have been the most dramatic finish to the Games we have seen (I guess Khalipa in 2008 was similar, but there were like 100 people watching, so...).

One reason the scoring would have been so close under this system is that Fraser's performance was remarkably consistent.  His lowest finish was 23rd.  All other athletes had at least one finish 26th or below, and Froning finished lower than 26th twice.  But Fraser also only won one event and had four top 5 finishes.  Froning, on the other hand, won four events and finished second one other time.

I also looked at how the scoring would have turned out under two other scoring systems:
  • Normal distribution scoring table - Similar to the Games scoring table, but the points are allocated 0-100 in a normal distribution.  See my article here for more information.
  • Standard deviation scoring** - This is based on the actual results in each event, rather than just the placement. Points are awarded based on how many standard deviations above or below average an athlete is on each event. More background on that in the article I referenced early on in this post.
Here is how the top 5 would have shaken our for men and women using all four of these systems (including the current system):

As far as the winners go, we would not have seen any changes.  Clearly, Froning and Camille Leblanc-Bazinet were the fittest individuals this year.  Generally, what you can observe here is that the athletes doing well in the standard deviation and normal distribution system had some really outstanding performances, whereas the athletes doing well in the regional scoring system were the most consistent.

What is also nice about the standard deviation system is that it can tell us a little more about how each event played out.  For each event, I had to calculate both the mean result and the standard deviation in order to get these rankings.  That allowed me to see a few other things:

  • Which events had the most tightly bunched fields (and the most widely spread fields)?
  • Were there significant differences between men and women in how tightly scores were bunched on events?
  • Which individual event performances were most dominant?
To measure the spread of the field in each event, I looked at the coefficient of variation, which is the standard deviation divided by the mean.  For instance, the mean weight lifted for women event 2 was 213.6 and the standard deviation was 22.1 pounds, so the coefficient of variation was 10%.  The higher this value, the wider the spread was in the results.  And remember, if the spread is wider, the better you have to be in order to generate a great score under the standard deviation system.

To see which individual event performances were most dominant, I looked at the winning score on each event.  Typically, this score was between 1.5 and 2.75 standard deviations above the mean; this is in the right ballpark if we assume a normal distribution, because there would be about a 7% chance of getting a result of 1.5 standard deviations above the mean and a 0.3% chance of getting a result of 2.75 standard deviations above the mean.

The chart below shows both the winning score (bars) and the coefficient of variation (line) for each event.  Note that the Clean Speed Ladder is omitted because there it was a tournament-style event and does not convert easily to the standard deviations system.  For my calculations of points on the Clean Speed Ladder, I used a normal distribution assumption and applied points based on the rankings in this event.

The largest win was Neal Maddox's 3.43 in the Sprint Sled 1; a normal distribution would say this should occur about 1-in-3,000 times.  For those who watched the Games, this performance was quite impressive.  Maddox looked like he was pushing a toy sled compared to everyone else.  Also, don't sleep on Nate Schrader's result in the Sprint Carry.  It may not have appeared quite as impressive because the field was so tightly bunched (only a 9% coefficient of variation, compared to 23% on Sprint Sled 1).

The most tightly bunched event was the Triple-3 for both men (7%) and women (5%).  The Sprint Carry was next (9% men, 7% women).  The event with the largest spread was Thick-n-Quick, at 53% for men and 41% for women.  Remember, Froning won this event in 1:40 (4.2 reps per minute), while some athletes only finished 2 reps (0.5 reps per minute).

The lesson, as always: Rich Froning is a machine.

*All of the alternate scoring scenarios here assume that the sprint sled events would each be worth half value.
**In order to do this, I had to convert all time-based events from a time score to a rate of speed score (reps per minute, for example).  There are lots of intricacies to this, so another individual calculating these may have used slightly different assumptions.  The main takeaways would be the same here, I think.

Friday, August 1, 2014

Initial Games and Pick 'Em Observations

Only a few days removed from the conclusion of the 2014 CrossFit Games, I haven't quite had time and energy to completely digest what took place.  Trust me, there is more analysis to come dealing the Games from a variety of angles, but for now, let's start with some quick observations.
  • I had the good fortune of being able to attend the Games in person (thanks to my wife for the birthday surprise!), so I can't comment on the quality of the TV product, but the intensity in-person for the prime-time events was top-notch.  In particular, the conclusion of the Saturday night event ("Push Pull") was probably the most exciting individual event I've witnessed.  The crowd's reaction when Froning took the lead, when it looked for the first time all weekend that the real Rich Froning had arrived, was powerful.  But for Josh Bridges to go unbroken on the last set of handstand push-ups and then hold off Froning on the final sled drag was really something special.
  • One of the underrated moments of the in-person experience came as we were leaving the venue Saturday night and a buzz went through the departing crowd as the JumboTron showed the updated men's overall standings with Froning out front for the first time since Friday morning.  You really got the sense that the spectators were fans, not just CrossFitters there to support the athlete from their local box.
  • Also super-cool was the "Fra-ser" chant in a small but vocal section of the crowd before the men's final.  Don't get me wrong, Froning was still the clear fan favorite, but this was a neat moment to hear the support for the underdog.
  • The Muscle-up Biathlon was also a pretty thrilling event, both for the men and women.  For the women in particular, the race between Camille and Julie Foucher (still in contention for the title at that point) was pretty nuts.  And I'm not sure if this is a good or bad thing, but when Foucher was no-repped at the end of her round of 12, I heard the first real booing at a CrossFit event.  People friggin' LOVE Julie Foucher, and they HATE no-reps.
  • Now let's get to some numbers.  Based on the CFG Analysis predictions, Cassidy Lance finishing in the top 10 was the third-longest shot to come through dating back to the 2012 Games.  I had her with a 2.3% chance of reaching the top 10.  The only two longer shots were Anne Tunnicliffe in 2013 (2.2% chance of top 10) and Kyle Kasperbauer in 2012 (1.2% chance of podium).  There were really no major long-shots on the men's side this year.  Jason Khalipa for podium had the lowest chances at 9.1%.
  • The winner of the pool was JesseM with 134.7 points.  He had some great picks, including 6 points on Lauren Fisher to finish top 10. However, he had far from the ideal set of picks.  That would have been the following:
    • 1 Rich Froning win
    • 1 Jason Khalipa podium
    • 1 Tommy Hackenbruck top 10
    • 1 Camille Leblanc-Bazinet
    • 1 Annie Thorisdottir podium
    • 15 Cassidy Lance top 10
    • Total score of 688.7!
  • I wrote a piece a couple weeks ago in which I mentioned that prior Games experience was worth approximately 4-5 spots at the Games.  The results from this season were consistent with that.  It seems fair to say that the advantage of having experience at the Games is real.
  • Despite not including the impact of past experience in my model, the calibration of my predictions turned out to be pretty solid again this year.  Now with three years of data, below is a chart showing the calibration of my top 10 predictions.

Over the next few weeks, I'll be breaking the Games down in more depth, in particular digging more into the programming, how it compares to years past and what it might tell us about the future.  That's it for today, so good luck in your training, and I'll see you back soon.

Thursday, July 24, 2014

Pick 'Em Rankings (updated after each day of competitions)

The contest rankings below represent what the payoffs would be using the current Games standings.  A highlighted cell indicates a correct pick.  Currently the chart below does not show how much was wagered on each athlete.  I'm working on a way to incorporate that without making the chart too cumbersome.


Congratulations to JesseM on winning the CFG Analysis 2014 Games Pick' Em! Five of six picks right, including a big-money pick of Lauren Fisher for top 10. More analysis of the results, and the Games in general, will be coming over the next week. Stay tuned!

If you see a pick that looks incorrect for you, let me know. I typed these in by hand and easily could have made a mistake.

Quick Thoughts on the Games so far [UPDATED MORNING OF 7/27]:

  • Not quite sure how Rich Froning is back in front after the past three days. He's had so many finishes in the 20's that it's pretty shocking he could still be in first, but his ability to rack up first and second place finishes has been crucial with this scoring system.  I haven't had time to do the math on this yet, but my guess is that Mat Fraser would be in front if the regionals scoring system was used.
  • Camille Leblanc-Bazinet looks to be running away with this on the women's side.  I mentioned it to a friend of mine prior to the Games that the fact that she has had so little hype, especially compared to Julie Foucher, might really help her mentally at the Games.  I'd like to note that Camille was a 38% shot to win the Games here on CFG Analysis, but was not in Pat Sherwood's top 8 competitors.  Just a note.
  • On the flip side, the pressure on Julie Foucher may have been too much.  For whatever reason, she was the focus of so much hype in the community this year, and the Games really haven't turned out quite like people expected for her.  That being said, she still has a shot at the podium if she can put together a couple solid finishes today.
  • Spectator-wise, Saturday's events were definitely the best of the three days so far.  The muscle-up biathlon made for some tremendous drama, and the men's final in the push-pull event had quite possibly the most intense finish that we've seen in the Games history.  It certainly helped that the men's competition (finally) is actually close, and that race really meant something. [END 7/27 UPDATES]
  • [7/26 UPDATES BELOW]
  • It probably goes without saying that this will be the toughest test Rich Froning has faced since he finished second in 2010. He desparately needed that win in 21-15-9, but now that he established that he can still dominate in a traditional CrossFit workout, I would still consider him the favorite despite sitting in fourth right now.
  • Of the men in front of him, I think Josh Bridges has the best shot to take him down.  While I don't think Khalipa will fall far, I also think that he generally has a hard time beating Froning in the traditional workouts, so I expect Froning will continue to gain on him the rest of the weekend.  But Bridges is one of just a handful of people that can beat Froning on the classic CrossFit workouts if they're in his wheelhouse.
  • If Julie Foucher isn't able to make some ground up quickly today, this is set up to be a two-horse race on the women's side.  Camille has survived the early workouts, which typically have hurt her in the past, and the remainder of the weekend should set up well for her.
  • Interesting that none of our 25 participants picked current leaders Kara Webb or Jason Khalipa to win.  Khalipa is popular enough that I am surprised no one took a flier on him at something like 70:1.  It's a little more understandable on the women's side, since Kara Webb wasn't quite as good of a payoff at 18:1 and Julie Foucher seemed like a great value at 6:1.[END 7/26/UPDATES]

Wednesday, July 16, 2014

Quick Hits: Contest Updates and Last-Minute Games Thoughts

T-minus one week, everyone.

Not that it should be a shock to anyone who's followed the sport for the past few years, but the Games will be kicking off two days earlier than originally scheduled, beginning with "The Beach" on Wednesday, July 23. We know a little bit about what's to come at this year's Games, but as usual, most of the weekend is still unknown. As we head down the home stretch here, I wanted to get one more post in to hit on a few topics.

First, the CFG Analysis Games Pick 'Em is now almost closed, and we've got 25 people entered as of the time of this writing. I sort of consider this a test run since there's nothing to give away and I haven't really done a whole lot of advertising, but I think we've already got enough entries to at least make things interesting and iron out any kinks for next year and beyond (that being said, anyone on the fence about entering, go ahead and throw your hat in the ring - the more, the merrier). What I think is intriguing about the set-up for this contest is that there are a lot of different strategies to employ, and we've already seen a few different ones come up.

One way to evaluate that is to look at how aggressive people have been with their wagers. For each competitor so far, I've calculated the maximum payoff they could have in the event that they hit on all six picks. These range from less than 60 points to over 600 at this point. The top 4 potential payoffs so far are:
  1. jayfo - 628.5 points (including 6 on Eric Carmody to go top 10 and 3 on Kenneth Leverich to podium)
  2. J Weezy - 345.1 points (including going with Noah Ohlsen for win, podium and top 10 on the men's side, as well as 10 points on Alessandra Pichelli to go top 10)
  3. kie - 325.0 points (including 15 points on Julie Foucher to win and a high-paying pick of Lucas Parker to podium)
  4. John Nail - 298.2 points (including a clever 13-point wager on Alissandra Pichelli to go top 10)
It's also been fascinating to see which athletes are being picked most often. Remember, these are not necessarily the athletes people think are most likely to make finish in certain spots, but rather the athletes where people believe my predictions are most understated. And since my picks are largely based on Regional results, what that really boils down to is the athletes who people expect to improve on their Regional performances.

The athletes with the most points wagered on them so far are:
  1. Julie Foucher - 92 points
  2. Rich Froning - 74 points
  3. Camille Leblanc-Bazinet - 32 points

Clearly people are liking Julie Foucher to win with roughly a 6:1 payoff, but even more impressive is that people are almost exclusively picking Rich Froning to win despite a payoff of less than 1:1. Then again, it's hard to bet against a man who has won the Games three years in a row and has finished atop the Cross-Regional leaderboard three years in a row. 

As noted before, the payoffs will be adjusted if and when athletes withdraw prior to competition. So far, Rory Zambard has withdrawn. I have adjusted the payoffs already, but the effect is minimal, so I won't be re-posting the predictions. If one of the favorites drops off, I will likely re-post the picks. You are allowed to adjust your picks at any time before Wednesday's opening event for any reason.

Now, onto the Games themselves.  At this point, we know at least something about 5 events, but three have at least some portion left in doubt. For instance, all we know about "The Beach" is that, well, it's at the beach. And we do know there will be two sled push events, but we don't know the weight. The weight there could make a huge difference; I like Jason Khalipa's chances a lot better with a 300-lb. sled than a 100-lb. one.

With that in mind, here are a handful of thoughts on what has been released and what could be to come:

  • The events released so far are a pretty decent balance between strength and conditioning, so I don't think the remaining events will really be biased either way. What we haven't seen yet is any sort of high skill gymnastic movements, so I'd expect those to be coming Friday and Saturday night as well as Sunday.
  • For those who read this blog on a regular basis, you may recall that the "Triple 3" combines the three movements that fall into the "Pure Conditioning" group. Without question, this is going to be a test of exactly that: conditioning. Especially at this level, I don't see these athletes having too much trouble with 300 double-unders. The row is basically just a long warm-up here, and I don't think the athletes will separate much there. I think this is all going to come down to who has the lungs left on that 3 mile run.
  • Given the other events on Friday, I'm not sure that "The Beach" will actually include much running. We already have a 3-mile run and the two sled pushes on Friday, so they may not be doing a whole lot more running (plus you throw in the 300 double-unders, and there are going to be some sore calves come Saturday morning). My hunch (hope?) is that "The Beach" includes some unique stuff like paddle boarding or some sort of object carry through the water.
  • Interesting that they put the one-rep max overhead squat on Wednesday when hardly anyone will be watching. Typically the max lift events have been highlighted at the Games in the past, so I'm not sure why they buried this one at the beginning. It also seems a little repetitive considering the amount of overhead squat and heavy snatches at the Regionals. 
  • I wouldn't rule out another max lift somewhere in the weekend. What I would LOVE to see is a heavy ladder, but with several reps per minute. I was in a competition a couple years ago that had 20 double-unders plus 3 clean-and-jerks each minute, and that got serious in a hurry.
  • Will the two sled events both be worth 100 points? I hope not. I like the idea of having these small events that are worth 50 points to diversify the competition without weighting one particular skill too much.
That's it for today. Unless there are any major updates needed to the contest page, I expect you won't hear from me again until the Games begin. The goal is to post daily updates on the contest standings, and maybe a few quick thoughts on each day's happenings.

Until then, good luck with your training, and get your mind right for the Games!

Thursday, July 10, 2014

Does Past Games Experience Matter? (And Other Thoughts On Predicting the Games)

Today, I'd like to tackle a topic that's been mentioned quite often in CrossFit Games commentary. It's basically an assumption that's been taken as fact: having experience competing at the CrossFit Games in the past gives an athlete an advantage over first-time Games competitors. I've generally believed this to be true, but without data to support it, we're all really just guessing.

But let's start with the reason I decided to look into the issue. About a week ago, I released my 2014 Games predictions, which were used in the CFG Analysis Pick 'Em that is going on right now. What I started to notice as picks came in is that people tended to wager much more often on past Games competitors. One reason for this is simply familiarity: people know these athletes and have seen them perform well (or perhaps they just like cheering for them). But I believe another reason is that people tend to assume that Games experience matters. And the reason that is a factor here is that my model does not take past Games experience into account.

Why? Well, in constructing these models, I wanted to be able to predict the chances that an athlete would win (or finish top 3 or top 10), not simply make a prediction about where they would finish on average. That meant I couldn't just set up some sort of linear regression model that could account for several variables (such as Regional placing, Open placing, past Games placing, etc.). I needed a model that could generate a range of outcomes, and I felt using this year's Games results was my best bet. This is a different approach than I took for the regionals predictions, for three reasons:
  1. Entering the Regionals, there had only been 5 events thus far this season, which is not really enough for me to get a sense of what types of events might come at the Regional level.
  2. Some athletes notoriously coast through the Open, so those results alone would not be a great predictor of Regionals success.
  3. Because so many more athletes compete at Regionals each year compared to the Games (~1400 vs. 85), there was enough historical data for me to build a decision-tree-style model for Regionals. There is simply not enough past Games data to learn about what characteristics give an athlete a chance of winning. Basically what we would learn is: "In order to win, be Rich Froning."
So my solution was to build a pretty unique simulation model that took into account specific results for each athlete for all 11 events this season prior to the Games. It's at this point that I'd like to recite a quotation that one of my work colleagues (a predictive modeling guru if there was one) likes to bring up quite often:

"Essentially, all models are wrong, but some are useful." - George Box

Any model we come up with to predict the CrossFit Games will be wrong. Remember, a perfect model would predict with 100% certainty exactly who would finish in what positions. No one would have a 20% chance of winning - they would have a 100% chance or a 0% chance. But creating such a model is impossible. So with that in mind, I acknowledge that my model is wrong. But is it useful? I think so.

The chart below shows the calibration of this model on the 2012 and 2013 Games (combined men and women for both years). This shows how often athletes finished in the top 10, compared with the chances I gave them. A perfectly calibrated model (not necessarily a perfectly accurate one) would have the blue line follow the red line exactly, so that for the athletes I predicted with a 7% chance to finish top 10, exactly 7% of them did finish in the top 10.

As we can see, the model has been pretty well calibrated the past two years. Generally speaking, athletes with a low probability of finishing in the top 10 don't finish in the top 10. The model is also much more accurate than a dull (but perfectly calibrated) model that gives all athletes an equal chance of finishing top 10: my model's mean square error was 11.6% vs. 17.1% for the equal chance model.

But of course, my model is not perfect. And one area where it could be skewed is in how it accounts for (or rather, does not account for) past experience. If past experience is an advantage, then my model is understating (to some degree) the chances for returning Games athletes and overstating (to some degree) the chances for first-timers.

Which brings us back to the original question: Does past Games experience matter? To answer the question, I compiled the results from the Games and Regionals from 2011-2013 and tagged all athletes with Games experience prior to the year of competition (I went all the way back to 2007 to see if athletes had past experience). 

The simplistic way of looking at this is to compare the finishes of athletes with prior experience compared with first-timers. Looking at things this way, we find that returning athletes do finish approximately 8 spots higher than new athletes on average (18.3 vs. 27.8). However, this could simply be due to the fact that the returning athletes are just flat-out better, and their experience had nothing to do with their Games performance.

What we should do to account for this is compare Games performances to Regionals performances in the same year (using the cross-Regional rankings, adjusted for week of competition). In general, we expect athletes who fare better at the Regionals to perform better at the Games. So if Games experience is a factor, the returning competitors should perform better at the Games than their Regionals results would indicate. When we look at things this way, we see that returning competitors do indeed improve their placing by approximately 0.6 spots from Regionals, while new competitors dropped by approximately 0.8 spots in from Regionals.

Unfortunately, there is a still a problem with this comparison. Although Regionals performances are a good indicator of Games performance, there is still a tendency of athletes to regress towards the mean  in general. That is, athletes who finish near the top at Regionals don't tend to improve their placement at the Games, while athletes near the bottom at Regionals tend to improve slightly on average. Part of this is due to the fact that if you finish near the top at Regionals, there is basically nowhere to go but down (and the reverse is true for the athletes at the bottom of the Regional standings).

So to be fair, we need to compare returning athletes with first-timers who had similar Regional placements. Since we don't have a huge sample, I split the rankings into buckets of 10. Within each bucket, I found the average Regionals and Games placements of returning athletes and first-timers, as well as the average improvement or decline. The results are presented in the chart below.

For every level of competitors except those near the bottom, the athletes who had past Games experience showed an advantage at the Games over first-timers with similar Regional placements. While we can see that there is significant variation in how much this advantage is worth, if I had to put a number on it, I'd say that Games experience is worth between 4-5 spots at the Games. Remember, my current predictions assume all athletes have equal experience, so a reasonable adjustment might be to improve the average rank of past competitors by ~2 spots and drop the rank of new competitors by ~2 spots.

This analysis is, of course, not precise. It is likely that experience matters more for veterans like Rich Froning and Jason Khalipa than it does for someone who has only competed once at the Games before. Moreover, some veteran Games athletes have consistently struggled to match their Regionals performances, while newcomers have overachieved in the past (see Garrett Fisher last year).

I don't plan to adjust my predictions this year, for a few reasons:
  • There is not a simple solution of how to implement this factor into the model framework I have set up;
  • I feel that the predictions are still pretty reasonable on the whole (based on the calibration seen in the past two years);
  • For the Pick 'Em contest, I committed not to change those predictions for reasons of fairness. I suppose I could produce a second set of predictions, but I think that's just creating unnecessary confusion. Anyone entering the contest is welcome to use the information in this post to their advantage if they wish.
Still, when it comes time to make my predictions again next year, I'm going to try to find a way to account for past Games experience. The model still won't be perfect, but hopefully it will be even more useful.