I thought I’d go through the prediction engine to see how it is performing so far this season.
- Total matches where both participants are on a Division I roster: 2860
- Matches where the Prediction Engine got the winner correct: 2006 (70%)
- Matches where the Prediction Engine got the match result correct (regardless of the win/loss prediction): 1419 (49.6%)
- Matches where the Prediction Engine picked both the winner AND match result correct: 955 (33.4% of all matches, 47.6% of matches where the win prediction was correct)
Obviously after only 2 weekends of wrestling, that’s a very small sample set. But it does give a good idea on the predictability of this process. I envision the results of the prediction engine will only get better as we get further into the season when wrestlers have more matches, thus their rating becomes closer to the reality. This is especially true for true freshman, and wrestlers that haven’t wrestled a college match until this season.
Sometime this season, I’ll try and come up with a more elaborate reporting style for this. I’ll probably add a page to the website where users will be able to look through all of these results.
Update 11/18/2016 @ 8:00AM Central
It was mentioned/suggested from a PSU fan that since wrestlers in their very first season will not have accurate rankings, it would make more sense to exclude those wrestlers from the prediction engine analysis, since we don’t have sufficient data for them.
I re-ran the process to exclude these wrestlers, here are the results:
- Total matches where both participants are on a Division I roster: 2081
- Matches where the Prediction Engine got the winner correct: 1581 (76%)
- Matches where the Prediction Engine got the match result correct (ie picked DEC correctly, or picked FALL correctly): 1076 (51.7%)
- Matches where the Prediction Engine picked both the winner AND match result correct: 744 (35.8% of all matches, 47.1% of matches where the win prediction was correct)
As expected, when you start removing those wrestlers where we don’t have sufficient data to do the analysis, our accuracy rate goes up. In this case, we removed 779 match results from our analysis, but if we analyze just that set of wrestlers that didn’t have sufficient data, the Prediction Engine only correctly picked the winners at 54.6% (425/779).
We improved our predictions by 6% after removing those wrestlers, and also improved out match result type prediction by 2%. These statistics will allow me to make adjustments to the Prediction Engine algorithm to hopefully make it more reliable in the future.