Of course head to head means something.
Bird beat Hermitage
Hermitage beat Manchester
Manchester beat Bird
They were all 1-1 against one another, so you have to look at the other 3000+ games played this year to help yourself out.
There are literally 100's of chains like this every season, some are much longer than this.
Football scores and winning football games do not obey the transitive property. We have to resort to probability and statistics because Boolean Algebra and standard algebra won't do in this case.
That's the whole point of this: head-to-head matters, and not just the games you want to matter, but all of the head to head games matter. Bird didn't just play Hermitage and Manchester, they played 13 other teams. All 15 of the teams they played had between 9 and 14 other games and all of those team's opponents had between 9 and 14 other games and every one of those was a head to head match up. We are trying to turn 2300 scores into a more manageable list of one number per team. It is impossible to do that and have EVERY head to head matchup represented exactly. We have to average them out.
Bird did lose to Manchester by 6 in regular season
Manchester dis lose to Hermitage by 12 in round of 16
Hermitage did lose to Bird by 21 in round of 8
You can't reconcile those scores easily, but if they were the only data I'd have I'd end up saying Bird outscored their opponents by 15, Hermitage was outscored by 9 and Manchester was outscored by 6. If we divide by 2 and make the average rating a 50, Bird's rating would be 57.5, Manchester 47, and Hermitage 45.5. This of course would not give us the correct outcome to any of the games, but it "averages" the result and would lead us to conclude Bird is, on an average night, 10.5 points "better" than Manchester. Now, we also have some other data that might make us want to adjust this more. Bird won the deepest game in the playoffs and the most recent game between the three...some people would tell you that winning more recent games or playoff games is worth more than regular season games. Bird went on to win the state championship, and one might give a little bonus for that achievement (I don't, but you could). Also Bird and Manchester played many common opponents. In almost every case Bird beat the common opponent quite a bit worse than Manchester did. Since the rating are not used solely to compare Bird to Manchester, but to compare all teams to one another and since Bird typically would beat the same opponent Manchester beats by 14 more points, it might be reasonable to assume when we put all these factors together that on an average night, Bird is 14 points better than Manchester, even if Bird clearly was not better on the one night they happened to play.
Maybe one more way to address the Bird vs. Manchester thing (once again the ratings are concerned with comparing EVERY team to one another not just Bird vs. Manchester) is to look at the common opponents:
Manchester was 6 points better than Bird head to head
Manchester was 8 points better than Bird against Midlothina
Bird was 7 points better than Manchester against Wythe
Bird was 7 points better than Manchester against Clover Hill
Bird was 7 points better than Manchester against Monacan
Bird was 9 points better than Manchester against James River
Bird was 31 points better than Manchester against Huguenot
Bird was 33 points better than Manchester against Hermitage
Bird was 34 points better than Mancheser against Cosby
So against playoff teams, Bird was much better. This does not change the fact that Manchester beat Bird, but when we're setting up ratings that compare everybody, that Cosby result is just as important as any other game. So in these 9 games Bird was 114 points better than Manchester (114/9=12.67). So the ratings would make sense. Most teams were on average 12.67 points worse against Bird than Manchester. If you throw in the fact that Manchester's remaining opponents were 16-16 and Bird's were 69-11 that might account for Bird getting a boost of another point.
I don't get mad at my students when I teach something poorly, so I will try to be patient, but I get paid to teach them. So I'm going to try.
Here's the basics (this is two months worth of statistics class in a paragraph).
Some things in the world have certain (or nearly certain outcomes). If I drop a cannonball off the roof of my house, we can calculate very accurately how long it take for it to hit the ground, how fast it will be going, and how much force it will hit the ground with. Algebra, Geometry, and Calculus and their related fields deal with this kind of stuff very nicely, but there are many other things for which the outcome just can't be predicted that easily.
If I give you an aspirin there is a chance your headache will go away within an hour. However, it's not certain. It's also not certain that if it went away it was because of the aspirin, or if it was going to go away anyway. I bought a lottery ticket just before I started typing this. I don't think I'll win any money because I usually don't, but there's a chance. We can calculate that chance using the rules of probability.
I can't tell you who will win the NFL playoff games this weekend. Too many variables involved. So many, in fact, that if we look at enough game, the results are random. When I say they are random, that is not to say that all results are equally likely. We can use the rules of probability to determine how likely a team is to win a game, or how likely a player is to get a hit or make a free throw. My later examples are pretty simple. We just look at past track record and if we have enough data (say 30 free throws or more, then the past rate of success is a good indicator of the probability they'll make the shot). If a player, of a long period, starts to dramatically do better or worse than in the past, that is a tip off that we might want to look at what has changed (guy responds differently under playoff pressure, got a new girlfriend, broke his hand).
Now, determining who might win a football game in the future is much trickier. We have a lot of data, but not data specific to what we want to know. For instance. If Salem were to play Monacan, a lot of people have speculated on what would happen. We all pretend we are experts on football and evaluating talent and calling plays and then tell everybody what we think like we're geniuses. The truth is, some people are experts on all that stuff and they simply still make the wrong prediction all the time. Also, the truth is, none of us have seen every team in the state play every game so we're working with a limited set of data anyway.
The simple truth is that I don't think the 2014 Spartans and 2014 Chiefs played one another. They had one common opponent. An opponent who claims they were off when they played the Chiefs and a little more on when they played the Spartans. I don't think the two schools have ever played, though I suppose it's possible that they hooked up somewhere in the last 40 years.
So honestly, how do we try and make this call? More importantly, how do we try to make this call if someone brings up any two of the 308 VHSL teams?
Quick, 2009 James Wood vs. 2004 Indian River, who'd win? Contrary to what idiots on ESPN try to make you think when they're picking the NFL games, we just don't know. However, there are techniques that can help us.
First of all, let's try to make it as simple as possible. We try to find the one variable that correlates the most to winning football games (correlation is a whole chapter...sigh). What is it? Coaching experience? Running back's 40 time? Size of school? NO, it's points scored and allowed. So the most successful rating system is going to be primarily (or possibly even exclusively) based on points scored and allowed. This is nice because, that's pretty much the one statistic that you probably are getting reported to you accurately (though not always, believe me). None of us have yardage numbers, time of possession numbers on all the games and if we did, I can guarantee you that they'd be wrong, anyway.
So, if we're really going to try and predict the winners of games our variable is going to be the final scores of previous games. We can determine experimentally the average margin of victory, home field advantage rates, etc. Using means and standard deviations and the laws of probabilities and z-scores and other things we can come up with an idea of how two teams would play against one another based on the numbers. Is the result certain? No. Upsets happen (statistically, about 5/6 to 6/7 of VHSL games are upsets, it seems).
So we can represent that probability various ways, but my favorite is simply that: just a probability that team A will beat team B. Now, exactly what this number represents is questionable.
Let's just say I told you that Salem has a 58% probability of beating Monacan (obvious, due to the fact that the probability of A and not A must add up to 100% that means Monacan has a 42% chance of beating Salem), am I saying that I'm 58% sure Monacan has a better team? Am I saying that there's a 42% chance the ratings have it wrong, or am I saying there's a 42% chance of an upset. Truth is I don't know, and I'm not sure I know why it matters anyway. Just suffice it to say that with the standard deviations we've observed in games in the past that we know a team rated as much higher than another team as Salem is over Monacan would win the game 58% of the time.
The other way to represent this probability is a point spread. We might say Salem is a 3 point favorite, or that Salem would win by three. Many programs (including my own) would use this number to predict a score. Say, Salem 27, Monacan 24. The truth is we really don't expect the team to actually win the game by three. We kind of think of it this way. If they played a huge number of games then on AVERAGE Salem would win by three. Late in the season the standard deviation of ratings systems always seems to be right around 14 or 15 points. I usually just throw out the number 14 because it's 2 touchdowns, but 14.5 is more accurate.
What that means is if a team is favored by 3 points and the standard deviation is 14, they have a z-score of 3/14 and we can actually calculate the probability they will win the game from that number (z-score tables if you ever took college stats). It also means that a little more than 2/3 of the time they play the final score will be within 14 points of that, and 95% of the time it will be within 28 points, and only with great rarity will it be more than 48 points different from the prediction.
So using this stuff, we can generate a number that compares teams to other teams. Matchups matter, but more than one number just wouldn't be very useful to the average human brain, so when we rate the teams we are kind of saying that on average this number represents how well the teams have played against their opponents over the season. Any team you beat by more than the ratings suggest mean you probably played better than your average against that team.
All these rules can be found in any advanced text on statistics and thanks to some great modern math theorems we can trust that they can be used to rate sports teams and we can always test our results at the end of the year to see if they are behaving as expected. If not, we know we've got a mistake somewhere or a flawed assumption somewhere and we need to debug it or tweak it (I love that part).
So am I saying that the Gilliam Ratings are right. Nope. I am saying they can't possibly be right, but no system can be, and if all you're going to go on is the final scores of games, well, this system is as strong as any and much better than what I used to publish in the past. Part of why I was so slow to get them back up was every time I'd get started I'd think of a way to improve them, hence, here I am 48 hours after posting the new ratings and I already have newer and better ratings.
I love it when people find mistakes or think about things to consider in the formulas that I may have not considered. If you suggest that statistics is not good math, however. I will dismiss your arguments, you have nothing to add to the business of mathematically rating teams and you should write poems about them instead (poetry is another fine endeavor of the human mind, I am not putting it down, just suggesting that it is another way of thinking about things).
No wonder my students hate my class.
This post was edited on 1/2 10:01 PM by GilliamRatings
This post was edited on 1/2 10:45 PM by GilliamRatings