Monday, April 25, 2011

#winning

When doing rankings, what is the most important thing that ESPN looks at? Well, besides not being a west coast team they look at winning record. Winning record is the focus of my new formula which is still in its infancy stages. In this post I will be explaining why I use the things that I do and then put up some results from the NCAA tournament "1st Round" and the real 1st Round.


Winning record can be looked at three different ways: conference wins, non-conference wins, and overall wins. The main two that I will be looking at are the conference and the non-conference wins (and of course losses). Let's use the Pac-10 as an example.

Team                       Conf Record      Nonconf Record       Overall Record
Arizona                           14-4                     16-4                            30-8
Washington                     11-7                     13-4                            24-11
UCLA                            13-5                     10-6                            23-11
California                        10-8                      9-7                             19-15
Oregon State                  5-13                      6-7                             11-20
Washington State            9-9                       13-4                            22-13
Stanford                          7-11                     8-5                             15-16
USC                               10-8                     9-7                             19-15
Oregon                           7-11                     14-7                           21-18
Arizona State                  4-14                     8-5                             12-19

Non-conference Record
     The nonconf record is pretty important in my opinion. This being said, I think that the conference's nonconf record is slightly more important than the individual team's nonconf. The point of nonconf games are to go against teams that will help your team work together better. In addition to this, with a large enough sample size, we can start getting a better picture of which conferences are better than others.
     Teams in the "better" conferences (ACC, Big Ten, Big 12, and Pac 10 to name some examples) typically do play some of the weaker teams (such as WSU playing Southern and UTPA) in order to pad their overall record so that they have a better shot at making it to the NCAA tournament. The inverse of this is that the teams in the weaker conferences (I'm looking at you SWAC) play the better teams such as Alcorn State playing, and losing to, Texas A&M, Purdue, Colorado, and Kansas State. A way that this is slightly balanced out is that the better teams make it to the preseason invitational tournaments and end up playing better teams and the worse teams play other teams that do not get invited to the tournaments.
    A simple way of judging conference strength is to take the nonconf record for each team in a conference and averaging it out to find how well the conference is as a whole. If the conference tends to do well in nonconf play, then they will make it more difficult for the other teams in their conference once play starts.

Conference  Avg Wins  Avg Losses  Avg Nonconf Win%  Avg Nonconf AdjWin%
ACC              12.083         5.583                  0.684                               0.665
Am East          6.000          9.444                  0.388                               0.401
Atl 10              9.357          6.929                 0.575                               0.566
Atl Sun            4.909          7.182                  0.406                              0.419
Big 12            13.417         4.333                  0.756                              0.730
Big East         11.625         3.938                  0.747                              0.719
Big Sky          6.667           8.000                 0.455                               0.460
Big South       6.500           7.100                 0.478                               0.481
Big Ten          11.364         4.273                 0.727                               0.701
Big West        6.556           8.889                 0.424                              0.433
Colonial         8.500           6.083                 0.583                               0.573
Conf USA     11.417          5.333                  0.682                              0.662
Great West   6.571           13.000                0.336                               0.351
Horizon         8.000           6.700                  0.544                              0.539
Ivy                7.625           7.125                  0.517                              0.515
M West        11.000         5.333                  0.673                              0.655
MAAC         6.700           7.800                  0.462                              0.467
MEAC         5.727           9.818                  0.368                              0.383
Mid-Am       7.667            9.333                 0.451                              0.456
Miss Valley   9.400           6.300                 0.599                              0.588
Northeast      5.000          7.667                  0.395                              0.409
Ohio Valley   6.300          7.500                  0.457                               0.462
Pac 10         10.600         5.600                  0.654                               0.637
Patriot           7.125          9.875                  0.419                              0.428
SEC            11.750         5.750                   0.671                              0.654
Southern      6.417           8.083                   0.443                              0.449
Southland     7.083           7.500                  0.486                              0.487
Summit         5.900           7.400                  0.444                             0.451
Sun Belt       6.667           8.500                   0.440                             0.447
SWAC        2.600          10.700                  0.195                             0.235
WAC          9.889            6.778                  0.593                             0.583
West Coast 10.750          8.750                  0.551                             0.547

From looking at the adjusted winning percentage we can see the best and worst overall conferences. (Tournament teams and respective seed in parenthesis)
High:
Big 12          .730
     (Kansas 1, Kansas State 5, Texas A&M 7, Texas 4, Missouri 11)
Big East        .719
     (Pitt 1, ND 2, Marq 11, G'town 6, Cuse 3, Nova 9, L'Ville 4, WV 5, St. John's 6,
      UConn 3, Cincy 6)
Big Ten        .701  
     (O State 1, M State 10, Illinois 9, Purdue 3, Penn State 10, Wisc 4, Mich 8)
ACC            .665
     (Duke 1, Fl State 10, UNC 2, Clemson 12)
Conf USA    .662
     (Memphis 12, UAB 12)

Low:
SWAC        .235  
     (Alabama State 16)
Great West  .351  

MEAC        .383  
     (Hampton 16)
Am East       .401
     (Boston University 16)
Northeast    .409  
     (Long Island 15)

As you can see, my top 5 conferences all sent multiple teams and only going as low as 12. Even Conference USA, which is not known as a power conference, sent two teams this year. The lowest five conferences only sent their tournament winner to the dance (save for the Great West which does not have an automatic bid to the NCAA tournament and instead sends their winner to the CIT). Of the four teams that made it from the lower 5 conferences, three were the 16th seeds and Long Island was 15th, but they got killed by about 40 in the first round.

Conference Record
Now that we know about nonconf records, we can start looking at how teams performed against their fellow conference teams multiple times per year. This simple chart is how we can interpret their conference strength of schedule:

                  Conference Difficulty
Team     Hard     Medium     Easy
.750    Increase    Same     Decrease
.500    Increase    Same     Decrease
.250    Increase    Same     Decrease

To summarize the above chart, if a team was playing in a difficult conference (any of the high 5 in the previous section), if they were moved to a medium/average difficulty conference then they would tend to have a better conference record because their competition is less challenging. Likewise, if a team was playing in one of the low 5 conferences and they were moved to an average conference then they would have a worse record due to the increase in competition level.

Formula
Okay, now is the formula part.

AdjWinDifficulty = (AdjWin(((1 + diff) * confwin) + nonconfwin, ((2 - diff) * confloss) + nonconfloss))

AdjWinDifficulty is the ranking system based on an excel formula. The formula is AdjWinDifficulty(confwin, confloss, nonconfwin, nonconfloss, diff).
AdjWin is a formula that I covered in a previous post. Just as a refresher, the formula is (1 + Wins)/(2 + Wins + Losses). The formula in excel reads like this AdjWin(win,loss).
diff is the difficulty of the conference which was taken from the nonconf section of this post.
confwin is the team's conference wins.
nonconfwin is the team's non conference wins.
confloss is the team's conference losses.
nonconfloss is the team's non conference losses.

What this formula does is increase/decrease conference wins based on the difficulty of the conference and adds that to the nonconf wins to get a new adjusted overall wins. The losses are then decreased/increased based on the same difficulty rating and added to the nonconf losses. Once we have the new overall wins and losses that have been adjusted, we can then find the adjusted win percentage. For ease in comparison, I assigned numbers to the different levels of rating as follows: <0.5=1, <0.6=2, <0.7=3, <0.8=4, <0.9=5, and <1.0=6.

With these levels applied to the teams that made it to the dance, we get match up rankings. Here are some of the key ones from upsets:

Seed        Team          Ranking
   5       Vanderbilt           4
  12      Richmond           5

   4        Louisville            4
  13     Morehead St        4

   6        St. John's            3
  11       Gonzaga             4

Most of the time the higher seeded team won which was reflected in my system (predicting the entire first round of the West correct).

Conclusion
So, now that a base has been established I can start working on ways to better fine tune it. One such way that I'm trying is using points/possession and the four factors turned into offensive rating. So far what I have done is pretty decent, in my opinion.

Something that should always be taken into account when dealing with statistics is error. We MUST always account for error and variability. This would translate into teams having off days or players fouling out early. So, with error taken into account, these predictions are not set in stone but rather are statistically strong. I have noticed that most of the stats are normally distributed which would allow me to test the what the percentage is of a team overperforming/underperforming.

No comments:

Post a Comment