March 22, 2010

Model update (updated!, updated again)

Filed under: Random,Technical — cec @ 9:09 pm

Earlier, I posted my current model for predicting the NCAA tournament.Â Since the whole thing is probabilistic, I figured that I would test it out against the current NCAA standings.Â I considered four models:

The one that I described
A random selection of which team would win (50/50 chance)
Always picking the top seeded team
A model suggested by a colleague at work

For each model, I ran 10,000 tests and compared them to the current NCAA tournament results, counting the scores for each test.Â Results are:

The X axis is the score (0-64 at this point), the Y axis is the number of test runs (out of 10k) that achieved that score.Â The number in the legend is the expected value (score) for each model.Â As you can see, my model had the [second] highest expected value.Â Choosing the top seeded team was the worst (guaranteed 10 points) best [see the 2nd update].Â Choosing randomly was better than selecting the top seed the worst [see update] and my colleague’s model (cyan) was between my model and the random model.Â Not bad.Â I’ll update after the next two rounds of the tournament.

Update: one interesting thing is that this suggests that there was still a lot of luck in my ESPN pick.Â Only about 0.5% of my model runs were as good as that one.

Update 2: So, I’m lying in bed when it occurs to me that I’m an idiot… the team with the *lowest* seed wins a game in Model 3.Â This is why I say I don’t really know basketball.

Comments Off

They laughed at my theories!

Filed under: Random,Technical — cec @ 3:09 pm

They laughed at my theories.Â They threw tomatoes when I presented my paper at the academy!Â Tomatoes I tell you!Â My minions cower in terror, shrinking in fright from the very ideas contained herein!Â But I will show them!Â I will PROVE IT TO THEM ONCE AND FOR ALL.Â The FOOLS, I WILL DESTROY THEM!! MWAHAHAHAAAA! (ask me how)

Oh, sorry.Â Where was I?Â Apparently, there’s this basketball thing going on.Â Some sort of NCAA tournament that will prove who has the best basketball team.Â But what if it doesn’t?Â What if it’s all just arbitrary?Â Could it be that the chances of any team winning a game are not deterministic, but rather stochastic?Â I’ll admit that I don’t know that much about basketball.Â I mean, I played the sport in junior high.Â I do know the rules.Â And I even think that it’s a pretty game.Â But I don’t follow the ins and outs of a particular season.

So what’s a guy to do when he doesn’t really follow basketball, but you live in NC where bball is life and it’s bracket time?

You model it.Â Â Which is exactly what I did.

The basic model:

Compute a team’s wins minus their losses, I’m sure there’s a word for this, but let’s call it demonstrated strength (D)
For a given match-up, take a draw from a Beta distribution parameterized by each team’s demonstrated strength (D1 and D2)
The resulting draw is the probability that the team representing the first parameter wins
Draw from a uniform random variable to predict if that team actually will win

There are some flaws with the model, the two obvious ones:

Different teams have different schedules, so one team with a 30-5 record might be a lot better than another with a 30-5 record in a different conference (I’m looking at you SEC)
It’s not clear that you should parameterize directly on the demonstrated strengths.Â There should probably be a scaling factor in there.Â So that rather than drawing from Beta(D1, D2), you should draw from Beta(alpha*D1, alpha*D2)

But this is close enough.Â The nice features of the model are:

The expected probability that a team will win is proportional to D1/(D1+D2).Â So, a team whose wins outnumber their losses by 10, will have an expected probability of winning of 50% when playing against another team with D2=10.Â And only a 33% chance of winning when playing against someone with a D2=20
The closer two teams’ demonstrated strength is to zero, the broader the probability distribution is.Â This reflects added uncertainty for two teams who win only slightly more often than they lose.
The larger two team’s demonstrated strength is, the narrower the probability distribution is.Â For example, D1=20, D2=40 has the same expected probability as D1=10, D2=20; but because this is a more common pattern for the two teams, we don’t have the same variance.
This is actually pretty rigorous in Bayesian terms.Â Throughout the season, we can update the posterior distribution of the probability of winning based on the prior distribution and the most recent game.

So, how well does the model work?Â Good question.Â I used it on ESPN, and it’s currently ranked in the 92.9th percentile, i.e., better than almost 93% of all ESPN brackets.Â All of my final four teams are still alive, and in general, the model predicted several of the biggest upsets in the tournament (e.g., Murray State vs Vanderbilt!).Â That said, this is just one random draw from the model.Â To test it further, I would like to go through a whole season of games and figure out if the probabilities of winning correspond to the statistics of a Beta distribution for the game’s D1 and D2.Â Moreover, I would like to infer the alpha parameter that I mention above.

If the model appears accurate, and we can properly infer alpha, then we get a probabilistic assessment of how feasible it is to even pick tournament champions.Â It may just be that at the end of the day, it comes down to luck.

Comments (1)

Holy crap… yes we did!

Filed under: Uncategorized — cec @ 9:33 am

Almost three months ago to the day, I wrote about the senate health care reform (HCR) bill, how they had achieved cloture and would vote on Christmas eve.Â Since December, things haven’t looked all that good for HCR.Â A weak candidate in Massachusetts lost to a republican underwear model (not that there’s anything wrong with that), and the Democrats started doing the Democratic thing, which mostly consists of herding all of the cats into a circle and giving them guns to take shots at each other [1].Â At one point in late January, the chances of any sort of HCR passing were very close to zero (Intrade was giving it around 22%).

Since then, President Obama has gotten more involved, and Nancy Pelosi (who love her or hate her will go down in history as one of the most effective Speakers of the House in recent memory) started working on her colleagues and the odds went up significantly.Â In the past week, it looked almost certain that the House would pass the Senate’s bill and then fix the worst budgetary issues in reconciliation.Â It was looking so certain, that the ignorant cretins in the teaparty were out in force, spitting and hurling racist and homophobic comments at legislators.

In spite of all that happened, last night the House did vote to pass the Senate bill.Â Then came a Republican motion to recommit the reconciliation bill in an effort to spike the whole thing by driving a wedge between the pro-choice and anti-abortion wedges of the Democratic party.Â That failed after Bart Stupak gave an impassioned speech saying that he believed that the current senate language plus the president’s executive order did uphold the Hyde amendment and that the bill was pro-life.Â In his words, the bill was pro-life because it not only protected children before they were born, but it helped to ensure that their mothers received pre- and post-natal care, that the children would have insurance and that we know that children and families with insurance are healthier than those without.

Over the past few months, I’ve called Stupak a wanker on more than one occasion, but last night he stepped up and helped to pass health care reform for everyone.Â After the vote to recommit, I went to bed (it was after 11pm and I was a bit tired), but the reconciliation bill was voted upon and also passed!

What’s next?Â Well, the senate will probably pass the reconciliation bill today.Â That will clean up the crap that they had to stick into the bill in order to overcome a republican filibuster.Â The President will sign the bill Tuesday.Â Then we’ll start seeing some changes.Â The bill was begin to close the doughnut hole for drug coverage that the Republicans put into Medicare Part D.Â It will begin to limit the insurance companies’ ability to shaft policy holders.Â And by 2014, we’ll see the mandate that everyone must have insurance coverage, even if it is subsidized for the poor. Sometime between now and 2014, Democrats will hopefully start to improve the bill.Â We still may not get to single payer any time soon, but we might get a public option.

From my standpoint, not too much will change.Â I’ll continue to receive insurance through my company.Â The congressional budget office (CBO) projects that my company’s costs for insurance will go down about 3%.Â Best of all, I stop having to worry about losing insurance if I lose my job or decide to change jobs.Â Hell, this even gives me some freedom to consider starting my own business without worrying as much about how to afford health insurance.Â All in all, passing HCR was an amazing effort and I’m proud to have watched it happen.

[1] FWIW, this is why I still consider myself to be an Independent, even though I almost always vote Democratic – the Democrats are just too fearful of the political consequences of their own popular platform planks?!Â Personally, I prefer a much more muscular liberal set of policies than the Democrats are usually willing to consider… even if they agree that those policies would be better for the country.

Comments Off

Alkahest my heroes have always died at the end

March 22, 2010

Model update (updated!, updated again)

They laughed at my theories!

Holy crap… yes we did!