I will pay out 100:1 on the following wager, should anyone like to make it: The Cardinals and the Reds will finish the season first and second (in either order) in the NL Central.

60:1 - The Rockies and Dodgers will finish 1st and 2nd (in either order) in the NL West

40:1 - The Phillies and Braves " " in the NL East

80:1 - The A's and Mariners " " in the AL West

25:1 - The Twins and Tigers" " in the AL Central

150:1 - The Yankees and Blue Jays " " in the AL East

Also, I will take - and solemnly promise to fulfill - 100,000:1 bets on the Devil Rays catching the Red Sox.

## Sunday, August 26, 2007

## Thursday, August 23, 2007

## Wednesday, August 8, 2007

### Are the yanks surprising?

"Derek, are we BAD AT BASEBALL? or just unlucky"

Arizona DiamondbacksClearly I have been completely fixated on this idea of the pythagorean expectation and figuring out whether a team's below expected performance is due to random chance or something else is figuring in (insert favorite explanation here). I decided to get away from all of the statistics and to harness the power of computing to replay the entire Yankee season to see if where we are now could be expected randomly or it something else is going on.

Here is how I replayed the season:

(1) Imagine the number of runs scored by the Yankees in a given game is entirely independent of the number of runs scored against the Yankees. This assumption holds up pretty well as the correlation of those two numbers is 0.014 (zero is no correlation 1 and -1 is extreme correlation).

(2) We now have two lists of numbers, Yankees scores and other team scores. Instead of the outcome we observed, we can randomly select a different Yankee score to match to the other team score. From those new scores, we can figure out which team won our pretend game (with ties split in half).

Imagine the yankees played 3 games the scores of which were 2-1, 5-3 and 3-4 leading to 2 wins.

One replayed season might have the scores 3-1, 2-3, and 5-4, again 2 wins.

Another might have the scores 2-3, 5-1, and 3-4, which only would be 1 win.

Yet another might have the scores 3-3, 5-4 and 2-1, which would be 2 wins and 1 tie.

Results:

I did this rerandomization 10,000 times with the scores from Yankee games this year.

Currently, they are 63-50. Most often in our 10,000 fake seasons, they were 70-43. This is almost dead on the Pythagorean expectation with the best possible exponent (if you didn't follow that, don't worry about it). But, the likelihood of them winning 63 games or fewer in these random seasons is only 1.8%. That seems surprising. In fact- calculating a p-value to determine the likelihood that we would see performance that deviates so far from the mean, we find p = 0.04. Traditionally, in cogsci we reject the null hypothesis (here that performance is just a random assignment of Yankee's scores to opponents scores) when p<0.05. So, we can conclude that there is something wrong with the Yanks (insert favorite theory here)

"For some reason - we are awesome"

Here is how I replayed the season:

(1) Imagine the number of runs scored by the Yankees in a given game is entirely independent of the number of runs scored against the Yankees. This assumption holds up pretty well as the correlation of those two numbers is 0.014 (zero is no correlation 1 and -1 is extreme correlation).

(2) We now have two lists of numbers, Yankees scores and other team scores. Instead of the outcome we observed, we can randomly select a different Yankee score to match to the other team score. From those new scores, we can figure out which team won our pretend game (with ties split in half).

Imagine the yankees played 3 games the scores of which were 2-1, 5-3 and 3-4 leading to 2 wins.

One replayed season might have the scores 3-1, 2-3, and 5-4, again 2 wins.

Another might have the scores 2-3, 5-1, and 3-4, which only would be 1 win.

Yet another might have the scores 3-3, 5-4 and 2-1, which would be 2 wins and 1 tie.

Results:

I did this rerandomization 10,000 times with the scores from Yankee games this year.

Currently, they are 63-50. Most often in our 10,000 fake seasons, they were 70-43. This is almost dead on the Pythagorean expectation with the best possible exponent (if you didn't follow that, don't worry about it). But, the likelihood of them winning 63 games or fewer in these random seasons is only 1.8%. That seems surprising. In fact- calculating a p-value to determine the likelihood that we would see performance that deviates so far from the mean, we find p = 0.04. Traditionally, in cogsci we reject the null hypothesis (here that performance is just a random assignment of Yankee's scores to opponents scores) when p<0.05. So, we can conclude that there is something wrong with the Yanks (insert favorite theory here)

"For some reason - we are awesome"

Actual: 63-51

Correlation btwn Snakes scores and Opponents scores = -0.1

Expected by best Pythagorean Estimate: 53.4 - 60.6

Most often seen in random seasons: 55-59, 56-58 (tied)

Likelihood of Actual given random seasons: ~1.1%

p-value = 0.02

Zona is better this season than we might expect by chance.

2006 Cleveland Indians

Actual: 78-84

Correlation btwn Tribe scores and Opponents scores = -0.014

Expected by best Pythagorean Estimate: 88.8 - 73.2

Most often seen in random seasons: 88-74

Likelihood of Actual given random seasons: ~0.5%

p-value = 0.01

The Indians really screwed up bad last year.

Finally, (sort of) my Orioles

Actual: 52-58

Correlation btwn Birds scores and Opponents scores = +0.10

Expected by best Pythagorean Estimate: 55-55

Most often seen in random seasons: 56-54

Likelihood of Actual given random seasons: ~10%

p-value = 0.22

Maybe we shouldn't be so surprised by the Os...

## Tuesday, August 7, 2007

## Monday, August 6, 2007

### A brief pause for basketball history... and then a BASEBALL STATS QUESTION

Did you know that a man whose legal given name was actually GOD SHAMMGOD played for the Washington (then) Bullets in 1997-98?

One might have thought he looked like this...

Unfortunately, he only looked like this...

My interest in baseball has been fairly low for this season. My semi-beloved Orioles have taken quite a beating on this blog, but currently they expected (Pythagoreanly) to be above 500 for the first time in about a decade. (In fact based on their runs scored and runs against that are the middlest team in baseball. Certainly not awesome but should not be described multiple times as a candidate for the WORST TEAM IN BASEBALL unless that candidate set is 15 teams long).

However, despite scoring more runs than they gave up, the Os are 6 games under 500. For the baseball statistical purist they would be described as underperforming by chance - that their Pythogorean Expectation is somehow a purer method of determining how good they are instead of their win loss, ala last years Cleveland Indians. The Indians won 12 fewer games than they were expected to by runs for and against. The Orioles are on pace to win 6 fewer games. More extremely - the Yankees are currently 9 games under expectation while Arizona is a whopping 10 games OVER expectation.

This reminds me of a conversation I had with GF last year about the Indians. The Pythagorean expectation is all well and good as a general indicator of how well a team should performa. But, there are certain predictors of win loss that is cannot capture. For example, if a team has 4 pitchers, two of whom have an ERA of 0 and two of whom have an ERA of 5 and the team scores exactly 4 runs a game. The team will obviously be a 500 team (the first two pitchers will win every game and the second two lose every game). However, over 160 games, we would expect them to win 102. Their actual performance would be -22 from pyagorean expectation, but we don't REALLY expect them to win 102 games. At some point is a massive deviation from Pythagorean expectation not just an expected possible deviation that could happen to any team, but actually soemthing which could EXPLAIN SOMETHING ABOUT A SPECIFIC TEAM? Thoughts?

One might have thought he looked like this...

Unfortunately, he only looked like this...

My interest in baseball has been fairly low for this season. My semi-beloved Orioles have taken quite a beating on this blog, but currently they expected (Pythagoreanly) to be above 500 for the first time in about a decade. (In fact based on their runs scored and runs against that are the middlest team in baseball. Certainly not awesome but should not be described multiple times as a candidate for the WORST TEAM IN BASEBALL unless that candidate set is 15 teams long).

*Coming right at you with eight wins in a row*However, despite scoring more runs than they gave up, the Os are 6 games under 500. For the baseball statistical purist they would be described as underperforming by chance - that their Pythogorean Expectation is somehow a purer method of determining how good they are instead of their win loss, ala last years Cleveland Indians. The Indians won 12 fewer games than they were expected to by runs for and against. The Orioles are on pace to win 6 fewer games. More extremely - the Yankees are currently 9 games under expectation while Arizona is a whopping 10 games OVER expectation.

This reminds me of a conversation I had with GF last year about the Indians. The Pythagorean expectation is all well and good as a general indicator of how well a team should performa. But, there are certain predictors of win loss that is cannot capture. For example, if a team has 4 pitchers, two of whom have an ERA of 0 and two of whom have an ERA of 5 and the team scores exactly 4 runs a game. The team will obviously be a 500 team (the first two pitchers will win every game and the second two lose every game). However, over 160 games, we would expect them to win 102. Their actual performance would be -22 from pyagorean expectation, but we don't REALLY expect them to win 102 games. At some point is a massive deviation from Pythagorean expectation not just an expected possible deviation that could happen to any team, but actually soemthing which could EXPLAIN SOMETHING ABOUT A SPECIFIC TEAM? Thoughts?

*"Eureka," says Pythagoras, "the Orioles suck only a little."*## Sunday, August 5, 2007

### *

Could #755 have come more awkwardly? I feel like Bonds is an omnipotent overlord who will smite all humans if we don't celebrate the breaking of the record to his satisfaction.

### ENOUGH SERIOUS TALK

Lady and gentlemen, I give you sweet-swinging (.306/.368/.501) Eric Byrnes - for the LOLZ:

*"The sun was bad, but was it any worse than it's been the last 10,000 years? I'm gonna say no."*

--A's outfielder Eric Byrnes, on losing a ball in the sun, barely catching the next one hit his way, then watching two more fall in for doubles (2005)

*"There was no head contact, but I couldn't stop smacking kids around in tournaments, so I kept getting disqualified."*

--Diamondbacks outfielder Eric Byrnes, on when he took karate as a kid (2006)

* "Dude, I was like John McEnroe. I was good, but I threw my racket all the time. Once, I nailed the kid on the other side of the net. I never hit the judge, but I did fire some balls at him." *

--Byrnes, on his tennis career as a kid (2006)

*"The greatest part about it is it went into someone's beer. My initial reaction was to send one of the kids in the clubhouse up there to buy them a new beer. Then I found out it was a Dodger fan."*

--Snakes outfielder Eric Byrnes on a three-run homer of his (2006)

"I didn't realize how many degenerates sit there at home and watch television and surf the Internet and look for ways to belittle people. People should not be concentrating on what I'm wearing and what my hair looks like. They should be concerned with my flow and the knowledge coming out of my mouth."

--Byrnes, in response to television viewers critical of his haircut (2006)

"I don't get much sleep anyway. Advantage: Byrnes"

--Eric Byrnes, when asked if day games after night games were a problem for him (2007)

Bonus roffles:

## Friday, August 3, 2007

### Beasts of the NL East

I'm still not sold on the Phillies. They are the NL's best hitting team, far ahead of the rest of the league with 592 runs scored. But that's only half the game: you have to be able to pitch too, and the Phillies are thirteenth in the NL with a team ERA of 4.74, which is hardly promising for a potential division champ.

The Mets, meanwhile, rank second in ERA at 3.86 and sixth in runs with a respectable 503 (hardly "lackluster"). I'm not sure where Ben's coming from when he complains the Mets don't hit for average and strike out too much - the Mets team average is .271 with 677 K's, which compares well with the Phils' .279/777 K's. It's true that Delgado has been close to "dead weight" - his line of .249/.322/.434 is pretty bad for a 1B on a contending team - but the rest of the team has picked up the slack. Beltran's line of .263/.340/.483 is a little disappointing by his standards, but is still valuable for a CF I think. And as for the Mets' starters: even if none of them are real intimidating, at least they're generally competent. That might not win them a playoff series, but it should carry them through the regular season.

So when Cole Hamels isn't throwing, the Phillies HAVE to win by outslugging their opponent, whereas the Mets and Braves (542 runs/4.12 ERA) each get it done both on the mound and at the plate. Ben predicts that the Phils are going to come out on top thanks to their ability to win high-scoring games like their victory over the Cubs last night. He tries to make that sound like a virtue, but to me it seems more like a more fundamental issue/problem is that the team is routinely getting into games like that.

Taking a look at the standings, the NL East contenders in fact have almost identical run differentials (NYM: 46, PHI: 45; ATL: 44), suggesting that all three are actually pretty close in terms of how good they "really" are. The difference, though, is that the Mets have a four game lead with only 54 left to play. As the teams below them aren't markedly better, I expect the Mets are going to hold on.

"See you in October"

## Thursday, August 2, 2007

### Trade deadline madness

Now that my brain has recovered from the completely insane Matt Morris trade, here's some thoughts on what's happened:

Ben, the Braves are clearly going for it this year, which makes sense I think. The division is still within reach, and there's not much time left for the core of this Braves team: Larry keeps getting dinged up, Andruw may no longer be around next season, and who knows how much longer Smoltz is going to be effective. With the moves they made, their bullpen is much better, and even if Tex doesn't hit like he has in Texas, he's still a major upgrade for their lineup - I think I read that Atlanta was ranking dead last in the majors in RBIs by their first basemen. You said that Tex has 8 HRs in 144 ABs away from Arlington - sure, that ain't Ruthian, but it still adds up to 30 HRs over 550 ABs. A lot of teams could use that. I'm not really familiar with any of the prospects the Braves traded away besides Salty, so I can't assess the value they gave away. But the Braves filled many of their immediate needs, and they now look to me like a better bet than the Phils to unseat the Mets.

As for Gagne: he's not the pitcher he was pre-injury, and initially there was speculation that he would share the closer role, which had me worried about a clash of egos between him and Papelbon. But seeing as Gagne has explicitly agreed to a set-up role I guess I like the trade. He's still pretty good, he's got experience, and the Sox didn't give up anyone too important in order to get him. He can take some innings/pressure off of Okajima while providing insurance in case Papelbon gets hurt. Plus he doesn't pitch for the Yankees now.

The Proctor-Betemit deal was a good one for the Yanks for sure. They got a useful bench player for 07/possible third baseman for 08 in return for a pitcher Torre has run into the ground. I think Betemit was pretty expendable (isn't someone else already tabbed as 3B of the future for LA?) but I'm still surprised the Dodgers were willing to accept Proctor of all people in exchange for him.

Luis Castillo to the Mets: I would have been okay with running Ruben Gotay out there every day but I guess this makes both management and fans feel like something (anything) was done at the deadline. What's way more important is getting Beltran (not to mention Pedro) back from the DL.

Kyle Lohse and Tad Iguchi to the Phillies: Iguchi is obviously a stopgap and Lohse is a very (very) minor upgrade. Pretty 'whatever', especially compared to who the Braves acquired and who the Mets will get back from injuries. Also I see Lohse is out of today's game after only one IP/one ER - not sure why yet, but for whatever reason, that ain't good.

Ben, the Braves are clearly going for it this year, which makes sense I think. The division is still within reach, and there's not much time left for the core of this Braves team: Larry keeps getting dinged up, Andruw may no longer be around next season, and who knows how much longer Smoltz is going to be effective. With the moves they made, their bullpen is much better, and even if Tex doesn't hit like he has in Texas, he's still a major upgrade for their lineup - I think I read that Atlanta was ranking dead last in the majors in RBIs by their first basemen. You said that Tex has 8 HRs in 144 ABs away from Arlington - sure, that ain't Ruthian, but it still adds up to 30 HRs over 550 ABs. A lot of teams could use that. I'm not really familiar with any of the prospects the Braves traded away besides Salty, so I can't assess the value they gave away. But the Braves filled many of their immediate needs, and they now look to me like a better bet than the Phils to unseat the Mets.

As for Gagne: he's not the pitcher he was pre-injury, and initially there was speculation that he would share the closer role, which had me worried about a clash of egos between him and Papelbon. But seeing as Gagne has explicitly agreed to a set-up role I guess I like the trade. He's still pretty good, he's got experience, and the Sox didn't give up anyone too important in order to get him. He can take some innings/pressure off of Okajima while providing insurance in case Papelbon gets hurt. Plus he doesn't pitch for the Yankees now.

The Proctor-Betemit deal was a good one for the Yanks for sure. They got a useful bench player for 07/possible third baseman for 08 in return for a pitcher Torre has run into the ground. I think Betemit was pretty expendable (isn't someone else already tabbed as 3B of the future for LA?) but I'm still surprised the Dodgers were willing to accept Proctor of all people in exchange for him.

Luis Castillo to the Mets: I would have been okay with running Ruben Gotay out there every day but I guess this makes both management and fans feel like something (anything) was done at the deadline. What's way more important is getting Beltran (not to mention Pedro) back from the DL.

Kyle Lohse and Tad Iguchi to the Phillies: Iguchi is obviously a stopgap and Lohse is a very (very) minor upgrade. Pretty 'whatever', especially compared to who the Braves acquired and who the Mets will get back from injuries. Also I see Lohse is out of today's game after only one IP/one ER - not sure why yet, but for whatever reason, that ain't good.

Subscribe to:
Posts (Atom)