tag:blogger.com,1999:blog-17655138065004513002008-06-15T20:19:15.180-07:00Stat HeadBrad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-1765513806500451300.post-29410126414964671402008-04-09T15:17:00.000-07:002008-04-09T16:38:20.423-07:00AB/HR and Batting Average CompositionIn our past two articles, we’ve discussed the importance of Contact Rate and Batting Average on Balls in Play (BABIP). Today, I’d like to talk about the third component of batting average and how you can put all three together to calculate batting average.<br /><br />The third component is home run rate. There are other statistics we can use to arrive at this number (which we’ll discuss at a later date), but in the context of batting average, this is the simplest. It’s At-Bats per Home Run (AB/HR). It’s calculated, as you can imagine, by dividing at-bats by home runs.<br /><br />You might not have thought that power has much to do with batting average. After all, a guy like Luis Castillo has a .294 career batting average while hitting 24 home runs in his career (which dates back to 1996).<br /><br />Think about it, though. Contact rate measures the rate at which balls are either put into play or hit for home runs. BABIP measures the rate at which balls in play become hits. AB/HR measures the rate of non-balls in play, home runs. Home runs, while more damaging than a single, are still hits and therefore are included in batting average.<br /><br />Hits divided by at-bats. Home runs are included among those hits.<br /><br />I won’t go into the details yet of how we arrive at our AB/HR number (that’s more of a power discussion than a batting average one), but you can get a general sense of a hitter’s skill in this regard by looking at his history of AB/HR.<br /><br />I’ve developed a very simple, intuitive tool in Excel to help you calculate batting averages easily. I think examining it will help shed a lot of light on this, so I ask that you download it now and look at it while reading. Scroll to the bottom of the post to download it.<br /><br />Here is a screen shot of it:<br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.mlbfrontoffice.com/uploaded_images/Pujols-Batting-Average-Composition-731800.jpg"><img style="cursor: pointer;" src="http://www.mlbfrontoffice.com/uploaded_images/Pujols-Batting-Average-Composition-731792.jpg" alt="" border="0" /></a><br /><br />Click on it to enlarge. In order to use the tool, all you have to input is what's highlighted in yellow: the hitter's At-Bats, Contact Rate, AB/HR, and BABIP. The tool does everything else.<br /><br />And if you'll notice, everything matches up perfectly. Pujols did hit .327 last year with 32 home runs and 185 hits.<br /><br />You can go through the formulas yourself, but I'll quickly explain how it works. When you input the ABs and Contact Rate, it simply multiplies the two together to get the number of contacted balls (BIP+HR). You then put in AB/HR, and it calculates HR by dividing AB by AB/HR. You then input BABIP, and it multiplies it by BIP (which it calculates by subtracting HR from BIP+ HR) to get Hits on Balls in Play (H on BIP). You then add this to HR to get your totals hits (H). You then divide hits by at-bats to get batting average!<br /><br />Very simple, yet I'm sure a lot of you had never thought about batting average in this way before. All we're doing is taking the components of batting average and combining them!<br /><br />Once you have the tool, you can fool around with numbers to see what guys would hit if they had different skills. Maybe you think Pujols was lucky with BABIP last year and you want to see what he would hit if his BABIP was just .300. Just input it into the sheet, and the calculations are done for you. You should get a .309 average with that BABIP adjustment.<br /><br />I believe that this will be an invaluable tool to use not only when coming up with batting average projections in the off-season but during the season as well. Hanley Ramirez will not have a .600 BABIP all year and Mark Reynolds will not have a AB/HR of 6. Use this tool throughout the year to make adjustments and get an expectation for what a player should hit going forward.<br /><br />If you have any questions about this, feel free to <a href="mailto:THTFantasyFocus@gmail.com">send me an e-mail</a>!<br /><br />Download the tool:<br /><a href="http://www.mlbfrontoffice.com/uploads/Component%20Batting%20Average%20Calculator.xls">uploads/Component%20Batting%20Average%20Calculator.xls</a>Derek Cartyhttp://www.blogger.com/profile/15051667211745800764noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-4001824234658273082008-03-26T13:26:00.000-07:002008-03-26T14:00:21.929-07:00Batting Average on Balls in Play (BABIP) for HittersThere are three primary factors that drive a player’s batting average.<span style=""> </span>Last week, we discussed the first: contact rate.<span style=""> </span>Today, I’d like to talk about Batting Average on Balls in Play (BABIP).<span style=""> </span>You may recall me talking about this before for pitchers, but today I’d like to talk about it in the context of hitters. <p class="MsoNormal"><o:p> </o:p></p> <p class="MsoNormal">Contact rate measures how often a player puts the ball into play.<span style=""> </span>Naturally, the more balls are put into play, the more have the opportunity to fall for a hit.<span style=""> </span>And that’s where BABIP comes in. <span style=""> </span>BABIP measures the percentage of these balls that actually do fall for hits.<span style=""> </span>It is calculated like this:</p><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.mlbfrontoffice.com/uploaded_images/BABIP-formula-743821.jpg"><img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer;" src="http://www.mlbfrontoffice.com/uploaded_images/BABIP-formula-743665.jpg" alt="" border="0" /></a><p class="MsoNormal"><br /></p><p class="MsoNormal"><br /></p><p class="MsoNormal"><br /></p><p class="MsoNormal">First, let’s look at Jake Peavy, one of the best pitchers in baseball.</p><p class="MsoNormal"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.mlbfrontoffice.com/uploaded_images/Jake-Peavy-BABIP-789861.jpg"><img style="cursor: pointer;" src="http://www.mlbfrontoffice.com/uploaded_images/Jake-Peavy-BABIP-789852.jpg" alt="" border="0" /></a><br /></p><p class="MsoNormal">If you’ll recall, we determined in a previous column that the BABIP of a pitcher generally regresses towards league average – around .300.<span style=""> </span>Some pitchers have the ability to do a little better than this (and some worse), but not by much.<span style=""> </span>Peavy seems to regress towards .300.<span style=""> </span>He never actually reaches .300, but that is the nature of BABIP; it is prone to luck and fluctuations, but his figures fluctuate around .300, just as we would expect.</p> <p class="MsoNormal">With hitters, though, you can’t employ such a strict regression.<span style=""> </span>It is true that there is a luck involved anytime you deal with balls in play.<span style=""> </span>All it takes is an outfielder to be positioned a couple feet in the wrong direction or having the fortune of seeing Manny Ramirez out in left field a few times more than everyone else for an otherwise probable out to turn into a hit.<span style=""> </span>When examining hitters, though, it becomes obvious that hitters do not follow the same set of rules for BABIP that pitchers do.<span style=""> </span>Let’s look at some examples.</p><p class="MsoNormal"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.mlbfrontoffice.com/uploaded_images/Derek-Jeter-BABIP-796801.jpg"><img style="cursor: pointer;" src="http://www.mlbfrontoffice.com/uploaded_images/Derek-Jeter-BABIP-796791.jpg" alt="" border="0" /></a></p> <p class="MsoNormal">There is some definite fluctuation, but notice that the fluctuation does not occur around .300, as it did with Peavy.<span style=""> </span>In fact, Jeter’s BABIP never went below .317, and I would be confident to say that there’s a good chance this was due to a larger proportion of bad luck than we might consider reasonable.<span style=""> </span>Instead of fluctuation around .300, it looks as though Jeter’s BABIP fluctuates around a number much higher.<span style=""> </span>If we take a straight, unweighted average of these BABIPs, we get .358.</p> <p class="MsoNormal">Now let’s look at Neifi Perez.</p><p class="MsoNormal"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.mlbfrontoffice.com/uploaded_images/Neifi-Perez-BABIP-722436.jpg"><img style="cursor: pointer;" src="http://www.mlbfrontoffice.com/uploaded_images/Neifi-Perez-BABIP-722422.jpg" alt="" border="0" /></a><br /></p><p class="MsoNormal">Much different looking than Jeter.<span style=""> </span>The only thing they have in common is that neither ever really got very close to .300.<span style=""> </span>Conversely, Neifi Perez’s BABIPs are consistently, significantly lower.<span style=""> </span>He seems to regress to .270 or so.</p> <p class="MsoNormal"><o:p> </o:p></p> <p class="MsoNormal">It is apparent that every hitter has his own distinct hitting ability and that they each regress to their own unique BABIP.<span style=""> </span>It can sometimes be difficult to pinpoint what that number is (I generally use a three-year weighted average as a starting point), but that’s the nature of the beast.<span style=""> </span>Some hitters simply hit the ball harder or have a better technique.</p> <p class="MsoNormal">Let’s look at one more player.</p><p class="MsoNormal"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.mlbfrontoffice.com/uploaded_images/Jorge-Posada-BABIP-737179.jpg"><img style="cursor: pointer;" src="http://www.mlbfrontoffice.com/uploaded_images/Jorge-Posada-BABIP-737168.jpg" alt="" border="0" /></a></p>Anything look out of place?<span style=""> </span>From 2002 to 2006, Posada’s BABIP revolved around .305 or .310.<span style=""> </span>Then in 2007, Posada puts out a .389 BABIP.<span style=""> </span>This is not a product of skill.<span style=""> </span>This is a product of extreme luck, and by the laws of statistics, there will always be a few players to do this every single year.<span style=""> </span>Taking a standard deviation approach, 5% of players will always be two standard deviations or more away from the mean.<span style=""> </span>In 2007, Posada was one such player.<br /><br />Don’t be fooled when someone does this.<span style=""> </span>Don’t be tricked into thinking that they have established a new hitting baseline.<span style=""> </span>It’s possible that they have, but don’t risk your fantasy season on it.<span style=""> </span>Expect regression, and wait another year or two and find out what the case is for sure – most times it will simply be luck at play.<span style=""> </span>In the case of Posada in 2008, expect some serious regression.<br /><br />Sometimes, we don’t have the luxury of years of major league BABIPs to judge players by.<span style=""> </span>We see a guy like Reggie Willits post a .363 BABIP or Jack Cust post a .366 BABIP or Chris Young post a .260 BABIP, and we don’t know what to think.<span style=""> </span>In this case, look at the player’s minor league BABIPs and the respective Major League Equivalencies (a topic I discussed a few weeks ago).<span style=""> </span>This should give you extra data to work with to see if a high (or low) BABIP is actually warranted or a function of luck.<br /><br />Here is a list of hitters from 2007 with BABIPs that stray far from league average.<span style=""> </span>I think it’ll be useful for you guys to through the past few years for these guys and see if you can tell which are due to regress and which are for real. <span style=""> </span>Let me know if you have any questions.<p class="MsoNormal"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.mlbfrontoffice.com/uploaded_images/hitter-BABIP-leaders-796894.jpg"><img style="cursor: pointer;" src="http://www.mlbfrontoffice.com/uploaded_images/hitter-BABIP-leaders-796839.jpg" alt="" border="0" /></a></p> <p class="MsoNormal"> </p> <p class="MsoNormal"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.mlbfrontoffice.com/uploaded_images/hitter-BABIP-trailers-747022.jpg"><img style="cursor: pointer;" src="http://www.mlbfrontoffice.com/uploaded_images/hitter-BABIP-trailers-747005.jpg" alt="" border="0" /></a></p><p class="MsoNormal">Once you decide whether a player is going to regress, realize that this will have a significant effect on batting average.<span style=""> </span>I’ll show you next week how to figure out the exact effect, but for now, just know that as a player’s BABIP regresses downward, his batting average will go down.<span style=""> </span>When the BABIP regresses upward, his batting average will go up.<o:p> </o:p></p> <p class="MsoNormal">Also, be sure to track BABIP throughout the season.<span style=""> </span>If a player is hitting .350 through April with a .400 BABIP, realize that he is not a .350 and will come down.<span style=""> </span>If you happen to own this player, then might be a good time to trade him.</p> <p class="MsoNormal">If you have any questions, feel free to <a href="mailto:THTFantasyFocus@gmail.com">send me an e-mail</a>.</p>Derek Cartyhttp://www.blogger.com/profile/15051667211745800764noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-6229815018911145092008-03-05T08:31:00.000-08:002008-03-19T08:37:07.584-07:00Contact Rate For HittersHey guys. Sorry for the long hiatus. There’s been a lot going on. Anyway, today we’re going to talk about contact rate for hitters. Contact rate is a very simply stat to calculate, but it is an enormously valuable tool. Its primary use is in evaluating a hitter’s batting average.<br /><br /> Contact rate, essentially, measures the percentage of balls that a hitter puts into play, or how often he doesn’t strike out. Logically, the more balls a hitter puts into play, the more have the opportunity to fall for a hit. If the ball isn’t put into play (i.e. if the hitter strikes out) it has zero chance of becoming a hit. Therefore, hitters who have high contact rates are better bets to have high batting averages. Here is the formula for contact rate:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty24-752334.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty24-752321.bmp" border="0" alt="" /></a><br />Most sites don’t provide contact rate, but some do provide strikeout rate (K%). If you find a site that has this stat – FanGraphs, for example – you can simply use the following formula, which should be pretty easy to do in your head.<br /><br />Contact Rate = 1 - Strikeout rate<br /><br /> Check out this table. It takes the aggregate stats for all hitters from 2004 to 2007, broken down by contact percentage, and the corresponding batting average.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty25-784668.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty25-784659.bmp" border="0" alt="" /></a><br />That should give you a pretty good idea how important contact rate is when forecasting batting average. There are other components, which we’ll talk about in the coming weeks, but contact rate is a big one. Here is a list of the hitters from 2007 with at least a 90% contact rate and 200 at-bats, along with their corresponding batting averages.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty26-735805.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty26-735765.bmp" border="0" alt="" /></a><br />Again, we see that the majority of hitters with good contact rates also have very good batting averages. As I said, there are other factors involved (one of which is luck, by the way), but the guys with excessively low batting averages are candidates for increases in 2008.<br /><br /> Let’s actually single these guys out. Here is a list of all hitters from 2007 with at least 300 at-bats, a contact rate above 85%, and a batting average lower than .280.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty27-789818.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty27-789755.bmp" border="0" alt="" /></a><br />All of these guys are candidates for further analysis. The majority of them will see their batting averages rise in 2008.<br /><br /> In fact, from 2004 to 2007, there were 61 players who had at least 300 at-bats in Year 1 and Year 2, a contact rate above 85%, and a batting average lower than .280. Of those 61 players, 45 saw their batting averages rise the following year. That’s 74% –nearly three quarters – without taking any other factors into account! 22 of these players (36%) raised their averages by at least 20 points, and 5 more raised them by at least 18 points (44%). 18 players raised their averages by at least 10%.<br /><br /> One of the greatest things about contact rate that I’ve yet to mention is how stable it is. Some of the stats we’ll look at in the future fluctuate so much that it makes them difficult to predict. Contact rate isn’t like that.<br /><br /> Here are the results of a regressions analysis done on year-to-year contact rate using all hitters from 2004 to 2007 with at least 350 at-bats in both Year 1 and Year 2.<br /><br />Correlation Coefficient: 0.85 <br />R2: 0.73 <br />Adjusted R2: 0.73 <br />P-value: 1.2E-131 <br />Level of Significance: 1% <br /><br /> Those are huge numbers! Unless a hitter dramatically changes his approach, his contact rate isn’t going to move much more than three or four percentage points from year-to-year.<br /><br /> Closing up, I think it’s pretty clear what a large impact contact rate has on batting average. When evaluating hitters, I would highly recommend taking it into consideration.Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-7938283127202709872008-01-23T08:27:00.000-08:002008-03-19T08:30:40.293-07:00Minor League EquivalenciesBrad Stewart asked if I would talk to you guys about Major League Equivalencies this week. I thought it was a great idea, so, here I am. If you’ve never heard of MLEs before, you’re in for a treat. When many people project younger players, there are several approaches I’ve seen taken that really put these people at a disadvantage… approaches that don’t allow the projection to be as accurate as it could be.<br /><br /> The first is taken when a minor league player who doesn’t have any major league experience, like an Evan Longoria or a Jay Bruce this year, is downgraded to such an extreme extent that the projection is calling for him to have, essentially, no fantasy value at all. The projector says that because the player doesn’t have any major league at-bats and because prospects are so risky, it would be irresponsible to give them anything but an extremely conservative projection.<br /><br /> Similarly, the second is when someone projects a player going into his sophomore year using only his rookie numbers, because “those are the only major league numbers we have of him.”<br /><br /> The third is when minor league numbers are used, but they are used at face value or some subjective changes are made in the projector’s mind.<br /><br /> If you have ever taken on one of these three approaches, you’ve done yourself a major disservice. While it’s true that many minor leaguers never make it to the majors, or fail once they do so, and that minor leaguers are a little trickier to project than major leaguers, it can be done. And that’s where Major League Equivalencies enter the picture.<br /><br /> Major League Equivalencies were first introduced by Bill James – often thought of as the godfather of sabermetrics and current Senior Baseball Operations Advisor to the World Champion Boston Red Sox – in his 1985 Baseball Abstract. He outlined his method for hitters in the Abstract, and since then MLEs have really taken off.<br /><br /> Let’s take a step back, though, and explain what MLEs are. MLEs, essentially, approximate the numbers a given minor league hitter would have produced at the major league level. This is not to be confused with a projection. It serves the same purpose as, say, the 2007 stat line of Alex Rodriguez. You wouldn’t take A-Rod’s 2007 line and use it as a projection for 2008. After the minor league line is translated, though, it can be looked at on the same plane as major league stat lines. Projections can be made from there, the same as you would project major league players.<br /><br /> You don’t need to worry about the specifics of creating MLEs. Others have already done that for you. I’m sure you’d like to know a little about the concepts used to produce them, though. To simplify it, I’m going to put them in list form. What MLEs essentially do to get the final adjusted line is this:<br /><br />· Adjust for the quality of the league, or the run environment<br /><br />· Adjust for the minor league and major league ballparks<br /><br />· Adjust for competition<br /><br /> Adjusting for the competition is kind of a tricky one to do. What you need to do is look at the stats of every player who has moved between levels within a season and weight them based on playing time at each level.<br /><br /> Another thing you need to know about MLEs is that a Triple-A MLE will be much more accurate than, say, a Low-A MLE. This is because when you adjust for competition, far more players jump from Triple-A to the majors within a season than from Low-A. You have a far smaller sample size to judge by.<br /><br /> To increase the sample size, then, you need to develop “intermediate relationships,” as Jeff Sackman of MinorLeagueSplits put it. That is, you look at players who jumped from Single-A to Double-A, and then at players who jumped from Double-A to either Triple-A or the majors. While this solves the small sample size issue, it does still take away accuracy.<br /><br /> Double-A and Triple-A MLEs are quite reliable, but once you get into A-ball or short season leagues you need to be very cautious. Of course, in most fantasy leagues, you’re not going to be projecting too many guys who played in the Gulf Coast League the year before, so this shouldn’t be a major concern. It does occasionally come into play, though, so you need to be aware of it. Take Cameron Maybin as an example. He had just 20 at-bats at Double-A and none at Triple-A before being called to the majors, where he had just 49 at-bats. Be very cautious when projecting him for 2008.<br /><br /> Major League Equivalencies can also be applied to players coming over from Japan. Because almost all of these conversions take place in-between seasons, though, you need to add one additional step to the process: adjusting for age.<br /><br /> Okay. At this point, you might be asking: “Where can I find MLEs?” As I said earlier, people have already created them for you. MinorLeagueSplits.com is one of my favorite websites, and MLEs are available on every player’s page. They don’t yet have 2007 MLEs, and they don’t have a comprehensive list of them, so if you’re looking for them in bulk you’re going to need to go elsewhere.<br /><br /> The Ron Shandler Baseball Forecaster has MLEs this year, and I believe if you buy from them (as in, not through Amazon or somewhere like that) you will get an electronic version of the book.<br /><br /> Or, if you’re really, motivated, check out the Baseball Think Factory article in the list below. Get your base minor league stats from somewhere like MinorLeagueBaseball.com, and then follow along as Dan Szymborski walks you through exactly how to create an MLE.<br /><br /> Closing up, this article, really, has only scratched the surface of MLEs. Its purpose was simply to make you aware of them, because they are a truly invaluable tool.<br /><br /> For further reading about MLEs, or if you’re really interested in the specifics of how they can be calculated, check out these sites. The final one isn’t so much about MLEs as I discussed them, but about potential improvements that could be made in the future.<br /><br />http://seamheads.com/blog/2008/01/19/major-league-equivalencies/<br /><br />http://www.baseballthinkfactory.org/btf/scholars/czerny/articles/calculatingMLEs.htm<br /><br />http://minorleaguesplits.com/mle.html <br /><br />http://www.hardballtimes.com/main/article/rethinking-mle-the-role-of-experience/ <br /><br /> Also, if you’re interested in the minor league park factors, here are a couple of resources for them. I know I’ve seen league factors before, but I can’t seem to locate them tonight. If I run across them, I’ll put the link in a future article.<br /><br />http://www.minorleaguesplits.com/pf.html <br /><br />http://firstinning.com/pf/<br /><br /> That wraps up this week. One final note, unrelated to this week’s topic. Two weeks ago I discussed K/BB ratio. Today, I posted an article at THT Fantasy Focus that looks into how pitchers achieve their K/BB ratios. Turns out, pitchers who do it in a certain way tend to put up better ERAs. If you're interested, you can find that here. K/BB ratio is a very important topic and one I’m sure I’ll be writing more about in the future.Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-49420205573360812322008-01-16T08:46:00.000-08:002008-03-19T08:52:07.752-07:00Ground BallsUp until last week, I had talked primarily about luck indicators. That is, statistics which are prone to unexplained fluctuation. When we examine these types of statistics, we try to see who is getting lucky or unlucky. Last week, though, I talked about strikeouts and walks. Strikeouts and walks are a different type of statistic. They are performance statistics, or skills. They are generally more predictable and less influenced by luck. Put simply, they are more indicative of skill. This week, I’ll examine another skill statistic: ground ball rate.<br /><br /> Ground ball can be calculated in two ways. The first, simplest way, is the by using the following formula:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty28-701640.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty28-701635.bmp" border="0" alt="" /></a><br /> The way I calculate it, though is like this:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty29-730203.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty29-730185.bmp" border="0" alt="" /></a><br />By calculating it like this, the result isn’t much different, but it is slightly more accurate. This way eliminates things like bunts that pitchers have, essentially, zero control over. For this article, I’ll be using the second formula to represent GB%.<br /><br /> While strikeouts and walks are the most important skills for a pitcher to have, a pitcher who gets a lot of ground balls can be successful as well. Ground balls are good, primarily, because they are not fly balls. As we discussed last week, pitchers don’t have a whole lot of control over their line drive rates, so that really leaves just two types of batted balls they can control: ground balls and fly balls.<br /><br /> The biggest reason ground balls are better than fly balls is that ground balls cannot clear the fences for a home run. When we consider what we learned a few weeks ago in our HR/FB article – that pitchers cannot control the rate at which they allow home runs – logically, we can assume that the higher percentage of ground balls a pitcher gives up, the lower percentage of home runs he will give up.<br /><br />Check out this list of pitchers from 2007 who threw at least 210 innings:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty30-781884.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty30-781861.bmp" border="0" alt="" /></a><br />As you can see, the guys with the highest ground ball rates gives up far fewer home runs per ball in play than those with lower ground ball rates, even if their other skills are good.<br /><br /> Take a look at the name at the top of the list. Johan Santana gave up more homers per ball in play than anyone on this list, despite being one of the most skilled pitchers in baseball in terms of strikeouts and walks. The bottom line is that home runs are more closely tied to a pitcher’s groundball rate than with any other meaningful statistic.<br /><br /> Getting ground balls isn’t all daisies, though. You need to consider that ground balls are, still, balls in play, and that a pitcher relies upon his defense to ultimately turn it into an out. A strikeout is much more preferable. Furthermore, ground balls actually become hits on balls in play more often than fly balls do.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty31-736640.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty31-736625.bmp" border="0" alt="" /></a><br />*All ground balls are balls in play, so “BA on GB” would be the same as “BABIP on GB.”<br /><br /> When you include home runs, though, as we did in the third column, we see that there are slightly fewer ground ball hits than fly ball hits. Even better is that none of those ground ball hits are home runs.<br /><br /> As you can see, ground ball rate is an extremely useful stat. Still, you need to make sure you keep it in the right context. Ground ball rate alone is not enough to qualify a pitcher as a worthy target for your fantasy team. You need to check his other skills, primarily strikeouts and walks, before making a decision. Check out these tables, showing pitchers ERAs. Data is taken from 2004-2007 among pitchers with at least 50 innings pitched.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty32-738646.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty32-738635.bmp" border="0" alt="" /></a><br />As you can see, independent of the pitcher’s strikeout or walk rate, a high ground ball rate will provide a better ERA than a low ground ball rate. Without a good strikeout or walk rate, though, it won’t matter much, because a 4.53 ERA isn’t something to chase in most fantasy leagues.<br /><br /> What you need to do is find pitchers with high ground ball rates and either 1) a good strikeout or 2) a good walk rate. If you can find a pitcher with all three you’re in great shape, but there are only a handful of pitchers who possess all three skills.<br /><br /> Let’s check out the pitchers from 2007 who had at least a 50% ground ball rate and either a K/9 greater than 6.50 or a BB/9 lower than 2.25.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty33-789444.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty33-789423.bmp" border="0" alt="" /></a><br />You’ve probably already realized that most of these guys are good, but you might not have been thinking about Greg Maddux, Derek Lowe, Dustin McGowan, or Chad Gaudin as 2008 targets. Knowing what you do know, though, I bet they seem a lot more appealing, don’t they?<br /><br /> If we change our criteria to pitchers with a 50% ground ball rate, K/9 rate greater than 6.50, and a BB/9 lower than 2.25, we don’t find any pitchers meeting the criteria in 2007.<br /><br /> From 2004-2006, just 3 different pitchers met the criteria. Chris Carpenter did it in all three years, Brandon Webb did in 2006, and Roy Halladay did in 2005. Obviously, if you notice a pitcher do this in the future, he’s a guy you’re going to want to target.Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-86382101105729730472008-01-09T06:22:00.000-08:002008-03-19T06:32:12.634-07:00Strikeouts and Walks“Strikeouts and walks? After spending weeks talking about things like BABIP and LOB%, you’re going to talk about strikeouts and walks?” Absolutely. If, this week, I do nothing more than stress to you how important these two categories are, then I’ll be satisfied.<br /><br /> Strikeouts and walks, hands down, are the two most important isolated stats to consider when evaluating pitchers. If you look at nothing else when doing your evaluations, look at these two stats. We touched on them a bit when we discussed DIPS Theory, but they are so important that they deserve their own article. Aside from strikeouts being a fantasy category unto themselves in the majority of leagues, strikeouts and walks have a significant impact on several other fantasy categories.<br /><br /> First and foremost, a pitcher has almost complete control over his strikeouts and walks. The batter, obviously, has some control as well, but the pitcher could accomplish a strikeout or walk without a single defensive player behind him and sans any luck. This makes strikeouts and walks a couple of the most predictable stats for pitchers. Because raw strikeouts and walks vary from year-to-year due to pitchers throwing different numbers of innings, we need to examine them as ratios. When I evaluate strikeouts and walks, I use the following, simple formulas:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty19-797406.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty19-797392.bmp" border="0" alt="" /></a><br />Obviously, the higher the K/9, the better, and the lower the BB/9, the better. To roughly measure their combined importance, we can use another statistic.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty20-758803.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty20-758794.bmp" border="0" alt="" /></a><br />Strikeout-to-walk ratio, or K/BB ratio, is an excellent statistic for predicting pitcher ERAs. We can run some tests to see just how effective it is. We’ll use data from 2004-2007 and include all pitchers with at least 50 innings pitched.<br /><br />Correlation Coefficient: -0.48 <br />R2: 0.23 <br />Adjusted R2: 0.23 <br />P-value: 9.59E-74 <br />Level of Significance: 1% <br /><br /> These tests show that there is a negative relationship between ERA and K/BB. In other words, as K/BB increases, ERA decreases – just as we would expect. While a 0.23 R2 isn’t amazing, when you consider how much variability there is with ERA and how difficult it is to predict, being able to predict 23% of ERA movement with this one simple statistic is actually a very good result. The p-value and level of significance further show that these tests are highly significant.<br /><br /> To make it even easier to understand how important K/BB ratio is in predicting ERA, let’s look at the top dozen best and worst Starting Pitcher K/BB ratios from 2007 and each pitcher’s corresponding ERA.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty21-714801.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty21-714776.bmp" border="0" alt="" /></a><br /> As you can see, 11 of the 12 players on the ‘best’ list have ERAs under 4.00. On the ‘worst’ list, though, just 2 players have ERAs under 4.00 and 6 players have ERAs over 5.00.<br /><br /> Of course, K/BB ratio isn’t the only component of ERA, as we can see some guys who don’t quite fit in with the rest. For example, Tim Redding – with an awful 1.24 K/BB – had a 3.64 ERA in 2007. As I’ve said many times, though, ERA is highly subject to luck-related swings, which is why we’ve spent the past few weeks explaining some of these indicators of luck.<br /><br /> Looking deeper into Redding, we can see that he had a .285 BABIP, 9.52 HR/FB, and 82% LOB%, all of which are above-average and should regress in 2008.<br /><br /> Still, on the whole, you can easily see how powerful K/BB ratio is for evaluating pitchers. Just be careful that the K/BB ratio isn’t artificially inflated, or isn’t in danger of falling off. This is far less common than it is for stats like BABIP or LOB%, but it still can happen. For example, look at Joe Blanton’s stats over the past three years:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty22-762632.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty22-762620.bmp" border="0" alt="" /></a><br />That 1.57 BB/9 is very low, and only a finite number of pitchers are able to maintain walk rates that low. It is certainly possible Blanton will be able to maintain it, but the possibility that his BB/9 will rise above 2.00 in 2008 needs to be considered and accounted for.<br /><br /> The K/BB of a guy like Jake Peavy is much more stable, as it is supported by years of similar walk rates and a high K/9. <br /> <br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty23-797972.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty23-797953.bmp" border="0" alt="" /></a><br />As always, the message I’m trying to get across to you is to do your homework thoroughly, examining all components of a pitcher’s performance before making a decision.Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-73622371551905190672008-01-02T06:08:00.000-08:002008-03-19T06:21:00.820-07:00Line Drive Rates For PitchersHappy New Year! Stat Head is back after a week off to talk about some more not-so-obvious stats that can help you improve your player evaluations and help you win a fantasy baseball championship. Today we’re going to talk about line drive rates for pitchers.<br /><br /> When we talk about batted ball types, we are referring to ground balls, outfield fly balls, infield fly balls (aka pop-ups) and line drives. You’ll occasionally hear about fliners (the type of balls that are somewhere between and fly ball and a line drive), but these are the primary four. Of these four, line drives fall for hits at the highest rate. Here is a table breaking down the hit percentages on these batted ball types over the past four years.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty14-707152.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty14-707145.bmp" border="0" alt="" /></a><br />And here is a table breaking down the BABIP of each batted ball type for the past four years. <br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty15-750436.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty15-750426.bmp" border="0" alt="" /></a><br />As is plain to see, if all a batter is trying to do is get a hit, a line drive is what he should be hoping for. And since pitchers don’t like to give up hits, it should also be pretty obvious that they want to allow as few line drives as possible.<br /><br /> The problem is, pitchers don’t actually have much control over their line drive rates. As with most of the other stats we looked at thus far, pitchers who deviate too far from the league average tend to regress going forward. Here are the league average rates for the past four years.<br /><br /> These are calculated by dividing line drives by the total number of line drives, fly balls, and ground balls. This formula – which will be used throughout this article when we refer to line drive rate or line drive percentage – excludes events like bunts that don’t say much about a pitcher’s skill level.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty16-705301.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty16-705294.bmp" border="0" alt="" /></a><br />Let’s use regressions analysis to see just how much control pitchers have over their line drive rates. In this analysis, we’ll examine a pitcher’s line drive rate from consecutive years. If there is a strong relationship between the two years, we can assume that line drive prevention (or lack thereof) is a repeatable skill. If not, we can see just how much external noise influences a pitcher’s line drive rate. We’ll use data from 2004-2007 and include all pitchers with at least 75 innings pitched in both Year 1 and Year 2.<br /><br />Correlation Coefficient: 0.15 <br />R2: 0.02 <br />Adjusted R2: 0.02 <br />P-value: 0.003 <br />Level of Significance: 1% <br /><br />What this shows us is that there is a weak relationship between a pitcher’s line drive rate from one year to the next and that only 2% of its movement can be predicted using his corresponding statistic from the previous year. It is still significant, though, showing that pitchers can control it to a degree.<br /><br /> Because it isn’t easily predicted, though, we can say that much of a pitcher’s line drive rate is composed of “luck” or, as we called it two weeks ago, unexplained variation. Because of this, it is best to assume that a pitcher should put up a line drive rate near league average, and that those who put up rates at the extremes are getting either lucky or unlucky.<br /><br /> I should note that the above analysis is a bit misleading. While the majority of pitchers can’t control their line drive rates, there are a couple of subsets who have just a bit more control over it than the rest. Pitchers with extreme ground ball or fly ball rates are able to prevent line drives a little better than pitchers with non-extreme GB/FB tendencies. Below are the charts illustrating this:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty17-720980.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty17-720952.bmp" border="0" alt="" /></a><br />You can see that as a pitcher’s ground ball rate or fly ball rate rises to an extreme level (which is something pitchers do have a lot of control over), his line drive rate subsequently falls. Not dramatically, but significantly. This shows that we need to evaluate these types of pitchers on a slightly different plane, though the general concept still applies.<br /><br /> This means that if a pitcher with a 55% ground ball rate posts a 12% line drive rate, we should still expect regression. Even though pitchers with good ground ball rates can prevent line drives to an extent – as is the case with normal pitchers – if the line drive rate is too far from the appropriate baseline, there will be regression.<br /><br />Let’s check out a few prime candidates for regression in 2008.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty18-763881.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty18-763826.bmp" border="0" alt="" /></a><br />As is always the case, make sure you check out a pitcher’s complete skill set and his other ‘luck indicators’ before making a decision. None of the stats we’ve looked at thus far should ever be used alone. Isolated, they have a limited amount of predictive power, but when we look at the overall picture that these indicators paint we can get a good feel for the overall direction a player’s surface stats will take.Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-9131531090949711772007-12-19T11:51:00.000-08:002008-03-18T11:52:09.101-07:00Analyzing LuckOver the past four weeks, I’ve discussed a number of statistics that can help you win your fantasy baseball league. In each article, I’ve mentioned the influence that “luck” plays on the statistic. I realized, though, that I’ve never actually talked about what I mean by the word “luck.”<br /><br /> When we evaluate a player, there are several layers to look at. The most obvious of these layers is the results. This layer consists of stats like batting average, RBIs, ERA, and other categories that typical fantasy leagues use for scoring. Since they are used for scoring, lazy owners – or perhaps uninformed owners – focus solely on them. This, as I’m sure you’ve realized, isn’t especially sound. If we look at the relationship between a player’s batting average from year-to-year, we see that there is an unspectacular 0.37 correlation coefficient and a pretty poor 0.14 R2 (using 2004-2007 data for batters with at least 200 at-bats in both years). Other statistics, like ERA, perform even worse.<br /><br /> What we need to do instead is focus on a player’s skills and indicators. For a category like ERA, skills include the things we mentioned in our discussion on DIPS Theory, things like strikeouts, walks, and ground balls. When we talk about ERA indicators, we’re referring to stats like Left on Base Percentage, Batting Average on Balls in Play, and Home Run per Fly ball.<br /><br /> These indicators, though, tend to fluctuate a good deal, and this fluctuation is often referred to as “luck.” What I mean when I chalk some thing up to luck is, generally speaking, unexplained variation in a statistic. This does not, however, mean that “luck,” in the sense that I am using it, is completely random. It could just be that we don’t have the proper stats at the moment to filter out the noise.<br /><br /> For example, there is a great deal of fluctuation with hitter BABIP. Right now, we don’t really have a great method for predicting BABIP. I may refer to the fluctuation in BABIP as “luck,” but that doesn’t mean that skill isn’t involved in the parts of BABIP that we can’t currently predict. In fact, I have a feeling that within another couple of years we will have made significant strides in predicting BABIP. Greg Rybarczyk’s wonderful program, HitTracker, keeps track of how hard batters hit the ball (Speed Off Bat). Once Greg gets enough help to track all batted balls (not just homers, as HitTracker currently does), I think this Speed Off Bat data will be the key to a hitter’s BABIP. Greg actually penned an article for the 2008 Hardball Times Annual that briefly examines this idea, which definitely shows promise.<br /><br /> In other instances, we might have information but have no real way of quantifying it. While we try to objectify statistics as much as possible, we need to remember that we are studying human beings. And with human beings come unpredictability. Maybe a player’s father died or maybe the player has been diagnosed with a mental disorder, like depression. In July of this year, Matt Wise of the Milwaukee Brewers hit a player in the head with a pitch. He tanked the rest of the year, but how much of this was due to that stray pitch? These types of things can obviously affect a player’s performance, but they affect every player differently, and we therefore have no way of getting real feel for just how much of a player’s performance variation should be attributed to them.<br /><br /> Perhaps even more prevalent than being unable to quantify this information is the absence of the info in the first place. Again, we are dealing with human beings, and just because they are baseball players and are in the media spotlight does not mean that they are required to share every detail of their lives with us. Things may be happening behind the scenes that affect a player’s performance that we are oblivious to. We simply have no way of knowing, so we simply classify it as “luck,” which, as I mentioned, isn’t just random chance. It is unexplained variation in stats, which these personal issues fall squarely into.<br /><br /> Other times, this unexplained variation won’t be due to a lack of available stats or information at all. In some cases, it will be random chance that affects the numbers. In some cases, it will simply be pure, dumb luck. This is true when working with any set of statistics, and while it can’t be accounted for, it does need to be recognized. That is part of what we mean when we talk about a stat like HR/FB regressing to the mean. Stats, especially baseball stats, are subject to unexplained external noise that is not likely to be repeated. If it isn’t repeatable, than it isn’t really a skill, is it? And if it isn’t repeatable, than what possible forecasting value does it have? We need to focus on the stats that we have to work with now (while continuing to try and come up with new and improved ones), utilize the components that we can explain, and expect the pieces that we cannot explain to even themselves out, at least for the time being.<br /><br /> Moving away from my explanation of luck, I wanted to talk about how we should be expecting luck to affect future numbers. Say a pitcher posts a 2% HR/FB rate in the first half of a season. We know that pitchers tend to regress towards the league average of around 11%. Does this mean that the pitcher should post a HR/FB around 20% in the second half to even things out?<br /><br /> I’ve known many players to point to the “law of averages,” as they call it, to try to validate this hypothesis. They say that if a player was expected at the beginning of the year to post an 11% HR/FB rate and it is only 2% in the first-half (which is unsustainable), then there is a good chance he will post a HR/FB significantly higher than 11% in the second-half to make up for it.<br /><br /> In baseball – and anyone who has ever played poker seriously or has some knowledge of games theory knows this – it doesn’t matter how you arrive at a particular point. All that matters is that you are there and that you have a clean slate at every single new moment that arrives. By originally projecting an 11% HR/FB rate, we were expecting this pitcher to be unaffected by luck, or to be affected by neutral luck (however you look at it). Just because he catches a run of really good luck doesn’t mean we should change our opinion of him in this regard. We should still expect him to post luck-neutral stats.<br /><br /> Luck, when examining a large quantity of players over a long period of time, will tend to even itself out. But when you are looking at an individual player over a short period of time, luck should not be expected to correct itself so quickly. It just doesn’t work that way. You should always expect luck to be neutral moving forward, at all times, because as we said before, our definition of luck is “unexplained variation.” If it is unexplained, why would we try to predict it? It is a fool’s errand.Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-40462570591073541402007-12-12T11:36:00.000-08:002008-03-18T11:49:08.787-07:00LOB %Over the past couple of weeks, we’ve been building a pretty good toolbox of stats... better yet, a utility belt of stats… like Batman uses… with which to evaluate pitchers. So far, we’ve discussed DIPS Theory, Batting Average on Balls in Play (BABIP), and Home Run per Fly ball rate (HR/FB). Today, we’re going to add another gadget to our utility belt by talking about Left On Base Percentage (LOB%).<br /><br /> Left On Base Percentage measures the portion of base runners that a pitcher (or subsequent relief pitchers who inherit runners, if the original pitcher leaves mid-inning) prevents from scoring. There are some slight variations to how it is calculated, but the formula that I use is as follows:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty10-746237.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty10-746233.bmp" border="0" alt="" /></a><br />Left On Base Percentage is very important to a pitcher. It is really a measure of how many runs a pitcher allows given that there are runners on base. It is not surprising, then, that pitchers who allow a lot of base runners have worse LOB Percentages. Dave Studeman of Hardball Times put this concept quite succinctly:<br /><br /><blockquote>The reason is simple: baserunners accumulate. If you allow only a couple of baserunners per game, chances are very good your LOB% will be 100%. However, as you allow more runners on base, chances are better that they will get on base in the same inning, making it more likely they will score. As you allow more runners on base, your LOB% will fall at an ever-faster rate. So good pitchers—who allow fewer baserunners—will have better LOB% rates.</blockquote><br /><br /> We use Left On Base Percentage similarly to how we use BABIP and HR/FB. There is a considerable amount of luck involved, and we look for guys substantially above or below average to regress towards the mean. Like BABIP, there is also a significant amount of skill involved, as mentioned above.<br /><br /> Let’s run some tests to see exactly how much skill plays into LOB%. For this test, we’ll use Luck Independent ERA (LIPS ERA) as our independent variable. I know that I’ve mentioned it in passing, and I’ll be talking about it in more detail once we cover all of these other components, but for now just understand that LIPS ERA is a measure of a pitcher’s skill with neutral luck involved. Furthermore, we’ll use LOB% as our dependent variable.<br /><br />Correlation Coefficient: -0.40 <br />R2: 0.16 <br />Adjusted R2: 0.16 <br />P-value: 1.96E-22 <br />Level of Significance: 1% <br /><br /> Essentially, what this all means is that as LIPS ERA decreases, LOB% increases. It also says that LIPS ERA can predict 16% of the movement of LOB% and that the tests are highly statistically significant.<br /><br /> 16% might not seem like a lot, but it really is, and I’m not claiming that LOB% is entirely (or even mostly) skill driven anyway. It has a lot of luck involved, and that’s why we’re looking at it. In addition to luck, “clutch pitching,” if you will, can play a part in LOB%. If a pitcher tends to put up better (or worse) numbers when runners are on base, it will be reflected in this stat.<br /><br /> To further show just how important LOB% is, let’s run these same tests using LOB% as our independent variable and ERA as our dependent variable.<br /><br />Correlation Coefficient: -0.76 <br />R2: 0.56 <br />Adjusted R2: 0.56 <br />P-value: 1.3E-104 <br />Level of Significance: 1% <br /><br /> Conclusion: As LOB% goes down, ERA goes up. 56% of ERA movement can be predicted by LOB%. It is extremely statistically significant. LOB% is very important.<br /><br /> So how do we use LOB% for fantasy purposes? Well, in general terms, the pitchers that are at the extremes of LOB% should be expected to regress. League average is generally around 71%, but here’s a table breaking it down by year in case you’re curious.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty11-724103.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty11-724094.bmp" border="0" alt="" /></a><br />Check out how the 2006 LOB% leaders regressed in 2007:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty12-794255.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty12-794223.bmp" border="0" alt="" /></a><br />As we saw last week with HR/FB rates, most guys tend to fall back into the typical, league average pattern. There are a few who did not, though. For most of them, the reason is because their skills support an above-average LOB%.<br /><br /> Roy Oswalt generally puts up good peripheral numbers and has had a LOB% below 76% just once in his career. Johan Santana hasn’t been below 76% since 2002 (which was also, not coincidentally, the last time his K/BB was below 3.00).<br /><br /> Chuck James, though, isn’t explained as easily. He has put up excellent LOB% despite LIPS ERAs of 4.72 and 4.55 in 2006 and 2007, respectively. These are the only two years we have of him, so we can’t call this a career trend as we did with Oswalt or Santana, and his peripheral numbers aren’t very good. He isn’t particularly better with runners on base either. In 2006 there was relatively no difference between his overall peripherals and those with runners on, and in 2007 he was only marginally better. I’m calling for some serious regression next year.<br /><br />Curious who else you should be looking out for next year? Here are the best and worst rates of 2007:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty13-709625.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty13-709597.bmp" border="0" alt="" /></a><br />That concludes Stat Head for this week. You guys now have some great ways of evaluating pitchers, and I’ll be explaining how to put a lot of this stuff together within the next couple of weeks. After that we’ll move onto hitters for a while.Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-61841467547944637822007-12-05T11:25:00.000-08:002008-03-18T11:35:52.499-07:00HR/FB RateLast week, I talked about DIPS Theory, its relevance to pitchers, and the fantasy implications that can give you an edge over your competition. If you recall, I digressed a couple of times to say that home runs are not completely within a pitcher’s control. I’d like to talk a little more about this today.<br /><br /> To measure a pitcher’s home run prevention, there are a couple of statistics we can use. The first is the method we use to typically measure strikeouts and walks, which is per nine innings, calculated as follows:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty4-790311.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty4-790301.bmp" border="0" alt="" /></a><br /> The second stat we can use is the Home Run per Flyball statistic. It is a simple ratio that is calculated as follows:<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty5-716194.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty5-716180.bmp" border="0" alt="" /></a><br /> If we test the year-to-year trends of the two (using 2004-2007 data), we see that HR/9 has a correlation coefficient of 0.33 and an r2 of 0.11. For HR/FB, the correlation coefficient is 0.18 and the r2 is 0.03. These tests are statistically significant and show that pitchers have a little control over their home run rates, but that control is not great. Much of it is unexplained.<br /><br /> To help explain it, and to forecast future performance in this area, we turn to the HR/FB stat. Tests show that pitchers tend to regress to the mean, which is typically around 11-12%.<br /><br /> Pitchers have a little bit of control over their HR/FB rates, and home ballparks can explain a small portion of deviation, but pitchers who are significantly above or below this mark should be expected to regress.<br /><br /> As a quick example, let’s check out the pitchers in 2004 that had the lowest HR/FB rates and their corresponding HR/FB rates in 2005. To qualify, a pitcher needed at least 12 starts in both years.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty6-742099.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty6-742086.bmp" border="0" alt="" /></a><br />As you see, they all got higher, and most of them put up HR/FB rates near 11% in 2005. Of course, just because a pitcher was heavily affected by luck in 2004 doesn’t mean that he can’t be heavily affected by luck in 2005. Hudson’s luck actually went the other way. But that’s why they call it luck. It happens, it is unpredictable, and the best we can do is assume that it will be neutral.<br /><br /> Here’s another thing to keep in mind when using HR/FB to make your own assessments. It is not uncommon for pitchers with extreme ground ball rates to have slightly higher HR/FB rates. Below is a breakdown of the league average HR/FB and the average among pitchers with at least 50 innings pitched and at least a 50% and 53% expected ground ball rate. The differences aren’t enormous, but you should be a little more lenient with these guys.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty7-706079.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty7-706068.bmp" border="0" alt="" /></a><br />So which players’ ERAs figure to significantly change when their HR/FB rates regress? In the following tables, I’ll also include the pitcher’s ERA and xFIP to highlight the affect that these rates can have on an ERA.<br /><br /> I talked a little about xFIP last week; it measures a pitcher’s ERA using his peripheral stats while normalizing his home run rate. There is a stat that I like better than xFIP – LIPS ERA – which I’ll talk about at some point in the future (in three weeks, probably), but for now xFIP does the trick. First, let’s check out the HR/FB losers.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty8-768079.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty8-768059.bmp" border="0" alt="" /></a><br />Some people will look at Chris Young’s HR/FB rate and quickly write it off as a function of Petco Park. There is no possible way, however, that a ballpark caused his HR/FB rate to be 7% lower than league average. It’s just not possible. There were other factors at play, and we shouldn’t expect him to put up anything lower than a 9% mark in 2008. Here are the winners.<br /><br />Like I said last week with BABIP, don’t take these numbers at face value. Make sure that they are supported by solid skills. Felix Hernandez has excellent skills, and the HR/FB rate in 2007 serves to suppress his value going into 2008. He is a guy who should probably be targeted. Dontrelle Willis, on the other, has declining skills, and it is now being said he’ll be moving to the more hitter-friendly American League. Despite his higher-than-normal HR/FB, he figures to go higher in fantasy drafts than he deserves to.<br /><br /> Also, HR/FB is not the only luck component of ERA, so the affect of it won’t always be as apparent as it is in most of the cases above. In some cases, a pitcher’s xFIP and/or LIPS ERA will be higher than his actual ERA, even with an unlucky HR/FB rate. In these cases, there are likely other indicators (like BABIP, for example) that are acting even more powerfully. HR/FB does have a significant affect, though, and should be considered by all fantasy owners who take winning seriously.<br /><br /> <a href="http://www.mlbfrontoffice.com/uploaded_images/carty9-709797.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty9-709781.bmp" border="0" alt="" /></a><br /> That wraps up Stat Head for this week. If you have any questions about HR/FB or anything else in this vein, feel free to send me an e-mail. Next week I’ll probably talk about Left on Base Percentage (also known as Strand Rate), which is another critical indicator for pitcher ERAs.Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-25642749720457084412007-11-28T11:16:00.000-08:002008-03-18T11:24:23.827-07:00DIPS Theory and Pitcher BABIPIf you’ve never before heard of Defense Independent Pitching Statistics (aka DIPS) Theory, brace yourself. DIPS Theory contradicts much of what is considered conventional baseball knowledge, but acceptance of it is an invaluable tool for any fantasy player who is committed to winning.<br /><br /> In its infantile form, DIPS Theory suggests that pitchers cannot control what happens to a baseball once it is put into the field of play. In other words, pitchers cannot control how many hits on balls in play they give up. This idea was originally presented by Voros McCracken, who earned a job with the Boston Red Sox for his work.<br /><br /> DIPS Theory realizes that many statistics – like ERA – that have traditionally fallen squarely on the shoulder of the pitcher, are actually influenced by a number of factors… the most prominent being a pitcher’s skill, the defense behind him, his bullpen, and luck. Logically then, why does it make any sense at all to focus on things that are out of a pitcher’s control? If they are uncontrollable and highly variable, what possible sense could it make to use them as an indicator of skill? Hopefully, you’re saying “none” to yourself right now. That means we’re on the same page.<br /><br /> Taking this knowledge into consideration, DIPS Theory attempts to separate the pitching skill from everything else in an attempt to get a feel for how good a pitcher truly is. To do this, Voros said that we should look at all of the statistics that a pitcher can control – those that aren’t compromised by external influence – and none that a pitcher can’t control. We can then create formulas based off of them. Here is how he classified different traditional statistics:<br /><br /> <strong><center>Defense Independent</strong></center><br />Walks, Strikeouts, Home Runs (essentially), Hit Batsmen, Intentional Walks<br /><br /> <strong><center>Defense Dependent</strong></center><br />Wins, Losses, Innings, Runs, Earned Runs, Hits Allowed, Sacrifice Hits, Sacrifice Flies<br /><br /> This breakdown makes complete sense. Defense has essentially nothing to do with a pitcher striking a batter out, and luck has only a minimal affect. Maybe a pitcher faces a few more Adam Dunn-type strikeout hitters per year, but that is about the extent of it. The same thing could be said about the rest of the Defense Independent variables.<br /><br /> If we look at the Defense Dependent variables, though, it becomes obvious how riddled with external noise they are. Wins and Losses have a lot to do with offense, runs have a lot to do with defense and bullpen, hits are heavily influenced by defense, etc.<br /><br /> Since Voros first introduced this DIPS concept in 2001, some nice strides have been made in refining it. As can be observed in his two lists above, Voros realized that Home Runs have some variability. Recently, Home Runs have been replaced in DIPS theory by batted ball types (i.e. ground balls, fly balls, line drives, and pop-ups) that can better predict home run rates than actual homers can. We’ll talk more about that at a later date, though.<br /><br /> Right now, the three most prominent statistics in DIPS Theory, and the ones I primarily focus on when evaluating a pitcher are – in order of importance – strikeouts, walks, and ground balls.<br /><br /> Not only are these statistics the most meaningful stats that a pitcher has nearly complete control over, but they also help solve the riddle of the hits on balls in play. Henceforth, hits on balls in play will be referred to as BABIP (Batting Average on Balls in Play). This statistic is calculated as such:<br /><br /> <center><strong>BABIP = (H-HR) / (AB-HR-K)</center></strong><br /><br /> It essentially measures how many hits a batter gives up on contacted balls that do not leave the park (Home Runs are treated separately because pitchers have a decent amount of control over them). In Voros’s original article, he suggested that pitchers have virtually no control over how often these balls in play become hits. His exact words were, “There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play.” While this was an extraordinary first step, it isn’t entirely true.<br /><br /> Before we explain why, let’s try and understand where Voros was coming from with his theory. Imagine that a pitcher makes a beautiful pitch, and the batter hits a weak-ish, shallow fly ball to left field. The problem for the pitcher is that Albert Belle is sitting out there, and he isn’t able to get to the ball in time. It falls for a hit, though by no fault of the pitcher. There are numerous scenarios like this that happen countless times every year. There are even a number of times when the ball doesn’t fall in because of simple dumb luck. When we think of all the variables involved, Voros’s original hypothesis makes a whole lot of sense.<br /><br /> To restate, his exact words were: “There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play.” It was later discovered, though, that BABIP is not a random thing; it is simply highly variable. My colleague at The Hardball Times, David Gassko, penned a fantastic article at the beginning of this year, entitled Uncovering DIPS, about pitcher control over BABIP.<br /><br /> David found that 75% of a pitcher’s BABIP can be found in his peripheral numbers (strikeout rate, walk rate, hit batsmen rate, and home run rate). Voros had originally stated that his belief was “simply that hits allowed are not a particularly meaningful statistic in the evaluation of pitchers.” David’s study shows that BABIP is indeed a meaningful statistic, if treated correctly.<br /><br /> Of course, because of the amount of luck involved, BABIP will fluctuate from year-to-year. If you look at several years’ worth of data, though, it becomes clear that pitchers with good peripheral stats generally have better BABIPs. Not mind-bogglingly better, but better nevertheless. If you’re curious to see some real-life examples of this, here are some career BABIPs to peruse.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty1-795567.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty1-795340.bmp" border="0" alt="" /></a><br /> It should be noted that league average BABIP is generally between .300-.305. Also, the difference would be even larger if we had a list of poor pitchers to look at, but pitchers who are poor don’t generally keep their jobs long enough to get a large enough sample size.<br /><br /> This is further validated by another article by David, which mentions that one standard deviation of BABIP is 0.009, which equates to .20 points of an ERA. He explained this quite easily and succinctly:<br /><br /> “In other words, 68% of all pitchers are affected by no more than +/- .20 runs due to their ability to prevent hits on balls in play, while 95% are within +/- .40 runs. So the effect is there, but not particularly large.”<br /><br /> So I’ve rambled on for a while now, and hopefully you were able to follow everything I said. At this point, though, you might be saying, “Derek, this is all well and good, but how do I apply this stuff to my fantasy team?”<br /><br /> The most important things that I can stress to you are the three DIPS stats we’ve talked about: strikeout rate, walk rate, and ground ball rate. I’m looking for a catchy name for when I’m talking about all three. Let’s go with the “Triforce of DIPS” and see if it catches on… and yes, that is a Legend of Zelda reference.<br /><br /> We’ll talk more in-depth about each of these stats next week and how to put them all together into an easily understandable format (the ERA scale), but for now, know that if you are going to evaluate pitchers, these are the three most important isolated statistics to look at.<br /><br /> I will, however, talk now about how you can apply BABIP to your fantasy teams. As I’ve hopefully driven home by now, BABIP is highly variable and prone to extreme fluctuations. When you see a player with a really high BABIP, expect that it will regress going forward. When it does, expect a significant affect on the pitcher’s WHIP (which is dependent on hits) and a marginal affect on ERA. The opposite holds true for a really low BABIP. Let’s try a quick exercise and see if you have the hang of it. Here are nine pitchers and their 2007 BABIPs and WHIPs. See if you can tell me which ones, judging by this data only, should have higher and lower WHIPs next year.<br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty2-750463.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty2-750448.bmp" border="0" alt="" /></a><br /> Well, judging solely by this data (we’re ignoring any progress or regression there could be in strikeout or walk rates for the moment), we would expect El Duque, Chris Young, Ubaldo Jimenez, and Carlos Zambrano to have higher WHIPs next year and Zach Duke, Scott Olsen, Matt Garza, Chris Capuano, and Felix Hernandez to have lower WHIPs.<br /><br /> Buddy Carlyle and Cliff Lee were thrown in there to trick you. While their WHIPs are high, this is not because of unlucky BABIPs. Their skills just aren’t that good. Here is a list of each of these pitchers and their DIPS WHIP, as I calculate it, from 2007. They should be in line with our conclusions above. <br /><br /><a href="http://www.mlbfrontoffice.com/uploaded_images/carty3-701534.bmp"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://www.mlbfrontoffice.com/uploaded_images/carty3-701521.bmp" border="0" alt="" /></a><br /> A word of warning. Just because a player’s high BABIP will regress does not mean you should automatically target him. You need to check his other skills to make sure he is a quality pitcher. While Zach Duke’s WHIP should get lower, a 1.46 WHIP is not something you should be actively pursuing.<br /><br /> That wraps it up for now. If you have any questions, feel free to e-mail me.Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.comtag:blogger.com,1999:blog-1765513806500451300.post-83234990025241392752007-11-24T11:15:00.000-08:002008-03-18T11:16:29.547-07:00IntroductionHello everyone, and welcome to the first edition of Stat Head. Each week, I’ll be breaking down a different statistic. Normally, this will be a statistic that is a little unconventional, one you probably wouldn’t be hearing about on, say, an ESPN.<br /><br /> First, a little about myself. My name is Derek Carty. I am a student in New Jersey and an avid fantasy baseball player. I am the chief fantasy analyst for the Hardball Times, where I write for the Fantasy Focus blog. If you like the types of things I delve into here, feel free to stop by the Hardball Times, where I apply these concepts to actual players.<br /><br /> I won’t be talking about a specific stat today, but rather why the stats I will talk about in the future are important and how they can help you win your fantasy baseball league.<br /><br /> The heart of baseball statistics comes down to this: baseball is game part skill, part luck. For each of the 10 primary fantasy stats, there is some measure of luck involved. Simply by looking at one of these stats, one cannot tell if it is truly reflective of a player’s skill. In fact, for each of these stats, you can better predict them from year-to-year using some measure other than the stat itself than you could by using the actual stat. Pitching strikeouts can predict themselves quite well, but even for these there are other – perhaps better – ways of forecasting.<br /><br /> If up until now you have been unfamiliar with this concept, you may be asking yourself, “So how do we predict the stats, if they can’t predict themselves? How is that even possible?” The answer, at the barebones, is to separate the luck from the skill. If we look at the components of each stat separately, at the things a player can control (or at the things he can’t), we can see where the expected level of the stat should be. By doing this, we can better predict the path that the stat will take in the future.<br /><br /> If you’re still not convinced, look up the stats of your favorite pitcher. It really doesn’t matter who it is. Just make sure to pick someone who has been in the majors for at least three years or who you have minor league numbers for. Do that now… I’ll wait for you.<br /><br /> Now look down the column labeled “ERA.” Notice how ERA fluctuates from year-to-year, seemingly without any rhyme or reason. Take Johan Santana who is pretty much the consensus top pitcher in baseball. In the past four years, his ERA has ranged from 2.61 to 3.33. While obviously good, if you were to try and put Santana on an ERA for 2008 by only using that information, it might be a little difficult, no?<br /><br /> You might think that the difference between putting him at a 2.60 ERA or a 3.30 ERA is negligible; I mean, they are both great figures, ones you would surely take on your fantasy team any day. The truth, though, is that there is a 0.70 difference between the two. That is a huge gap. That’s nearly a full ERA point.<br /><br /> Once you move away from the elite pitchers, that 0.70 could be the difference between a pretty good 4.00 ERA and a pretty poor 4.70 ERA. In many fantasy leagues, that is the difference between a #3 fantasy pitcher and one who isn’t rosterable. And the killer part is that there is just no way to tell whether it will be on the high end or the low end if you only focus on ERA. It seems a little counterintuitive at first, but you need to dig deeper than the stat itself.<br /><br /> That’s where I come in. I’ll tell you the stats you should be looking at, the stats that dictate where a player’s ERA (or any other stat for that matter) should end up. Because there is luck involved, it will be impossible to always get it right, but in the long run you will get far more right than you would by any other means, and that’s really what it’s all about.<br /><br /> Baseball is not about perfection. In conventional terms, a player who hits .300 is considered a success, completely ignorant of the fact that he failed to get a hit 70% of the time. If we look at a little bit more complete statistic and an absolute freak like Barry Bonds, we see an OBP that was once an absurd .609. Still, failure occurs 40% of the time. If one consistently gets just 60% of the questions on a high school or college exam correct, that person would not graduate.<br /><br /> Baseball is not about perfection, and predicting baseball statistics is the exact same way. Perfection is not an option because of the high variability of the stats… because of the luck factor. But by digging deeper into the stats, we can begin to sift through the luck, find the skill, and from there we can make better predictions than we ever could before.<br /><br /> That’s all for now, but be sure to stop by on Tuesday as I begin to discuss some of these statistics, starting with the one I like (and hate) the most: Batting Average on Balls in Play (BABIP).Brad Stewarthttp://www.blogger.com/profile/14729097165380454622noreply@blogger.com