Wednesday, March 21, 2012

First post: Dirty WAR calc for NPB

Everyone likes WAR. You like WAR, and of course I like WAR. While WAR is occasionally over-hyped, misunderstood, or misused and all Wins Above Replacement metrics currently available on the Web have their own limitations, it gives us a great viewpoint of baseball as long as you understand its benefit and blemish correctly. This time, I'm going to introduce Wins Above Replacement metrics at Nippon Professional Baseball, abbreviated as NPB.

But before the methodology, some warnings exist.

[First, lack of data, data, data!]
When you have a question like "Who corrected most hits during 2008 to 2010?" or "Which pitcher has the most extreme platoon splits?" or like that, you can navigate to Baseball-Reference, roam around there for a couple of minutes, and finally get the answer. Or whenever you have greed for more geeky statistics you can go to Fangraphs or Baseball Prospectus and may find what you want. Some of you reading this stuff right now would be able to play around with Retrosheet. Some of you could deal with Pitch f/x too, and of course others couldn't, but there are lots of awesome Pitch f/x-concentrated sites like BrooksBaseball or Texas Leaguers, to name a few. Even Major League official site can give you a good deal of baseball statistics, and whenever you have a PC or Tablet and connection to the Internet, you are able to enjoy baseball just by staring at a screen fraught with numbers.

However, when you shift your attention to the league across the Ocean you always have to consider a lack of available data. For example, if you want to know runs variation by stadiums at NPB, you have to take raw datasets into your spreadsheet or whatever other softwares you like and manipulate it by yourself to get an answer. Or even when you like to know league RPG year by year to see a historical trend line, you have to work on that troublesome handling. There is neither Retrosheet nor Pitch f/x. No one exists like Bill James, Pete Palmer, David Smith, or Ted Turocy. More disappointing is that nobody attempts to establish the platform which certainly would lead to much more flourish at sabermetric community in Japan, whether it's NPB official staff or established data company or even an aspiring individual, as long as I know (i.e. those active on the Web) as of March, 2012.

Fortunately however, one baseball site provides us a good deal of basic NPB statistics since 2006 season and it helped me fetch those stuff to work on this project. That site is called NulData, (the link is written in Japanese) and thus I'll name these six years "NulData era" after the site name for a later reference. Six years worth of data may be scarce to you, though I believe it's sufficient to show you the outline.

[Be aware of differences from Major League Baseball]
First of all, each team played only 144 games per season in Japan, 18 games less than in Big League's season. While NPB implement laxer schedule, in terms of both the proportion of a day off between games played and a travel distance, which enables regular ballplayers to play more innings or games, they also adopt a drawn match at no more than 12 innings. Coupled with a lower RPG (more on that later), it leads to less playing time for players. So all baseball statistics, whether it's offensive or pitching, or player's or team's, include more variation, relative to those of MLB. I should also add that teams usually form their rotation filled by six starters, one less than Major League's teams. That means starters may contribute less to their teams, again compared to MLB. So keep these facts in mind for a later presentation.

Secondly, NPB does rarely perform waiver system. If you ask "How does waiver system work on NPB?" to any Japanese baseball fan who doesn't follow MLB at all, you would be countered with "What's waiver?", 95% of the time. Players are rarely put on waiver, claimed by other clubs, and play for a new team during the season in NPB. There is some of precedents of course. But it may be permissible to say it doesn't work on usually. Thus, clubs don't usually procure external replacement players during the season when they get into emergency; they only compensate for it internally, by promoting those scrubs form their own minor league affiliates. While NPB teams can stock as many as 70 players through their whole affiliates (except Ikusei system) and there is no extra restriction like player demotion option, it may be such that you can think it would lower replacement level baseline relative to that on MLB. Or it may not.

Lastly, not only do park variations in NPB stem from a stadium's intrinsic characteristics like dimension, climate, or mound height, they are also byproduct of the fact that each team was able to decide which baseball to use on games at their home park. This was really a disgusting issue since we can't know which team uses what type of baseball and what kind of impact that baseball has on overall run production. Far from it, we can't even know which direction each ball affects it in the first place.

However, that situation came to an end prior to 2011 season, when NPB abolished the discord and took in notorious standardized baseball. You were relieved from the above concern, but this rule change, along with league-wide unification of umpires, caused huge run reduction in 2011. From 2006 to 2010 average runs per game was 4.18 and neither league saw their own league's RPG less than 3.93, yet in 2011 Central League's RPG dropped to 3.37 and Pacific League's to 3.30. Always bear in mind that Nippon Professional Baseball was undergoing huge conversion this past year.

[fielding data isn't included]
If only the reason you're here is because you want to look at fielding metrics on NPB, you'd be better off closing your browse's tab and spend your time on other things. We have no "useful" fielding data. That's all.

OK, with a boring background explanation aside, let's go into the methodology. I'll first explain batting because it's the easiest. I took wOBA coefficients borrowed from this site (again, written in Japanese) to calculate seasonal wOBA as a baseline batting metric, excluding pitcher's at-bats. Then, I converted each player's rate stats subtracted by league average to counting scale by multiplying it by their playing time (PAs) with an adjustment for their own park environments. That's his total batting runs above average. Add replacement value and convert it to wins scale, you can see his Wins Above Replacement figure (batting plus replacement, actually).

Next on pitchers. Pitchers' values are generally considered harder to be measured quantitatively due to a floorer of distinguishing contributions between a pitcher and players behind him. FanGraphs WAR, for example, deals with this issue by regarding pitchers as immune from what's happening on the field and attributing no credit/debut to them once baseballs are put into play. Rally on the contrast thinks pitchers owe some degree of responsibilities for their inability to defeat hitters by themselves. Theoretically, Rally is right. Different pitchers have different types of involvement in letting batters put balls in play and fielders handle those balls. However, how much portion of it should be attributed to pitchers and how much to fielders, I don't know. On the contrary, I don't have fielding metrics in the first place! So I decided to wrestle with it by looking only at pitcher's three true outcomes (walks, strikeouts, and home runs allowed). That's far from the reality baseball is based on, but the best approach I can tackle given a scarce of datasets on my hands and my laziness.

(By the way, this above paragraph does only reflect what fWAR thinks, not FanGraphs member actually think.)

The calculation itself isn't so complicated, but more so than hitters'. I first calculated each pitcher's RA-scaled FIP by season, adjusted it for stadiums they pitch in, converted it to winning percentage form thanks to PythagenPat, then multiplied his effectiveness above replacement class by their playing time. For starters I factored in how deep they went on the average and if he didn't last nine innings on average (all pitchers did, of course), I assumed the rest of innings on the game were assigned to average pitchers. Some starters had occasional relief appearance and I don't know how many innings he pitched on those entrances, so I fixed all relief outings by starters at 1 inning per appearance. On more than 4 WAR season for 2006 to 2011, only Chihiro Kaneko's 2009 season had skewed proportion (21 starts vs. 11 reliefs, 77 FIP- on 171 2/3 IP) and for other than him and Kazuki Yoshimi in 2009 (25 starts vs. 2 reliefs), all pitchers relieved on no more than 1 games on a season. Not a serious problem.

I also played around a bit with separation on starters/relievers role, but came to a decision of just dividing it if a pitcher started more than 50% of the time or not and tagged him with SP/RP role. Some pitchers are more or less biased like Mamoru Kishida's 2007 season when he started the season as a long reliever and logged 28 outings and then was converted to a starter, in which he started 11 games, or 2011 Rookie of the Year winner submariner Kazuhisa Makita, who opened the season as a starter and started 10 games, and then because of a sluggish bullpen on the Lions, served as a closer after an interleague and showed up on 45 relief appearances, including 22 saves. However, those are a few exceptions and most pitchers who couldn't be segmented easily are not ones we are interested in. So I took this short-cut route.

Next on to the replacement values. I took each team's top 8 players (if they played in Central League, where DH is adopted) or 9 players (National League, no DH) who accumulated most plate appearances on a season and excluded these players as regular ballplayers and tagged the rest as replacement guys. Then, I compared these players' park and league adjusted linear weights to league average and found that these scrubs contribute to their teams 33% less than an average group of players do, which is about 18 runs per 500 plate appearances. I didn't find almost no difference between two leagues.

Replacement class of pitchers go through the same procedure. I took top 6 starters who pitched in most innings per team and season and considered the rest as replacement. Likewise do relievers. Replacement starters have .380 winning percentages if they pitch in Central League, and .360 if in Pacific League while scrap-heap relievers' winning percentages are .460 in CL, and .430 in PL. Considering the fact that Pacific League is widely considered to be a better league (since the implementation of an interleague, PL won 52.3% of the time, though that's based only on 1,152 games, which means luck effect still remains relatively large 1.5% at one SD), this gap may explain some of the differences that arise from the observed track record.

You may think that this method is flawed in that it fails to shut out some regular players who were struggling with injuries and hence not able to play for a full year, or hitters who did just miss the baseline but were actually pinch-hitters and not replacement-level ones. Yes, you're right. This method crammed Josh Whitesell in 2010, who singed during the season with the Yakult Swallows and gained only 268 plate appearances for the rest of the season but was very productive in that short period of time with a slash line of .309/.399/.591, awesome for .423 wOBA, into a scrub community. Or when I tried to keep one more player away from these class and re-calculated baseline, I did find that these barely left players on the first calc batted better line than the rest of replacement class. However, attempting to set replacement baseline by playing time also risks to misclassification of players because of managers/coaches' mismanagement like a continuous use of a mediocre player who had good first month solely by good fortune, or a player who struggled with injury and was not deserved for playing in top level actually. And the reason players gained so much playing opportunities is because they benefited from good luck in the first place. That would offset some part of it.

The last explanation goes on to park factors. Park factors were calculated as weighting target year with 3, two adjacent years with 2, and other years with 1, though I thought afterward this had too much weight on recent years and forgot on why I did such a steep weighting. Then, regressed it by some amount depending on how many years for that specific park to be based on, and just divide the figure by 2 to apply it to actual modification. Raw park figure calculations were based solely on runs scored, runs allowed, and games played, which means it does bias against pitchers on better teams (like Dragons and Giants). I regretted factoring in this effect, though it has minor impact. Once I computed "applied" park factors and averaged out, it didn't spit out 1. It would come from the fact that NPB sometimes host games on rural stadiums, which are usually small and pull up runs production. I didn't realize this phenomenon, though it also makes little influence. Standard deviation on "applied" park factors between 12 parks is about 4.7 and once I correct the problem as mentioned above it would be close to around 3.5. I considered to redo park calculations, though reworking of park computations requires an complete reform and cost much time so I decided to carry over to the next time.

Finally, let's enter the most fun stuff. Here's a leaderboard for top 10 batter (batting and replacement) and pitcher for that 6 years.

Yu Darvish1174.07334.6
Masahiro Tanaka930.07625.2
Toshiya Sugiuchi1071.38324.8
Hideaki Wakui1150.39221.8
Yoshihisa Naruse949.38620.2
Hisashi Iwakuma819.38219.2
Tetsuya Utsumi1079.39018.7
Tsuyoshi Wada945.69018.5
Chihiro Kaneko808.38916.0
Daisuke Miura932.39315.2

Kazuhiro WadaLF3390.38715235.0
Michihiro Ogasawara3B3270.39715534.7
Norichika AokiCF3766.38613933.0
Alex RamirezLF3593.38214132.0
Alex Cabrera1B3088.39315131.8
Hiroyuki NakajimaSS3468.38314131.6
Atsunori InabaRF3306.37414130.0
Takashi ToritaniSS3714.35813128.9
Shinnosuke AbeC3029.38414628.7
Takeya Nakamura3B2718.39515228.3

Everyone would immediately disregard my calculations if Yu Darvish doesn't show up tagged with the best pitcher honor. Yes, he led all pitchers and in fact his figure is by far the best among all pitchers for this short period. Second-best Masahiro Tanaka is arguably the most outstanding pitcher now that Darvish is gone and our next target to hatch grabbing into Major League's hill. Kyuji Fujikawa, who you are also familiar with, exceeded all relievers by huge amount (14.2 WAR, 6.3 more than runner-up Takuya Asao among relievers) and is ranked 13rd even mixed among starters. Due to finally get an international free agency right during this year, he's expected to take a mound in MLB in 2013. Hisashi Iwakuma is another terrific starter who finally came to our nation after posting negotiation with the Oakland Athletics was broken down two years ago. He has always struggled with injuries, but while healthy, I believe he can pitch as good as an average class of Major League starters and bring a good portion of return to the Mariners.

For the batters side, Kazuhiro Wada edged out 2nd-place Michihiro Ogasawara by a marginal amount of values, though taken into consideration of these two players' defensive position and/or reputation and age, their total figures once fielding values are included would be less than Yu Darvish's. On the contrary, Norichika Aoki played mainly in center field and has been regarded as at least average there for the whole career, and thus might have gained most wins under NulData Era.

Here's 10 best seasons for pitchers and batters, respectively.

Yu DarvishFighters2011232.031.2%4.7%.2691.441.25539.1
Masahiro TanakaEagles2011226.327.8%3.7%.2791.271.56588.3
Yu DarvishFighters2010202.027.6%6.7%.2921.782.14637.0
Hisashi IwakumaEagles2008201.620.2%5.0%.2701.872.34636.5
Daisuke MatsuzakaLions2006186.327.7%5.1%.2652.132.25646.0
Yu DarvishFighters2008200.627.2%6.9%.2541.882.57735.9
Kazumi SaitoHawks2006201.026.0%6.5%.2631.752.26725.8
Yu DarvishFighters2007207.626.6%7.7%.2241.822.39755.7
Toshiya SugiuchiHawks2008196.027.5%4.8%.2882.662.53725.7
Seth GreisingerSwallows2007209.019.1%4.0%.2732.842.81695.6

Takeya NakamuraLions20113B622.269.373.600.266.43310.6
Kosuke FukudomeDragons2006RF578.351.438.653.382.46610.1
Tyrone WoodsDragons20061B614.310.402.635.347.4409.3
Kazuhiro WadaDragons2010LF602.339.437.624.338.4518.6
Yoshio ItoiFighters2011CF578.319.411.448.373.4018.5
Tuffy RhodesBuffaloes2007DH554.291.403.603.336.4348.2
Nobuhiko MatsunakaHawks2006LF559.324.453.528.319.4338.2
Shuichi MurataBayStars20083B554.323.397.665.336.4578.1
Lee Seung-YeopGiants20061B592.323.389.615.352.4277.9
Michihiro OgasawaraFighters20061B579.313.397.572.318.4227.9

Single-season WAR leaderboard features Takeya Nakamura's 2011 season as with the best of them, where he batted .433 wOBA on 622 PA and league-leading 48 HR, which excelled 2nd-place Nobuhiro Matsuda of the Softbank Hawks by 23(!!!) and even combined Chiba Lotte Marines players together couldn't reach by 2. Also worth noting is Kosuke Fukudome's 2006 season, where he batted .351/.438/.653 for .466 wOBA at Petco when league average was .325. For the pitcher's side, Yu Darvish is ranked 4 times on top 10 for the past 6 seasons. No surprise here.

Aggregated team WAR (fielding excluded, again) picks 2011 Japan Series champion SoftBank Hawks as the best team at 59.2 WAR. Their .657 winning percentage got them 88 wins and Pacific League pennant title with 17.5 GB to runner-up Nippon Ham Fighters. Yomiuri Giants is only other team to join 50 WAR club and they did twice (2009 and 2007, 52.1 and 50.4 WAR respectively, won pennant both year). On the flip side, perennial cellar dweller Yokohama BayStars rank 5 times on 9 worst team seasons and in 2009 and 2010, they gained only 17.3 and 17.1 WAR respectively.

Even without fielding, baserunning, and whatever other things, correlation coefficient on aggregated team WAR and their actual wins is .79 and I'm comfortable with this result. For your information, correlation coefficient on aggregated team WAR and their actual wins on MLB, according to FanGraphs, is .89 and with the exclusion of fielding and baserunning, it comes to .82.

Average aggregated team WAR by season is 38 and standard deviation between teams is 8.5. (By the way, standard deviation on actual team wins is 10.3, so fielding variation would be somewhere between 30 and 40 runs.)

These calculations are far from perfect and leave many issues to be solved. Park calculations should be treated more carefully, same is true of other factors like starter/reliever role . Next time I will do (maybe next year), I wish more data come to my hands. Until then, I hope this is a starting point of research and analysis on NPB.