This is a bit of a different kind of article. Usually I take existing stats and data from the sumo world and then try to draw conclusions from that data. Today I am going to talk about stats that as far as I’m aware are not tracked in sumo, but that I think would be useful for fans of the sport and even the wrestlers themselves.
In fact, a great deal of the work on here is normally descriptive. That is what it sounds like; the statistics describe who ends up Yokozuna. Conversely many of these measures I propose here could (potentially) be prescriptive or normative - meaning it’s actionable advice; it’s what you should do. To give a preview, I think that knowing how quickly a given wrestler comes out of the tachiai (where they touch both hands to the ground) could be incredibly useful for wrestlers. For instance if I knew one wrestler can get out of their stance quickly, but also often has slow starts then as a coach it’d give me normative data to work with; the data is giving direct recommendations.
Some of these are doable if you had videos of all the matches and some dedication. Others are probably a bit more infeasible or would require the kinds of changes to the sport (having wrestlers wear devices for tracking for instance) that probably won’t come about from a humble blogger’s suggestions. But if for some reason a heya is reading a niche English sumo blog and has access to historic sumo videos please reach out at OzekiAnalytics@gmail.com. In fact if you’re curious about why I included PFF in the title, it’s because Pro Football Focus (PFF) is a company that pioneered a similar statistics based approach to football themselves.
There is no particular order for the various stats, but for each one is included the following measures: basic description, how it could be obtained, and potential use. If I think of more I’ll potentially do another round of this.
New Sumo Stat 1 - Wingspan
Wingspan is what it sounds like; spread your arms flat from the shoulders and measure tip to tip from one hand to the other.
I’m actually quite surprised this isn’t a more standard measurement. In combat sports like boxing and MMA, it’s very commonly included along with other various key stats like height and weight when introducing the fighters. You would probably need the sport as a whole to get behind it and release the number or otherwise each heya (sumo stable) would keep the data to itself.
Wingspan is obviously important for a sport like sumo wrestling where you’re trying to push your opponent with your arms. Imagine a match-up between two rikishi of similar rank and size, but one has a 3-inch reach advantage. If he tends to win matches with tsuppari-style pushing, that wingspan suddenly becomes a more predictive feature. It usually is correlated with height, but you’ll also have the occasional guy with disproportionately long arms which only benefits them (Floyd Mayweather had long arms). Conversely, some wrestlers on the taller end might have shorter arms than it seems.
That said, I do wonder if it’s a bit like Judo. I’m an amateur (retired) practitioner but from my recollection and knowledge of the sport, I do believe there’s a bit of ambivalence about arm length. Some throws are easier for the longer limbed, while others are much harder to execute unless you have short arms. For this alone I’d love the wingspan data and ideally going back into history to ascertain if there is indeed ideal arm length for sumo wrestlers. Finally, it would be helpful to figure out how wrestlers should focus their development - your arm length can’t be changed, but you can change your style to maximize your effectiveness with your physical gifts.
New Sumo Stat 2 - Grip and Arm Placement Frequency
Another relatively straightforward one, this could probably be done with simple video review1. We often hear commentators discuss one wrestler getting his preferred grip on the Mawashi (belt) or an under arm grip for instance. By looking at all of a wrestlers given matches you could begin to classify not only how well they do when they get certain grips on their opponents, but also how well they do when their opponents have them in certain grips.
This would be valuable data for qualifying what kind of skills wrestlers need to practice most. If only rikishi at Ozeki and above can get a grip on me/get to my belt then focusing on grappling like that might be a potential misallocation of practice time and resources. Conversely, if a wrestler has particular trouble dealing with their opponent having a certain grip on them, it would be quite valuable to practice handling that situation. They likely have some idea on this already, but having a stat allows you to directly quantify those weaknesses and strengths.
New Sumo Stat 3 - Tachiai Reaction Time
The tachiai’s Japanese characters can be more or less directly translated as “Stand Meet.” It’s when both wrestlers have touched their hands to the ground and then explode to meet each other. I would love to know how quickly they explode out and which wrestler is quicker.
This stat would actually be quite difficult to obtain, but also incredibly useful.
The problem here is quantifying when wrestlers actually “go.” I have two potential ideas neither of which is easy.
First and more possible: having high FPS (frames per second) footage from the same vantage point for every match. It’s still likely imperfect but with an algorithm you could potentially identify the frames on which wrestler begins moving. It is important to have the same vantage point for all of these as otherwise it could interfere with the measurement and make times non-standard.
Second: sensors in the Mawashi. With technological advancements, I think this is likely feasible but also a bit of a pipe dream at the same time. As we know when the wrestlers make a big movement at the tachiai to start the match, they’re generating a ton of force and quite quickly. I think you could configure a wearable motion tracker to detect that and accurately, but I very much doubt that the Sumo Association will be implementing something like this any time soon.
Assuming we did obtain this data and it was reliable it’d be useful. I often compare sumo to other sports and the tachiai is probably the one instance that these large violent men are most comparable to the lithe drivers of Formula 1. In both cases it’s all about reaction.
Also like F1, smarter wrestlers would likely be able to take advantage of knowing opponent’s tendencies - not unlike in a restart at F1 (the race is slowed down due to debris on the track) there is an advantage to be gained for the leader by tactically taking off and resuming the race when least expected.
Furthermore, beyond knowing if reaction times are quick or slow in general it would also be useful to know if they’re consistent. Some wrestlers likely have a quick tachiai, but only occasionally. Coaching up that consistency would be valuable.
New Sumo Stat 4 - First Step: Stride Length, Time
This is in certain ways a continuation of the stat above. But knowing how far the initial stride of the wrestler is along with how long it took them to get the foot down from the first step would be useful and usable information.
I’ll be honest: here’s where I think it could be tougher to implement in a reliable way. For this one you might need a Hawk-eye multi camera high FPS (frames per second) system like in tennis.
The rewards would be worth it for wrestlers and coaches with access to this data however. They would get more value out of it than the audience at home because unlike us they would know each wrestler’s game plan. Depending on strategy in match, your steps could vary greatly. In a henka for instance, the wrestler is jumping to the side and hoping the other wrestler over committed to going straight. That might be obvious but if the wrestler is going for a belt grip vs a push then the footwork and how long (or short) of a step they aim for could be quite different in this case.
Even still, knowing that time to first step would be useful for us viewers too. In running they say that you can change stride length and you can change stride frequency but that’s what you’re working with. In a sport where one wrong step can mean a loss, more info about the first step could only enrich our understanding.
Conclusion
I joked about consulting for a sumo stable up top, but in writing this piece I tried to think like a data scientist, or a statistician, or a member of the analytics department for a professional sports team. I tried to keep the discussion relatively grounded, but I think there are some valuable lessons in there that are widely applicable.
Knowing how the data is obtained is incredibly important. In fact, sometimes data will have been poorly collected. At PFF they’ll often have employees watch the same plays and chart it with whatever stats they try to get from it to ensure that folks are grading them the same. Election pollsters are always assessing if the people they ask questions are representative - thinking more about where data comes from and how reliable it is has only becomes more and more important.
There will likely be a part two of this sooner or later, but if you want to try and get a little smarter and think more like a data scientist or hedge fund manager then I’d encourage you to think about what other stats might help. To help, think of the Target as being predicting who will win a sumo match. The data you would get will be a feature. Let’s assume there’s a working Elo (the chess ranking system) so your goal is to come up with a feature that will improve your algorithm that predicts a Target. Or in plain English: what’s a stat that doesn’t exist that you think would allow you to better predict who wins? I think that adding wingspan of a feature to my model would allow me to more accurately predict the target being who will win a given match.
Extra Credit: sumo match outcomes are best modeled as a binary. In other words, the wrestler either wins or loses; two choices - binary. In computer science and stats this will often be portrayed as a 0 or a 1 (with 0 representing a loss and 1 a win or vice versa). This is actually a bit problematic because a common method - linear regression - is unavailable because it assumes that numbers can go continuously beyond 0 and 1. In this case when you’re modeling a binary outcome you’ll typically use a Logistic Regression. Much of the same terminology and even lots of the math is similar (but not the same) except the result you’re returned will be somewhere starting from 0 up to 1. So key takeaway: linear regressions are used for predicting quantities like how much money a company will earn whereas logistic regressions are used for calculating odds like what percent chance will the company make the predicted earnings for instance. Going forwards I think I’ll put more math/in-depth stats stuff at the end under this Extra Credit section so it can hopefully be educational but also not overwhelming and more optional.
The problem is that you do need some sort of standardized definition and if you have multiple people, you need to ensure that they’re grading these grips and arm placements in relative uniformity.
For tachiai response time, I swear I've seen Dosukoi Sumo Salon on NHK World have something like that where they timed it. I assume they used video analysis. The force, though... in some cases, I think they put on sensors, but in others, I believe they used estimates of impact from estimated speed, measured weight, etc.
With some of the new AI tools out there, we could probably capture some of your proposed stats from already-extant videos, using some reference stats, like the rikishi's height (and we can correct the bullshit Midorifuji height in the JSA record).
And yeah, I'd use LOGIT or something for the model, but one of the things I particularly like tracking are the kimarite, of course.
I often wonder about tachiais and the advantages and disadvantages of putting fists to the clay first. I notice some rikishi vary on this, but others--like Hoshoryu--insist on being the last one to touch down, while Onosato touches down first and waits. If you're ready to go and are focused on your opponent, you might get some insight into what they're going to do, perhaps by detecting signs of a henka, but if you're the one to touch down last and start the bout, you are able to start before the other realizes it.
Most people don't know that there is a significant delay between when something happens and when we perceive it. Our brains do a very good job hiding this from us. In effect, our perceptions are largely based on predictions and expectations that are corrected after the fact. I wrote about this on my blog (https://www.awondrousworld.com/2024/07/constructing-our-world-what-is-real-13.html) in the third part of this page. This would give Hoshoryu an advantage by giving him a head start and it also accounts for the occasional false starts. Hakuho also insisted on touching down last. On the other hand, and a bit less likely, if you can pick up signals by closely watching your opponent prepare to launch, there might be an advantage in that. Onosato is huge and can afford to start late.
I've tried to see whether there's any custom or unofficial rule as to who touches down last. I thought it might be by rank, with the higher ranked rikishi touching down last, but I haven't noticed any pattern to it.
Mary's memory is right. That was on Dosukoi Sumo Salon. I had it in my queue to watch and so I pulled it out and watched it. They found that Hakuho was the fastest off the tachiai and in the videos they showed I noticed that he touched down last, so that gave him an advantage. But as for speed and hip height, Kisenosato said he was second to last, yet he still become yokozuna. He was always making minute adjustments to his tachiai trying to find the ideal method. There are a lot of contributing factors and variables. Dosukoi also looked at impact force, psychological intimidation, and how long the first and second steps are. Unfortunately, my copy is missing the final 13 minutes.
They said that 80% of bouts are decided by the tachiai, but that sure doesn't seem right to me. A lot of bouts look like they could go either way right up to the end.
I'm definitely interested in find out more on this and it's worth examining.