The Submarine Lane Is Real, But Narrow: A 2026 Field Check on Kings of War Scoring Systems

Steve Tonneau’s scoring-system work gives tournament organizers better questions. The 2026 event data shows why those questions matter.

Contents

TL;DR What Steve’s Article Gets Right What I Could Actually Test The Submarine Lane The Comeback Examples Strength of Schedule Is the Real Story Swiss Pairing Does Improve Game Balance What This Does Not Prove So What Should Tournament Organizers Take From This?

A good tournament scoring article does not start with “my favorite system is best.” It starts with a better question: What are we trying to reward? That is what I appreciated most about Steve Tonneau’s recent work on Kings of War scoring systems. His article compares Northern Kings, Blackjack, WDL+SOS, pool variants, and hybrid systems using simulation. But the most valuable part is not the final recommendation. It is the framework. Steve asks tournament organizers to name the thing they care about.

Do you want the player with the most wins to finish highest?
Do you want the strongest player to win most often?
Do you want to reduce submarining?
Do you want better game balance across the room?
Do you want strength of schedule to matter?

Once you define the goal, the scoring-system debate gets much clearer.

TL;DR

Steve’s simulations point to several real tournament dynamics. I looked at six usable 2026 post-event reports to see whether those dynamics show up in actual event data. They do. Across Adepticon, Aussie Clash of Kings, Bugeater, Hoosier Storm, Northwoods GT, and Shroud of the Saint:

181 player finishes were reviewed.
466 matched games were joined to Elo and ranking data.
84 players lost in round 1.
18 of those round-1 losers finished with a positive record.
3 reached 4+ wins.
6 finished in the top quarter.
0 finished in the top 10%.

The headline finding: the submarine lane is real, but narrow; early-loss comebacks happen. They are part of the tournament ecosystem. But in this 2026 sample, they were not rampant, and they did not produce top-10% finishes. The sharper finding is schedule strength: Round-1 winners averaged a .545 opponent win percentage. Round-1 losers averaged .445. Among one-loss players, those whose only loss came in rounds 1–2 averaged a .462 opponent win percentage. Those whose only loss came in round 3 or later averaged .612.

What Steve’s Article Gets Right

The best part of Steve’s work is that it does not treat scoring systems as vibes. It treats them as tools, and tools have tradeoffs. Blackjack, Northern Kings, WDL+SOS, and pool systems are not just different ways to add numbers. They create different incentives. They reward different outcomes. They answer different questions.

Instead of asking which system feels best, he asks what each system does under repeated tournament conditions. His simulation framework lets him test things that are hard to isolate in real life: submarining, opponent strength, game balance, ranking accuracy, and whether wins drive final standings. So I wanted to take those questions back into real post-event data.

What I Could Actually Test

I gather a lot of tournament data. The 2026 post-event reports I reviewed do not let me fully rescore every event under every system. The usable reports had round-by-round results, final placement, Elo information, opponent win percentage, and strength-of-schedule-style measures. They generally did not have complete scoring-margin detail in a form that would let me accurately convert every event into Blackjack, Northern Kings, Bullshroud, Bartshroud, and WDL+SOS.

So this is not a full scoring-system shootout, but it is a field check. The question is narrower: Do the structural effects Steve measured in simulation show up in real 2026 tournament data? In three areas, the answer appears to be yes:

Early-loss comeback paths exist.
Strength of schedule matters.
Swiss pairing improves game balance over time.

The Submarine Lane

In tournament language, a “submarine” is a player who loses or draws early, drops into a softer part of the field, then climbs back up while avoiding the hardest path through the top tables. The 2026 data lands in the middle. Across six events, 84 players lost in round 1:

That is a useful result because it cuts against both extremes. No, an early loss does not end your event. About one in five round-1 losers still finished with a positive record. But no, the data does not show a flood of players losing early and stealing the top tables. In this sample, no round-1 loser finished in the top 10%. So the submarine lane exists, but it is narrow.

The Comeback Examples

A few individual runs are worth calling out.

Player	Event	Path	Finish
Max Altman	Bugeater	LWWWW	7th of 44
Bob Sautbine	Adepticon	LWWWW	6th of 35
Adam Storey	Aussie COK	LLWWWW	14th of 52

A round-1 loss can still leave room for a good finish, a Masters point push, or a respectable placing. But at least in this dataset, it did not create a reliable path to the very top. That feels right.

Most events are short. Five or six rounds is not much time. An early loss usually costs too much tempo to fully recover, especially if the undefeated and one-loss players keep winning. But it can still shape the middle and upper-middle of the standings. That is where SOS starts to matter.

Strength of Schedule Is the Real Story

The cleaner signal was not final placement but opponent strength. Round-1 winners and round-1 losers did not face similar paths afterward:

Group	Avg Opponent Win %
Round-1 winners	.545
Round-1 losers	.445

Once you lose early, Swiss pairings usually move you into a softer bracket. That is not a flaw in Swiss pairing but the design: Swiss is trying to create better games by pairing players with similar records. If you lose early, you are more likely to face other players who also lost early. But that also means final records do not always carry the same meaning. A 4-1 record against a hard path is different from a 4-1 record against a softer path. Both players did the work in front of them, but the context is different. The one-loss data makes this even clearer:

That is the heart of the article and what I think Steve was showing us: among players with one loss, timing mattered. Players who lost early tended to face softer opposition overall. Players who stayed clean longer and lost later tended to face much stronger schedules. That strongly echoes Steve’s point. Strength of schedule is not just a tiebreaker for spreadsheet obsessives. It is a way to describe the shape of a player’s tournament.

Who did you beat?
When did you lose?
How hard was the path?

Swiss Pairing Does Improve Game Balance

The other positive finding is that Swiss pairing works. Using Elo gap as a rough proxy for expected game balance, average matchup distance dropped over the event.

That is exactly what we want to see. Round 1 is noisy. Pairings are less informed. Strong players can hit newer players. Middle-table players can get awkward draws. The room is unsorted. By round 5, the system knows more: Players have separated by results. The top tables are sharper. The middle tables are more clustered. The lower tables are more appropriate. Games generally tighten. That is a quiet win for Swiss pairing.

It also helps explain why scoring systems are so hard to evaluate. They are doing more than ranking players. They are shaping the experience of the room as the event unfolds. A system that produces a slightly more accurate final podium but worse games across the room may not be the right choice for every event. A system that is very fair on wins and SOS but less accurate at identifying the strongest player may be perfect for a local GT and less ideal for a Masters event. Different goals. Different tools.

What This Does Not Prove

This data does not prove that Northern Kings is better than Blackjack. It does not prove that WDL+SOS should be the default. It does not prove that pools are good or bad. The dataset does not support that level of claim yet. For now, the responsible conclusion is narrower:

The real-world data supports the conceptual importance of Steve’s metrics: Submarining is not imaginary. Strength of schedule is not cosmetic. Game balance changes over the course of an event. The timing of a loss matters. Those are all visible in actual 2026 tournament data.

So What Should Tournament Organizers Take From This?

I would not read Steve’s article, or this follow-up, as a command to immediately change your scoring pack. I would read them as a prompt to be more intentional. Before choosing a scoring system, ask:

Do I want wins to be the primary driver of standings?
Do I want margin of victory to matter?
Do I want to reward strength of schedule?
Do I care more about identifying the strongest player or giving the room balanced games?
Is this a Masters-style event, a local GT, or a large convention tournament?
How much complexity can my players and scorekeeper reasonably handle?

The perfect scoring system on paper may be the wrong system if nobody understands it, if it creates pairing delays, or if players leave the event feeling like the standings were opaque. There is an old Bill James idea from the 1987 Baseball Abstract that feels useful here. In his chapter on “Meaningless and Meaningful Statistics,” James argued that good statistics should be judged by three things: importance, reliability, and intelligibility. I think these matter for tournament scoring, too: A scoring system should measure something important. It should measure it consistently. But players should also be able to understand what it is measuring.

That does not mean every system needs to be simple. Strength of schedule, Elo, margin, scenario points, and win-path logic can all add useful context. But if the final standings require a private explanation from the scorekeeper, something has gone wrong. Tournament scoring is not just math. It is communication. A good scoring system should let players look at the table and understand the basic story:

Who won more?
Who played the harder path?
Who performed better within the same record band?
Why did this player finish above that player?

That is the Bill James test applied to Kings of War. The metric has to matter, it has to work, and it has to make sense in plain English.

In short, Steve’s article provides a useful framework for evaluating tournament scoring systems by focusing on measurable tradeoffs rather than personal preference. The 2026 event data supports many of the dynamics he highlights—especially the impact of strength of schedule, the limited effect of submarining, and the way Swiss pairings improve game balance over time. I applaud him for digging into it and hope to see more work like this in the future!