Are you a Quiet Speculation member?
If not, now is a perfect time to join up! Our powerful tools, breaking-news analysis, and exclusive Discord channel will make sure you stay up to date and ahead of the curve.
It's the most wonderful time of the year! The holiday season? Nope. Oath of the Gatewatch spoiler hype and rage? Try again, but you're getting warmer on the "hype" and "rage" business. With 2015 wrapping up and January 2016 right around the corner, 'tis the season of endless debate, discussion, and delirium about arguably the hottest button issue in competitive Magic: the Modern banlist. Our own Trevor Holmes gave some banning history the other week, and we've already seen opinions from Anthony Lowry, Craig Wescoe, and every other OP on r/spikes and r/modernmagic. The overwhelming majority of this conversation is often devoid of data or evidence (sorry, a personal anecdote of losing on turn 1 last Friday night doesn't qualify). Today's article is the first in a series of Modern banlist-focused pieces where I'll try to add some concrete datapoints to this dialogue.
The lovely Stoneforge Mystic has been the talk of the town since Wizards announced their 2016 Grand Prix promo. Today, I'll summon my buddy Arcbound Ravager, along with his merry minions, to test the impact of a hypothetical Mystic unban.
In this inaugural "Banlist Test" article, we'll choose a banned card, stick it in a list, and throw it into the Modern octagon. We're following in the footsteps of Caleb Durward's fascinating "Banned Series" on ChannelFireball, but with fewer videos, way more rounds, and extensive context around our card and deck choices. Hopefully, this will inject some much-needed evidence into banlist discussions that are often heavy on rhetoric and light on actual evidence. We're kicking it off with a Stoneforge Mystic Abzan list battling against a stock Affinity build, spread out over three games and 80 matches. In the interest of space, I'm splitting this article into a Part 1 (deck overviews and Game 1) and a Part 2 (Games 2-3 and conclusions). We'll publish Part 2 next week. For now, let's launch into the banlist testing action!
[wp_ad_camp_1]
Why Abzan?
One of the biggest pitfalls in testing banned cards is picking non-representative decks. When you run Bloodbraid Elf in Tribal Shamans and determine the Elf is safe for Modern, the only thing you've really revealed is that you've been hoarding foil Rage Forgers since 2012. Cards need to be tested in those same strategies they would call home if unbanned. For Bloodbraid, that means Jund and maybe Naya Zoo. For Ancestral Vision, it would be UR Twin and Grixis Control/Midrange. What about our leading lady of the day? Archetypes across Modern would undoubtedly welcome an unbanned Stoneforge Mystic, but for testing purposes, the deck we need to worry about is a deck that already doesn't need much help. This deck has already been Tier 1 on numerous occasions, one Mystic could easily push over the edge: Abzan.
Abzan might have missed Tier 1 in November, but the BGx powerhouse still boasts an impressive metagame history, almost all of it following Siege Rhino's arrival in Khans. We've published ten metagame updates since our site's launch and Abzan has made the Tier 1 cutoff in five different periods. Its Pro Tour Fate Reforged performance drove the early Abzan dominance in the spring, where Bloodbraid Rhino carried Abzan to a 25%+ share in the Pro Tour's Day 2 metagame. Abzan's share fell less than a month later at Grand Prix Vancouver, but it still ruled Day 2 at around 17.5%. It's true we haven't seen this level of BGW dominance since the spring: Abzan's most recent Tier 1 stint was in September at 5% of the format. That said, if Abzan could reach these 17%-25% levels without Stoneforge's help, we have every reason to be worried about what it could do with her. Given these metagame shares, Mystic's natural fit in midrange, and Abzan's love of good-stuff creatures, there's no better deck to welcome the Artificer and her equipment arsenal.
Why Affinity?
Now that we've tapped Abzan to champion the Mystic, we need to select our challenger. Although any top-tier deck could work here, we really need to pick an opponent that fulfills two criteria. First, we need our matchup to have a documented, baseline win-rate. This lets us check if Stoneforge Abzan pushes that win-rate too far in one direction or the other. Second, we need to choose a sparring partner that directly tests Stoneforge's strengths and weaknesses. No one is too worried about Mystic skewing the Abzan vs. RG Tron matchup too heavily. Warping an aggro matchup, however, is much more in line with the Kor's talents.
Based on this, Affinity is an easy selection for our banlist-test grudge match. Numerous sources have attested to the 50-50 nature of Abzan vs. Affinity. We see this frequently in quantitative pieces, such as those published here, on MTG Goldfish, and on ChannelFireball. We also have qualitative confirmation of this contest, as seen in Andrea Mengucci's Abzan primer and Frank Karsten's Affinity primer, both published in the aftermath of Pro Tour Fate Reforged. With the datapoints aligning across the quantitative and qualitative spectrum, we can be reasonably confident this matchup is very close to 50-50. That presents a perfect opportunity to see how Stoneforge's addition could influence the duel.
From a more theoretical perspective, one of the biggest fears around a Stoneforge unbanning is reducing format diversity by depressing the metagame share of aggressive strategies. As Wizards has said, they don't want a format dominated by the Mystic. Turn three Batterskull does a number on decks trying to win through damage, especially backed up by Abzan's disruption. Affinity is happy to rise to this challenge. If the robot horde is stymied by the lifelinking Germ, you can bet lower-tier aggro decks will be in even deeper trouble. That would suggest Stoneforge is much more dangerous than its proponents admitted. On the other hand, if Ravager and friends can keep an early Batterskull in check, it's possible other aggro players can adapt as well. This wouldn't be the end of testing, but it would be a very promising start for Mystic supporters.
List Selection
Finalizing an Affinity list was easy: Aaron Webster just got 2nd at Grand Prix Pittsburgh with a no-frills 75. We made some adjustments to the sideboard to reflect Affinity battling in a post-Stoneforge world, but the maindeck was largely unchanged except for one swap.
Affinity, inspired by Aaron Webster
We added in the Decays as game 2-3 concessions to Mystic. Although the BGx staple doesn't kill Batterskull itself, Affinity can easily pull ahead in the turns Decay buys after blowing up the Germ token. It can also use Decay to blow up the Mystic, whether against Abzan or the UWx decks that would invariably wield Stoneforge too. We also swapped the lone sideboarded Champion with a maindecked Spellskite to give us more beef in Game 1. Between Thoughtseize and Decay to disrupt, Aether Grid to circumvent Stony Silence, and the full set of Champions in the main, our Affinity list has more than a few answers to Abzan. Of course, there's an entirely separate question here as to whether Affinity runs Stoneforge itself, but we didn't worry about that for these tests.
Tailoring an existing Affinity list was easy. Crafting a post-Stoneforge Abzan list, however, was a bigger puzzle.
We started with Jon Westburg's 8th place Abzan deck from the October StarCityGames Open in Dallas. This was the highest-performing Abzan build in the fall and a good beginning for our Stoneforge overhaul. The first question was determining how many Mystic's to play, and then what equipment to run alongside her. We bounced around between three and four copies before realizing this was Stoneforge frikkin Mystic getting played in Modern. Of course we should be running the playset!
After that, we needed to determine what else the Kor would be Stoneforging apart from Batterskull. Without access to Umezawa's Jitte, it was a tossup between Sword of Fire and Ice and Sword of Feast and Famine, both of which saw similar degrees of Modern play throughout 2015. Feast and Famine won in the end not because it's better against Affinity (it isn't), but because it's better in a metagame where everyone is playing Stoneforge. Protection from black and green tears through opposing tokens, not to mention all the Goyfs and Decays inevitably accompanying Stoneforge into battle. Fire and Ice got shipped to the board instead. This left us with a maindeck Stoneforge suite of 4 Mystic, 1 Batterskull, and 1 Sword. Silvestri used the same equipment package when brainstorming Esper Stoneblade, further justifying our decision.
After figuring out Mystic, we reconfigured the deck's removal to be more generic (nixing cuteness like Abzan Charm and Murderous Cut). We ended with sideboard tweaks, throwing in that leftover Sword, an extra Maelstrom Pulse, and even a Slaughter Pact to address Stoneforge on the play against opposing Mystic decks. Our final Abzan list bore a strong resemblance to Willy Edel's deck at Grand Prix Pittsburgh, which suggested our reworks were on the right track.
Stoneforge Abzan, by Sheridan Lardner
We agonized over that lone Scavenging Ooze for a long time. Most builds go up to two in the main, but we thought we could get away with one if we had Stoneforge instead. There was no avoiding the slot problem at 61 cards, so it was either trimming the Ooze, going to two copies of Liliana/Path, or nudging Rhino/Decay/Inquisition to three. All of those options sucked for different reasons (especially when viewed through the lens of a grindier, post-Mystic metagame), so we settled on the least problematic of the lot.
Test Parameters
A friend of mine with over a decade of Affinity practice piloted the robots. I stayed on Abzan. We considered switching decks between games, but experience is so important in getting the most out of Affinity that I deferred to his expertise. All tests were conducted online to speed games up (especially around Abzan's shuffling). For Game 1, we played 30 total rounds: 15 with Abzan on the play and another 15 with Abzan on the draw. Then we sideboarded and played 50 Games 2-3 trials, split evenly with both decks playing and drawing.
In his Affinity primer, Karsten estimated Game 1 at 75-25 in Affinity's favor, with Games 2-3 leaning towards Abzan at 40-60. Based on this and the other sources, we wanted to see if the Stoneforge Abzan vs. Affinity matchup would be 50-50 overall, with a ~25% Abzan win-rate in Game 1 and a ~60% Abzan win rate in Games 2-3.
Game 1 Results
In our thirty Game 1 trials, Stoneforge Abzan went 11/30 for a total win rate of 37%. Although this is higher than Karsten's 25% estimate, it's well within the expected variance (statistics talk: I bootstrapped the Game 1 sample in 10,000 resamples and then compared those results to Karsten's 25%, finding a statistically insignificant difference between the two at p=.50). This suggests Stoneforge Mystic had no statistically significant impact on the Affinity vs. Abzan contest in Game 1. She did, however, make small differences for the Abzan pilot.
Before we dive into the themes and takeaways of this matchup, here are the high-level Abzan figures from our thirty games.
- Abzan win %: 37% (11/30)
- Abzan win % on the play: 47% (7/15)
- Abzan win % on the draw: 27% (4/15)
- Average Abzan win-turn: 9
- Average Abzan loss-turn: 6
It should come as no surprise you want to be on the play against Affinity, and Stoneforge Abzan was no exception. Similarly, the more you can prolong the Affinity matchup, the better it is for the Abzan pilot. This is particularly true when it comes to Stoneforge Mystic. Landing the turn 3 Batterskull before your opponent's fourth turn makes a world of difference, especially if you can beat Steel Overseer to its first tap. That said, Abzan couldn't push above 50-50 on the play even with Mystic in the mix, which suggests the Game 1 advantage remains solidly on Affinity's side of the court. For the sake of completion, here are the high-level Affinity stats, which just flip the Abzan numbers.
- Affinity win %: 63% (19/30)
- Affinity win % on the play: 53% (8/15)
- Affinity win % on the draw: 73% (11/15)
- Average Affinity win-turn: 6
- Average Affinity loss-turn: 9
Let's go a little deeper. Here are some statistics around Stoneforge Mystic and her impact on games.
- Games with 1+ Mystic: 70% (21/30)
- Abzan win % with Mystic: 38% (8/21)
- Abzan win % with no Mystic: 33% (3/9)
- Abzan loss % with Mystic: 62% (13/21)
- Abzan loss % with no Mystic: 67% (6/9)
Or maybe I should say, Stoneforge Mystic's relative lack of impact. Despite seeing the Artificer in 70% of games, Abzan managed only a slight improvement when it dropped her on the board. Without Mystic, Abzan won 33% of games. With her, it won 38%. That's an insignificant difference both at a glance and statistically. Even if we doubled our sample size to 60 games (or went all the way to 100), I don't think we would see much change here. The no-Mystic win rate would likely slip down to 25%-30%. As for games with Mystic, it might eke up to 40%. This would represent a very minimal improvement over a generally bad matchup, pointing to Mystic being safer than many of her critics acknowledged, but still of measurable benefit to Abzan.
On the subject of Mystic herself, one of the main objections to a Stoneforge unbanning is the power of turn 2 Mystic into turn 3 Batterskull. How did that line play out in the Abzan vs. Affinity fight?
- Average Mystic turn: 2.75
- % of Mystic games where Mystic landed on turn 2: 62% (13/21)
- % of total games where Mystic landed on turn 2: 43% (13/30)
Win or lose, Abzan dropped a turn 2 Mystic onto the battlefield in 43% of its games. That's right around the expected value of drawing a Mystic in your opening 7-9 cards when you're running the full playset. There were three games where I had to hold removal or Inquisition instead of living the dream (typically blowing up Steel Overseer or discarding Etched Champion or an uncast Plating), but most of the time the turn 2 Mystic was the right play. Indeed, in those games where Abzan saw a Mystic at all, she landed on turn 2 in 62% of games.
Fortunately, we're not here to throw a fit about turn 2 Stoneforges in the abstract. We're here to look at how that turn 2 Mystic actually affected our win percentages.
- Total Abzan wins: 11
- % of Abzan wins with Mystic: 73% (8/11)
- % of Abzan wins after a turn 2 Mystic: 46% (5/11)
- % of Abzan wins after a turn 3+ Mystic: 27% (3/11)
- % of Abzan wins with no Mystics: 27% (3/11)
Looking at wins alone, the turn 2 Mystic was a clear factor in Abzan's victories. Almost half of Affinity's losses came to the dreaded turn 2 Stoneforge, on top of Mystic's involvement in 73% of Abzan wins overall. Reviewing my notes, the Kor was a major contributor to all the wins where she made an appearance, particularly when that showing came on turn 2. Although we don't know with certainty how games would have ended if the Abzan player had drawn Kitchen Finks instead of Stoneforge, my notes suggest the one-two punch of Mystic into Skull was too much for Affinity to handle. Batterskull generated massive life advantages when left alone. I got Skull up on a Spirit four times total. All of those resulted in landslide victories. Taken as a whole, when Abzan won with Stoneforge, it tended to win big.
Of course, that gets us wondering about Mystic's performance in games Abzan eventually lost.
- Total Abzan losses: 19
- % of Abzan losses with Mystic: 74% (14/19)
- % of Abzan losses after a turn 2 Mystic: 42% (8/19)
- % of Abzan losses after a turn 3+ Mystic: 26% (5/19)
- % of Abzan losses with no Mystics: 32% (6/19)
Hmm. Maybe Stoneforge wasn't so decisive after all...
Even though Stoneforge hit play on turn 2 in 42% of these losses, she was unable to avert the inevitable defeat. Indeed, Mystic entered the battlefield in 74% of the lost Abzan games overall, the exact same rate she appeared in the Abzan wins (73%). My notes give some explanation around these losses. There were three factors which worked against the turn 2 Mystic, converting a potential win into a guaranteed loss. The first was removal: a stray Galvanic Blast crippled the Batterskull line and left the Abzan player on the back foot. The second was Inkmoth Nexus, which combined with Ravager or Plating to soar across for a poisonous finale. Finally, Etched Champion could sit back and block the hapless Germ token all day while Affinity's fliers chipped away for the win. These numbers and the narratives behind them show the Kor was quite beatable.
In the end, Abzan saw the Mystic in 70% of its games, but still maintained similar win percentages when it drew Mystic (38%) and when it didn't (33%). These numbers are right around Karsten's 25-75 Game 1 estimate, and although Mystic nudges the scales in Abzan's favor, it's not nearly a big enough push to cause worry. Based on these numbers and their context, I am tentatively concluding that Mystic does not break the Abzan vs. Affinity matchup in Game 1. Affinity has more than enough maindeck ways to handle the infamous Artificer, whether through direct removal, the protected Champion, venomous Inkmoths, or even through a simple airforce damage race.
Limitations
Like any testing environment, our Abzan vs. Affinity study today has limitations. For one, I know at least a handful of readers are going to see the number of test games and immediately cry foul about an insufficient N. Many of these critics wouldn't be satisfied with 50 or even 100 games, because these samples fall below N levels needed for "truly" significant results. I've compensated for this in a few ways. This includes bootstrapping our sample, checking the observations against the expected values and seeing no serious deviations, and digging into the narratives behind each game to contextualize the numbers. Social science analysis often deals with smaller samples, and these methods are all great ways of mitigating the low-N effect. Moreover, I sincerely doubt Wizards runs 10 matches of hypothetical Modern decks, let alone 30 Game 1s. This suggests the testing should be more than enough to suggest something about banned cards.
A second limitation concerns the test's applicability to other matchups. Can we make conclusions about the Abzan vs. Burn Game 1 based on these results? Or Abzan vs. Gruul Zoo? These comparisons are fraught with difficulties. On the one hand, something like Burn or Zoo would lack both Champions to stonewall a Batterskull and the Inkmoths to ignore lifegain. On the other hand, both decks pack significantly more removal than Affinity's four Blasts, not to mention anti-lifegain bullets in Atarka's Command and Skullcrack. Without testing these matchups, it's hard to know if one factor would compensate for another. This underscores the need for further testing, but also the importance of looking for matchup themes. For example, the Affinity results suggest Burn would probably be okay battling through a Mystic. It has enough spells to either kill Stoneforge (a noticeable Mystic weakness in even the removal-light Affinity tests), or to ignore the Skull and blast for lethal (as Affinity could do with its flying creatures). Then again the landlocked, creature-packed Gruul Zoo might struggle here.
Stay tuned for Round 2!
I hope you're as excited to read the Games 2-3 results as I am to report them! We'll be back next week with the conclusion of our Stoneforge Abzan vs. Affinity series, along with some final thoughts based on this round of testing. Mystic might not have caused too many problems in Game 1, but I'm sure we're all excited to see how she fares once the sideboard comes in.
Let me know if there are any additional matchup numbers you want to see or unanswered questions you want addressed. Do you have issues with the methodology? Concerns about the lists or feedback about the matchup? Any other opinions you have on Mystic and translating these Affinity-focused results to a broader metagame? Bring it into the comments and I'll see you all there!
Wow! I liked the article very much. Thanks for the analysis! As an affinity player, I am intrigued about the inclusion of Abrupt Decay in the Affinity sideboard. How easy was it to cast? I assume that the purpose of Decay would also be to kill the Affinity hate-cards like Stony Silence, but it would be extremely hard to cast once Stony Silence has landed.
Mathematically, you odds are bad. 12 colored sources (50% on the play, 58% on the draw) to tap in response to a stony silence. The opportunity cost to hold up that mana though is the real deal breaker. You can’t afford many reactive cards and they certainly shouldn’t be 50/50 to cast. Just play more wear//tear. Deals with skull and many of their likely sb cards
They definitely should’ve used Dismember. It’s already in sideboards for Affinity.
Kind of discredits a lot of their testing.
We’re definitely open to re-running tests based on feedback. Decay wasn’t nearly as problematic as people are theorizing (it definitely put in work), but Grudge is a strong contender for a replacement in a new round of tests. Dismember would certainly be behind Wear/Tear and Grudge in any revision of the board though.
I’m glad to see other affinity players mentioned decay, was my only point of confusion in the article, but I’m interested in how you said it wasn’t problematic to cast sheridan, could you elaborate on how often it was able to be cast if you have the figures in round 2?
Here are some high-level Decay stats.
1. We saw Decay in about 50% of games.
2. Of games where we saw Decay, we fired it without problem in 75% of them.
3. Of the Decay games, we had a small mismatch in the turn we drew Decay and the turn we could cast it (averaging 1.5 turns) in 13% of games.
4. Of the Decay games, we had a LARGE mismatch in the turn we drew Decay and the turn we could cast it in 12% of games.
All these numbers include resampling to account for a relatively small N in the games 2-3.
This points to Decay being much better than people who are just theorycrafting believe, but still not the best. I think Wear/Tear is going to be the better option overall.
I’d prefer to see all the results in a longer article, but that’s just me. Affinity is well-known for its high game 1 win rates, so I don’t think the results are out of the ordinary. Affinity is highly proactive and requires sweepers, 2:1’s, and/or prison cards to lock them down. Sure, mystic bringing G1 percentages to 60% would be a huge indicator of potential power, but I don’t see how the best-case T3 Batterskull helps the usual plan for BGx, which is to either land a blocker or remove ravager/overseer/plating on T2. I can’t imagine we want to trade T2 SFM with a battle-cried memnite, so we wind up with more damage in the hole.
I’m also curious about the “fail rate” of affinity. Before buying into Tron, I went into tappedout and goldfished 113 openers to get a rough idea of when Tron goes online. I found that in about 10% of the games the deck just whiffs (>T6), which means I probably just lost to the deck itself. I’ve been playtesting against affinity with Jund the last couple days and the only games I win are the ones with specific card combinations or affinity failing to play more than 4-5 cards in the first few turns. Mana screw and drawing poorly happen, so I’m curious how many of the wins were due to affinity losing to itself or if they exploded on the board and SFM STILL got you there (or which cards had the most impact).
[In case anyone is wondering, found that Tron was available on T3 about 50% of the time and T4 25% of the time, though you might have to tap out to achieve it – sorry, Karn!]
Depends on how you define fizzle, but I will try. Affinity has roughly 15 cards that matter(plating, ravagers, overseer, champ, master) the rest are support cards. Truely fizzling would be not finding any in your top 10 cards(4%), but you shouldn’t keep a hand without one. I would define fizzle as only seeing one in ten cards(17.6%). This is certainly high, plenty of decks can’t beat a turn one plating, but the combination of opponents interacting with your one power card via discard or removal or having your card not be right for the matchup (champ v. Combo decks) makes this a reasonable guess.
Abrupt decay seems like a poor choice. Its really hard to cast as you ideally only want a one color mana symbol on your sideboard answer cards. You are already running ancient grudge for the mirror and whipflare for tokens, galvanic blast already answers mystic itself. Abrupt decaying a germ token seems like a really bad tempo move where as ancient grudge is not even a card loss as you are only using half the card to answer batterskull.
I would just max out on copies of ancient grudge, why would you waste your time killing a germ token when you can just kill batterskull? Mystic is fairly useless after the tutor and putting the equipment into play, so you really shouldn’t waste your time killing it. Not to mention that even if the Abzan player gets a sword out, you can simply flashback grudge.
We might end up re-running some of the games 2-3 with Grudge. We were worried about the countermagic and inevitable permission-based (e.g. Twinblade) style decks that might use SFM, but Grudge gets around this with flashback to some extent. If it works better, we have enough time to re-run tests: we still had some remaining game 2-3s left to do.
The reasoning makes more sense for why you would want abrupt decay. The mana cost is just hard to fulfill as sometimes you only have 1 colored mana source.
But I am a fan of this article and I hope to keep seeing quality content like it. This is much better then the baseless ban/unban discussions that one generally finds on the subject.
I would just use Dismember. I can’t see myself bringing Grudge in against Stoneforge alone.
Sheridan, this is awesome. While it may not be a perfext, it is leaps and bounds better than any thing I have seen. You should reach out to wizards and see if they would be willing to consider data like this into their banlist decisions in the future.
Cheers!
Happy you liked it! I’m sure Wizards has their own internal process, although it’s probably not as involved as this. Maybe someone out there is watching and wants to consider some third-party data!
Seriously high quality writing. This site is the best!
Glad you enjoyed it! Let us know if there’s anything else you’d like to see.
I like the premise of this article and all, but I think the specific matchup you chose was a poor one. In my mind, the inclusion of Stoneforge in Modern would affect matchups against aggressive decks and midrange decks the most. Affinity is certainly an aggressive deck, but it’s of an entirely different type than most others. You even mentioned that you don’t expect the Mystic to affect this matchup much.
I think testing matchups between Jund, Twin, Grixis, Merfolk, Burn, Little zoo and big Naya zoo would be pretty useful. Additionally, coming up with a very similar pre-SFM list to test and compare results to would help filter out things like player skill differences and perceived vs actual match up numbers. Obviously, this would require twice the amount of testing so I can understand why you didn’t do it.
It’s unlikely we’d run all those tests. Jund, Grixis, and Twin aren’t aggressive decks, which creates a testing environment that doesn’t assess one of Mystic’s biggest criticisms (reducing aggro’s share). Burn is already in Abzan’s favor, and Merfolk and the Zoos don’t have well-documented win-rates against Abzan in the first place. As you said, we can account for this with doubling all our tests using pre and post-SFM lists, but that’s really a crazy amount of testing. I’m all for doing tests but there’s a certain point where we should have enough information from the tests we are already doing.
We started with Affinity because it’s the most important aggro deck in the format. If Mystic can beat Affinity, that’s probably a disaster for other decks. If it can’t, then it’s possible other decks can survive as well, provided we analyze the reasons Affinity prevailed.
Great article, unfortunately this sort of leaves an important question unanswered: whether or not SFM is too much value against fair decks like Jund. A lot of people are making the assertion that Kolaghan’s Command evens up the matchup, but I sort of doubt this and would be interested to see some Jund vs SFM testing.
I think part of the issue is that comparing Affinity to any other aggro deck in the format isn’t really correct. I agree that if Affinity was a dog to the card, it would probably mean other decks are destroyed by it, but I don’t think it follows at all that Affinity not caring about SFM in general means it’s okay for other aggressive decks.
I totally get the time argument. It’d be awesome if someone did an absurd amount of testing for various banned list cards to see how they affect specific matchups an the format, but as you implied the time just isn’t there. Basically that’s why I wish the testing was against a deck that I thought would actually be affected by SFM. Affinity is one of the first decks that comes to my mind when I think “what deck in Modern is affected by SFM the least?”
I appreciate the time you guys took to do this, so please don’t interpret my comments as bashing the article! Just trying to give constructive criticism.
In what world do run feast and famine over sword of fire and ice, even with the logic provided it makes no sense.
Your still protected against terminate, bolt, 2nd mode of k.command and your netting better ca. and board control. Gofys and ground troops can bump chests, our focus should be lingering souls tokens swinging through.
F&F is significantly better in a metagame where SFM is present. Protection from black and green is way more relevant when there are tons of Abzan decks fielding the sword, and the on-hit modes are better against what would likely become a grindy Mystic fest. That’s especially true when the grindy decks are also UWx themselves (just like Lily wrecks control, so too does the Sword). Protection from blue is also one of the least relevant types in Modern: the only place it really saves you is Twin tapping and Snapcaster blocks. As for red, if time is of the essence in sticking the Sword on a Spirit, they’re probably just going to Bolt/kill it anyway in response to your equipping it. The earliest you can also get this line is turn 5: turn 2 SFM, turn 3 Sword, turn 4 Souls, turn 5 equip. At that point, if the game is still going on I’m just going to keep equipping Sword until an opponent runs out of removal and the protection from burn matters less than the ability to smash through defenses.
SoFI definitely goes into the board for games 2-3 where it’s better in certain matchups. But if Mystic’s impact on previous formats is any indication, I want to hedge my bets against the BG decks for game 1.
Just a correction – F&F doesn’t protect against Abrupt Decay since it is the Abrupt Decay target.
Otherwise good article.
My hope with a SFM unbanning is the decrease in the meta share of Zoo and Burn.
There’s just too much aggression in the format with Affinity, Zoo (all variants), the various tribal decks and Burn just trying to race you as quickly as possible – and that’s ignoring a deck like Infect which Batterskull wouldn’t help against but is also aggressively trying to kill. If Batterskull halted that progress we can start getting games that go a little longer/are more interactive.
(This is not a dig at aggro decks, its just that the format is currently warped around very fast decks and something needs to slow the format down a little bit)
Oh, I didn’t mean it protected against Decay on the artifact. If it came off that way it was just misphrasing on my end. I meant Decay on the holder, not the artifact.
I think Mystic would be very strong against the Zoo decks, based on the results of the Affinity games and the games where Mystic was at its best. Not sure if it would be TOO good against them, but it would have an impact.
In a world where there’s combo. It’s incredibly relevant the power of SoFaF when paired with either countermagic or hand discard, both of which are the best ways to slow down/stop combo. I believe combo would keep its meta share in a stoneforge world.
Can you please test against Burn? I always felt Abzan vs Burn is kind of 50/50, maybe a bit in favor of burn. Now I wonder how the numbers will change with T3 Batterskull.
Other than that: thanks for doing this extensive testing 🙂
In my own experience, I found Abzan a favorite against Burn, even if game 1 can be rough. Things get a lot better in games 2-3, especially once stuff like Duress comes in. That said, it’s totally possible we’d do this testing in the future.
Hey Sheridan,
Great article! Stuff like this is the reason I come to Modern Nexus. One thing I am interested in is how did Affinity win? Were the robots able to push through enough normal damage or did your opponent rely on the infect win a lot? I think it’s an important data point that will help us understand how Stoneforge would affect other aggro decks. I’m looking forward to the rest of this series!
Would you be willing to run some power calculations? Partly this depends on guessing what change in win percentage is too much. But assume that moving the win percentage more than 5% in Abzan’s favor would make the Mystic too powerful, how many game 1’s would you have to play to be able to be pretty sure that you could detect that change at a significance level of p=.05? My apologies if you ran the power calculation before deciding in the number of game 1’s but just elected not to bore use with the statistics.
What I would like to see is the impact of stoneforge played into affinity. Get access to 7-8 plating (some of them uncounterable) and maybe a batterskull could be pretty strong! It would become another must-answer turn 2 threat beside ravager, overseer and plating itself. Also the citadel-mox-0 cc creature-stoneforge for plating/batterskull turn one is a pretty crazy start. I’m currently playing affinity (and always played it, since the beginning) and I think it wouldn’t be too slow as many says. And certainly stoneforge is not comparable to steelshaper gift since mystic is also a good topdeck on an empty board since it can carries equipment by itself. And a stronger white presence in affinity can be useful in upgrading Dismember with Dispatch.
Your using the wrong Sword for the job.
Substitute for Sword of Fire and Ice. And start hitting those robots with 2 damage.
SoFI is clearly better against Affinity, but I don’t think it’s the weapon that Abzan Stoneblade decks wield maindeck if SFM gets unbanned. If SFM is freed, there’s a good chance you see a ton of Abzan and Jund decks rolling around, and SoFF is the maindeck card you want to win those matchups. We certainly swap the swords for games 2-3.
Excellent stuff. I’m not surprised by the G1 result – a T3 B-Skull would wreak tons of havoc against the likes of Gruul Zoo, but not Affinity and its flying (and sometimes infecting) swarms. I echo the thoughts of being willing to wait and see this as a big article, but I can understand the need to provide steady content in order to maintain interest (reminds me of academia, actually). I’m looking forward to both seeing the post-board article (and I echo the thoughts of some posters regarding Dismember and Ancient Grudge vs. Abrupt Decay in Affinity’s board), as well as a couple of aggro decks. Is it selfish of me to ask how Merfolk would measure up? I’ll do so anyway, though.
From experience Batterskull on turn 5 or later was rarely enough to turn things around for UR Twin (except when I was losing anyway), which was the only list I’ve ever had it played against me. Based on some test games I did with a TwinBlade list turn 3 might be soon enough to race an average or worse Merfolk draw, though not the good ones.
It made sense to split the article both from a word-count perspective (this was already almost 4k), and so we could get some feedback. We’d done enough tests that it would be easy to keep going if we didn’t see a need for changes, but could also redo tests if there was a good reason.
Merfolk seems like a deck that would struggle against SFM, especially when backed by disruption. I don’t think it would be the end of that deck (Merfolk could even run SFM itself!), but it would be an obstacle.
Do you have any opinion on a Jeskai Twin deck with SFM? I personally feel it would be a terrifying deck as the Twin combo gets a back up plan of SFM and Batterskull. What do you think about a list along the lines as the one in this article?
http://www.starcitygames.com/article/32073_In-A-World-With-Stoneforge-Mystic.html
I would love to see the same analysis in an esper deck, I feel like the problem UWx decks have is just not having any early pressure on board until late game. I think it is also just what is needed to shake up the overall metagame a bit.
Why haven’t you put Stoneforge Mystic in Affinity, too? I mean, in case she will be unbanned, I think robots would be happy about playing 8x cranial plating, and maybe also a batterskull in order to be able to grind even better than now.
We’ve seen Legacy Affinity decks run her, but we’ve also seen Legacy Affinity decks not run her. This suggests it’s not an open and shut issue about whether she makes Affinity better. I think it would be interesting to run her in Affinity vs. Abzan after these tests, just to see if she improves our baseline matchup win-rates, but that’s a test for another time.
i would say there’s some kind of heterosticity here, as it may be obvious that having the mystic in the deck enforces us to play it in T2 as many times as possible. therefore, the “look, you lost and played a mystic in T2!” kind of comment lacks of real statistical value, as it’s easily answered with “hey, i didn’t have anything better to do”. testing with a no-mystic abzan and comparing the times the T2 play is deploying a permanent to the times a spell is played or mana is hold up for removal would give better information, as maybe deploying mystics in T2 is just a missplay based on the “if it stays i win” perception of the abzan player.
also, i think the “unfair” aspect of poison in affinity has a lot to say in those percentages (even when not dying to it, it influences on decisions made). maybe a test with a completely “fair” aggro deck would be better.
that said, amazing work here, congrats!
There were a number of games where T2 SFM was not the right play, especially on the draw against an Overseer, Plating, or similar threat. Sometimes you topdecked the discard and could use it to preemptively stop a Champion, Or fire a second discard even if on the play. Overall, there were a lot of situations where it wasn’t necessarily the “best” play.
As for the unfair aspects of Affinity, we knew these advantages before we started testing. It was important to START with Affinity because it’s the best aggro deck in the format. If Mystic was enough to turn the 50-50 Abzan matchup into a 60-40 Abzan favorite, I would probably disqualify the safety of an SFM unbanning on the spot. But if Affinity stays fair, then we can move on to something like Burn.