Cute Mage's Tower

AI: Straight Up

In a hobby with a ton of white cishet men, of course the round all about other cultures is the round that is solved by the least number of people.

This is part of a multi-part series analyzing the various AI rounds from the 2023 MIT Mystery Hunt and seeing what they can teach us about writing puzzle hunts in general.

The Culture of the MIT Mystery Hunt

The MIT Mystery Hunt has a dominant culture present throughout it - American culture (and specifically, white upper-middle class culture). This isn’t to say that people who are outside that culture can’t participate, but there are some definite advantages to puzzle hunting to understanding that culture.

Now let’s be clear - this is not inherently a bad thing. In some ways, this is a puzzle hunt that is directed towards college students at MIT. Let’s not pretend that the culture of students there is wildly representative of all of the cultures in the world, and certainly the culture of people who are veterans of the puzzle hunt community aren’t representative either.1

Of course, as a Social Justice Warrior2, I particularly enjoy it when other cultures are represented in my hobbies in an authentic way, which is why I’m glad that these puzzles didn’t only use non-English languages in their answer, but took the time to dig into interesting bits of their respective cultures. Of course, that’s why it sucks that Ascent was the least solved round of all the Hunt.

Wait a minute, was it? That seems to be the general wisdom, but that’s the kind of thing that we can find out with math! Let’s make some tables!

Lots of Tables!

Fortunately, teammate gave us a ton of stats on each puzzle, each on their own individual stats page. I kinda wish that was done for other Hunts (but since many of the previous hunts have released the full guess log, it should be possible to do with some fun programming).

First, I went through the stats of each of the AI feeder puzzles3 and just recorded how many solves each puzzle got from people actually solving the puzzle and how many total solves were used, including free solves. That table is below.

Error: File...1155D Barred...226Book of Fixed Stars120Construct Objects310
Flooded Caves916Invisible627Fountain422Crack Some Crypts914
Folded Cards315Pixel Art1223How Come This Crossword...321Cure the Werewolf's Woe216
Sea Bass415Terminal1125Moral of the Story117Dispel the Bees717
The Devil's in the Details1016WLLP426Mosaics414Eat Desserts on Main310
We Made a Quiz Bowl215 My Fun Bow019Graph It1322
4D Geo418 North Carolina Shopping List416Interpret Perplexing Texts1923
Degeneration...217 To Numbers and Back Again018Investigate Relics09
Endless Knots420 Walking Tour117Laugh1222
Izzy's Art Gallery1019 Make Love or War1420
Shared Characters417 Pluck the Petals518
Run the Gamut1117
Set off Fireworks1321
Win a Game of Bingo210

Looking at the average number of real solves per puzzle, we get 4.8 for Wyrm, 7 for ABCDE, 2 for Ascent, and 8.1 for Conjuri. That’s a significant drop for Ascent. It seems that there’s sufficient evidence already to show that Ascent puzzles were solved less often. And yes, but that’s not the full story.4

Not all solves are created equal. This is not to say that one team’s solve is more important than another team’s solve, but there are different motivations behind different kinds of solves. A team who barely gets to the AI round and solves a Wyrm puzzle right before HQ closes has different approaches to how they solve puzzles and spend free answers than a team that is trying to complete the round. I want to take a look at those teams that finished the round, because they had to balance solving puzzles and using free answers. So for each round, I grabbed the teams that solved that round and noted whether they actually solved the puzzle (S), used a free answer (F) or didn’t solve it at all (-).

TeamError:...FloodedFoldedSea BassTDitDQuiz Bowl4D GeoDegen and RegenEndlessIzzy'sSharedTotal
Death and Mayhem-FFFSFFFFSF2/10
Frumious BandersnatchFFFSFFFFFFF1/11
Providence Planned VacationsFFSFSFFFFS-3/10
✈✈✈ Galactic Trendsetters ✈✈✈FSFSSSSFSSF7/11
Setec AstronomyFFFFSFFFFSS3/11
Super Team Awesome!FSFFSFFFFSF3/11
The MIT Mystery Hunt ✅FSSSSFFFFFF4/11
Team5D...InvisiblePixel ArtTerminalWLLPTotal
The MIT Mystery Hunt ✅-FSFF1/4
Providence Planned VacationsFSFS-2/4
Death and MayhemFFSSF2/5
Super Team Awesome!FSFFF1/5
✈ ✈ ✈ Galactic Trendsetters ✈ ✈ ✈SSSSS5/5
The TSBI SwarmFFSFF1/5
Frumious BandersnatchFFFFF0/5
Left OutFFFSF1/5
ET Phone in AnswerFFFSF1/5
TeamBookFountHCTCGNGOMoralMosaicsMy Fun BowNCSLTNaBAWalkTotal Solved
Death and MayhemFFFSSFFFF2/9
The MIT Mystery Hunt ✅FFFFFFSFF1/9
Providence Planned VacationsFFFFSFSFF2/9
Super Team Awesome!FSFF-F-FF1/7
✈✈✈ Galactic Trendsetters ✈✈✈SSSFSFSFS6/9
Amateur HourFFFFFFFFF0/9
TeamConstructCrackCureDispelEatGraphInterpretInvestigateLaughMakePluckRunSet OffWinTotal
The MIT Mystery Hunt ✅SFFFFFS-SSFSS-6/12
Death and MayhemFSFSSSSFSSFSSF9/14
✈ ✈ ✈ Galactic Trendsetters ✈ ✈ ✈SSFSSSSFFSFSSS10/14
Providence Planned VacationsFS-SFFS-SSSSFF7/12
Super Team AwesomeFS---FSFFSFSSF5/11

The first thing that comes to mind when I see this data is the significant drop-off from Ascent to everything else5. Now ABCDE comes close, and that probably has something to do with the fact that ACBDE only has five relevant puzzles.

Of course, that’s still not the full story. Even at this level there’s still the difference between teams who are trying to solve every single round and teams who only tried to do one round. If it’s getting late in the Hunt and it’s obvious that a team isn’t going to finish everything, they might burn all of their solves on one round to try to make it so that they can accomplish something in the AI rounds. But if you’re setting a good pace to finish all four AI rounds, then that’s going to affect how you’re spending your free answers. So I decided to take a look at all the teams that solved all four AI rounds. I grabbed the percentage of puzzles that each of those teams solved from those rounds, and threw it into the table below.6

✈ ✈ ✈ Galactic Trendsetters ✈ ✈ ✈63.64%100%66.67%71.43%71.79%
Death and Mayhem18.18%40%22.22%64.29%38.46%
Providence Planned Vacations27.27%40%22.22%50%35.90%
⛎ UNICODE EQUIVALENCE72.73%80%22.22%71.43%61.54%
The MIT Mystery Hunt ✅36.36%20%11.11%42.86%30.77%
Super Team Awesome27.27%20%11.11%35.71%25.64%
The TSBI Swarm27.27%20%11.11%57.14%33.33%

Looking at this again, it becomes incredibly clear that the teams that were focused on completing the Mystery Hunt really just ditched the Ascent round as hard as they could.7 I know that this is the prevailing wisdom, but it’s really nice to have the data to back it up, and it was a fun use of a couple hours.8

So the question is, why was this the case? To answer this, I need to talk about a puzzle genre that we don’t talk about that much.

Become an Expert

When Palindrome won the 2021 MIT Mystery Hunt, we not only were on the hook for writing the 2022 Hunt, but also for running the 2023 How to Hunt Seminar the night before Hunt. We ended up creating two seminars. The first was a video seminar that ran approximately a month before Hunt started for those who couldn’t be on campus the night before. The second was a short talk followed by a small set of puzzles that was designed to teach attendees how to solve Mystery Hunt Puzzles by giving them Mystery Hunt-style puzzles.9

One of my priorities when writing this set of puzzles was to consider what kinds of puzzles that showed up more often in the Mystery Hunt than in other places that the students might have seen. I included a cryptic crossword, a large picture identification, and a location puzzle. But perhaps, the most important puzzle of all, was the Become an Expert puzzle.

One of the frequent recurring jokes in the Mystery Hunt is that any obscure piece of knowledge could come up at any time - and there’s a reason why we keep making it. A large part of the first generation of Mystery Hunts involved getting solvers to find obscure pieces of information. As the Internet and team sizes grew, hunt writers can expect the solvers to know more and be able to process more. This means that when you write a puzzle now, you have to consider what prior knowledge you’re assuming for the solver. There are a couple possibilities:

  1. There will be multiple people on the team (perhaps even a majority of people on the team) who know this thing. This is common knowledge. Even if people don’t have particular skill, they know where to find information about it online and how to process that information. Most puzzles fit in this category.
  2. There will be one expert on most teams who has the knowledge or skills needed for this puzzle. If a team doesn’t have this expert, then they probably will not solve the puzzle. Examples of this puzzle include Something Command and Opening Bids.10
  3. No one probably knows this information, but it’s relatively easy to discover and look up, and it’s relatively quick to apply that information to the puzzle. Examples of this puzzle include Harold and the Purple Crayon and A Wrinkle in Time.
  4. No one probably knows this information, and you need to make a deep dive into this information in order to be able to solve the puzzle. Examples of this puzzle include Large-Scale Anthropomorphism and The MIT Mlystery Hunt…11

Now let’s be clear - while I have created four buckets to sort puzzles into, puzzles themselves are very fluid and often defy categorization. In reality, these are not buckets but the boundaries of a spectrum. But like many things that are on a spectrum, it can be useful to talk about these boundaries and then deal with the edge cases afterwards.

In particular, I would like to highlight the difference between group 2 and group 4. By its very definition, group 2 can get deeper into the knowledge of the subject than group 4 can. When you’re writing a puzzle about group 4, you are assuming that the solver doesn’t know anything about the subject in question, which means that you need to build a puzzle that is either easy enough that the solvers can take the time to learn the material or deliberately builds in ways to teach the solvers what’s going on as they are solving. The very structure of the MIT Mlystery Hunt… does this by telling you how to find an information stream as you are solving a different information stream. Compare this to Something Command, where if you don’t know how to play Magic: the Gathering, well, good luck.12

Going back to the set of puzzles for the seminar, one of the things I made was a very short puzzle written in court stenography. Very few people have ever tried to read court stenography before, so I got to see tons of people’s laptops on very different pictures of court stenographer’s keyboards, trying to interpret the letters that were on the page. There’s no one good dictionary for court stenography, so in the end, the solvers had to get good enough to truly understand what was happening in order to get the answer to the puzzle. A wonderful introduction to the MIT Mystery Hunt.

Become an Expert in Everything!

Coming back to Ascent, every one of these puzzles is a puzzle in group 4. You cannot expect a team to have multiple experts in any language other than English, and they all used the culture heavily, so there was a lot of work to do. One particular thing that the puzzles did well was that they wrote crossword clues from the presumption that you were someone from that culture, which meant that you would have to think like someone from that country in order to solve the clue. Again, this is overall a good thought process in mind for the puzzles, but this means that every one of these puzzles falls under the realm of “Become an Expert” puzzles, which means we need to look at them through this lens.

When you look at them this way, it becomes clear that the editing of the Hunt caused two different issues with this round.13 The first is that the Ascent puzzles weren’t really welcoming. A “Become an Expert” puzzle is already facing an uphill battle.14 It involves a subject that not many people know about, which means folks won’t be as excited about it, and it’s going to involve a ton of work, which means it’s probably going to take a while to solve. The Mlystery Hunt… was timed at one of the longest puzzles in testing in the 2022 Hunt, which we were okay with because it was one of only fiveish puzzles to break 3 squad-hours15 in testing. In addition, they’re often not puzzles that can parallelize well, which means that teams are focused on just a few people trying to break through, which can be a very frustrating experience if they’re not making progress.

In addition, the Hunt’s general editing problems making it run over time meant that teams who were trying to get far in the Hunt had to decide relatively quickly upon looking at a puzzle whether it was worth their time or not. If it was going to not be worth their time, then it was better to just nuke it quickly to try to get to a puzzle that was. When you’re trying to make that decision from just looking at the puzzle, the “Become an Expert” puzzles are going to be one of the first to go, since you know that they’re going to take a ton of time and that time is better spent somewhere else. In addition, it means that teams are likely to nuke a puzzle before solving one, which means that they’ll be spoiled as to the “answers are in a foreign language” trick.

This can be shown in which puzzles in Ascent got solves and which ones were completely skipped. Fountain, How Come This Puzzle…, Mosaics, and North Carolina Shopping List all got 3 or 4 solves. These make sense. Fountain started off with a pretty straightforward taste test, How Come This Puzzle… was a diagramless where English speakers still had some intuition, Mosaics was a series of English crosswords with some black box shenanigans, and NCSL involved clues that one could easily dig into, along with some symbols at the bottom to clue you into what was going on. All of these puzzles you could easily jump into and start solving, and gradually got you into working with their culture. Compare this to the puzzles that got 0 solves, My Fun Bow and To Numbers and Back, both of which were groups of minipuzzles that were quite intimidating to look at, especially once you knew you were going to have to do a deep dive into the culture to solve them.

In general, the Ascent puzzles needed someone to say “this round needs the puzzles to be easier on average because the puzzles by their nature are going to be hard.” This is very different from the AI puzzles, where the difficulty comes mostly from the meta-level information16 about the puzzle. (Also there’s another whole bit where the puzzles at the end shouldn’t be the hardest puzzles, but that’s a different article.)

This is a long way of saying that I think Ascent’s gimmick is the coolest of the AI rounds, but it was also hurt a lot by the implementation. There’s a part of me that wishes that instead of getting 13 extra Bootes puzzles, we got 13 extra Ascent puzzles, because I would love to see them get solved outside of the framework of the Hunt.17


Oh right, I usually talk about metapuzzles. Let’s talk about the steps that are needed for this metapuzzle:

First of all, hoo boy that is a lot. Let’s just sit with that for a second.

Let’s start off with the fact that everything seems to connect to each other nicely. There’s no real random steps that don’t make thematic sense, and the crossword clues are good confirmers so you know you’re on the right track. Singing plus homophones plus vocaloids is a good pairing, and so just from the feel, I’m happy.

I also appreciate that the solve path for this metapuzzle didn’t require you to translate the words into English. That would’ve totally gone against the whole idea of the round. I’m not sure that translating them all into Japanese is that much better. Yes, you’re still delving into a non-English language and culture for the meta, but you have a wonderful opportunity with feeder answers from all different languages and I don’t feel like that was really taken advantage of in the metapuzzle. This is not really a flaw - this is more of an opinion and an editing decision - but still worth mentioning.

That being said, the metapuzzle is also a “Become an Expert” puzzle. It is unreasonable to expect every team to know Japanese homophones, so you’re expecting people to research them when they have no background knowledge. This is already a long metapuzzle, it already involves an obscure reference that is vaguely clued, and you’ve added the “Become an Expert” level on top of it. That’s just a lot. There’s too much in this metapuzzle. This needed to get cut down. Honestly, out of all 4 AI final metas18, this is the one that falls flat the most for me.

Wrapping Up

There’s been one very small thing that’s bugging me. In discussions and notes after the Hunt, teammate mentioned that they were considering using puflantu as a possible language. And while that would be really cool, there’s a part of me that was thinking “Why puflantu and not Chaotian?” I know that puflantu is a meme thanks to the 2019 Galactic Hunt, but the MIT Mystery Hunt has its own constructed language that has been built upon in puzzle upon puzzle, not just by its creator.

Like, puflantu is cool and all, but I would leave it in the realm of the Galactic Puzzle Hunt unless you are writing a puzzle that is directly referencing the Galactic Puzzle Hunts. Chaotian exists!

That’s a minor thing. Ascent was a really cool idea, and I hope that this is not the only time that someone messes around with the concept of a polyglot round.

– Cute Mage

  1. This is why I’m really glad that there’s the Chinese puzzle hunt that exists. I like it when things are made for people other than me - especially people in other cultures than me. 

  2. Well, more of a Social Justice Bard. 

  3. Note that this does not include the metapuzzles from the Wyrm, as while they were used as feeders into other metas, they worked like metapuzzles. They blocked progress, and they didn’t allow you to use the free puzzle answers on them. I’m also not including Touch Grass because that wasn’t used in the metapuzzle at the time of the Hunt. 

  4. Oh come on - you knew that there was going to be a catch. The header for this said “Lots of Tables!” and there’s only been one table so far. 

  5. Okay, that’s not completely true. The first thing that comes to mind is “Holy crap ✈ ✈ ✈ Galactic Trendsetters ✈ ✈ ✈, I thought you were supposed to have a losecomm. You solved THAT many AI puzzles?” 

  6. A couple notes about this - the percentages are the number of real solves out of puzzles in the round. This doesn’t count intermediate metas of Wyrm or the Scavenger Hunt. The total is obtained by the total number of real solves divided by the total number of feeder puzzles. 

  7. Also, it becomes incredibly clear that I am putting money on ✈ ✈ ✈ Galactic Trendsetters ✈ ✈ ✈ to accidentally win the 2024 MIT Mystery Hunt.19 

  8. Again, to emphasize, I do not think solves in the last table are any more important than solves from the earlier tables. I DO NOT THINK SOLVES IN THE LAST TABLE ARE ANY MORE IMPORTANT THAN SOLVES FROM THE EARLIER TABLES. I do think that they tell different stories with different priorities, which is interesting to highlight. 

  9. This is the most math-teacher-esque thing to do for this. 

  10. As usual, I’m using examples from the 2022 MIT Mystery Hunt because it is the puzzle hunt that I am probably the most familiar with. Also I’m biased. 

  11. Yes, that is supposed to be “Mlystery”. It’s a Blaseball reference, may it RIV. 

  12. I mean, I guess it is possible to solve Something Command while learning how to play Magic in that weekend. But whooo boy. Good luck. 

  13. I know, a lot has already been said about teammate’s editing, and I’m generally trying to keep it out of my posts, but this AI round was heavily affected by the editing in interesting ways, so I’m bringing it up here. Sorry. 

  14. Hehe. Get it? Uphill battle? I definitely meant to do that. 

  15. I can’t wait for the post where I debut the unit “squad-hours” as a way of timing your Hunt.20 

  16. Okay, that’s not entirely true about the Wyrm, but that’s a much longer article. 

  17. There is also a part of me that really wants there to be extra Ascent puzzles because then it would really freak out teammate folks upon reading this. 

  18. I am counting the actual finale of ABCDE, not Space Modules. 

  19. Okay, probably not, but it’s funny to say and it’ll scare the crap out of my friends who are on ✈ ✈ ✈ Galactic Trendsetters ✈ ✈ ✈. 

  20. Man, there are a lot of footnotes in this paragraph, aren’t there?