Infinite Cryptogram (Solution)

by Anders Kaseorg

The puzzle is an infinite string of letters, presented in pages of 2000 letters at a time. Page 1 begins:

AEOLOMWSUOYCMKBSXNSKFSMDYBCSOBBKAEOLOMSXNSKCSOBBKVSWKCSOBBKHBKISXNSKZKZKAEOLOMWS…

We’ll call this level 0. It decodes with Caesar shift key K to:

QUEBECMIKEOSCARINDIAVICTORSIERRAQUEBECINDIASIERRALIMASIERRAXRAYINDIAPAPAQUEBECMI…

which is the phonetic alphabet spelling of level 1:

QMOIVSQISLSXIPQMOIIGLSXERKSIGLSXERKSIGLSKSPJPMQEIGLSZMGXSVTETEQMOIMRHMEDYPYQMOIK…

This now decodes with Caesar shift key E to:

MIKEROMEOHOTELMIKEECHOTANGOECHOTANGOECHOGOLFLIMAECHOVICTORPAPAMIKEINDIAZULUMIKEG…

which is the phonetic alphabet spelling of level 2:

MRHMETETEGLEVPMIZMGXSVHIPXEQMOIKSPJTETERSZIQFIVALMWOICHIPXEMRHMEXERKSEPTLEXERKSV…

This now decodes with Caesar shift key E to:

INDIAPAPACHARLIEVICTORDELTAMIKEGOLFPAPANOVEMBERWHISKEYDELTAINDIATANGOALPHATANGOR…

That’s about as much as we can get from just page 1, so we add on pages 2–5 to read more of level 3:

IPCVDMGPNWDITATRWDWDITAUDMIGDIEPEPAXBPCDKTBQTGWDITAGDBTDKXRIDGUDMIGDISTAIPJCXUDG…

As we continue like this, the Caesar shift keys will spell out the clue phrase, but by the time we’ve gotten through KEEP DECODING, the first letter at level 12 is spelled by 386634276 letters at level 0. Clearly a more efficient strategy is required.

Let \(a_{n,l}\) be the length of an encoded letter \(l\) at the \(n\)th level. By writing these lengths in terms of the lengths at the previous level: \begin{align*} a_{0,l} &= 1, \\ a_{3,\mathtt I} &= a_{2,e_{\mathbf E}(\mathtt I)} + a_{2,e_{\mathbf E}(\mathtt N)} + a_{2,e_{\mathbf E}(\mathtt D)} + a_{2,e_{\mathbf E}(\mathtt I)} + a_{2,e_{\mathbf E}(\mathtt A)} \\ &= a_{2,\mathtt M} + a_{2,\mathtt R} + a_{2,\mathtt H} + a_{2,\mathtt M} + a_{2,\mathtt E}, \end{align*} we can compute a table of \(a_{n,l}\):

\(n\) \(a_{n,\mathtt A}\)\(a_{n,\mathtt B}\)\(a_{n,\mathtt C}\)\(a_{n,\mathtt D}\)\(a_{n,\mathtt E}\)\(a_{n,\mathtt F}\)\(a_{n,\mathtt G}\)\(a_{n,\mathtt H}\)\(a_{n,\mathtt I}\)\(a_{n,\mathtt J}\)\(a_{n,\mathtt K}\)\(a_{n,\mathtt L}\)\(a_{n,\mathtt M}\)\(a_{n,\mathtt N}\)\(a_{n,\mathtt O}\)\(a_{n,\mathtt P}\)\(a_{n,\mathtt Q}\)\(a_{n,\mathtt R}\)\(a_{n,\mathtt S}\)\(a_{n,\mathtt T}\)\(a_{n,\mathtt U}\)\(a_{n,\mathtt X}\)\(a_{n,\mathtt W}\)\(a_{n,\mathtt X}\)\(a_{n,\mathtt Y}\)\(a_{n,\mathtt Z}\) key
011111111111111111111111111K
155754745564448546565767464E
22127312219372023223119182044271834293223392837223121E
39713714510492194100112111157979110322513684171148154121205143176105150102P
4533556831567549838549609593741408423465996548352912566622587953665932512798547D
527123433434632052421507524483063287434892533189023805348358923783704304641703718431145104204260942492403E

We can now use these encoded lengths to skip ahead to the necessary pages while decoding each successive level. This gets us as far as KEEP DECODING THIS DIG DUG PAC MA…, but once we get to about the nine quadrillionth page, we notice that the server has a “bug”: it starts redirecting our requests to round the page numbers to the nearest 64-bit double-precision floating point number.

In order to continue decoding with only the pages that the server will give us, we need the ability to predict the contents of an arbitrary page given guesses for a prefix of the clue phrase. Let \(e_{n,l}(i)\) be the \(i\)th letter of the encoding of an \(n\)th level letter \(l\). Using the lengths \(a_{n,l}\) from the table above, we can compute \(e_{n,l}(i)\) recursively. For example: \begin{align*} e_{0,l}(0) &= l, \\ e_{3,\mathtt I}(i) &= \begin{cases} e_{2,\mathtt M}(i), & 0 \le i < a_{2,\mathtt M}, \\ e_{2,\mathtt R}(i - a_{2,\mathtt M}), & a_{2,\mathtt M} \le i < a_{2,\mathtt M} + a_{2,\mathtt R}, \\ e_{2,\mathtt H}(i - a_{2,\mathtt M} - a_{2,\mathtt R}), & a_{2,\mathtt M} + a_{2,\mathtt R} \le i < a_{2,\mathtt M} + a_{2,\mathtt R} + a_{2,\mathtt H}, \\ e_{2,\mathtt M}(i - a_{2,\mathtt M} - a_{2,\mathtt R} - a_{2,\mathtt H}), & a_{2,\mathtt M} + a_{2,\mathtt R} + a_{2,\mathtt H} \le i < a_{2,\mathtt M} + a_{2,\mathtt R} + a_{2,\mathtt H} + a_{2,\mathtt M}, \\ e_{2,\mathtt E}(i - a_{2,\mathtt M} - a_{2,\mathtt R} - a_{2,\mathtt H} - a_{2,\mathtt M}), & a_{2,\mathtt M} + a_{2,\mathtt R} + a_{2,\mathtt H} + a_{2,\mathtt M} \le i < e_{3,\mathtt I}. \end{cases} \end{align*} Using the server to check various predictions based on the 26 possible guesses for the first letter of level \(n\) (and the corresponding level \(n - 1\) key), we work out more of the clue phrase:

KEEP DECODING THIS DIG DUG PACMAN GALAGA LEGIONS GALAGA XEVIOUS PACMAN CHAMPIONSHIP EDITION NEW RALLYX MS PACMAN MR DRILLER ONLINE GROBDA MOTOS THE TOWER OF DRUAGA METROCROSS GALAXIAN RALLYX KING AND BALLOON BOSCONIAN PAC AND PAL SUPER PACMAN DIG DUG II MAPPY DRAGON BUSTER SKY KID BARADUKE ROLLING THUNDER SKY KID DELUXE PACMANIA GALAGA EIGHTY EIGHT DRAGON SPIRIT POLE POSITION POLE POSITION II GALAGA ARRANGEMENT PACMAN ARRANGEMENT DIG DUG ARRANGEMENT DIG DUG PAC MAN GALAGA LEGIONS GALAGA XEVIOUS PACMAN CHAMP…

Now we really can’t go any farther—we’ve reached the last page, having exhausted the range of a 64-bit double. Fortunately the clue phrase has started repeating itself. The repeated part lists the games of the Namco Museum Virtual Arcade collection (arranged by their order in the collection’s menu screens). The answer is NAMCO MUSEUM VIRTUAL ARCADE.