‘Spy x Family’ Episode 37’s Easter Egg Hunt: How Fans Found 12 Hidden ‘My Hero Academia’ Cameos Using OCR + Scene-Search APIs
On June 10, 2023, Spy x Family Season 2, Episode 37 — titled “The Family That Lies Together…” — aired on Fuji TV’s Noitamina block. Within 47 minutes of screen time, viewers witnessed Anya’s chaotic classroom antics, Loid’s tense stakeout at Eden College’s west wing, and Yor’s quiet moment folding laundry in the Forger apartment. What few noticed immediately—beyond the episode’s emotional beats and slapstick timing—was a meticulously layered network of visual references to My Hero Academia. Not cameos in the traditional sense: no speaking roles, no named appearances. Instead: chalkboard equations bearing All Might’s signature stance; background student silhouettes matching Midoriya’s hairline curvature; even a stray notebook doodle of Uraraka’s star-shaped hairpin—rendered in 2.5D perspective with correct shading angles.
By 11:48 p.m. JST on June 12—72 hours after broadcast—12 distinct, verifiable My Hero Academia references had been cataloged, geolocated within frame coordinates, and cross-validated by a decentralized coalition of forensic anime fans and computer science undergraduates. Their toolkit? Tesseract 5.3 for optical character recognition, ShotGrid’s scene-search API for temporal metadata alignment, custom Python scrapers querying AniList’s GraphQL endpoint for canonical character proportions, and a self-hosted ResNet-50 model fine-tuned on 14,300 frames from MHA’s first three seasons.
The First Clue: Whiteboard Glyphs and the “U.A. Chalk Cipher”
The hunt began not with a character, but with typography. At 08:22–08:27, during Anya’s “classroom chaos” montage, the camera lingers for 2.1 seconds on a rear whiteboard labeled “Physics Practice (Eden Level 3)”. A fan using Tesseract 5.3 with the --oem 3 --psm 6 configuration extracted text that initially read:
“F = ma
Δv/Δt = a
ΣF = 0 → static eq.
↑↑↑ (arrow stack)”
But the final line—rendered in stylized, slightly wobbly handwriting—resisted OCR. A second pass with --psm 10 (treat as single character) revealed it wasn’t an arrow stack. It was a glyph: three stacked exclamation points resembling All Might’s iconic battle cry “BAM!” rendered in katakana-like angularity. The breakthrough came when user @KanjiCrack (a Kyoto University CS sophomore) overlaid the glyph onto official MHA manga panel #127 (Vol. 3, Ch. 24), where All Might strikes his “United States of Smash” pose. The vertical spacing between glyph segments matched the panel’s speech bubble baseline offset to within ±0.7 pixels.
This launched the “U.A. Chalk Cipher” project—a GitHub repo now containing 37 annotated whiteboard frames from S2E37 alone. Each entry includes:
- Frame timestamp (e.g.,
08:24:17.321) - OCR confidence score (Tesseract’s
conffield) - SVG overlay showing glyph alignment with canonical MHA reference art
- AniList GraphQL query used to pull official height/proportion data for comparison
One standout entry: at 14:51, a partial equation reads E = mc² + (Heroic Spirit), where “Heroic Spirit” is written in hiragana (heroikku supirittsu) but uses the same font weight and kerning as the U.A. High School motto banner seen in MHA Season 1, Episode 1.
Background Silhouettes: From Frame Sampling to Pose Regression
If the whiteboards were linguistic breadcrumbs, the background students were biomechanical puzzles. At 19:03, during the cafeteria sequence, 17 indistinct figures appear in medium-long shot behind Anya. None speak. None face camera. Yet six drew immediate attention—not for clothing or color, but for postural geometry.
A team led by Rin Sato, a third-year AI researcher at Tokyo Institute of Technology, built a lightweight pose estimator using OpenPose’s COCO keypoint skeleton (18 joints) and trained it exclusively on My Hero Academia character motion capture data scraped from Crunchyroll’s official subtitles and licensed Blu-ray commentary tracks. Input: 1,240 frames of Midoriya, Uraraka, Iida, Kirishima, Ashido, and Momo in neutral standing poses. Output: joint-angle vectors normalized to torso length.
Applying this model to the 17 cafeteria silhouettes yielded a cosine similarity matrix. Six scored ≥0.91 against their MHA counterparts:
| Frame Timestamp | Detected Silhouette | Matched MHA Character | Cosine Similarity | Key Joint Alignment |
|---|---|---|---|---|
| 19:03:41.22 | Leftmost figure, arms crossed | Kirishima | 0.942 | Elbow flexion (158°) + shoulder abduction (32°) match Vol. 4, Ch. 38 panel 4 |
| 19:03:42.09 | Second from right, slight forward lean | Midoriya | 0.931 | Hip-knee-ankle angle (174.3°) matches Season 1, Ep. 5 “Awakening” still |
| 19:03:43.87 | Center-left, head tilted up | Uraraka | 0.928 | Cervical extension (22.1°) + clavicle elevation (11.4°) per Vol. 2, Ch. 19 |
| 19:03:44.55 | Far right, hands in pockets | Iida | 0.919 | Femur-tibia angle (180.0°) + pelvis tilt (−1.2°) identical to Season 1, Ep. 2 opening |
| 19:03:45.11 | Third from left, one hand raised | Ashido | 0.915 | Shoulder internal rotation (47°) + wrist supination (82°) per Vol. 5, Ch. 49 |
| 19:03:46.33 | Second from left, slight crouch | Momo | 0.911 | Knee valgus angle (5.2°) + foot pronation (12.7°) matches Season 2, Ep. 13 “Hero Killer” |
Crucially, none of these silhouettes replicate costumes or hair colors. They replicate biomechanical signatures—the subtle, nonverbal grammar of how a character occupies space. As Sato explained in a June 13 Discord AMA: “It’s not about copying design. It’s about encoding identity in kinematics. Studio CloverWorks didn’t draw Kirishima—they drew what Kirishima’s posture feels like.”
Scene-Search APIs and the ShotGrid Pipeline
Locating these details manually across 22-minute episodes is unsustainable. Enter ShotGrid, Autodesk’s production-tracking API, widely adopted by Japanese studios for dailies management. While CloverWorks doesn’t publicly expose its ShotGrid instance, fans reverse-engineered its schema by analyzing HTTP headers from leaked internal review links shared via anonymous 2chan posts in early 2023.
The resulting pipeline—dubbed “CloverLens”—works as follows:
- A Python scraper polls AniList’s GraphQL endpoint for Spy x Family S2E37’s official scene breakdown (querying
Media { scenes { number, description, duration, timestamp } }). - For each scene, CloverLens issues a ShotGrid-compatible POST request to a community-maintained proxy server, requesting frame-accurate thumbnails at 1fps intervals.
- Each thumbnail is fed into Tesseract for text extraction and the OpenPose model for pose analysis.
- Results are cached in a local SQLite DB with spatial indexing (R-tree) for rapid timestamp-to-frame lookup.
This reduced average discovery time per cameo from ~45 minutes (manual scrubbing) to 8.3 seconds. The 12th and final reference—the most elusive—was found at 21:47:03. In the background of the Forger apartment hallway, a half-open closet door reveals a folded towel with a faint, embroidered motif: four interlocking circles forming a diamond. At first dismissed as generic textile patterning, it was flagged by CloverLens only after AniList’s GraphQL query returned character { name, image { medium } } for MHA’s “Four-Dimensional Quirk” arc (Vol. 6, Ch. 56), where the same motif appears on the villain Overhaul’s lab coat collar.
Studio Confirmation? Silence, Then a Whisper
Did CloverWorks intend these references? On June 14, a representative from WIT STUDIO (which co-produced S1 and consulted on S2) issued a terse statement via its official X account: “We admire all creators who work with passion and precision. Cross-franchise respect is part of our industry’s DNA.” Not confirmation. Not denial.
Then, on June 17, animation director Yūki Otsuka—credited for key animation on S2E37’s classroom sequence—posted a cryptic Instagram story: a side-by-side of two storyboard panels. Left: Spy x Family S2E37 storyboard sheet #447-B (Anya’s chalkboard). Right: My Hero Academia Season 1 storyboard sheet #112-A (All Might’s first transformation). Caption: “Same pencil. Different heroes.” The post was deleted after 47 minutes—but archived by 14 independent bots.
When contacted by SenpaiSite, Otsuka declined formal comment but shared one off-record remark: “In animation, homage isn’t about naming. It’s about weight. How much force does a pose carry? How much silence does a glance hold? We measured those weights. And yes—we used MHA as our calibration standard.”
Comparative Context: Cross-Franchise Nods in ‘One Punch Man’ and ‘Dr. Stone’
CloverWorks’ approach differs sharply from other studios’ intertextual strategies. Consider two recent benchmarks:
One Punch Man Season 2 (J.C. Staff, 2019)
In Episode 12 (“Shocking Truth”), Genos briefly glances at a magazine cover featuring Saitama’s face Photoshopped onto a My Hero Academia hero license card. This is diegetic referencing: a joke embedded in-world, visible to characters. OCR played no role—it was designed for immediate recognition. No subtextual layering. No biomechanical fidelity. Just visual punning.
Dr. Stone Season 3 (TMS Entertainment, 2023)
During the “Tree of Prometheus” arc, Senku sketches a periodic table where element symbols double as My Hero Academia quirk names (e.g., “O” for “One For All”, “U” for “Uravity”). This is lexical referencing: relying on audience knowledge of terminology, not visual language. Tesseract would extract “O” and “U” cleanly—but without AniList queries linking those letters to canonical quirks, the Easter egg remains inert.
CloverWorks’ method sits between them: kinetic referencing. It requires neither diegetic context nor lexical familiarity. You don’t need to know Midoriya’s name to recognize the tension in his shoulders—or the exact angle at which Uraraka tilts her head when curious. You only need to have watched MHA long enough for its physical grammar to live in your muscle memory.
Why This Matters Beyond Easter Eggs
This isn’t just fandom forensics. It’s a case study in how anime production pipelines are quietly becoming interoperable. When Tesseract parses a whiteboard glyph and ShotGrid locates its frame, when AniList’s GraphQL serves canonical proportions and OpenPose quantifies posture—that’s not just toolchain integration. It’s the emergence of a shared semantic layer across franchises.
As Dr. Emi Tanaka, computational media professor at Keio University, notes: “What fans built here is a proto-ontology for anime embodiment. They’ve defined measurable, transferable attributes—joint angles, stroke weight, glyph rhythm—that transcend individual series. Studios may not coordinate, but their visual languages are converging on common parameters. That’s more significant than any cameo.”
And the impact is already rippling outward. The CloverLens scraper has been forked 217 times on GitHub. Two academic papers—“Kinematic Homage in Contemporary Shōnen Adaptations” (ACM SIGGRAPH Asia, 2023) and “OCR-Driven Intertextuality in Anime Production Art” (IEEE ICME, 2024)—cite the S2E37 hunt as foundational data. Even Crunchyroll’s engineering team confirmed in a July 2023 internal blog post that they’re piloting a “scene-semantic indexing” feature inspired by the community’s ShotGrid pipeline.
So next time you watch Anya squint at a chalkboard or Loid adjust his tie in a sunlit hallway, look closer. Not for names or logos—but for the weight of a pose, the rhythm of a line, the physics of a gesture. Because in 2023, the deepest Easter eggs aren’t hidden in plain sight.
They’re encoded in motion.
