Somewhere in Navid Safaei’s archive is a stack of old booklets, some photocopied, some scanned with what appears to be equipment from another era. In 2006, he began collecting them. Each year, nations participating in the International Mathematical Olympiad would bring their best problems, printed in those little competition booklets, and distribute them to other delegations.
After that, the booklets would essentially disappear. A library had not been constructed. What was effectively one of the richest collections of expert mathematical thinking created by any community on earth had not been cleaned and arranged. Safaei simply continued to scan. Silently. For almost twenty years.
| Information | Details |
|---|---|
| Project Name | MathNet |
| Lead Institution | MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) |
| Lead Author | Shaden Alshammari, MIT PhD Student |
| Collaborating Institutions | King Abdullah University of Science and Technology (KAUST), HUMAIN |
| Dataset Size | 30,000+ expert-authored problems and solutions |
| Geographic Span | 47 countries across six continents |
| Languages Covered | 17 languages |
| Competitions Included | 143 competitions spanning four decades |
| Comparison to Existing Datasets | Five times larger than the next-biggest dataset of its kind |
| Presented At | International Conference on Learning Representations (ICLR), Brazil |
| Validation Team | 30+ human evaluators from Armenia, Russia, Ukraine, Vietnam, Poland |
| IMO Board Connection | Co-author Sultan Albarakati currently serves on the IMO board |
| Archive Source | 1,595 PDF volumes, 25,000+ pages, including Navid Safaei’s personal collection dating to 2006 |
That somewhat compulsive, somewhat ungrateful story proved to be very significant. His personal archive served as a foundation for MathNet, which is currently being developed by researchers at MIT’s CSAIL, King Abdullah University of Science and Technology, and the business HUMAIN. MathNet is the largest high-quality dataset of proof-based math problems ever compiled. It is five times bigger than anything on the market before. Over thirty thousand issues and their fixes. 47 nations. seventeen languages. There were 143 contests. On paper, it’s a startling sight, and it’s currently difficult to fully map out the implications for how machines might eventually assist teenagers with calculus and mathematical proofs.
The project’s leader, MIT PhD candidate Shaden Alshammari, participated in the IMO as a student. She recalls what it was like to train primarily on her own, without the support of a national infrastructure or a central location to locate problems or worked solutions from mathematical traditions outside of her own nation.

She stated, “No one in their country was training them for this kind of competition,” and there is clearly a personal component to that statement. Part of the resource she wished had been available when she was fifteen and trying to figure this out on her own is the dataset she assisted in creating.
Beyond just the numbers, what makes MathNet truly fascinating is the foundation upon which it is constructed. The majority of math datasets currently in use scrape problems from community forums, such as Art of Problem Solving, where answers are typically brief, informal, and written with varying degrees of accuracy. MathNet only uses official national competition booklets, which contain expert-written, peer-reviewed solutions that frequently cover multiple pages and multiple approaches to the same problem. It’s that depth. An AI model that has been trained on comprehensive, multi-path solutions learns something very different from one that has been trained on a single-line response. It’s the distinction between reading a flash card and seeing an expert teacher solve a problem.
It’s possible that this will have a greater impact on the development of AI tutoring tools in the coming years than people currently realize. A sixteen-year-old who struggles with calculus may learn more effectively from a system that has truly internalized multiple pathways through the problem—not just the answer, but also the reasoning, the dead ends, and the moments of choosing between approaches—than from software based on more superficial content. Of course, that is still speculative. However, the foundation being laid here is distinct from previous ones, and that seems important to consider.
The geographic breadth is significant in and of itself. A limited portion of the global mathematical culture was captured by earlier Olympiad-level datasets, which mainly relied on competitions from the US and China. Six continents are used by MathNet, including mathematical traditions that are hardly ever found in AI training data. The deputy head of Switzerland’s IMO team, Tanish Patil, pointed out that critical metadata, verified solutions, and standardized formatting are absent from the current archives. Although it’s still unclear if this breadth will result in AI systems that teach math in significantly more diverse or inclusive ways, the possibility is there in a way that wasn’t previously.
It feels like something that should have been there for a long time has finally been constructed as you watch this project come to fruition. Scattered throughout those booklets, the math was always there. All it took was a good twenty years for someone to gather it.

