Why AI Can't Actually Do Your Math Homework (And What That Tells Us About Intelligence)

I've been a teaching assistant for six semesters now—everything from differential equations to machine learning, plus a few math camps thrown in. And I need to tell you something important: we're living in a bizarre historical moment. A window that's rapidly closing.

Right now, when students submit AI-generated homework, I can still tell. Not because I'm running it through some detection software or playing digital detective. I can tell because the AI is still stupid in very specific, very revealing ways.

But here's what keeps me up at night: OpenAI's o1 model just won gold at this year's International Mathematical Olympiad. The same technology that can solve problems that stump 99.9% of humans still fails your calculus homework in ways that would make any TA laugh.

This paradox isn't just funny—it's a countdown clock. And understanding why it's ticking tells us something profound about intelligence itself.

Part I: The Two Ways AI Fails (While We Can Still Notice)

After grading hundreds of problem sets, I've noticed AI-generated work falls into two distinct failure modes. These aren't random errors—they're systematic blind spots that reveal how these systems actually "think."

The Over-Complication Tell

Give an AI a first-year calculus problem without telling it the course level, and watch what happens. It'll pull out Lebesgue integration for a problem meant to test basic derivatives. It'll invoke abstract algebra where simple arithmetic would suffice.

Why? Because AI doesn't understand what "easy" means. It has no concept of pedagogical progression. When it searches its vast training data for relevant methods, it can't distinguish between an approach that would impress a PhD committee and one appropriate for someone who just learned what a limit is last week.

I once received a homework submission that used measure theory to solve a basic optimization problem. The math was technically correct. It was also like using a particle accelerator to crack a walnut. No undergraduate would think to do this—they don't even know measure theory exists. But the AI doesn't understand that the elegant solution, the one we're actually testing for, involves nothing more than setting a derivative equal to zero.

This reveals something crucial: AI has no mathematical intuition. It can't feel the difference between elegant and overwrought. It's like someone who memorized every word in the dictionary but has never had a conversation.

The Brute Force Confession

The second tell is even more obvious. In competitive mathematics, we often ask students to find the smallest number satisfying certain constraints, or prove that something is true for all cases. A human student will look for patterns, symmetries, insights—the "aha!" moment that makes the problem tractable.

AI? It often just... calculates everything.

OpenAI recently published fascinating research on why their models do this. Here's the example that stopped me cold: imagine you're trying to predict someone's birthday. You have a 1-in-365 chance of guessing correctly. But if you say "I don't know," you're guaranteed to be wrong according to the accuracy metrics these models are trained on. So they guess. Always.

When applied to math homework, this creates almost comical results. I've seen AI-generated solutions that literally test every integer from 1 to 1000, conclude no solution exists, and call it a day. No human would ever do this. It's not just inefficient—it's intellectually offensive to anyone who understands mathematics.

But here's the thing: for an AI, checking 1000 numbers is trivial. The computational cost is nothing. So why wouldn't it brute force? It doesn't experience the tedium we'd feel. It doesn't appreciate the satisfaction of finding a clever trick.

Every time an AI uses brute force where we'd use insight, it highlights that human intelligence isn't just about getting the right answer. It's about understanding why something is true, feeling the shape of a problem, knowing instinctively which path will be fruitful.

Part II: The Clock Is Ticking

Now here's where things get interesting—and a little scary.

The same company whose AI fails at basic homework in these laughable ways also built a system that scored in the top tier of the International Mathematical Olympiad. Think about that. We're watching AI systems that can simultaneously:

  • Solve problems that would challenge the best mathematical minds on Earth

  • Fail undergraduate homework in ways that reveal fundamental incomprehension

This isn't stable. It's a transition state.

The research on AI hallucination that OpenAI published points to the solution: future models will be trained on confidence, not just accuracy. They'll learn when to say "I don't know" instead of generating plausible nonsense. They'll be fine-tuned on educational materials to understand which methods are appropriate for which levels.

More importantly, they'll get better at mimicking the human problem-solving aesthetic. They'll learn that mathematicians value elegance, that there's beauty in simplicity, that the "right" answer isn't just the one that's technically correct but the one that demonstrates understanding.

When that happens—and it will happen—we won't be able to tell anymore.

So what happens when AI can perfectly mimic a bright undergraduate's homework? When it can not just solve the problem but solve it the "right" way, with the appropriate method, showing all the work a human would show?

This isn't just about academic dishonesty, though that's certainly a crisis brewing. It's about something deeper: what happens to human mathematical thinking when machines can perfectly simulate it?

I work primarily in applied mathematics, where we model real-world problems and optimize systems. If AI can look at any system and find the optimal solution, prove it's optimal, and do so with the elegance we associate with human insight, then what?

The terrifying answer might be: then AI doesn't just do our homework. It does our research.

Think about it. If an AI can look at a math competition problem and recognize the "trick," see the pattern that makes it solvable—really see it, not just brute force it—then it has something approaching mathematical intuition. And if it has that, why couldn't it develop new mathematical insights? Prove new theorems? Solve the unsolved?

We're potentially looking at the last generation of human mathematicians who are meaningfully better than machines at mathematics.

Part III: What Do We Do with the Time We Have Left?

If we're in a countdown—and I believe we are—then what should we do?

For Students

This might be the last generation that actually needs to learn math the hard way. That might sound like missing out on a convenience, but it's actually a gift. The struggle to understand, the frustration of being stuck, the joy of breakthrough—these aren't bugs in the learning process. They're features. They're what builds not just mathematical knowledge but mathematical character.

When you're tempted to use AI for your homework, remember: you're not just cheating the system. You're cheating yourself out of possibly the last chance to develop genuinely human mathematical intuition before machines make it obsolete.

Every over-complicated AI solution reminds us that human learning is fundamentally about progression—we build from simple to complex, each step informing the next. We don't just know facts; we know their place in a larger structure of understanding. Students aren't just solving problems. They're building intuition. They're learning not just what works, but what's worth trying.

For Educators

We need to radically rethink what we're testing for. If AI can mimic procedural problem-solving, maybe we need to focus more on conceptual understanding, creative problem-posing, and mathematical communication. We need to ask questions that require not just answers but insight.

More fundamentally, we need to decide: if AI can do all the calculations, what mathematics do humans still need to know? What's the irreducible core that remains uniquely valuable?

The window where we can distinguish human from AI work is our chance to identify what really matters in mathematical education before it becomes impossible to tell the difference.

For Everyone

We're witnessing something unprecedented—the potential automation of intellectual work that we've always considered fundamentally human. Mathematics has been called the language of the universe, and we're watching machines become fluent in it.

AI's current failures are teaching us what makes human mathematical intelligence special. But once AI learns to hide these tells—once it learns to solve problems the way we do, with our aesthetics and our intuitions—we might forget they were ever there at all.

The Bottom Line

Right now, I can still catch AI-generated homework because AI is bad at pretending to be human. It doesn't know what we find easy or hard. It doesn't understand elegance. It can't fake the particular way a student who's been struggling with a concept finally breaks through.

But that's changing. Fast.

OpenAI's research shows they know exactly what's wrong with their models. They're working on fixes. Each new version gets better at mimicking not just our answers but our problem-solving aesthetics.

We're in this unique moment where AI is smart enough to attempt your homework but dumb enough to get caught. This window—where we can still see the difference between human and artificial mathematical thinking—is closing.

The question isn't whether AI will eventually be able to do all our math homework perfectly. It will. The question is: what does that mean for human intelligence? What does it mean for education? What does it mean for the value of human mathematical thinking?

I don't have all the answers. But I do know this: every time I spot an AI's brute force solution, every time I see it over-complicate a simple problem, I'm seeing something that won't last much longer. I'm seeing the last days where human mathematical intuition is demonstrably, visibly different from machine calculation.

We should pay attention to these differences while we still can. Because once AI learns to hide them—once it learns to solve problems the way we do, with our aesthetics and our intuitions—we might forget they were ever there at all.

And in forgetting what makes human mathematical thinking special, we might lose something we can't get back: the understanding of what intelligence really means, before machines made the question moot.

Previous
Previous

How I Actually Learn Complex Research Papers with AI (And Why You Should Too)

Next
Next

The Two-Way Street: What Great TAs and Smart Students Both Know