AI in mathematics is forcing big questions

rrbanffy about 6 hours ago 36 commentsRead Article on spectrum.ieee.org

⚡ Community Insights

Discussion Sentiment

48% Positive

Analyzed from 1586 words in the discussion.

Discussion (36 Comments)Read Original on HackerNews

skybrian•about 1 hour ago

Here’s one way to think about the difference between coming up with a formal proof and having something other mathematicians can use:

> A clear explanation can be found in Alex Kontorovich’s account of his own learning curve with formalized mathematics. In a nutshell: Mathlib, the dominant Lean library, is a human-curated formalization of an ever-growing fraction of existing human mathematics. It exposes clean APIs and abstractions, without which no autoformalization could take place. By contrast, Math Inc’s autoformalized proof of Viazovska’s results exposes no intelligible interface. Who in their right mind would merge a 200,000-line unaudited vibe-coded blob into the master branch of global human science?

https://davidbessis.substack.com/p/the-fall-of-the-theorem-e...

rirze•6 minutes ago

I think we’re going to find out the hard way that the proofs left to solve are very much not elegant.

fiforpg•about 3 hours ago

The use of computers in mathematics has been somewhat controversial from the very start.

There are of course all the computer-assisted proofs (see 4 color theorem), as well as the partially-assisted ones (see Viazovska et al on packing problems in dimensions 8, 24). But even finding a solution numerically, then rigorously verifying its properties can leave a lingering sense of incompleteness, of a gap in understanding. I like this one quote by (allegedly) Wigner that illustrates it well:

"It is nice to know that the computer understands the problem, but I would like to understand the problem, too."

rdedev•about 2 hours ago

Reminded me of this quote: the problem with machine learning is that it's the machine that does the learning

jackyinger•about 3 hours ago

To bluntly put it in a nutshell, and state the obvious:

If you don’t understand the problem you can’t be sure that the computer does.

avaer•about 2 hours ago

As a programmer I definitely get annoyed when I see code and I don't understand what it does.

But I also definitely don't understand the problem if I can't get the computer to understand it, with tests.

In some sense I always considered programming to be more trustworthy than maths arguments without the certainty of a solver proof.

With all of these questions in the air, epistemology might be making a comeback.

therobots927•about 2 hours ago

Tests only work for a limited set of programming verification. In many cases you don’t actually know what the output for any given input should be, so there’s no way of verifying the AI-generated code. You just kind of have to trust it. The only exception I can think of is robotics and quantitative trading. Which have already been extensively utilizing AI.

akoboldfrying•about 1 hour ago

Well, if you can formalise the problem statement (this is the hard part) sufficiently well that the computer can produce a proof, you can be very sure the proof is sound.

A fundamental property of any formal proof is that it can be checked by a fairly stupid machine, automatically, because every step is a simple mechanical operation that names one of a handful of axioms and refers to a handful of earlier steps, the truth of which has already been established. So while coming up with a proof may require genius-level thinking, checking an existing fully fleshed out proof is simple -- just potentially very tedious because of the sheer number of steps.

That said, a typical human-written proof omits many steps considered "obvious" to a trained mathematician. Converting this to a formal proof involves interpreting what the original author "must have meant", which requires a lot of expertise and can go wrong -- or it may reveal that there is some inconsistency in the original claim itself.

bsder•19 minutes ago

> checking an existing fully fleshed out proof is simple

The controversy around Mochizuki and the "abc Conjecture" proof is a contrary example.

seanmcc•about 3 hours ago

Almost another layer in the peer review process in the best case right? Just a different kind of peer you have to review.

wbl•about 1 hour ago

Look up the story of Flyspeck for this taking an entire career.

therobots927•about 2 hours ago

So… more peer review backlog. That sounds fun. Oh, you want someone to review your paper, Mr phd in mathematics with 20 years of experience? Get in line behind chatGPT.

kimjune01•41 minutes ago

lean compiles or it doesnt

andai•38 minutes ago

>more recently, a new general-purpose AI system from OpenAI disproved an important conjecture in combinatorial geometry. This result would have been worthy of publication in a major mathematics journal if humans had been the authors

The quality of the mathematics is a function of who has authored it?

cpard•about 2 hours ago

Human mathematicians could become “priests to oracles.”

Priests were interpreting the oracles (at least at a place like Delphi) according to the context of the people asking the questions aka participating in politics of that ancient times.

Subjectivity was a feature and I’m not sure that fits to mathematics though.

I wonder if mathematics as a science field moves more into engineering or if a different branch will emerge that is closer to that because to the point of the article, science is about understanding not just results.

therobots927•about 2 hours ago

Human mathematicians could become “priests to oracles.”

This is a decidedly anti-enlightenment statement.

glouwbug•about 4 hours ago

Turns out you have to be Terence Tao to know when an LLM is right or wrong

gerdesj•about 4 hours ago

"I imagine my work could be completed with AI assistance in a matter of days—maybe hours."

Would some one with tokens to burn mind checking that statement out and post back. Be sure to use long dashes too.

bijowo1676•about 2 hours ago

is the similar statement true for coding as well?

i.e. You have to be a good engineer to understand the well generated LLM code and a program

glouwbug•about 2 hours ago

Yes, that's the point I'm making

paulpauper•about 3 hours ago

Yeah, so much for AI making mathematicians obsolete.

lubujackson•about 4 hours ago

The article poses if AI will be a tool, a collaborator or an oracle. Why not all 3?

If mathematics is human understanding of logical consequences, understanding is the priority. But if AI proves something we can't understand but can utilize, that is a different sort of useful.

We are getting awfully close to "the answer of the universe is 42" and having it not be a joke...

fn-mote•about 3 hours ago

I don’t know about “close”, but there are certainly results in math that are considered deep because they require the use of a “Hard Theorem” at some point. That kind of building on top of something Very Difficult is still possible without understanding the “Very Difficult” part. I’d say a lot of not-amazing math is built by believing the platform works but not being able to built it yourself.

I couldn’t build an internal combustion engine or even a plastic box, so maybe there’s nothing wrong with this approach.

mmooss•5 minutes ago

There's yet another major issue of the centralization of power and knowledge:

> Some worry about the accessibility of AI tools. Traditionally, mathematicians have required little more than intuition, training, and a pen and paper to advance their field. If this slow, deliberative process is no longer valued by society, and particularly by research funders, then mathematics could become an elitist activity, only practiced by select organizations that can afford to work with proprietary AI models.

This can be true of anything LLMs do better than existing options. Because the best LLMs require enormous resources to develop, access to them can be very limited. Right now they are priced for market share. What happens when your small law firm attorney, or self-representation, goes up against a large firm with LLM expertise? Can the kid from the poor high school compete with the kid from the rich school with premium LLM access, in mathematics for example?

morpheos137•39 minutes ago

Much can be resolved when it is understood math is discovered not created. AI is a tool. if it makes discovery or proof easier that is still mathematics. A proof stands on its own logic regardless how it is derived. The root concern is how ai may provide uplift for mathematical discovery outside of socially expected channels.

therobots927•about 2 hours ago

It’s a well known problem in higher mathematics that even if you’ve solved a problem, often the proofs are incredibly long and complex and require an extensive amount of time spent by peers to review it.

It would be great if someone could explain to me how AI improves this situation. Even if AI thinks it’s solved a problem, unless the proof is incredibly efficient and well explained, it will be difficult to verify the correctness. One hallucination in 300 steps of logic is enough to destroy the entire proof.

hilbertseries•about 1 hour ago

In 2012 Mochizuki claimed to have proved the abc conjecture by developing a new branch of mathematics. He was a respected mathematician, but the theories he had developed were so complex no one could determine if he was correct. It took six years until two number theorists dissected the proof and found a fatal flaw in it.

jonahx•about 1 hour ago

> It would be great if someone could explain to me how AI improves this situation.

It's main utility is in the search step, not the verification step. The search is the bulk of the work and creativity. Separately, as the sibling commenter pointed out, it will likely get better at the verification step as well, with integrations of tools like Lean.

> One hallucination in 300 steps of logic is enough to destroy the entire proof.

The situation with human mathematicians is not much different. Eg, Wiles original proof of Fermat's Last Theorem contained errors found by reviewers, which he later repaired.

skipkey•about 1 hour ago

I would imagine that in the future AI will be doing proofs in Lean or whatever the successor to it, which gives you a pretty good confidence it’s correct.

cdetrio•39 minutes ago

The article addresses that, a formal verification layer provides a computational check that a proof is correct.

There's an interesting wrinkle though. The formal definitions and statements need to be correct, i.e. need to faithfully translate the human-readable (informal) definitions and statements. A few months ago, Kevin Buzzard created an auto-formalization challenge: https://gist.github.com/kbuzzard/778bc714030b3e974ab5f403878...

It's a handful of human-written lines, a skeleton that defines an "API surface". If you use AI to generate the rest of the code, the API surface (along with the Lean compiler) acts to guarantee that the whole generated solution (tens of thousands of lines) will be correct.

The discussion thread about the challenge, which was solved a couple weeks ago, is here: https://leanprover.zulipchat.com/#narrow/channel/583336-Auto...

Without the human-written or human-audited skeleton, you might get AI-generated slop Lean code that compiles, but contains nonsensical definitions/statements and thus vacuous proofs. Though from what I gather, recent models are much improved in their capability to generate correct statements and definitions and proofs, for both formal math and informal math.

A discussion thread about a different auto-formalization effort without human-written API surface, with some correct output and some slop: https://leanprover.zulipchat.com/#narrow/channel/583336-Auto...

paulpauper•about 3 hours ago

It's amazing how much attention this issue has gotten. What is lost in the hype is no AI can tell you if a proof is correct. An AI can produce a convincing looking proof, but it can have a subtle but critical error or make an assumption that is unfounded. Thus, it ultimately comes down to humans. A mathematician has to craft the prompt, and mathematician to interpret/check the results. Also, these programs are very expensive and propitiatory. They are not like the commercial AI that regular people use. It takes considerable prompting and trial an error to solve even Olympiad/Putnam problems, and tons of work by humans pouring over the results to see if it's correct. For every Erdos problem that captures the headlines, there are many where it failed or untold hours of prompting and token burn to get that result, and manhours verify it.

golly_ned•about 1 hour ago

Please read the article. You've ignored proof checkers.

treyd•about 3 hours ago

I don't think you understand the workflow. Terrence Tao has done a lot of work using them in conjunction with LEAN.

You aren't having the AI check the proof, you interactively work on the same LEAN proof, handing off between the AI assistant and having LEAN check it and provide feedback for both of you when there's a mistake.

ares623•about 3 hours ago

But just imagine...

(edit: lol didn't realize the sibling comment below is essentially my comment)

hackermailman•about 2 hours ago

AI can't yet come up with any new ideas to make the inductive leap to solve a math problem. New ideas are what get the accolades and using an old idea just means the original author missed something. We are still at the author missed something stage that AI is doing today.

It can definitely be a good research assistant though