Up: Home


AI for the Working Mathematician


Summary. We suggest ways that AI can be useful to the working mathematician. Routine tasks are emphasized, but examples also demonstrate recently-developed AIs’ increasing usefulness for actual research mathematics. We finish with a brief overview of the AI product landscape, and a list of tips for effective use.

What can AI do for me? AI1 speeds up many mundane tasks, especially those that are laborious, but whose correctness is easy to verify:

AI is also helpful for finding and understanding information:

Like humans, AIs often say things that are not perfectly correct, but point in the right general direction. Using them is straightforward: open a browser tab to your AI of choice (I most often use Claude), and talk to it much as you would a person. For some tasks, it is also helpful to upload files, such as PDFs of relevant papers, or to provide a brief explanation of your higher-level goal. This contextual information allows the AI to make better guesses about what you want it to do.

AIs have historically been bad at tasks requiring careful, exact reasoning or multiple layers of abstraction. New “chain-of-thought” or “thinking” models are beginning to overcome this weakness, making AIs more useful for directly discussing novel math, e.g. for generating small examples/counterexamples.

Show me the goods! We consider some concrete examples and anecdotes, remarking on salient features. Each example can be clicked to expand/collapse details.

Example. TikZ code for the Hasse diagram of a partition poset can be generated by brief interaction with ChatGPT. This illustrates a routine interaction with an AI on a task well within its abilities.

We first ask:

Please output tikz code for the poset of partitions contained in (3,3), ordered by containment.

After a couple minutes of thought, ChatGPT outputs TikZ code, which we compile to the following image.

∅(1(2(1(3(2(3(2(3(3)),),,,,,111223))))))

The picture is correct, but has some ugly visual features, so we ask ChatGPT to fix them.

This looks great. Can you space out the nodes a little more, remove the outlines on the nodes, and draw a small picture of the young diagram for each partition instead of using labels like (3,3), (3,2), etc?

The updated code compiles to:

∅

The new picture is uncentered, but this is again a quick fix!

Can you now make it so that each row with 2 young diagrams is centered, and add in an empty set sign to represent the empty young diagram at the bottom?

We obtain the final image:

∅

The model used in this example was o3-mini-high.

Example. In a recent project, the class of matroids with no U2,4, K4, or K2,3 minor appeared. We wondered if this class of matroids had a name. We learned that this is the class of matroids of “outerplanar graphs” by talking to ChatGPT. In this conversation, the AI makes some wrong assertions (e.g. that K2,3 is not series-parallel). Correcting the model and feeding it lots of miscellaneous facts about the class eventually allows the AI to identify it.

The model used in our original conversation (linked above) is ChatGPT-4o, a weak, non-thinking model that is the default for ChatGPT’s free tier.

Out of curiosity, we later ran the same query on a thinking model (Claude 3.7 Sonnet); a transcript of the conversation is below. The thinking model makes the same mistake, first guessing incorrectly that the class we’re dealing with is series-parallel matroids. After being corrected, the thinking model requires much less information than 4o before it suggests looking at outerplanar graphs. In fact, it is able to notice on its own that outerplanar graphs are precisely what we are looking for!

The model only observes this after being asked to recall the definition of outerplanar graphs, which it had previously suggested in a tone of relatively low confidence. Even thinking AIs do not always handle multi-layered abstractions well, so it can be helpful to manually prompt them to unroll the layers.

Me: Is there a name for the class of matroids with exluded minors U2,4, K4, and K2,3?

AI: I don’t believe there is a standard name in matroid theory for the specific class defined by exactly these three excluded minors (U2,4, K4, and K2,3).

The exclusion of K4 and K2,3 as minors is related to series-parallel networks, but the additional exclusion of U2,4 makes this a more specialized class that doesn’t correspond to one of the commonly named matroid classes in the literature.

Me: Does it help to know that the class is closed under parallel connection?

AI: Yes, that’s helpful information. The class you’re describing is most likely the series-parallel matroids.

Series-parallel matroids are precisely those that can be constructed from single-element matroids using series and parallel connections. They can be characterized by their excluded minors:

This class is indeed closed under parallel connection by definition, which aligns with the additional information you provided.

Series-parallel matroids also correspond to the graphic matroids of series-parallel graphs, and they have applications in network reliability and electrical circuit theory.

Me: K2,3 is a series-parallel matroid though

AI: You’re absolutely correct - I made an error. K2,3 is indeed a series-parallel graph/matroid, not an excluded minor for that class.

The class of matroids excluding U2,4, K4, and K2,3 as minors is therefore:

This appears to be a more specialized class than just series-parallel matroids. It could be related to outerplanar matroids or some other restricted class, but I don’t immediately recognize a standard name for matroids with exactly these three excluded minors.

The closure under parallel connection is still an important property but doesn’t immediately reveal the standard name for this specific class, if one exists in the literature.

Me: What are outerplanar matroids?

AI: Outerplanar matroids are the graphic matroids of outerplanar graphs.

An outerplanar graph is a graph that can be drawn in the plane with all vertices on the exterior face (the outer face) and no edges crossing. The key property of outerplanar graphs is that they are precisely the graphs that exclude K4 and K2,3 as minors.

Therefore, outerplanar matroids:

This matches exactly the excluded minor characterization you mentioned! So the class of matroids with excluded minors U2,4, K4, and K2,3 is precisely the class of outerplanar matroids.

This class is indeed closed under parallel connection, as adding parallel elements to an outerplanar graph preserves its outerplanarity.

The next few examples show new “chain-of-thought” or “thinking” models performing tasks that would have been beyond the reach of AIs from less than a year ago.

Example. After reading our paper, ChatGPT produced a nontrivial example of the “Bergman fan of a polymatroid”, a complicated object that we defined in the paper. This is a complex task requiring compilation of information from multiple locations in the paper, and careful calculation using that information. This task is near to the limit of current AI capabilties; performance here may not be so reliable.

The early part of the chat is comprised of failed attempts to get ChatGPT to access and read our paper on the web. This did not work (through a fault of the interface rather than the model), so we finally uploaded the PDF, along with the prompt

Read this PDF. Come up with an example for definition 1.7 of the Bergman fan.

After 25 seconds, the model responded with a worked-out example of a polymatroid and an explanation of how to calculate its Bergman fan. This was correct, but we wanted a more explicit description of the Bergman fan, so we followed up with

Enumerate the cones of the Bergman fan explicitly in this example.

The AI did this correctly after 98 seconds of thought.

The generated example was the polymatroid P on E = {1,2} with rank function ρ() = 0, ρ(1) = 1, ρ(2) = 2, and ρ(12) = 3. To calculate the Bergman fan of P, the AI needed to first understand another definition from the paper, the “multisymmetric lift” of P. The AI described the multisymmetric lift, then listed the ten cones of the Bergman fan of P.

PIC

The model used in this example was OpenAI’s o1.

Example. A small error was recently discovered in a different paper of mine3, having crept past three authors, plus referees and readers. I asked Claude whether the offending statement was true. Claude first said (incorrectly) that the statement was likely true, giving a heuristic reason similar to our original intuition. With a little prodding to think harder, Claude correctly concluded that the statement was false and provided a small counterexample. It finished by correctly suggesting that the condition of the false statement’s if-and-only-if might be sufficient but not necessary. Prominent in this example is the usefulness of asking an AI to think more carefully about a past response.

A transcript of our short interaction is below.

Me: A math paper claims that: If M is a matroid on ground set E and A1,,Am is a collection of subsets of E, then there is an independent transversal I Ee of A1,,Am for any element e E if and only if for all nonempty J [m], rk(jJAj) ≥|J| + 1. Do you think this statement is true?

AI: After analyzing this statement, I believe it is true.

The condition rk(jJAj) ≥|J| + 1 for all nonempty J [m] ensures there’s enough “richness” in the collection of sets to find an independent transversal avoiding any single element.

We can understand this condition intuitively: for any subcollection of sets indexed by J, we need their union to have rank at least |J| + 1. This extra “+1 in the rank requirement gives us the flexibility to avoid any single element e while still finding an independent set with exactly one element from each Ai.

This is a natural extension of Hall’s marriage theorem to matroids with the additional constraint of avoiding a specific element.

Me: Are you sure?

AI: I need to reconsider my answer. After further analysis, I don’t think the statement is necessarily true.

Let me provide a potential counterexample:

Consider a uniform matroid U2,3 on ground set E = {1,2,3}, where any 2 elements form an independent set.

For J = {1,2}, we have:

So rk(jJAj) < |J| + 1, failing the condition.

Yet we can find independent transversals avoiding any element:

This suggests the condition is sufficient but not necessary.

This example was generated using API calls to Claude 3.7 Sonnet. While our interaction happened in a mix of unicode and ascii math, I asked Claude to transcribe it in latex, producing what you see above.

Example. ChatGPT was asked to “calculate the characteristic polynomial of the Hesse configuration”. The Hesse configuration is comprised of the twelve lines in the affine plane over the field with three elements. The AI inferred that the characteristic polynomial of the matroid of this configuration was desired, and correctly calculated it.

We do not have a link or transcript of this conversation; the model used was o1-pro.

How do I try it? The leading AI providers are Anthropic (who produce Claude), OpenAI (ChatGPT), and Google (Gemini).

To try out an AI produced by one of these companies, you will need to make an account (an existing Google account is sufficient for Gemini). Each company has several different AI models to choose from.

The least powerful models can be used free of charge. These models handle simple tasks with aplomb, but may fall down on questions that require extended or careful reasoning, or synthesis of lots of information. They also may come with daily usage caps.

With a subscription (typically about 20 USD/month), users can access “thinking” models, which write out intermediate chains of thought before replying, much as you might use scratch paper. Thinking models are slower, but more robust. They are able to handle more complex and exacting tasks than non-thinking models, are much better at logical reasoning, and can to an extent notice and correct their own mistakes. Leading examples in this category are Claude 3.7 Sonnet and ChatGPT o3-mini-high4.

OpenAI also offers a “Pro” subscription (200 USD/month), whose main draw is access to o1-pro, a thinking model that is able to think for an especially long time.

Tips and pointers.

Acknowledgements. This note was improved by conversations, comments, and examples contributed by Anna, Artane, Asvin, Colin, Daoji, June, Liban, and Trajan. Thank you all.

Exhortations. Please feel free to let me know if I’ve made a mistake or irksome omission. I am also happy to hear of new applications (even small ones: 10 minutes saved six times is a whole hour), anecdotes, or random musings on this topic.

Finally, if you hold appropriate office (e.g. OpenAI/Anthropic/Google engineer), and want to both accelerate algebraic geometry research and earn eternal gratitude from the community, please make your models write better Macaulay2 :)

Last update. March 21, 2025

1For the purposes of this essay, “AI” will refer to LLM-based chatbots such as ChatGPT, Claude, and Gemini, as this is what the term is now identified with in the popular discourse.

2Sadly, no model that I know of writes good Macaulay2, perhaps due to lack of training data. Lots of mathematicians would be very happy if this changed!

3Matt Larson noticed that Proposition 5.2.3 of this paper is incorrect as stated. An fix has been posted.

4If the “Last update” date of this page is more than 3-ish months ago, it is likely that these are not the leading models anymore.