• Mathematical Reasoning is like Reinforcement Learning, except we have an infinite search space due to an infinite number of axioms applicable. It is also non-adversarial and “binary” in the sense that proofs are either valid or invalid (unlike games which can be assessed on a continuum).

Papers

  • 1 Introduces Draft, Sketch, Proof (DSP) an approach for mathematical proving which involves using LLMs to translate informal proofs into a formal proof using autoformalization to create a high-level proof sketch and an automated prover to fill in any gaps.

  •  2 Survey for Mathematical Reasoning through Deep Learning.

  • 3 Introduces Evariste which makes use of a MCTS-like pipeline to generate and prove theorems. This pipeline is called HyperTree Proof Search.

    • The model makes use of two encoder-decoder transformer models (one for policy and one for evaluation) trained via online learning, which the paper demonstrates as being effective over expert iteration.

  • 4 Explores the use of Expert Iteration and curriculum learning. The goal is to have models close opened proofs much quicker by incentivizing shorter proofs.

  • 5 Introduces the Fermi Problem challenge which provides open-ended problems that require life experience, long chains of thoughts, creativity, and numerical estimation skills to solve.

  • 6 Introduces NATURALPROOFS, a multi-domain dataset for bridging the gap between formal mathematical reasoning and informal reasoning through proofs worded in natural language. It also establishes benchmarks for the task of retrieving and generating references to be used in a proof.

  • 7 Introduces the automated prover and proof assistant which makes use of a pre-trained transformer model and a proof verification system to generate valid proofs. It also demonstrates self-sufficiency as it can expand its training set with proofs it discovers during each iteration.

Links

Footnotes

  1. Jiang, A. Q., et. al (2023). Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

  2. Lu, P., et. al (2023). A Survey of Deep Learning for Mathematical Reasoning

  3. Lample, G., et. al (2022). HyperTree Proof Search for Neural Theorem Proving

  4. Polu, S., et. al (2022). Formal Mathematics Statement Curriculum Learning

  5. Kalyan, A., et. al (2021). How Much Coffee Was Consumed During EMNLP 2019? Fermi Problems: A New Reasoning Challenge for AI

  6. Welleck, S., et al. (2021). NaturalProofs: Mathematical Theorem Proving in Natural Language

  7. Polu, S., & Sutskever, I. (2020). Generative Language Modeling for Automated Theorem Proving