NeurIPS2025 Reading List

Paper Title Note
A Implies B: Circuit Analysis in LLMs for Propositional Logical Reasoning Handles explicit logical reasoning, internal circuit analysis clarifying the types of logics for internal interpretation
Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent Theoretical guarantee that Transformers can learn multi-step symbolic reasoning
Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization Formal results: CoT reasoning provably learned, with length generalization
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought Frames continuous thought as superposition, theoretical perspective
LogicTree: Improving Complex Reasoning of LLMs via Instantiated Multi-step Synthetic Logical Data Multi-step synthetic logical data to improve complex reasoning
SynLogic: Synthesizing Verifiable Reasoning Data at Scale Large-scale verifiable logical data synthesis, useful for logical generalization
Enigmata: Scaling Logical Reasoning in LLMs with Synthetic Verifiable Puzzles Uses logical puzzles for scalable, verifiable reasoning
Compositional Neural Network Verification via Assume-Guarantee Reasoning Assume-Guarantee Reasoning (AGR) for modular verification, aligns with argument structure
VeriThoughts: Formal Verification Pipeline Combines formal verification with code generation & reasoning
Evaluating Program Semantics Reasoning with Type Inference in System F Evaluates program semantics reasoning using type theory
Reviving DSP for Advanced Theorem Proving Advanced theorem proving (ATP) via DSP techniques
IneqSearch / Ineq-Comp Inequality theorem proving, suitable for argument/strategy structure experiments
On Learning Verifiers for Chain-of-Thought Reasoning Studies how to learn verifiers for CoT reasoning loops
Right for the Right Reasons: Avoiding Reasoning Shortcuts via Prototype-Augmented Neurosymbolic AI Focus on neurosymbolic reasoning + bias/shortcut avoidance, ensures faithfulness
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks Introduces formal grammars of uncertainty for trust calibration
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning Theoretical link between probability and self-consistency for reasoning
SATURN: SAT-based Reinforcement Learning to Unleash Language Model Reasoning Combines SAT solvers with RL to improve reasoning efficiency
Counterfactual reasoning: an analysis of in-context emergence Studies counterfactual reasoning and in-context dynamics
Mathematical Reasoning Planning for Language Models Planning framework for structured mathematical reasoning
DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning KG reasoning pipeline, coarse-to-fine
K-DeCore: Continual Structured Knowledge Reasoning Continual learning for structured knowledge reasoning
GRIP: A Graph-Based Reasoning Instruction Producer Produces reasoning instructions via graphs
Deliberation on Priors: Trustworthy Reasoning of LLMs on Knowledge Graphs Explores trustworthy KG reasoning using priors
Personalized Decision Modeling: Utility Optimization or Textualized-Symbolic Reasoning Decision-making models with symbolic reasoning
SymRTLO: Neuron-Inspired Symbolic Reasoning Neuron-inspired symbolic reasoning for RTL optimization
Composing Global Solutions via Algebraic Objects in Neural Nets Uses algebraic objects for compositional global reasoning
Multimodal Symbolic Logical Reasoning Extends symbolic + logical reasoning into multimodality
What’s in Common? Multimodal Models Hallucinate When Reasoning Across Scenes MLLM hallucinations in cross-scene reasoning
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning Addresses drift in multimodal reasoning, grounded in evidence
Collective Reasoning in Performative Prediction Studies collective reasoning phenomena
Scientists’ First Exam: Probing Cognitive Abilities of MLLM Probes MLLM cognitive abilities (perception, reasoning, understanding)
Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch? Evaluates transfer of math reasoning from LLMs to MLLMs
Mechanistic Interpretability of RNNs emulating Hidden Markov Models Mechanistic interpretability applied to RNN–HMM dynamics
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability? Questions limits of causal abstraction in mechanistic interpretability