NeurIPS2025 Reading List

Paper Title	Note
A Implies B: Circuit Analysis in LLMs for Propositional Logical Reasoning	Handles explicit logical reasoning, internal circuit analysis clarifying the types of logics for internal interpretation
Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent	Theoretical guarantee that Transformers can learn multi-step symbolic reasoning
Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization	Formal results: CoT reasoning provably learned, with length generalization
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought	Frames continuous thought as superposition, theoretical perspective
LogicTree: Improving Complex Reasoning of LLMs via Instantiated Multi-step Synthetic Logical Data	Multi-step synthetic logical data to improve complex reasoning
SynLogic: Synthesizing Verifiable Reasoning Data at Scale	Large-scale verifiable logical data synthesis, useful for logical generalization
Enigmata: Scaling Logical Reasoning in LLMs with Synthetic Verifiable Puzzles	Uses logical puzzles for scalable, verifiable reasoning
Compositional Neural Network Verification via Assume-Guarantee Reasoning	Assume-Guarantee Reasoning (AGR) for modular verification, aligns with argument structure
VeriThoughts: Formal Verification Pipeline	Combines formal verification with code generation & reasoning
Evaluating Program Semantics Reasoning with Type Inference in System F	Evaluates program semantics reasoning using type theory
Reviving DSP for Advanced Theorem Proving	Advanced theorem proving (ATP) via DSP techniques
IneqSearch / Ineq-Comp	Inequality theorem proving, suitable for argument/strategy structure experiments
On Learning Verifiers for Chain-of-Thought Reasoning	Studies how to learn verifiers for CoT reasoning loops
Right for the Right Reasons: Avoiding Reasoning Shortcuts via Prototype-Augmented Neurosymbolic AI	Focus on neurosymbolic reasoning + bias/shortcut avoidance, ensures faithfulness
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks	Introduces formal grammars of uncertainty for trust calibration
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning	Theoretical link between probability and self-consistency for reasoning
SATURN: SAT-based Reinforcement Learning to Unleash Language Model Reasoning	Combines SAT solvers with RL to improve reasoning efficiency
Counterfactual reasoning: an analysis of in-context emergence	Studies counterfactual reasoning and in-context dynamics
Mathematical Reasoning Planning for Language Models	Planning framework for structured mathematical reasoning
DuetGraph: Coarse-to-Fine Knowledge Graph Reasoning	KG reasoning pipeline, coarse-to-fine
K-DeCore: Continual Structured Knowledge Reasoning	Continual learning for structured knowledge reasoning
GRIP: A Graph-Based Reasoning Instruction Producer	Produces reasoning instructions via graphs
Deliberation on Priors: Trustworthy Reasoning of LLMs on Knowledge Graphs	Explores trustworthy KG reasoning using priors
Personalized Decision Modeling: Utility Optimization or Textualized-Symbolic Reasoning	Decision-making models with symbolic reasoning
SymRTLO: Neuron-Inspired Symbolic Reasoning	Neuron-inspired symbolic reasoning for RTL optimization
Composing Global Solutions via Algebraic Objects in Neural Nets	Uses algebraic objects for compositional global reasoning
Multimodal Symbolic Logical Reasoning	Extends symbolic + logical reasoning into multimodality
What’s in Common? Multimodal Models Hallucinate When Reasoning Across Scenes	MLLM hallucinations in cross-scene reasoning
When Thinking Drifts: Evidential Grounding for Robust Video Reasoning	Addresses drift in multimodal reasoning, grounded in evidence
Collective Reasoning in Performative Prediction	Studies collective reasoning phenomena
Scientists’ First Exam: Probing Cognitive Abilities of MLLM	Probes MLLM cognitive abilities (perception, reasoning, understanding)
Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?	Evaluates transfer of math reasoning from LLMs to MLLMs
Mechanistic Interpretability of RNNs emulating Hidden Markov Models	Mechanistic interpretability applied to RNN–HMM dynamics
The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?	Questions limits of causal abstraction in mechanistic interpretability