B. Park | Important Papers

2025

Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety	(Korbak, 2025)
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs	(Wen, 2025)
Thinking fast, slow, and everywhere in between in humans and language models	(Prystawski, 2025)
Emergent Symbolic Cognition: A Unifying Computational Framework for Symbolic Thought in Humans and LLMs	(Huddleston, 2025)
AbsenceBench: Language Models Can’t Tell What’s Missing	(Fu, 2025)
Because we have LLMs, we Can and Should Pursue Agentic Interpretability	(Kim, 2025)
Open Problems in Mechanistic Interpretability	(Sharkey, 2025)
Values in the wild: Discovering and analyzing values in real-world language model interactions [url]	2025
Progress on Attention [url]	2025

2024

Kolmogorov–Arnold Transformer	2024
Physics of Language Models: Part 1, Learning Hierarchical Language Structures	(Allen-Zhu, 2024)
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process	(Allen-Zhu, 2024)
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems	(Allen-Zhu, 2024)
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction	(Allen-Zhu, 2024)
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws	(Allen-Zhu, 2024)

2023

2022

2021

2006 ~ 2010

Marker-Passing Inference in the Scone Knowledge-Base System	Fahlman (2006)
What Is Answer Set Programming?	(Lifschitz, 2008)
The Physical Symbol System Hypothesis: Status and Prospects	(Nilsson, 2007), PSSH

2001 ~ 2005

1990 ~ 2000

A Logical Framework for Default Reasoning

(Poole, 1998)

1980 ~ 1990

An Assumption-based TMS	(Kleer, 1986)
Applications of Circumscription to Formalizing Common-Sense Knowledge	(McCarthy, 1986)

1970 ~ 1980

Computer science as empirical inquiry: symbols and search

(1976), PSSH

1960 ~ 1970

1950 ~ 1960

| Formalizing Nonmonotonic Reasoning Systems | (Etherington, 1955) |

1900 ~ 1950

1800 ~ 1900

~ 1800