Paper History (2)

LLM 관련 논문이 너무 많아져서.. LLM 위주의 paper만 따로 모아서 보자

이전페이지: https://ai-information.blogspot.com/2022/05/paper-history.html

읽어볼것

Scaling Inference

Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Small Language Models Need Strong Verifiers to Self-Correct Reasoning

LLM reasoner

Transfer Q⋆ : Principled Decoding for LLM Alignment
Think before you speak: Training Language Models With Pause Tokens, ICLR 2024

Hallucination

Truth-Aware Context Selection: Mitigating Hallucinations of Large Language Models Being Misled by Untruthful Contexts
Knowledge Verification to Nip Hallucination in the Bud
디코딩 기반

Factuality Enhanced Language Models for Open-Ended Text Generation, NeurIPS 2022
Contrastive Decoding Improves Reasoning in Large Language Models, Preprint 2023
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model, NeurIPS 2023
Regularized Contrastive Decoding with Hard Negative Samples for Hallucination Mitigation

https://vr25.github.io/lrec-coling-hallucination-tutorial/
https://github.com/EdinburghNLP/awesome-hallucination-detection
https://github.com/LuckyyySTA/Awesome-LLM-hallucination
https://github.com/HillZhang1999/llm-hallucination-survey
https://github.com/ThuCCSLab/Awesome-LM-SSP/blob/main/collection/paper/safety/hallucination.md

decoding strategy

Automating Thought of Search: A Journey Towards Soundness and Completeness
Stream of Search (SoS): Learning to Search in Language
Chain-of-Thought Reasoning Without Prompting
Fast Inference from Transformers via Speculative Decoding

self-학습

Self-Instruct: Aligning Language Models with Self-Generated Instructions, ACL 2023
Large Language Models Can Self-Improve, EMNLP 2023
SELF: Self-Evolution with Language Feedback, Preprint 2024
Beyond Human Data: Reinforced Self-Training for Large Language Models, TMLR 2024
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models, ICML 2024
SPIN: Self-Play Fine-Tuning for Large Language Models, ICML 2024
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge, Preprint 2024
Can Large Reasoning Models Self-Train?, Preprint 2025
SCoRe: Self-Correction via Reinforcement Learning for Language Models, ICLR 2025

Safety

https://github.com/tjunlp-lab/Awesome-LLM-Safety-Papers?tab=readme-ov-file
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback, Anthropic 2022
Constitutional AI: Harmlessness from AI Feedback. Anthropic 2022
Deduplicating Training Data Mitigates Privacy Risks in Language Models, ICML 2022
Ignore Previous Prompt: Attack Techniques For Language Models, NeurIPS ML Safety Workshop
Jailbreaking Black Box Large Language Models in Twenty Queries, 2023
Jailbroken: How Does LLM Safety Training Fail?. NeurIPS 2023
Rule Based Rewards for Language Model Safety. OpenAI 2024
블로그

https://openai.com/index/improving-model-safety-behavior-with-rule-based-rewards/
https://openai.com/index/deliberative-alignment/
https://openai.com/index/introducing-gpt-oss-safeguard/
https://www.anthropic.com/news/activating-asl3-protections

Dataset

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models, Findings of EMNLP 2020
TruthfulQA: Measuring How Models Mimic Human Falsehoods, ACL 2022
ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection, ACL 2022

읽을 논문 찾기

https://github.com/IAAR-Shanghai/ICSFSurvey
https://github.com/Hannibal046/Awesome-LLM?tab=readme-ov-file
https://github.com/dair-ai/ML-Papers-of-the-Week
https://www.promptingguide.ai/papers

저자로 참여한 논문

1. PLM

1.1 PLM Models

(ELMo) Deep contextualized word representations. NAACL 2018. [pdf] [project]
(GPT) Improving Language Understanding by Generative Pre-Training. Preprint. [pdf] [project]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL 2019. [code & model] [포스팅]
(GPT-2) Language Models are Unsupervised Multitask Learners. Preprint. [code] [포스팅]
(MT-DNN) Multi-Task Deep Neural Networks for Natural Language Understanding. ACL 2019. [code & model] [포스팅]
XLNet: Generalized Autoregressive Pretraining for Language Understanding. NeurIPS 2019. [code & model] [포스팅]
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. ACL 2020. [포스팅]
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. ICLR 2020. [pdf]
Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scorings. ICLR 2020. [포스팅]
TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. ACL 2020. [code] [포스팅]
Pre-training via Paraphrasing. NeurIPS 2020. [포스팅]
Cloze-driven Pretraining of Self-attention Networks, EMNLP 2019 [포스팅]
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data. [포스팅]
Reformer: The Efficient Transformer. [포스팅]
Linformer: Self-Attention with Linear Complexity. [포스팅]
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention, EMNLP 2020 [포스팅]

1.2 Knowledge Distillation & Model Compression

TinyBERT: Distilling BERT for Natural Language Understanding. Preprint. [code & model] [포스팅]
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks. Preprint. [포스팅]
Patient Knowledge Distillation for BERT Model Compression. EMNLP 2019. [code] [포스팅]
Small and Practical BERT Models for Sequence Labeling. EMNLP 2019. [포스팅]
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. ICLR 2020. [포스팅]
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. Preprint. [포스팅]

1.3 Analysis

Language Models as Knowledge Bases? EMNLP 2019, [code] [포스팅]
Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models. EMNLP 2020. [포스팅]

1.99 기타

2. LLM

2.1 LLM Models

(GPT3) Language Models are Few-Shot Learners [포스팅]
(InstructGPT) Training language models to follow instructions with human feedback, OpenAI 2022.03 [포스팅]
GPT-4 Technical Report, OpenAI [포스팅]
LLaMA: Open and Efficient Foundation Language Models, Preprint 2023 [포스팅]

2.2 Knowledge Distillation & Model Compression

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes, Findings of ACL 2023 [포스팅]
Large Language Models Are Reasoning Teachers, ACL 2023 [포스팅]
Compact language models via pruning and knowledge distillation, NeurIPS 2024 [포스팅]
LLM Pruning and Distillation in Practice: The Minitron Approach, Preprint 2024 [포스팅]

2.3 Analysis

The False Promise of Imitating Proprietary LLMs, Preprint 2023 [포스팅]
Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning, ACL 2023 [포스팅]

2.4 LLM Evaluator

Large Language Models Are State-of-the-Art Evaluators of Translation Quality, EAMT 2023 [포스팅]
Can Large Language Models Be an Alternative to Human Evaluations?, ACL 2023 [포스팅]
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment, Preprint 2023 [포스팅]

2.5 LLM + RAG

RAFT: Adapting Language Model to Domain Specific RAG, Preprint 2024 [포스팅]
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection, ICLR 2024 [포스팅]
스크리닝 [포스팅]

RAG: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
FID: Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
RETRO: Improving Language Models by Retrieving from Trillions of Tokens
FID-distillation: Distilling Knowledge from Reader to Retriever for Question Answering
Atlas: Few-shot Learning with Retrieval Augmented Language Models
Re2G: Retrieve, Rerank, Generate
Active Retrieval Augmented Generation
Corrective Retrieval Augmented Generation
Learning to Filter Context for Retrieval-Augmented Generation
COCOM: Context Embeddings for Efficient Answer Generation in RAG

2.6 Hallucination

How Language Model Hallucinations Can Snowball, ICML 2024 [포스팅]
Fine-grained Hallucination Detection and Editing for Language Models, COLM 2024 [포스팅] [자체데이터세트]
Two-tiered Encoder-based Hallucination Detection for Retrieval-Augmented Generation in the Wild, EMNLP Industry 2024 [포스팅]
Reducing hallucination in structured outputs via Retrieval-Augmented Generation, NAACL industry 2024 [포스팅]
Why Language Models Hallucinate, OpenAI 2025
How do language models learn facts? Dynamics, curricula and hallucinations, DeepMind 2025
Harmful Factuality Hallucination: LLMs Correcting What They Shouldn’, ARR202510
Measuring the Impact of Lexical Training Data Coverage on Hallucination Detection in Large Language Models, ARR202510

2.6.1 Datasets

HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models, EMNLP 2023 [포스팅]
FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs, Preprint 2024 [포스팅]

2.6.2 Reference-free detection

SELFCHECKGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models, EMNLP 2023 [포스팅] [자체 데이터세트]
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation, Preprint 2023 [포스팅] [자체 데이터세트(비공개)]
SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency, Findings of EMNLP 2023 [포스팅, QA데이터]
Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation, ICLR 2024 [포스팅] [자체 데이터세트]
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation, ACL 2024 [포스팅] [자체데이터세트(비공개), TruthfulQA, BioGEN]
Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus, EMNLP 2023 [포스팅] [selfcheckgpt 데이터]
Zero-Resource Hallucination Prevention for Large Language Models, Findings of EMNLP 2024 [포스팅] [자체데이터세트(비공개), pre-detection]
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification, Findings of ACL 2024 [포스팅] [자체데이터세트(비공개)]
InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers, ACL 2024 [포스팅] [books & movie]
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations, ICLR 2025 [포스팅]
HaDeMiF: Hallucination Detection and Mitigation in Large Language Models, ICLR 2025 [포스팅]

2.6.3 Decoding for Mitigating Hallucination

Contrastive decoding: Open-ended text generation as optimization, ACL 2023 [포스팅] [wikinews, wikitext-103, bookcorpus] [[cc_news](https://huggingface.co/datasets/vblagoje/cc_news), [wikitext-103-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext)]
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models, ICLR 2024 [참고포스팅1, 참고포스팅2, 참고유튜브] [TruthfulQA (MC, Gen), FACTOR, StrQA, GSM8k]
CAD: Trusting Your Evidence: Hallucinate Less with Context-aware Decoding, NAACL 2024 [포스팅] [CNN-DM, XSUM]
Integrative Decoding: Improve Factuality via Implicit Self-consistency, ICLR 2025 [포스팅] [TruthfulQA, Biographies, LongFact]
Delta - Contrastive Decoding Mitigates Text Hallucinations in Large Language Models, Preprint 2025 [포스팅] [SQuAD v1.1, SQuAD v2, TriviaQA, Natural Question]
Entropy Guided Extrapolative Decoding to Improve Factuality in Large Language Models, COLING 2025 [포스팅] [TruthfulQA (MC), FACTOR (MC)]

2.6.4 Sampling or Regeneration for Mitigating Hallucination

Towards Mitigating Hallucination in Large Language Models via Self-Reflection, Findings of EMNLP 2023 [포스팅] [PubMedQA, MedQuAD, MEDIQA2019, LiveMedQA2017, MASH-QA]
SR: Self-refine: Iterative refinement with self-feedback, NeurIPS 2023 [포스팅] [Dialogue Response Generation, Code Optimization, Code Readability Improvement, Math Reasoning, Sentiment Reversal]
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model, NeurIPS 2024 [포스팅] [NQ, TriviaQA, MMLU]
Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation, ICLR 2024 [포스팅] [자체 데이터세트]
USC: Universal self-consistency for large language model generation, ICML workshop 2024 [포스팅] [GSM8K, MATH Reasoning, TruthfulQA, BIRD-SQL, ARCADE, GovReport]
UCS: Lightweight reranking for language model generations, ACL 2024 [포스팅] [Xsum, MiniF2F, WMT14]
SE-SL, SE-RG: Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation, ACL 2024 [포스팅] [HumanEval, HumanEval+, BIRD-SQL , DailyMail, SummScreen, GSM8K, MATH]
FSC: Improving LLM Generations via Fine-Grained Self-Endorsement, Findings of ACL 2024 [포스팅] [Biographies, TriviaQA]
Self-Consistent Decoding for More Factual Open Responses, Preprint 2024 [포스팅]

2.6.5 Training for Mitigating Hallucination (IDK 포함)

Language Models (Mostly) Know What They Know, Anthropic 2022 [포스팅]
Do Large Language Models Know What They Don’t Know?, Findings of ACL 2023 [포스팅] [selfaware]
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?, EMNLP 2024 [포스팅]
Fine-tuning Language Models for Factuality, ICLR 2024 [포스팅]
I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token, NeurIPS 2024 [포스팅]
Alignment for Honesty, NeurIPS 2024 [포스팅]
R-Tuning: Instructing Large Language Models to Say ‘I Don’t Know’, NAACL 2024 [포스팅]
Unfamiliar Finetuning Examples Control How Language Models Hallucinate, NAACL 2025 [포스팅]

2.7 Scaling inference

Large Language Monkeys: Scaling Inference Compute with Repeated Sampling, Preprint 2024 [포스팅]
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters, Preprint 2024 [포스팅]
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models, ICLR 2025 [포스팅]

2.8 LLM reasoner

STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning, NeruIPS 2022 [포스팅]
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking, COLM 2024 [포스팅]

2.9 Alignment learning

DPO: Direct Preference Optimization: Your Language Model is Secretly a Reward Model [참고]
KTO: Model Alignment as Prospect Theoretic Optimization [참고]
ORPO: Monolithic Preference Optimization without Reference Model [참고]
Don't Use Your Data All at Once, COLING 2025 [참고]
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning, Preprint 2025 [포스팅]

2.10 Prompting

Making Pre-trained Language Models Better Few-shot Learners, ACL 2021 [포스팅]
Exploring the Universal Vulnerability of Prompt-based Learning Paradigm, NAACL 2022 [포스팅]
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT. INLG 2022 [포스팅]
Contrastive Chain-of-Thought Prompting, Preprint 2024 [포스팅]

2.12 Further (Continual) training

2.12.1 Language Transfer

Extrapolating Large Language Models to Non-English by Aligning Languages, Preprint 2023 [참고]
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer, Findings of NAACL 2024 [포스팅]
LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation, Findings of ACL 2024 [포스팅]
Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca, Preprint 2023 [포스팅]
Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models, Preprint 2024 [포스팅]
RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining, Preprint 2024 [포스팅]
Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus, NVIDIA 2024 [포스팅]

2.12.2 Domain Transfer

BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining, Preprint 2022 [포스팅]
Continual pre-training of language models, ICLR 2023 [포스팅]
Efficient continual pre-training for building domain specific large language models, Findings of ACL 2024 [포스팅]
Med-PaLM: Large language models encode clinical knowledge, Nature 2023 [포스팅]
Med-PaLM2: Towards Expert-Level Medical Question Answering with Large Language Models, Nature medicine 2025 [포스팅]

2.13 Multilingual LLM

A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models, ICLR 2024 [참고]
Cross-Lingual Supervision improves Large Language Models Pre-training, Preprint 2023 [포스팅]

2.13.1 Consistency

Beneath the Surface of Consistency: Exploring Cross-Lingual Knowledge Representation Sharing in LLMs, Preprint 2024 [포스팅]
CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment, Sumeval 2025 [포스팅]
Align after Pre-train: Improving Multilingual Generative Models with Cross-Lingual Alignment, Preprint [포스팅]

2.99 기타

LoRA: Low-Rank Adaptation of Large Language Models, ICLR 2022 [포스팅]
Taxonomy and Analysis of Sensitive User Queries in Generative AI Search, Review (NAVER) [포스팅]
Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling, LREC-COLING 2024 [포스팅]
Large Language Models for Data Annotation: A Survey, Preprint 2024 [포스팅]

3. Audio

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations, NeurIPS 2020 [포스팅]

4. Multi / Omni Models

Ola: Pushing the Frontiers of Omni-Modal Language Model, Preprint 2025 [포스팅]

인공지능, AI, NLP, 논문 리뷰, Natural Language, Leetcode

AI Information

Paper History (2)

읽어볼것

1. PLM

1.1 PLM Models

1.2 Knowledge Distillation & Model Compression

1.3 Analysis

1.99 기타

2. LLM

2.1 LLM Models

2.2 Knowledge Distillation & Model Compression

2.3 Analysis

2.4 LLM Evaluator

2.5 LLM + RAG

2.6 Hallucination

2.6.1 Datasets

2.6.2 Reference-free detection

2.6.3 Decoding for Mitigating Hallucination

2.6.4 Sampling or Regeneration for Mitigating Hallucination

2.6.5 Training for Mitigating Hallucination (IDK 포함)

2.7 Scaling inference

2.8 LLM reasoner

2.9 Alignment learning

2.10 Prompting

2.12 Further (Continual) training

2.12.1 Language Transfer

2.12.2 Domain Transfer

2.13 Multilingual LLM

2.13.1 Consistency

2.99 기타

3. Audio

4. Multi / Omni Models

댓글

댓글 쓰기