Paper History (2)
LLM 관련 논문이 너무 많아져서.. LLM 위주의 paper만 따로 모아서 보자
이전페이지: https://ai-information.blogspot.com/2022/05/paper-history.html
읽어볼것
- Scaling Inference
- Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
- Small Language Models Need Strong Verifiers to Self-Correct Reasoning
- LLM reasoner
- Transfer Q⋆ : Principled Decoding for LLM Alignment
- Think before you speak: Training Language Models With Pause Tokens, ICLR 2024
- Omni
- LLAMA-OMNI: SEAMLESS SPEECH INTERACTION WITH LARGE LANGUAGE MODELS
- OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding
- Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities
- VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
- Qwen2.5-Omni Technical Report
- Hallucination
- Truth-Aware Context Selection: Mitigating Hallucinations of Large Language Models Being Misled by Untruthful Contexts
- Knowledge Verification to Nip Hallucination in the Bud
- 디코딩 기반
- Factuality Enhanced Language Models for Open-Ended Text Generation, NeurIPS 2022
- Contrastive Decoding Improves Reasoning in Large Language Models, Preprint 2023
- Inference-Time Intervention: Eliciting Truthful Answers from a Language Model, NeurIPS 2023
- A Single Model Ensemble Framework for Neural Machine Translation using Pivot Translation, Preprint 2025
- Regularized Contrastive Decoding with Hard Negative Samples for Hallucination Mitigation
- 학습 기반
- self-학습
- Self-Instruct: Aligning Language Models with Self-Generated Instructions, ACL 2023
- Large Language Models Can Self-Improve, EMNLP 2023
- SELF: Self-Evolution with Language Feedback, Preprint 2024
- Beyond Human Data: Reinforced Self-Training for Large Language Models, TMLR 2024
- Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models, ICML 2024
- SPIN: Self-Play Fine-Tuning for Large Language Models, ICML 2024
- Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge, Preprint 2024
- Can Large Reasoning Models Self-Train?, Preprint 2025
- SCoRe: Self-Correction via Reinforcement Learning for Language Models, ICLR 2025
- https://vr25.github.io/lrec-coling-hallucination-tutorial/
- https://github.com/EdinburghNLP/awesome-hallucination-detection
- https://github.com/LuckyyySTA/Awesome-LLM-hallucination
- https://github.com/HillZhang1999/llm-hallucination-survey
- https://github.com/ThuCCSLab/Awesome-LM-SSP/blob/main/collection/paper/safety/hallucination.md
- decoding strategy
- Automating Thought of Search: A Journey Towards Soundness and Completeness
- Stream of Search (SoS): Learning to Search in Language
- Chain-of-Thought Reasoning Without Prompting
- Fast Inference from Transformers via Speculative Decoding
- Continual pre-training
- Investigating continual pretraining in large language models: Insights and implications, 2024.
- BloombergGPT: A Large Language Model for Finance, Preprint 2023
- FinGPT: Open-Source Financial Large Language Models, Preprint 2023
- Galactica: A Large Language Model for Science, Preprint 2023
- AceGPT, Localizing Large Language Models in Arabic, NAACL 2024
- ALLaM: Large Language Models for Arabic and English, ICLR 2025
- Reuse, Don’t Retrain: A Recipe for Continued Pretraining of Language Models, NVIDIA 2024
- VRCP: Vocabulary Replacement Continued Pretraining for Efficient Multilingual Language Models, Sumeval 2025
- DISTILLM: Towards Streamlined Distillation for Large Language Models, ICML 2024
- 읽을 논문 찾기
- https://github.com/IAAR-Shanghai/ICSFSurvey
- https://github.com/Hannibal046/Awesome-LLM?tab=readme-ov-file
- https://github.com/dair-ai/ML-Papers-of-the-Week
- https://www.promptingguide.ai/papers
- 저자로 참여한 논문
1. PLM
1.1 PLM Models
- (ELMo) Deep contextualized word representations. NAACL 2018. [pdf] [project]
- (GPT) Improving Language Understanding by Generative Pre-Training. Preprint. [pdf] [project]
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL 2019. [code & model] [포스팅]
- (GPT-2) Language Models are Unsupervised Multitask Learners. Preprint. [code] [포스팅]
- (MT-DNN) Multi-Task Deep Neural Networks for Natural Language Understanding. ACL 2019. [code & model] [포스팅]
- XLNet: Generalized Autoregressive Pretraining for Language Understanding. NeurIPS 2019. [code & model] [포스팅]
- BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. ACL 2020. [포스팅]
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. ICLR 2020. [pdf]
- Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scorings. ICLR 2020. [포스팅]
- TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data. ACL 2020. [code] [포스팅]
- Pre-training via Paraphrasing. NeurIPS 2020. [포스팅]
- Cloze-driven Pretraining of Self-attention Networks, EMNLP 2019 [포스팅]
- ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data. [포스팅]
- Reformer: The Efficient Transformer. [포스팅]
- Linformer: Self-Attention with Linear Complexity. [포스팅]
- LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention, EMNLP 2020 [포스팅]
1.2 Knowledge Distillation & Model Compression
- TinyBERT: Distilling BERT for Natural Language Understanding. Preprint. [code & model] [포스팅]
- Distilling Task-Specific Knowledge from BERT into Simple Neural Networks. Preprint. [포스팅]
- Patient Knowledge Distillation for BERT Model Compression. EMNLP 2019. [code] [포스팅]
- Small and Practical BERT Models for Sequence Labeling. EMNLP 2019. [포스팅]
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. ICLR 2020. [포스팅]
- DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. Preprint. [포스팅]
1.3 Analysis
2. LLM
2.1 LLM Models
2.2 Knowledge Distillation & Model Compression
- Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes, Findings of ACL 2023 [포스팅]
- Large Language Models Are Reasoning Teachers, ACL 2023 [포스팅]
- Compact language models via pruning and knowledge distillation, NeurIPS 2024 [포스팅]
- LLM Pruning and Distillation in Practice: The Minitron Approach, Preprint 2024 [포스팅]
2.3 Analysis
- The False Promise of Imitating Proprietary LLMs, Preprint 2023 [포스팅]
- Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning, ACL 2023 [포스팅]
2.4 LLM Evaluator
- Large Language Models Are State-of-the-Art Evaluators of Translation Quality, EAMT 2023 [포스팅]
- Can Large Language Models Be an Alternative to Human Evaluations?, ACL 2023 [포스팅]
- G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment, Preprint 2023 [포스팅]
2.5 LLM + RAG
- RAFT: Adapting Language Model to Domain Specific RAG, Preprint 2024 [포스팅]
- Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection, ICLR 2024 [포스팅]
- 스크리닝 [포스팅]
- RAG: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
- FID: Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
- RETRO: Improving Language Models by Retrieving from Trillions of Tokens
- FID-distillation: Distilling Knowledge from Reader to Retriever for Question Answering
- Atlas: Few-shot Learning with Retrieval Augmented Language Models
- Re2G: Retrieve, Rerank, Generate
- Active Retrieval Augmented Generation
- Corrective Retrieval Augmented Generation
- Learning to Filter Context for Retrieval-Augmented Generation
- COCOM: Context Embeddings for Efficient Answer Generation in RAG
2.6 Hallucination
- Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?, EMNLP 2024 [포스팅]
- How Language Model Hallucinations Can Snowball, ICML 2024 [포스팅]
- Fine-grained Hallucination Detection and Editing for Language Models, COLM 2024 [포스팅] [자체데이터세트]
- Two-tiered Encoder-based Hallucination Detection for Retrieval-Augmented Generation in the Wild, EMNLP Industry 2024 [포스팅]
- Reducing hallucination in structured outputs via Retrieval-Augmented Generation, NAACL industry 2024 [포스팅]
2.6.1 Datasets
2.6.2 Reference-free detection
- SELFCHECKGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models, EMNLP 2023 [포스팅] [자체 데이터세트]
- A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation, Preprint 2023 [포스팅] [자체 데이터세트(비공개)]
- SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency, Findings of EMNLP 2023 [포스팅, QA데이터]
- Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation, ICLR 2024 [포스팅] [자체 데이터세트]
- Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation, ACL 2024 [포스팅] [자체데이터세트(비공개), TruthfulQA, BioGEN]
- Enhancing Uncertainty-Based Hallucination Detection with Stronger Focus, EMNLP 2023 [포스팅] [selfcheckgpt 데이터]
- Zero-Resource Hallucination Prevention for Large Language Models, Findings of EMNLP 2024 [포스팅] [자체데이터세트(비공개), pre-detection]
- Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification, Findings of ACL 2024 [포스팅] [자체데이터세트(비공개)]
- InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers, ACL 2024 [포스팅] [books & movie]
- LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations, ICLR 2025 [포스팅]
- HaDeMiF: Hallucination Detection and Mitigation in Large Language Models, ICLR 2025 [포스팅]
2.6.3 Decoding for Mitigating Hallucination
- Contrastive decoding: Open-ended text generation as optimization, ACL 2023 [포스팅] [wikinews, wikitext-103, bookcorpus] [[cc_news](https://huggingface.co/datasets/vblagoje/cc_news), [wikitext-103-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext)]
- DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models, ICLR 2024 [참고포스팅1, 참고포스팅2, 참고유튜브] [TruthfulQA (MC, Gen), FACTOR, StrQA, GSM8k]
- CAD: Trusting Your Evidence: Hallucinate Less with Context-aware Decoding, NAACL 2024 [포스팅] [CNN-DM, XSUM]
- Integrative Decoding: Improve Factuality via Implicit Self-consistency, ICLR 2025 [포스팅] [TruthfulQA, Biographies, LongFact]
- Delta - Contrastive Decoding Mitigates Text Hallucinations in Large Language Models, Preprint 2025 [포스팅] [SQuAD v1.1, SQuAD v2, TriviaQA, Natural Question]
- Entropy Guided Extrapolative Decoding to Improve Factuality in Large Language Models, COLING 2025 [포스팅] [TruthfulQA (MC), FACTOR (MC)]
2.6.4 Sampling or Regeneration for Mitigating Hallucination
- Towards Mitigating Hallucination in Large Language Models via Self-Reflection, Findings of EMNLP 2023 [포스팅] [PubMedQA, MedQuAD, MEDIQA2019, LiveMedQA2017, MASH-QA]
- SR: Self-refine: Iterative refinement with self-feedback, NeurIPS 2023 [포스팅] [Dialogue Response Generation, Code Optimization, Code Readability Improvement, Math Reasoning, Sentiment Reversal]
- Inference-Time Intervention: Eliciting Truthful Answers from a Language Model, NeurIPS 2024 [포스팅] [NQ, TriviaQA, MMLU]
- Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation, ICLR 2024 [포스팅] [자체 데이터세트]
- USC: Universal self-consistency for large language model generation, ICML workshop 2024 [포스팅] [GSM8K, MATH Reasoning, TruthfulQA, BIRD-SQL, ARCADE, GovReport]
- UCS: Lightweight reranking for language model generations, ACL 2024 [포스팅] [Xsum, MiniF2F, WMT14]
- SE-SL, SE-RG: Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation, ACL 2024 [포스팅] [HumanEval, HumanEval+, BIRD-SQL , DailyMail, SummScreen, GSM8K, MATH]
- FSC: Improving LLM Generations via Fine-Grained Self-Endorsement, Findings of ACL 2024 [포스팅] [Biographies, TriviaQA]
- Self-Consistent Decoding for More Factual Open Responses, Preprint 2024 [포스팅]
2.6.5 Training for Mitigating Hallucination (IDK 포함)
- Language Models (Mostly) Know What They Know, Anthropic 2022 [포스팅]
- Do Large Language Models Know What They Don’t Know?, Findings of ACL 2023 [포스팅] [selfaware]
- Fine-tuning Language Models for Factuality, ICLR 2024 [포스팅]
- I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token, NeurIPS 2024 [포스팅]
- Alignment for Honesty, NeurIPS 2024 [포스팅]
- R-Tuning: Instructing Large Language Models to Say ‘I Don’t Know’, NAACL 2024 [포스팅]
- Unfamiliar Finetuning Examples Control How Language Models Hallucinate, NAACL 2025 [포스팅]
2.7 Scaling inference
- Large Language Monkeys: Scaling Inference Compute with Repeated Sampling, Preprint 2024 [포스팅]
- Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters, Preprint 2024 [포스팅]
- Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models, ICLR 2025 [포스팅]
2.8 LLM reasoner
2.9 Alignment learning
- DPO: Direct Preference Optimization: Your Language Model is Secretly a Reward Model [참고]
- KTO: Model Alignment as Prospect Theoretic Optimization [참고]
- ORPO: Monolithic Preference Optimization without Reference Model [참고]
- Don't Use Your Data All at Once, COLING 2025 [참고]
- SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning, Preprint 2025 [포스팅]
2.10 Prompting
- Making Pre-trained Language Models Better Few-shot Learners, ACL 2021 [포스팅]
- Exploring the Universal Vulnerability of Prompt-based Learning Paradigm, NAACL 2022 [포스팅]
- Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT. INLG 2022 [포스팅]
- Contrastive Chain-of-Thought Prompting, Preprint 2024 [포스팅]
2.12 Further (Continual) training
2.12.1 Language Transfer
- Extrapolating Large Language Models to Non-English by Aligning Languages, Preprint 2023 [참고]
- Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer, Findings of NAACL 2024 [포스팅]
- LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation, Findings of ACL 2024 [포스팅]
- Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca, Preprint 2023 [포스팅]
- Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models, Preprint 2024 [포스팅]
- RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining, Preprint 2024 [포스팅]
- Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus, NVIDIA 2024 [포스팅]
2.12.2 Domain Transfer
- BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining, Preprint 2022 [포스팅]
- Continual pre-training of language models, ICLR 2023 [포스팅]
- Efficient continual pre-training for building domain specific large language models, Findings of ACL 2024 [포스팅]
- Med-PaLM: Large language models encode clinical knowledge, Nature 2023 [포스팅]
- Med-PaLM2: Towards Expert-Level Medical Question Answering with Large Language Models, Nature medicine 2025 [포스팅]
2.13 Multilingual LLM
2.13.1 Consistency
- Beneath the Surface of Consistency: Exploring Cross-Lingual Knowledge Representation Sharing in LLMs, Preprint 2024 [포스팅]
- CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment, Sumeval 2025 [포스팅]
- Align after Pre-train: Improving Multilingual Generative Models with Cross-Lingual Alignment, Preprint [포스팅]
2.99 기타
- LoRA: Low-Rank Adaptation of Large Language Models, ICLR 2022 [포스팅]
- Taxonomy and Analysis of Sensitive User Queries in Generative AI Search, Review (NAVER) [포스팅]
- Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling, LREC-COLING 2024 [포스팅]
- Large Language Models for Data Annotation: A Survey, Preprint 2024 [포스팅]
3. Multi / Omni Models
- Ola: Pushing the Frontiers of Omni-Modal Language Model, Preprint 2025 [포스팅]
댓글
댓글 쓰기