Yerin Hwang
Portrait of Yerin Hwang

Yerin Hwang

Ph.D. Candidate · NLP & LLMs

Seoul National University, Seoul, Republic of Korea

CV Email LinkedIn

Actively seeking postdoc positions (start: 2026 SEP. ) in NLP/LLM evaluation, bias & robustness. CV · Email

LLM LLM Evaluation LLM-as-a-Judge Bias & Robustness Dialogue Generation

About

I am a Ph.D. candidate at SNU MILAB focusing on LLM evaluation—especially bias and robustness. I study how LLM-as-a-Judge behaves under framing and uncertainty cues, and I design methods and protocols that make judgments more reliable and interpretable.

Beyond evaluation, I work on automatic data generation for benchmarking and training (e.g., dialogue and evaluation datasets), and I am actively engaged in Korean NLP, including building Korean-specific datasets and metrics and analyzing model behavior in Korean settings.

Publications

Selected (recent)

LLMs can be easily Confused by Instructional Distractions
Y. Hwang, Y. Kim, J. Koo, T. Kang, H. Bae, K. Jung · ACL 2025

We formalize instructional distraction—inputs that look like instructions— and show that even advanced LLMs frequently follow the distracting input rather than the user’s true instruction.

Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation
Y. Hwang*, D. Lee*, K. Min, T. Kang, Y. Kim, K. Jung · EMNLP 2025

We systematically characterize visual biases in LVLM-based judgment and show how they distort alignment evaluations.

Can You Trick the Grader? Adversarial Persuasion of LLM Judges
Y. Hwang, D. Lee, T. Kang, Y. Kim, K. Jung · Findings of EMNLP 2025

We demonstrate that persuasive perturbations can shift LLM-judge decisions and discuss practical defenses.

Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the Effect of Epistemic Markers on LLM-based Evaluation
D. Lee*, Y. Hwang*, Y. Kim, J. Park, K. Jung · NAACL 2025

We measure how uncertainty expressions influence LLM-based evaluation outcomes.

View full list
International Conferences
  • Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation — Y. Hwang*, D. Lee*, K. Min, T. Kang, Y. Kim, K. Jung. EMNLP 2025.
  • Can You Trick the Grader? Adversarial Persuasion of LLM Judges — Y. Hwang, D. Lee, T. Kang, Y. Kim, K. Jung. Findings of EMNLP 2025.
  • LLMs can be easily Confused by Instructional Distractions — Y. Hwang, Y. Kim, J. Koo, T. Kang, H. Bae, K. Jung. ACL 2025.
  • Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the Effect of Epistemic Markers on LLM-based Evaluation — D. Lee*, Y. Hwang*, Y. Kim, J. Park, K. Jung. NAACL 2025.
  • SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models — J. Koo, Y. Hwang, Y. Kim, T. Kang, H. Bae, K. Jung. Findings of NAACL 2025.
  • MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging Knowledge Graphs — Y. Hwang, Y. Kim, Y. Jang, J. Bang, H. Bae, K. Jung. EMNLP 2024.
  • Kosmic: Korean Text Similarity Metric Reflecting Honorific Distinctions — Y. Hwang, Y. Kim, J. Bang, H. Bae, H. Lee, K. Jung. COLING 2024.
  • Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources — Y. Hwang*, Y. Kim*, H. Bae, H. Lee, J. Bang, K. Jung. EMNLP 2023.
  • PR-MCS: Perturbation Robust Metric for Multilingual Image Captioning — Y. Kim, Y. Hwang, H. Yun, S. Yoon, T. Bui, K. Jung. Findings of EMNLP 2023.
  • Injecting Comparison Skills in Task-Oriented Dialogue Systems for Database Search Results Disambiguation — Y. Kim*, Y. Hwang*, J. Shin, H. Bae, K. Jung. Findings of ACL 2023.
  • Improving Cross-Modal Attention via Object Detection — Y. Kim, Y. Hwang, S. Yoon, H. Yun, K. Jung. NeurIPS Workshops 2022.
International Journal
  • Flowlogue: A Novel Framework for Synthetic Dialogue Generation with Structured Flow from Text Passages — Y. Kim, Y. Hwang, H. Bae, T. Kang, K. Jung. IEEE Access, 2024.
Domestic Conferences & Journal
  • A Study on the Evaluation Consistency of Korean LLM-as-a-Judge Models in Mathematical Problems — Y. Hwang, D. Lee, J. Moon, K. Min, K. Jung. KCC 2025.
  • Analysis of Stylistic Bias in Korean LLM-as-a-Judge — Y. Hwang, D. Lee, J. Moon, K. Min, K. Jung. KCC 2025.
  • Evaluating the Robustness of LLM-Judges to Epistemic Markers in Korean — D. Lee, Y. Hwang, J. Moon, K. Min, K. Jung. KCC 2025.
  • TSDG: A Framework for Generating Natural Topic-Shift Dialogue Data — Y. Hwang, D. Lee, Y. Kim, K. Jung. KSC 2024.
  • Error-Correction Chain-of-Thought (ECOCoT): Enhancing Accuracy in Mathematical Reasoning through Error-Correction Framework — Y. Hwang, Y. Kim, D. Lee, T. Kang, H. Bae, K. Jung. KSC 2024.
  • Reference-Centric QA Evaluation Leveraging Contrastive Decoding — D. Lee, K. Min, Y. Hwang, J. Park, K. Jung. KSC 2024.
  • KLIPScore: A Highly Human-Correlated Korean Image Captioning Metric (Oral) — Y. Kim, Y. Hwang, Y. Chae, S. Yoon, K. Jung. KCC 2023.
  • Thinking Fast and Slow in Multimodal Emotion Recognition Task — Y. Hwang, Y. Kim, Y. Chae, K. Jung. KCC 2023.
  • Improving Cross-Modal Attention via Object Detection — Y. Kim, H. Yun, Y. Hwang, K. Jung. KCC 2022.
  • COVID-19 Severity Prediction using Deep Transfer Learning — Y. Hwang, Y. Kim, K. Jung. KCC 2022.

* equal contribution.

Experience

Research Intern
Max Planck Institute for Security and Privacy (MPI-SP)
Bochum, Germany · Aug 2025 – Oct 2025
Intern
LGAI Research
Seoul, Republic of Korea · Aug 2022 – Oct 2022

Education

Seoul National University
Ph.D. Candidate, Interdisciplinary Program in Artificial Intelligence
Seoul, Republic of Korea · Sep 2021 – Present
Seoul National University
B.S., Electrical & Computer Engineering (Cum Laude)
Seoul, Republic of Korea · Mar 2016 – Aug 2021

Awards & Scholarships

National Science & Technology Scholarship
Republic of Korea · 2018 – 2019
SNU IPAI Support Scholarship (Ph.D.)
Republic of Korea
SNU IPAI Support Scholarship (M.S.)
Republic of Korea
Winner — Research Paper Competition, SNU IPAI
Seoul, Republic of Korea · Jun 2025
Winner — Research Paper Competition, SNU IPAI
Seoul, Republic of Korea · Dec 2024
Winner — Creative Autonomous Research Competition, SNU IPAI
Seoul, Republic of Korea · Dec 2023
Winner — ETRI Human-Understanding AI Paper Contest
Republic of Korea · Jun 2023

Materials

CV
PDF (for LinkedIn / applications)
Open

Contact