Yerin Hwang

Ph.D. Candidate · NLP & LLMs

Seoul National University, Seoul, Republic of Korea

LLM LLM Evaluation LLM-as-a-Judge Bias & Robustness Dialogue Generation

About

I am a Ph.D. candidate at SNU MILAB focusing on LLM evaluation—especially bias and robustness. I study how LLM-as-a-Judge behaves under framing and uncertainty cues, and I design methods and protocols that make judgments more reliable and interpretable.

Beyond evaluation, I work on automatic data generation for benchmarking and training (e.g., dialogue and evaluation datasets), and I am actively engaged in Korean NLP, including building Korean-specific datasets and metrics and analyzing model behavior in Korean settings.

Publications

Selected (recent)

LLMs can be easily Confused by Instructional Distractions

Y. Hwang, Y. Kim, J. Koo, T. Kang, H. Bae, K. Jung · ACL 2025

We formalize instructional distraction—inputs that look like instructions— and show that even advanced LLMs frequently follow the distracting input rather than the user’s true instruction.

Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation

Y. Hwang*, D. Lee*, K. Min, T. Kang, Y. Kim, K. Jung · EMNLP 2025

We systematically characterize visual biases in LVLM-based judgment and show how they distort alignment evaluations.

Can You Trick the Grader? Adversarial Persuasion of LLM Judges

Y. Hwang, D. Lee, T. Kang, Y. Kim, K. Jung · Findings of EMNLP 2025

We demonstrate that persuasive perturbations can shift LLM-judge decisions and discuss practical defenses.

Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the Effect of Epistemic Markers on LLM-based Evaluation

D. Lee*, Y. Hwang*, Y. Kim, J. Park, K. Jung · NAACL 2025

We measure how uncertainty expressions influence LLM-based evaluation outcomes.

View full list

International Conferences

Don’t Judge Code by Its Cover: Exploring Biases in LLM Judges for Code Evaluation — J. Moon*, Y. Hwang*, D. Lee, T. Kang, Y. Kim, K. Jung. Findings of EACL 2026.
Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation — Y. Hwang*, D. Lee*, K. Min, T. Kang, Y. Kim, K. Jung. EMNLP 2025.
Can You Trick the Grader? Adversarial Persuasion of LLM Judges — Y. Hwang, D. Lee, T. Kang, Y. Kim, K. Jung. Findings of EMNLP 2025.
LLMs can be easily Confused by Instructional Distractions — Y. Hwang, Y. Kim, J. Koo, T. Kang, H. Bae, K. Jung. ACL 2025.
Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the Effect of Epistemic Markers on LLM-based Evaluation (Oral) — D. Lee*, Y. Hwang*, Y. Kim, J. Park, K. Jung. NAACL 2025.
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models — J. Koo, Y. Hwang, Y. Kim, T. Kang, H. Bae, K. Jung. Findings of NAACL 2025.
MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging Knowledge Graphs — Y. Hwang, Y. Kim, Y. Jang, J. Bang, H. Bae, K. Jung. EMNLP 2024.
Kosmic: Korean Text Similarity Metric Reflecting Honorific Distinctions — Y. Hwang, Y. Kim, J. Bang, H. Bae, H. Lee, K. Jung. COLING 2024.
Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources — Y. Hwang*, Y. Kim*, H. Bae, H. Lee, J. Bang, K. Jung. EMNLP 2023.
PR-MCS: Perturbation Robust Metric for Multilingual Image Captioning — Y. Kim, Y. Hwang, H. Yun, S. Yoon, T. Bui, K. Jung. Findings of EMNLP 2023.
Injecting Comparison Skills in Task-Oriented Dialogue Systems for Database Search Results Disambiguation — Y. Kim*, Y. Hwang*, J. Shin, H. Bae, K. Jung. Findings of ACL 2023.
Improving Cross-Modal Attention via Object Detection — Y. Kim, Y. Hwang, S. Yoon, H. Yun, K. Jung. NeurIPS Workshops 2022.

International Journal

Flowlogue: A Novel Framework for Synthetic Dialogue Generation with Structured Flow from Text Passages — Y. Kim, Y. Hwang, H. Bae, T. Kang, K. Jung. IEEE Access, 2024.

Domestic Conferences & Journal

A Study on the Evaluation Consistency of Korean LLM-as-a-Judge Models in Mathematical Problems — Y. Hwang, D. Lee, J. Moon, K. Min, K. Jung. KCC 2025.
Analysis of Stylistic Bias in Korean LLM-as-a-Judge — Y. Hwang, D. Lee, J. Moon, K. Min, K. Jung. KCC 2025.
Evaluating the Robustness of LLM-Judges to Epistemic Markers in Korean — D. Lee, Y. Hwang, J. Moon, K. Min, K. Jung. KCC 2025.
TSDG: A Framework for Generating Natural Topic-Shift Dialogue Data — Y. Hwang, D. Lee, Y. Kim, K. Jung. KSC 2024.
Error-Correction Chain-of-Thought (ECOCoT): Enhancing Accuracy in Mathematical Reasoning through Error-Correction Framework — Y. Hwang, Y. Kim, D. Lee, T. Kang, H. Bae, K. Jung. KSC 2024.
Reference-Centric QA Evaluation Leveraging Contrastive Decoding — D. Lee, K. Min, Y. Hwang, J. Park, K. Jung. KSC 2024.
KLIPScore: A Highly Human-Correlated Korean Image Captioning Metric (Oral) — Y. Kim, Y. Hwang, Y. Chae, S. Yoon, K. Jung. KCC 2023.
Thinking Fast and Slow in Multimodal Emotion Recognition Task — Y. Hwang, Y. Kim, Y. Chae, K. Jung. KCC 2023.
Improving Cross-Modal Attention via Object Detection — Y. Kim, H. Yun, Y. Hwang, K. Jung. KCC 2022.
COVID-19 Severity Prediction using Deep Transfer Learning — Y. Hwang, Y. Kim, K. Jung. KCC 2022.

* equal contribution.

Experience

Research Intern

LGAI Research

Seoul, Republic of Korea · Nov 2025 – Present

Research Intern

Max Planck Institute for Security and Privacy (MPI-SP)

Bochum, Germany · Aug 2025 – Oct 2025

Intern

LGAI Research

Seoul, Republic of Korea · Aug 2022 – Oct 2022

Education

Seoul National University

Ph.D. Candidate, Interdisciplinary Program in Artificial Intelligence

Seoul, Republic of Korea · Sep 2021 – Present

Seoul National University

B.S., Electrical & Computer Engineering (Cum Laude)

Seoul, Republic of Korea · Mar 2016 – Aug 2021

Awards & Scholarships

National Science & Technology Scholarship

Republic of Korea · 2018 – 2019

SNU IPAI Support Scholarship (Ph.D.)

Republic of Korea

SNU IPAI Support Scholarship (M.S.)

Republic of Korea

Winner — Research Paper Competition, SNU IPAI

Seoul, Republic of Korea · Dec 2025

Winner — Research Paper Competition, SNU IPAI

Seoul, Republic of Korea · Jun 2025

Winner — Research Paper Competition, SNU IPAI

Seoul, Republic of Korea · Dec 2024

Winner — Creative Autonomous Research Competition, SNU IPAI

Seoul, Republic of Korea · Dec 2023

Winner — ETRI Human-Understanding AI Paper Contest

Republic of Korea · Jun 2023

Materials

PDF (for LinkedIn / applications)

Open

Contact

✉️ dpfls589@snu.ac.kr
🔗 LinkedIn