NL-159, Towards Evaluation of Multi-party Dialogue Systems, INLG 2022

NL-159, Towards Evaluation of Multi-party Dialogue Systems, INLG 2022

Main Motivation

Prolific research in NLG evaluation

Multiple taxonomies presented[1, 2, 3, 4]
Studies towards importance of automatic and human metrics[5, 6, 7, 8]
+ Confusion surrounding inconsistent evaluation methods used[9]

However, not much work towards evaluation specifically for Multi-party Conversation (MPC) evaluation

= Need for discussing MPC specific challenges and needs

MPC Challenges

여러 참가자의 존재는 대화 모델링 관점에서 새롭고 흥미로운 과제를 소개합니다.

참가자 역할 - 대화 모델링과 함께 speaker-specific 및 addressee-specific 정보를 유지해야 합니다.
대화 구조 - 순차보다 그래프에 가깝습니다.
대화 내 스레드 - 하위 그룹 내에서 여러 주제 스레드가 공존할 수 있음

Contributions

Propose an expanded taxonomy focusing on the specific challenges introduced by multi-party dialogue, or group conversations

Such as the need to maintain speaker-specific context and recognize the proper addressees

Synthesize evaluation measures utilized in existing MPC research, and relate them to the expanded taxonomy introduced

Report important inconsistencies in current research

Reference

댓글