NL-159, Towards Evaluation of Multi-party Dialogue Systems, INLG 2022
Main Motivation
- Prolific research in NLG evaluation
- Multiple taxonomies presented[1, 2, 3, 4]
- Studies towards importance of automatic and human metrics[5, 6, 7, 8]
- + Confusion surrounding inconsistent evaluation methods used[9]
- However, not much work towards evaluation specifically for Multi-party Conversation (MPC) evaluation
- = Need for discussing MPC specific challenges and needs
MPC Challenges
- 여러 참가자의 존재는 대화 모델링 관점에서 새롭고 흥미로운 과제를 소개합니다.
- 참가자 역할 - 대화 모델링과 함께 speaker-specific 및 addressee-specific 정보를 유지해야 합니다.
- 대화 구조 - 순차보다 그래프에 가깝습니다.
- 대화 내 스레드 - 하위 그룹 내에서 여러 주제 스레드가 공존할 수 있음
Contributions
- Propose an expanded taxonomy focusing on the specific challenges introduced by multi-party dialogue, or group conversations
- Such as the need to maintain speaker-specific context and recognize the proper addressees
- Synthesize evaluation measures utilized in existing MPC research, and relate them to the expanded taxonomy introduced
- Report important inconsistencies in current research
Reference
댓글
댓글 쓰기