Sparrow Benchmark

Introduction

SPARROW is a evaluation benchmark for sociopragmatic meaning understanding. SPARROW comprises 169 datasets covering 13 task types across six primary categories (e.g., anti-social language detection, emotion recognition). SPARROW datasets encompass 64 different languages originating from 12 language families representing 16 writing scripts.

Citation

If you use SPARROW benchmark for your scientific publication, or if you find the resources in this website useful, please cite our paper as follows.


            @misc{zhang2023skipped,
                  title={The Skipped Beat: A Study of Sociopragmatic Understanding in LLMs for 64 Languages}, 
                  author={Chiyu Zhang and Khai Duy Doan and Qisheng Liao and Muhammad Abdul-Mageed},
                  year={2023},
                  eprint={2310.14557},
                  archivePrefix={arXiv},
                  primaryClass={cs.CL}
            }