A dataset of lexical contrast pairs ViCon distinguishes between similarity (synonymy) and dissimilarity (antonymy), and a dataset of semantic relation pairs ViSim-400 reflects the continuum between similarity and relatedness.

See here on how to obtain the data.


Kim-Anh Nguyen, Sabine Schulte im Walde, Ngoc Thang Vu (2018)
Introducing Two Vietnamese Datasets for Evaluating Semantic Models of (Dis-)Similarity and Relatedness
In: Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). New Orleans, LA.