Investigating and Recognizing Lavender Language in a GenderNLP Perspective
以性別自然語言處理觀點分析與預測同志語言
2018
Education
2018
M.A. in Linguistics, National Taiwan University
Academic Output
Affiliated Publications
Exploring Lavender Tongue from Social Media Texts [In Chinese]
Hsiao-Han Wu, Shu-Kai Hsieh
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing (ROCLING 2017)2017
Under the issue of gender and Natural Language Processing (NLP), most papers aim at gendernorm language that spoken by biologically males and females with opposite-sex desires. However, from the point of view of sexual orientation, this study presents the first work in the task of Chinese homosexual identification. Firstly, we collect homosexual texts from social media, and secondly examine linguistic behavior found in gay and lesbian texts. In addition, we also provide sets of linguistic features to automatically predict homosexual language with the adoption of 5-fold cross-validation Support Vector Machine (SVM) and Naive Bayes (NB) models. Training procedure in the study resulted in promising f-score around 70% with the use of particular lexicon-based feature set.
paper
@inproceedings{wu_exploring_2017,
title = {Exploring Lavender Tongue from Social Media Texts [In Chinese]},
author = {Hsiao-Han Wu AND Shu-Kai Hsieh},
booktitle = {Proceedings of the 29th Conference on Computational Linguistics and Speech Processing (ROCLING 2017)},
year = {2017},
}
Crowdsourcing Experiment Designs for Chinese Word Sense Annotation
Proceedings of the 28th Conference on Computational Linguistics and Speech Processing (ROCLING 2016)2016
This paper tries to demonstrate our exploratory efforts in tackling with the “high accuracy-low quantity” problem of human word sense annotation task in Chinese, and ultimately reach the goal of automatic word sense annotation. Our proposed annotation architecture consists of explicit and implicit aspects of of crowdsourcing approach. Explicit method focuses on the general issues of crowdsourcing and made adjustments on current MTurk framework. Implicit method concentrates on the idea of Game with a Purpose (GWAP) design, which originates from a well-known video game Super Mario.
paper
@inproceedings{da63c34f0702456b8183d892390d3b01,
title = "Crowdsourcing experiment designs for Chinese word sense annotation",
abstract = "This paper tries to demonstrate our exploratory efforts in tackling with the “high accuracy-low quantity” problem of human word sense annotation task in Chinese, and ultimately reach the goal of automatic word sense annotation. Our proposed annotation architecture consists of explicit and implicit aspects of of crowdsourcing approach. Explicit method focuses on the general issues of crowdsourcing and made adjustments on current MTurk framework. Implicit method concentrates on the idea of Game with a Purpose (GWAP) design, which originates from a well-known video game Super Mario.",
author = "Huang, \{Tzu Yun\} AND Wu Hsiao-Han AND Lee Chia-Chen AND Lee Shao-Man AND Li Guan-Wei AND Hsieh Shu-Kai",
note = "Publisher Copyright: {\textcopyright} Proceedings of the 28th Conference on Computational Linguistics and Speech Processing, ROCLING 2016.; 28th Conference on Computational Linguistics and Speech Processing, ROCLING 2016 ; Conference date: 06-10-2016 Through 07-10-2016",
year = "2016",
month = oct,
day = "1",
language = "English",
series = "Proceedings of the 28th Conference on Computational Linguistics and Speech Processing, ROCLING 2016",
publisher = "The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)",
pages = "82--99",
editor = "Chung-Hsien Wu AND Yuen-Hsien Tseng AND Hung-Yu Kao AND Lun-Wei Ku AND Yu Tsao AND Shih-Hung Wu",
booktitle = "Proceedings of the 28th Conference on Computational Linguistics and Speech Processing, ROCLING 2016",
}