Sentiment detection in micro-blogs using unsupervised chunk extraction
Pierre Magistry, Shu-Kai Hsieh, Yu-Yun Chang
Lingua Sinica2016
In this paper, we present a proposed system designed for sentiment detection for micro-blog data in Chinese. Our system surprisingly benefits from the lack of word boundary in Chinese writing system and shifts the focus directly to larger and more relevant chunks. We use an unsupervised Chinese word segmentation system and binomial test to extract specific and endogenous lexicon chunks from the training corpus. We combine the lexicon chunks with other external resources to train a maximum entropy model for document classification. With this method, we obtained an averaged F1 score of 87.2 which outperforms the state-of-the-art approach based on the released data in the second SocialNLP shared task.
papersource
@article{magistry_sentiment_2016,
title = {Sentiment detection in micro-blogs using unsupervised chunk extraction},
author = {Pierre Magistry AND Shu-Kai Hsieh AND Yu-Yun Chang},
journal = {Lingua Sinica},
year = {2016},
}
Skillex: a graph-based lexical score for measuring the semantic efficiency of used verbs by human subjects describing actions
Bruno Gaume, Karine Duvignau, Emmanuel Navarro, Yann Desalle, Hintat Cheung, Shu-Kai Hsieh, Pierre Magistry, Laurent Prevot
Traitement Automatique des Langues2014
RÉSUMÉ. Les dictionnaires sont des objets socioculturels qui peuvent être utilisés comme struc tures sous-jacentes pour la modélisation en sciences cognitives. Nous montrons d’abord que les réseaux lexicaux construits à partir de dictionnaires, malgré un désaccord de surface au niveau des liens, partagent une structure topologique commune. En supposant que cette structure profonde reflète l’organisation sémantique du lexique partagée par les membres d’une communauté linguistique, nous proposons un modèle basé sur l’exploration de cette structure spécifique pour analyser et comparer l’efficacité sémantique des productions [Enfants/Adultes] dans une tâche d’étiquetage d’action. Nous définissons un score générique de l’efficacité sémantique, S KILLEX. Assigné aux participants du protocole A PPROX, ce score nous permet de les classer avec précision dans les catégories enfants et adultes.
paper
@article{gaume_skillex_2014,
title = {Skillex: a graph-based lexical score for measuring the semantic efficiency of used verbs by human subjects describing actions},
author = {Bruno Gaume AND Karine Duvignau AND Emmanuel Navarro AND Yann Desalle AND Hintat Cheung AND Shu-Kai Hsieh AND Pierre Magistry AND Laurent Prevot},
journal = {Traitement Automatique des Langues},
year = {2014},
}
Skillex, an action labelling efficiency score: the case for french and mandarin
Yann Desalle, Bruno Gaume, Karine Duvignau, Hintat Cheung, Shu-Kai Hsieh, Pierre Magistry, Jean-Luc Nespoulous
Proceedings of the Annual Meeting of the Cognitive Science Society2014
We propose a model to compute two measurements of semantic efficiency of verbs as action labels. It is based on the exploration of the specific structure of synonymy networks of verbs. We use these measurements to analyse and compare the semantic efficiency of [Children/Adults] productions in action labelling tasks, in French and Mandarin. The combination of these two measurements leads to a generic score of semantic efficiency, Skillex. Assigned to participants of the Approx protocol experiment, this score enables us to accurately classify them into Children and Adults categories, be they French or Mandarin native speakers.
paper
@article{desalle_skillex_2014,
title = {Skillex, an action labelling efficiency score: the case for french and mandarin},
author = {Yann Desalle AND Bruno Gaume AND Karine Duvignau AND Hintat Cheung AND Shu-Kai Hsieh AND Pierre Magistry AND Jean-Luc Nespoulous},
journal = {Proceedings of the Annual Meeting of the Cognitive Science Society},
year = {2014},
}
Graph representation of synonymy and translation resources for crosslinguistic modelisation of meaning
Benoît Gaillard, Yannick Chudy, Pierre Magistry, Shu-Kai Hsieh, Emmanuel Navarro
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation2010
In this paper we describe the data that will be used to compare the semantic structures that emerge from synonymy in French and in Mandarin. We aim at studying these semantic structures at both a global, lexicographic level, using lexicons, synonymy and translation dictionaries and at a more localised, experimental level, using data collected in parallel psycholinguistic experiments in French and Mandarin. After presenting our research project, the data we need to carry it out and the available resources, we analyse several linguistic issues arising from the structural differences between the French and Mandarin lexicons. We then explain the construction of the synonymy and translation networks from the available resources and detail specific choices that will enable us to produce meaningful experimental results based on this prepared data. Two kinds of networks are built: lexicographic networks and smaller movie-based networks extracted from experimental recordings. We conclude by describing how we intend to use this data.
paper
@inproceedings{gaillard-etal-2010-graph,
title = "Graph Representation of Synonymy and Translation Resources for Crosslinguistic Modelisation of Meaning",
author = "Gaillard, Beno{\^i}t AND Chudy, Yannick AND Magistry, Pierre AND Hsieh, Shu-Kai AND Navarro, Emmanuel",
editor = "Otoguro, Ryo AND Ishikawa, Kiyoshi AND Umemoto, Hiroshi AND Yoshimoto, Kei AND Harada, Yasunari",
booktitle = "Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation",
month = nov,
year = "2010",
address = "Tohoku University, Sendai, Japan",
publisher = "Institute of Digital Enhancement of Cognitive Processing, Waseda University",
url = "https://aclanthology.org/Y10-1094/",
pages = "819--830"
}