Pierre Magistry

Sentiment detection in micro-blogs using unsupervised chunk extraction

Pierre Magistry, Shu-Kai Hsieh, Yu-Yun Chang

Lingua Sinica 2016

In this paper, we present a proposed system designed for sentiment detection for micro-blog data in Chinese. Our system surprisingly benefits from the lack of word boundary in Chinese writing system and shifts the focus directly to larger and more relevant chunks. We use an unsupervised Chinese word segmentation system and binomial test to extract specific and endogenous lexicon chunks from the training corpus. We combine the lexicon chunks with other external resources to train a maximum entropy model for document classification. With this method, we obtained an averaged F1 score of 87.2 which outperforms the state-of-the-art approach based on the released data in the second SocialNLP shared task.

paper source

Skillex: a graph-based lexical score for measuring the semantic efficiency of used verbs by human subjects describing actions

Bruno Gaume, Karine Duvignau, Emmanuel Navarro, Yann Desalle, Hintat Cheung, Shu-Kai Hsieh, Pierre Magistry, Laurent Prevot

Traitement Automatique des Langues 2014

RÉSUMÉ. Les dictionnaires sont des objets socioculturels qui peuvent être utilisés comme struc tures sous-jacentes pour la modélisation en sciences cognitives. Nous montrons d’abord que les réseaux lexicaux construits à partir de dictionnaires, malgré un désaccord de surface au niveau des liens, partagent une structure topologique commune. En supposant que cette structure profonde reflète l’organisation sémantique du lexique partagée par les membres d’une communauté linguistique, nous proposons un modèle basé sur l’exploration de cette structure spécifique pour analyser et comparer l’efficacité sémantique des productions [Enfants/Adultes] dans une tâche d’étiquetage d’action. Nous définissons un score générique de l’efficacité sémantique, S KILLEX. Assigné aux participants du protocole A PPROX, ce score nous permet de les classer avec précision dans les catégories enfants et adultes.

paper

Skillex, an action labelling efficiency score: the case for french and mandarin

Yann Desalle, Bruno Gaume, Karine Duvignau, Hintat Cheung, Shu-Kai Hsieh, Pierre Magistry, Jean-Luc Nespoulous

Proceedings of the Annual Meeting of the Cognitive Science Society 2014

We propose a model to compute two measurements of semantic efficiency of verbs as action labels. It is based on the exploration of the specific structure of synonymy networks of verbs. We use these measurements to analyse and compare the semantic efficiency of [Children/Adults] productions in action labelling tasks, in French and Mandarin. The combination of these two measurements leads to a generic score of semantic efficiency, Skillex. Assigned to participants of the Approx protocol experiment, this score enables us to accurately classify them into Children and Adults categories, be they French or Mandarin native speakers.

paper

Graph representation of synonymy and translation resources for crosslinguistic modelisation of meaning

Benoît Gaillard, Yannick Chudy, Pierre Magistry, Shu-Kai Hsieh, Emmanuel Navarro

Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation 2010

In this paper we describe the data that will be used to compare the semantic structures that emerge from synonymy in French and in Mandarin. We aim at studying these semantic structures at both a global, lexicographic level, using lexicons, synonymy and translation dictionaries and at a more localised, experimental level, using data collected in parallel psycholinguistic experiments in French and Mandarin. After presenting our research project, the data we need to carry it out and the available resources, we analyse several linguistic issues arising from the structural differences between the French and Mandarin lexicons. We then explain the construction of the synonymy and translation networks from the available resources and detail specific choices that will enable us to produce meaningful experimental results based on this prepared data. Two kinds of networks are built: lexicographic networks and smaller movie-based networks extracted from experimental recordings. We conclude by describing how we intend to use this data.

paper

馬基石

Academic Output

Affiliated Publications

Sentiment detection in micro-blogs using unsupervised chunk extraction

Skillex: a graph-based lexical score for measuring the semantic efficiency of used verbs by human subjects describing actions

Skillex, an action labelling efficiency score: the case for french and mandarin

Graph representation of synonymy and translation resources for crosslinguistic modelisation of meaning

Let's explore language
frontiers together.

USEFUL_LINKS

LOCATE_US

Pierre Magistry

馬基石

Academic Output

Affiliated Publications

Sentiment detection in micro-blogs using unsupervised chunk extraction

Skillex: a graph-based lexical score for measuring the semantic efficiency of used verbs by human subjects describing actions

Skillex, an action labelling efficiency score: the case for french and mandarin

Graph representation of synonymy and translation resources for crosslinguistic modelisation of meaning

Let's explore languagefrontiers together.

Let's explore language
frontiers together.