I am an associate professor in
the Department of Philology, Literature, and Linguistics of the University of Pisa and the director of the Computational Liguistics Laboratory (CoLing Lab)
My current research focuses on the following themes:
- distributional semantics - the distributional hypothesis - i.e. words with similar distributional properties have similar semantic properties - lies at the heart of Distributional Semantic Models (DSMs), a number of computational approaches that share the assumption that it is possible to dynamically build semantic representations of the lexical space through the statistical analysis of the contexts in which words occur.
Together with Marco Baroni, we have developed a new DSM called Distributional Memory (DM). I am particularly interested in using DM and similar models for linguistic and cognitive research, in particular to investigate the semantic memory and the role of distributional information in the formation, organization and dynamics of concepts and meanings.
- computational methods to investigate verb argument structure - the automatic acquisition of verb information from corpora represents a longstanding research avenue in computational linguistics. Efforts have mostly focused on developing methods to extract verb subcategorization frames, to identify verb selectional preferences, and to automatically detect diathesis alternations. I apply such methods to investigate the argument structure of Italian verbs, with the goal of building a semantic classification based on shared distributional properties, similar to Levin's classes.
- event types - event types, i.e. Vendler's standard classification of predicates into states, activities, accomplishments and achievements - play a crucial role in verb semantics, contributing to the temporal constitution of the sentence.
Together with Alessandra Zarcone and Pier Marco Bertinetto, we have been investigating event types both from the cognitive and computational point view. We have carried psycholinguistic experiments to analyze the role of event types in the mental lexicon and we have built various models for the automatic classification of verb event types.
- tools and resources for Natural Language Processing (NLP) - in collaboration with the Dylan Lab of the ILC-CNR, we have developed tools for Italian NLP and text annotation, i.e. part-of-speech taggers, named entity recognizers, terminology extractors, etc.
I am also actively involved in the construction of syntactically and semantically annotated corpora to be used both to train
NLP tools, and as important information resources for linguistic research.
- Distributional Memory (DM) - a general distributional semantic model, developed in collaboration with Marco Baroni.
- LexIt - on line database developed at the Laboratory for Computational
Linguistics of the University
of Pisa, containing automatically corpus-derived information on the argument structure properties of Italian verbs.
- Word Combinations in Italian: theoretical and descriptive analysis, computational models, lexicographic layout and creation of a dictionary - a -3-year project funded by the Italian Ministry of Research (PRIN 2010-2011), coordinated by Raffaele Simone (University of Rome 3). I am the Head of the Research Unit at the University of Pisa. The goal of my Research Unit is to develop advanced computational linguistics methods for the extraction of distributional information from text corpora
- Semantic representations in congenital blind subjects - a 2-year project funded by the Italian Ministry of Research (PRIN 2008), in collaboration with Giovanna Marotta (University of Pisa, Project Director), Pietro Pietrini (University of Pisa), and Marco Baroni (University of Trento). The overall goal of the project is to conduct linguistic, computational and neuro-cognitive analyses of semantic representations in the congenitally blind.
- Paisà - a 3-year project funded by the Italian Ministry of Research (Firb 2007), in collaboration with University of Bologna (Project Director Sergio Scalise), ILC-CNR, University of Trento and Eurac (Bolzano). The project will build a large, freely available, richly annotated corpus of Italian, and lexical databases that will be automatically acquired from it.
- Semawiki - 2-year project funded by the Fondazione Cassa di Risparmio di Pisa. The project has developed various computational tools and resources for Italian NLP, and was carried out in collaboration with the Department of Computer Science of the University of Pisa and ILC-CNR.
As a member of the Department of Philology, Literature, and Linguistics of the University of Pisa, I teach
Computational Linguistics in the Bachelor and Master Degree Courses in Informatica Umanistica (Humanities Computing).
This is a highly innovative study program jointly run by the Faculty of Letters and the Faculty of Sciences,
in particular by the Department of Computer Science. The master program has a special track for Language Technology.
I also teach a course on Computational Linguistics in the Master program (Laurea Magistrale) in Linguistics.
I am member of the Phd Program in Linguistics at the University of Pisa. Each year I give short courses and seminars
on various topics in computational linguistics and lexical semantics.
I am a member of the research group TRIPLE (Research Desk on Word and Lexicon), coordinated by Raffaele Simone (University of Rome 3), and each year I teach a short course at the Scuola Invernale TRIPLE (TRIPLE Winter School), mainly on computational methods for lexical research.
I have given courses and seminars in several Italian and international Universities, summer schools (e.g. ESSLLI 2009 with Stefan Evert), etc.
Publications and CV
Email address: alessandro.lenci AT unipi.it
|Dipartimento di Filologia, Letteratura e Linguistica
|via Santa Maria 36
|56126 PISA, Italy
Phone numbers: +39-050-2215638 (University of Pisa)
Skype name: alessandro.lenci