Resumen

In this paper we focus on collocations, which have been studied in computational linguistics since they constitute a key factor when processing natural languages. For instance, they usually represent a challenge in automatic translation because the association of two terms is not easily computed. We proposed that the parser should be provided with a lexical database in order to make more effective the identification of collocations during the parsing process. We assessed this claim by using a corpus of 6’000 sentences retrieved from the British magazine The Economist Espresso. The corpus was parsed twice, first with the collocation detection component turned on and then with it turned off, and to make the comparison the Fips tagger was used. The results showed an improvement of the quality when the parser has access to collocation knowledge. 

Palabras clave: collocations, multiword expressions, sentence parsing, computational linguistics, natural language processing