Resumen
This work describes the identification and evaluation process of potential text markers for sentiment analysis. The evaluation of the markers and their use as part of the feature extraction process from plain text that is needed for sentiment analysis is presented. The evaluation of text markers obtained as a result of systematic analysis from a corpus over a second one allowed us to identify that emphasized positive words that tend to appear in positive text posts. The second corpus allowed us to evaluate the relation between the polarity of morphological text markers and the text they appear in. The evaluation of the markers for polarity detection task, in combination with a polarized dictionary, produced polarity classification average precision of 0.56 % using only three markers. These are promising results if we compared them to the top 0.69 % obtained using more features and specialized dictionaries for the same task.
Citas
Arce, J. L. 2012. Medios de Comunicación de Masas en Costa Rica: Entre la digita- lización, la convergencia y el auge de los “New Media”. Hacia la Sociedad de la Información y el Conocimiento, Programa Sociedad de la Información y el Conocimiento, Universidad de Costa Rica, 283-308.
Cabanlit, Mark Anthony and Kurt Junshean Espinosa. 2014, July. Optimizing N-gram based text feature selection in sentiment analysis for commercial products in Twitter through polarity lexicons. IISA 2014, The 5th International Conference on Information, Intelligence, Systems and Applications, 94-97. IEEE.
Cambria, Erick et al. 2013. New avenues in opi- nion mining and sentiment analysis. IEEE Intelligent Systems, 28(2), 15-21. Obtained from http://sentic.net/new-avenues-in-opi- nion-mining-and-sentiment-analysis.pdf
Chenlo, J. M. & Losada, D. E. 2014. An empirical study of sentence features for subjectivity and polarity classification. Information Sciences, Elsevier,, 280, 275-288.
Feldman, Ronen. 2013. Techniques and applications for sentiment analysis. Communications of the ACM, 56(4), 82-89. Obtained from http://dl.acm.org/ citation.cfm?doid=2436256.2436274
Forman, G. 2003. An extensive empi- rical study of feature selec- tion metrics for text classification Journal of machine learning research, 3, 1289-1305.
Guo, Liqiang and Wan, Xiaojun. 2012. Exploiting syntactic and semantic relationships bet- ween terms for opinion retrieval. Journal of the American Society for Information Science and Technology, 63(11), 2269- 2282. Obtained from http://onlinelibrary. wiley.com/doi/10.1002/asi.22724/full
Indurkhya, N. & Damerau, F. J. 2010. Handbook of natural language processing CRC Press, 2.
Kouloumpis, E.; Wilson, T. & Moore, J. D. 2011. Twitter sentiment analysis: The good the bad and the omg. Icwsm,11, 538-541.
Martín-Valdivia, María Teresa et al. 2013. Sentiment polarity detection in Spanish reviews combining supervised and unsu- pervised approaches. Expert Systems with Applications, 40(10), 3934-3942. Obtained from http://www.sciencedirect.com/scien- ce/article/pii/S0957417412013267
Melero, M.; Cardús, A.-B.; Moreno, A.; Rehm, G.; de Smedt, K. & Uszkoreit, H. (2012). The Spanish language in the digital age. Springer.
Pang, Bo and Lee, Lillian. 2008. Opinion mining and sentiment analysis. Foundations
and trends in information retrieval, 2(1- 2), 1-135. Obtained from http://dx.doi. org/10.1561/1500000011
Perez-Rosas, Verónica et al. 2012, May. Learning Sentiment Lexicons in Spanish. In LREC, 12, 3077-3081.
Sharma, Anuj and Dey, Shubhamoy. 2012. Performance investigation of feature selection methods and sentiment lexi- cons for sentiment analysis. IJCA Special Issue on Advanced Computing and Communication Technologies for HPC Applications, 3, 15-20.
Stats. 2013. Internet World Users By Language: Top 10 Languages. Electronic site. Obtained from http://www.internetworlds- tats.com/stats7.htm
Turney, Peter D. 2002, July. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. Proceedings of the 40th annual meeting on association for computational linguistics, 417-424. Association for Computational Linguistics. Obtained from http://dl.acm. org/citation.cfm?doid=1073083.1073153