Abstract
The increase in scientific production makes it a challenge to identify particular patterns and traits that characterize researchers. Establishing levels of compatibility and similarity between actors in a scientific research context from their profiles requires a rapid and appropriate process. The objective of this article is to evaluate the levels of similarity, Euclidean distance and compatibility between vectors of researchers, based on clustering algorithms, multidimensional scaling, principles of the vector-space model and attributes of their scientific profiles, considering the terminologies addressed in their scientific production. Theoretical and empirical methods were used, including text mining techniques and tools. The application of the procedure in the Advanced Energy and Technology Study Center from Cuba and the Cotopaxi Technical University from Ecuador, evidenced its effectiveness. As a result, it was possible to identify professionals with higher levels of coincidence in areas and lines of research, which favors the establishment of Collective Communities of Knowledge; it was possible to demonstrate that the methods used can be integrated to ICT, resulting in obtaining perceptual relationships between researchers and expressing the groups formed from clusters of observations in each subcategory and knowledge domains of the two case studies analyzed.
References
Ahn, J. (2011). The effect of social network sites on adolescents' social and academic development: Current theories and controversies. Journal of the American Society for Information Science and Technology, 62(8), 1435-1445. https://doi.org/10.1002/asi.21540
Al-Anzi, F. S. y AbuZeina, D. (2018). Beyond vector space model for hierarchical Arabic text classification: A Markov chain approach. Information Processing & Management, 54(1), 105-115. https://doi.org/10.1016/j.ipm.2017.10.003
Almufti, S., Marqas, R. y Ashqi, V. (2019). Taxonomy of bio-inspired optimization algorithms. Journal Of Advanced Computer Science & Technology, 8(2), 23. doi: http://dx.doi.org/10.14419/jacst.v8i2.29402
Ashby, F. G. (2014). Multidimensional models of perception and cognition. Psychology Press.
Avinash, K., Sambit, M. y Pradip, S. (2020). Mapping Scientific Collaboration: A Bibliometric Study of Rice Crop Research in India. Journal of Scientometric Research, 9(1). http://dx.doi.org/10.5530/jscires.9.1.4
Bárcenas, G. R., Culqui, A. C., Peñaherrera, J. R., Beltrán, S. C. y Tamayo, E. T. (2016). Levels of Similarity in User Profiles Based Cluster Techniques and Multidimensional Scaling. International Journal of Systems Applications, Engineering & Development, 10(2016), 56-64.
Cambria, E., Song, Y., Wang, H. y Howard, N. (2014). Semantic multidimensional scaling for open-domain sentiment analysis. Intelligent Systems (IEEE), 29(2), 44-51.
Day, R. E. (2011). Death of the user: Reconceptualizing subjects, objects, and their relations. Journal of the American Society for Information Science and Technology, 62(1), 78-88. https://doi.org/10.1002/asi.21422
Degemmis, M., Lops, P., Ferilli, S., Di Mauro, N., Basile, T. M. A. y Semeraro, G. (2006). Text learning for user profiling in e-commerce. International Journal of Systems Science, 37(13), 905-918. https://doi.org/10.1080/00207720600891794
Dunn-Rankin, P., Knezek, G. A., Wallace, S. R. y Zhang, S. (2014). Scaling methods. Psychology Press.
Haris, M., Arnela, P., Edin, M. y Mahira, M. (2019). In Search of a Silver Bullet: Evaluating Researchers’ Performance in Bosnia and Herzegovina. Journal of Scientometric Research, 8(3). http://dx.doi.org/10.5530/jscires.8.3.27
Joshi, R., Prasad, R., Mewada, P. y Saurabh, P. (2020). Modified LDA Approach For Cluster Based Gene Classification Using K-Mean Method. Procedia Computer Science, 171, 2493-2500. https://doi.org/10.1016/j.procs.2020.04.270
Kastrati, Z. y Imran, A. S. (2019). Performance analysis of machine learning classifiers on improved concept vector space models. Future Generation Computer Systems, 96, 552-562. https://doi.org/10.1016/j.future.2019.02.006
López-Herrera, A. G. (2006). Modelos de sistemas de recuperación de información documental basados en información lingüística difusa [Tesis doctoral, Universidad de Granada].
Machado, J. T. y Lopes, A. M. (2020). Multidimensional scaling and visualization of patterns in prime numbers. Communications in Nonlinear Science and Numerical Simulation, 83, 105-128. https://doi.org/10.1016/j.cnsns.2019.105128
Marteleto, R. M. y de Oliveira e Silva, A. B. (2005). Redes e Capital Social: o enfoque da informação para o desenvolvimento local. Ciência Da Informação, 33(3). https://doi.org/10.18225/ci.inf.v33i3.1032
Ratheeshkumar, A., Rajkumar, M., Balakrishnan, S. y Kalaiarasan, R. (2018). An Effective Method for Mapping Web User Profile based on Domain Ontology. International Journal of Engineering & Technology, 7(4.19), 1-4.
Rodríguez-García, M., Valencia-García, R.l., Alcaraz-Mármol, G. y Carralero, C. (2014). Open Idea: An intelligent platform for managing innovative ideas. Procesamiento de Lenguaje Natural, 53, 147-150.
Salton, G. (1989). Automatic Text Processing. Addison Wesley.
Samper. (2005). Estudio y evaluación de un sistema inteligente para recuperación y el filtrado de información de Internet. [Tesis de Doctorado, Universidad de Granada].
Sun, J. (2012). Why different people prefer different systems for different tasks: An activity perspective on technology adoption in a dynamic user environment. Journal of the American Society for Information Science and Technology, 63(1), 48-63. https://doi.org/10.1002/asi.21670
Sun, S., Song, H., He, D. y Long, Y. (2019). An adaptive segmentation method combining MSRCR and mean shift algorithm with K-means correction of green apples in natural environment. Information Processing in Agriculture, 6(2), 200-215. https://doi.org/10.1016/j.inpa.2018.08.011
Velásquez, E., Cardona, A. y Peña, A. (2014). Vector Model for Cognitive State Inference of Patients in Coma-Derived States. Revista Ibérica de Sistemas e Tecnologias de Informação, 13, 65-81. https://doi.org/10.4304/risti.13.65-81
Žilinskas, J. (2007). Reducing of search space of multidimensional scaling problems with data exposing symmetries. Information Technology And Control, 36(4), 377-382.