Una metodología para encontrar el mejor clasificador en decisión empresarial

José C. Vega Vilca; David A. Torres Núñez

doi:10.15517/rce.v33i1.19971

Vol. 33 No. 1 (2015), Articles

Vol. 33 No. 1 (2015)

A methodology to find the best classifier in business decision

Articles

https://doi.org/10.15517/rce.v33i1.19971

Published July 2, 2015

José C. Vega Vilca⁺⁻
David A. Torres Núñez⁺⁻

José C. Vega Vilca

Universidad de Puerto Rico, P.0 BOX 23332, Código Postal 00931, San Juan, Puerto Rico

David A. Torres Núñez

Universidad de Puerto Rico, P.0 BOX 23332, Código Postal 00931, San Juan, Puerto Rico

PDF (Español (España))

Keywords

SUPERVISED CLASSIFICATION
CROSS VALIDATION
ERROR RATE
CUSTOMER
STATISTICAL DECISION
MULTIVARIATE ANALYSIS
CLASIFICACIÓN SUPERVISADA
VALIDACIÓN CRUZADA
TASA DE ERROR
CLIENTE
DECISIÓN ESTADÍSTICA
ANÁLISIS MULTIVARIABLE

How to Cite

Vega Vilca, J. C., & Torres Núñez, D. A. (2015). A methodology to find the best classifier in business decision. Revista De Ciencias Económicas, 33(1), 63–73. https://doi.org/10.15517/rce.v33i1.19971

Abstract

In this research, a methodology is presented to improve strategies of analysis in situations where supervised classification becomes the fundamental tool for business decision. The need to categorize the new customers into one of several groups, according to the characteristics of the subject, is analyzed through the calculation of the error rate. Programs were written using the statistical software package R, to calculate the error rate of each of nine classifiers, using cross-validation method 10 (Stone, 1974), in the 50 permutations of the data under consideration. For each of the analyzed data sets it was demonstrated, through ANOVA, that there are indeed significant differences in the average error rates of classifiers (p=0.00); therefore, it is concluded that the best classifier is the one with the lowest error rate.

https://doi.org/10.15517/rce.v33i1.19971

PDF (Español (España))

References

Antipov, E., & Pokryshevskaya, E. (2010). Applying CHAID for logistic regression diagnostics and classification accuracy improvement. Journal of Targeting, Measurement and Analysis for Marketing, 18 (2), 109-117.

Blake, C. L., & Merz, C. J. (1998). Churn Data Set. University of California. Department of Information and Computer Science, Irvin, CA. Recuperado de: http://www.sgi.com/tech/mlc/db/churn.data

Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. Boca Raton, FL: CRC Press LLC.

Dobson, A. (2002). An Introduction to Generalized Linear Models. Boca Raton, FL: CRC Press LLC. doi:10.1002/sim.1493

Hothorn, T., Hornik, K., van de Wiel, M., & Zeileis, A (2006). A Lego System for Conditional Inference. The American Statistician, 60 (3), 257–263. doi:10.1198/000313006X118430

Manning, C., Raghavan, P., & Schutze, H. (2008). Introduction to Information Retrieval. London: Cambridge University Press.

Ripley, B. D. (1996). Pattern Recognition and Neural Networks. London: Cambridge University Press.

Smith, C. (1947). Some examples of discrimination. Ann. Eugenic 18, 272–282.

Stone, M. (1974). Cross-validatory choice and the assessment of statistical predictions (with discussion). Journal of the Royal Statistical Society, B 36, 111-133.

Venables, W. N., & Ripley, B. D. (2002). Modern Applied Statistics with S. New York, NY: Springer-Verlag. doi:10.1007/978-0-387-21706-2

Witten, I., Frank, E., & Hall, M. (2011). Data Mining: Practical Machine Learning Tools and Techniques. Burlington, MA: Morgan Kaufmann.

##plugins.facebook.comentarios##

Downloads

Download data is not yet available.