Kunc, Vladimír and Kléma, Jiří and Anděl, Michael

Kunc, V., Kléma, J., & Anděl, M. (2015). Increasing Weak Classifiers Diversity by Omics Networks. In Proceedings of 2nd Workshop on Machine Learning in Life Sciences (pp. 16–28). ENGINE - European Research Centre of Network Inteligence and Innovation.

Abstract

The common problems in machine learning from omics data are the scarcity of samples, the high number of features and their complex interaction structure. The models built solely from measured data often suffer from overfitting. One of possible methods dealing with overfitting is to use prior knowledge for regularization. This work analyzes contribution of feature interaction networks in regularization of ensemble classifiers representing another approach to overfitting reduction. We study how utilization of feature interaction networks influences diversity of weak classifiers and thus accuracy of the resulting ensemble model. The network and its random walks are used to control the feature randomization during construction of weak classifiers, which makes them more diverse than in the well-known random forest. We experiment with different types of weak classifiers (trees, logistic regression, näıve Bayes) and different random walk lengths and demonstrate that diversity of weak classifiers grows with increasing network locality of weak classifiers.

Citation

@inproceedings{kunc2015,
  title = {Increasing Weak Classifiers Diversity by Omics Networks},
  author = {Kunc, Vladimír and Kléma, Jiří and Anděl, Michael},
  year = {2015},
  booktitle = {Proceedings of 2nd Workshop on Machine Learning in Life Sciences},
  publisher = {ENGINE - European Research Centre of Network Inteligence and Innovation},
  location = {Wroclaw},
  pages = {16-28},
  isbn = {978-83-943803-0-4},
  url = {http://ida.felk.cvut.cz/klema/publications/Biotex/kunc_mlls2015.pdf}
}