Noise-Tolerant Occam Algorithms and Their Applications to Learning Decision Trees

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)

Abstract

In the distribution-independent model of concept learning of Valiant, Angluin and Laird have introduced a formal model of noise process, called classification noise process, to study how to compensate for randomly introduced errors, or noise, in classifying the example data. In this article, we investigate the problem of designing efficient learning algorithms in the presence of classification noise. First, we develop a technique of building efficient robust learning algorithms, called noise-tolerant Occam algorithms, and show that using them, one can construct a polynomial-time algorithm for learning a class of Boolean functions in the presence of classification noise. Next, as an instance of such problems of learning in the presence of classification noise, we focus on the learning problem of Boolean functions represented by decision trees. We present a noise-tolerant Occam algorithm for k-DL (the class of decision lists with conjunctive clauses of size at most k at each decision introduced by Rivest) and hence conclude that k-DL is polynomially learnable in the presence of classification noise. Further, we extend the noise-tolerant Occam algorithm for k-DL to one for r-DT (the class of decision trees of rank at most r introduced by Ehrenfeucht and Haussler) and conclude that r-DT is polynomially learnable in the presence of classification noise.

Original languageEnglish
Pages (from-to)37-62
Number of pages26
JournalMachine Learning
Volume11
Issue number1
DOIs
Publication statusPublished - 1993 Apr
Externally publishedYes

Keywords

  • decision lists
  • decision trees
  • learning from examples
  • noisy examples
  • polynomial-time learnability
  • probably approximately correct learning

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Noise-Tolerant Occam Algorithms and Their Applications to Learning Decision Trees'. Together they form a unique fingerprint.

Cite this