page 1  (37 pages)
2to next section

Advances in Instance-Based Learning

D. Randall Wilson and Tony R. Martinez

Neural Network & Machine Learning Laboratory
World-Wide Web:
Computer Science Department
Brigham Young University
Provo, Utah 84602, USA
Tel. (801) 378-6464

The basic nearest-neighbor rule generalizes well in many domains but has several shortcomings, including inappropriate distance functions, large storage requirements, slow execution time, sensitivity to noise, and an inability to adjust its decision boundaries after storing the training data. This paper proposes methods for overcoming each of these weaknesses and combines these methods into a comprehensive learning system called the Integrated Decremental InstanceBased Learning Algorithm (IDIBL) that seeks to reduce storage, improve execution speed, and increase generalization accuracy, when compared to the basic nearest neighbor algorithm and other learning models. In our experiments IDIBL achieves higher generalization accuracy than other less comprehensive instance-based learning algorithms, while requiring less than one-fifth the storage of the nearest neighbor algorithm and improving execution speed by a corresponding factor. In experiments on 31 datasets, IDIBL also achieves higher generalization accuracy than those reported for 16 major machine learning and neural network models.

Key words: instance-based learning, classification, pruning, distance-weighting, voting.

Running head: Advances in IBL