| ![]() | |||||||||
Transcending the Bayes Limit Through
Repeated Sampling?
Jiangying Zhou Daniel Lopresti
Matsushita Information Technology Laboratory
Panasonic Technologies, Inc.
Two Research Way
Princeton, NJ 08540
USA
[jz,dpl]@mitl.research.panasonic.com
August 24, 1994
Abstract
In statistical pattern recognition, the Bayes Risk serves as a reference { a limit of excellence that cannot be surpassed. In this paper, we show that by relaxing the assumption that the input be sampled only once, a classification system can be built that beats the Bayes error bound. We present a detailed analysis of the effects of repeated sampling, including proofs that it always yields a net improvement in recognition accuracy for common distributions of interest. Upper and lower bounds on the net improvement are also discussed.
Keywords: pattern recognition, statistical classifier, Bayes Risk, repeated sampling, optical character recognition, consensus sequence voting.
1 Introduction
A fundamental problem in pattern recognition is to take an unidentified object and associate it with one of a set of pre-defined classes according to the measurement of some number of its physical attributes. It is well known that the error rate for any statistical classifier based on a specific collection of attributes, or feature set, is lower-bounded by the Bayes Risk [Cho57]. In this paper, we show that by relaxing a basic assumption { that the input be sampled only once { a classification system can be built that beats the Bayes error bound. This result is not just a theoretical curiosity, but appears to have practical applications in real-world recognition problems.
According to the Bayes theorem, the design of a statistical classifier is dictated by the characteristics of the a priori class probabilities and by the conditional probability distributions of the measured features for each class. Once the distributions of these random variables are known, the optimal classification boundaries are determined by the Bayes decision rule. Errors arise when the distributions for different classes overlap (e.g., Figure 2). In the traditional case, such mistakes are unavoidable; the classifier is optimal" in the sense that it minimizes this base error rate.
In a previous paper, we introduced a methodology that reduces the residual error rate in optical character recognition (OCR) by sampling the input repeatedly and combining the results through
?To be presented at the IAPR Workshop on Machine Vision Applications, Kawasaki, Japan, December 1994.