Learning Efficient Rules by Maintaining the Explanation Structure
Jihie Kim and Paul S. Rosenbloom
Information Sciences Institute and Computer Science Department University of Southern California
4676 Admiralty Way
Marina del Rey, CA 90292, U.S.A.
Many learning systems suffer from the utility problem; that is, that time after learning is greater than time before learning. Discovering how to assure that learned knowledge will in fact speed up system performance has been a focus of research in explanationbased learning (EBL). One way to analyze the utility problem is by examining the differences between the match process (match search) of the learned rule and the problem-solving process from which it is learned. Prior work along these lines examined one such difference. It showed that if the search-control knowledge used during problem solving is not maintained in the match process for learned rules, then learning can engender a slowdown; but that this slowdown could be eliminated if the match is constrained by the original search-control knowledge. This article examines a second difference ? when the structure of the problem solving differs from the structure of the match process for the learned rules, time after learning can be greater than time before learning. This article also shows that this slowdown can be eliminated by making the learning mechanism sensitive to the problem-solving structure; i.e., by reflecting such structure in the match of the learned rule.
Efficiency is a major concern for all problem solving systems.
One way of achieving efficiency is the application
of learning techniques to speed up problem solving.
Explanation-based learning (EBL)(Mitchell, Keller, &
Kedar-Cabelli 1986; DeJong & Mooney 1986) can improve
performance by acquiring new search-control rules1. Given
its four informational components ? the goal concept, the
training example, the domain theory, and the operationality
criterion ? EBL generates a new search control rule that is
intended to reduce the search required in subsequent problems.
Unfortunately, EBL suffers from the utility problem,
so that the cost of using learned rules often overwhelms
Research on the utility problem can be divided up into
1EBL can also be used to acquire other types of structures, such as macro-operators, but we focus on search-control rules here.
two key issues. The first issue is the expensive chunk 2
problem (Tambe 1991), in which individual learned rules are so expensive to match that the system suffers a slow down from learning (Minton 1988; Tambe 1991; Etzioni 1990; Shell & Carbonell 1991; Subramanian & Feldman 1990). The second issue is the average growth effect (Doorenbos, Tambe, & Newell 1992), in which the interactions across the rules slow down the system, even if none of the rules individually are all that expensive. Recent work on the average growth effect has shown that it is possible to learn over one million rules while still allowing their efficient use (Doorenbos, Tambe, & Newell 1992; Doorenbos 1993). In this article we focus on the expensive chunk problem.
Previous work on the expensive chunk problem has investigated how to produce cheaper rules (Prieditis & Mostow 1987; Minton 1988; Shell & Carbonell 1991; Shavlik 1990; Etzioni 1990) and how to filter out expensive rules (Minton 1988; Greiner & Jurisica 1992; Gratch & Dejong 1992; Markovitch & Scott 1993). However, none of these approaches can generally guarantee that the cost of using the learned rules will always be bounded by the cost of the problem solving episode from which they are learned. That is, the cost of a learned rule can be greater than the cost of solving the problem with the original set of rules. There has been developed a technique for restricting the expressiveness of the rules to bound the match cost of the rules (Tambe 1991). However, the restriction reduces the expressibility of the rules, requiring a large number of rules to encode tasks. Also, the learned rules may become very specific. One way of finding a solution which can guarantee cost boundedness without such a restriction is to investigate the differences between the match process (i.e., the search performed during match3) of the learned rule and the problem-solving process from which it is learned. By analyzing the differences, we can identify a set of sources which can make the output rule expensive. Prior work on this topic has examined one such difference: in chunking (and other EBL systems which use search control in problem solving), eliminating search control in learning can increase the cost of the learned rules (Kim &
2Chunk means any learned rule. This is a generalization of the
term used in the Soar system.
3What is referred to as k-search in (Tambe 1991).