Reference: Cheng, T.-P. Towards a Formalization of Partial Domain Theories, Landmark Value Selection and Incremental Learning for Inductive Learning Systems. KSL, June, 1989.
Abstract: Today's learning systems lack the ability to perform truly automated learning. In many successful cases of concept learning, the learning systems require a great deal of fine-tuning by human experts. This paper addresses three deficiencies in the current state-of-the-art of automated learning systems. This research towards the goal of a general automated learning system, was done using the inductive rule-learning system, RL4. The first area of work involves the specification of a partial domain theory in the learning system. Learning from examples is a form of inductive learning in which a learning system induces descriptions of a concept given a training set of positive and negative instances of that concept. A partial domain theory of a problem domain must be given to the learning system upon which an inductive leap of formulating concept descriptions is based. This paper addresses several issues in the design of a specification language that allows a partial domain theory to be described using RL4 as a test-bed. The main motivation behind the design is that in handling a wide range of problem domains, a general learning system must be allowed the flexibility to switch partial theories between different problem domains with ease. Furthermore, in the light of constructive induction [Michalski, 83], a learning system should also be allowed to modify the current partial theory in the system. Thus, by having a uniform specification language for partial domain theories, RL4 is taking a step towards the sharing of information among problem domains, which is closer to the ultimate goal of automating the specification of partial domain theories. The second area of work involves the automatic selection of landmark values. An important component of a partial domain theory is a concept description language that is used to describe a concept as it is learned. Many current learning systems [Michalski, 83; Breiman et al., 84; Quinlan, 86], including RL4, use descriptors called features or attributes that assume certain sets of values that are specified in a partial domain theory. Often times the sizes of these sets are very large or even infinite. Thus, the success of a learning system in learning a concept is dependent largely on human experts' choices of these values. This paper introduces a technique that automates the selection of relevant landmark values for each feature, which in turn automates part of the specification of a partial domain theory. The third area lies in incremental learning. For inductive learning in general, not all training instances may be available at the time of learning. As mentioned in [Sullivan, et al., 88], it may be necessary of desirable to learn on a set of training instances and later update the theory as more instances become available, especially in the realm of learning time-vary concepts (i.e. concept drift). Thus, incremental learning can be viewed as a process whereby a theory evolves through continuous refinement. This paper discusses a method which determines the information to be included in an evolving theory. Furthermore, a criterion for evaluation evolving theories is also introduced. Section 2 gives an overview of the inductive learning system RL4 in which the ideas are tested. The subsequent three sections are devoted to each area of work mentioned above in the same order. In section 6, the experimental results of landmark value selection and incremental batch learning criterion are discusses. Finally, section 7 outlines several suggestions for future work, followed by a summary of this paper in section 8.