Learning of Indexed Language Families from Stochastic Input

Shyam Kapur
Department of Computer Science, James Cook University, Townsville Q 4811, Australia.
kapur@cs.jcu.edu.au

Gianfranco Bilardi
Dipartimento di Elettronica ed Informatica, Universita di Padova, Via Gradenigo 6/A, 35131 Padova, Italy.
bilardi@artemide.dei.unipd.it

Abstract

Language learning from positive data in the Gold model of inductive inference is investigated in a setting where the data can be modeled as a stochastic process. Specifically, the input strings are assumed to be generated from a sequence of identically distributed, independent random variables, where the distribution depends on the language being presented. It is shown that the indexed families of languages learnable in a distribution-free fashion with probability greater than 1/2 are exactly those that can be learned from all texts. Conditions are established under which a technique can be employed to boost the probability of success at learning.

Conference Home Page