Learning of Indexed Language Families from Stochastic Input

Shyam Kapur
Department of Computer Science, James Cook University, Townsville Q 4811, Australia.

Gianfranco Bilardi
Dipartimento di Elettronica ed Informatica, Universita di Padova, Via Gradenigo 6/A, 35131 Padova, Italy.


Language learning from positive data in the Gold model of inductive inference is investigated in a setting where the data can be modeled as a stochastic process. Specifically, the input strings are assumed to be generated from a sequence of identically distributed, independent random variables, where the distribution depends on the language being presented. It is shown that the indexed families of languages learnable in a distribution-free fashion with probability greater than 1/2 are exactly those that can be learned from all texts. Conditions are established under which a technique can be employed to boost the probability of success at learning.
Conference Home Page