• distributional learning of parallel multiple context-free grammars

    • تاریخ ارائه: 1392/07/24
    • تاریخ انتشار در تی پی بین: 1392/07/24
     natural languages require grammars beyond context-free for their description. here we extend a family of distributional learning algorithms for context-free grammars to the class of parallel multiple context-free grammars (pmcfgs). these grammars have two additional operations beyond the simple context-free operation of concatenation: the ability to interleave strings of symbols, and the ability to copy or duplicate strings. this allows the grammars to generate some non-semilinear languages, which are outside the class of mildly context-sensitive grammars. these grammars, if augmented with a suitable feature mechanism, are capable of representing all of the syntactic phenomena that have been claimed to exist in natural language.we present a learning algorithm for a large subclass of these grammars, that includes all regular languages but not all context-free languages. this algorithm relies on a generalisation of the notion of distribution as a function from tuples of strings to entire sentences; we define nonterminals using finite sets of these functions. our learning algorithm uses a nonprobabilistic learning paradigm which allows for membership queries as well as positive samples; it runs in polynomial time.

