Rewrite rule grammars with multitape automata

abstract

The majority of computational implementations of phonological and morphophonological alternations rely on composing together individual finite state transducers that represent sound changes.Standard composition algorithms do not maintain the intermediate representations between the ultimate input and output forms.These intermediate strings, however, can be very helpful for various tasks: enriching information (indispensable for models of historical linguistics), providing new avenues to debugging complex grammars, and offering explicit alignment information between morphemes, sound segments, and tags.This paper describes a multitape automaton approach to creating full models of sequences of sound alternation that implement phonological and morphological grammars.A model and a practical implementation of multitape automata is provided together with a multitape composition algorithm tailored to the representation used in this paper.Practical use cases of the approach are illustrated through two common examples: a phonological example of a complex rewrite rule grammar where multiple rules interact and a diachronic example of modeling sound change over time.

introduction
Finite-state transducer based phonological and morphological models tend to be built by the composition of individual transducers that encode morphotactics and morphophonological alternations (Beesley and Karttunen 2003).Apart from cases such as nonconcatenative morphologies where augmented techniques tend to be favored (Beesley and Karttunen 2000;Habash et al. 2005;Hulden 2009d;Kiraz 2001), this well-established approach is indeed quite successful and streamlined in the domain of morphophonology if the goal is to produce a single transducer that maps underlying forms (parses) to surface forms and vice versa.Some types of grammatical information are difficult to include in such a design, however.In morphological modeling, one may want to recover the alignment of morphological tags to the actual morphemes; in phonological modeling, one may want to recover intermediate representations that show how a particular phonological alternation targets specific segments in a word, what order phonological alternations occur in, and what they were conditioned on.This is particularly important in developing finite-state models of historical sound change, where it is imperative to retain intermediate alignment information so that the model may indicate what sound laws proto-segments are subject to and in what order changes occur.In some respect, the "intermediate representations" in diachronic derivations are more crucial to the linguist than their counterparts in synchronic models since in the latter case they are bound to a particular model of phonology.The ability to model such sequences would make finite-state devices more attractive for linguistic research, where computational methods could help streamline the work of lining up large amounts of data and testing hypothetical generalizations; it might therefore increase linguists' use of finite-state methods, whose potential has to date been underexploited in the linguistics literature (Karttunen 2003).
In this paper, I show that a multitape model constructed by composition of individual multitape lexicon or alternation transducers offers a simple framework that addresses the problem of intermediate forms, while at the same time retaining the straightforward design of morphology and morphophonology.Apart from expanding the expressive power of the grammar, the method also offers the grammar designer the option to re-convert the multitape grammar to a simple underlying-to-surface transducer, if desired -as may be the case if the multitape representation is only used for obtaining debugging information.Indeed, debugging the alternation rules and lexicon description involved in drafting a morphological grammar becomes much [ 108 ] less burdensome under the multitape model, since information about each step in the process of mapping from underlying to surface form is retained and is available for inspection. 1he methods described in this paper are implemented as a standalone library in Python.The library itself is built on top of the foma library (Hulden 2009b) which provides a backbone implementation of standard transducer algorithms.The implementation allows users to develop multitape grammars in standard regular expression and rewrite-rule notation, automatically and transparently converting the compiled transducers to multitape equivalents and performing multitape composition on the components.This enables a relatively linguist-friendly grammar design procedure that relies on well-known formalisms and offers the possibility of quick conversion of existing grammars into a multitape representation where word-forms can be parsed and generated with a rich intermediate structure.
This paper is structured as follows: first, some background on rewrite-rule grammars is presented, motivating the need for more richly structured representations; this is followed by a description of the multitape encoding with special focus on the composition algorithm for multitape automata; following this, a system for augmenting the multitape automata with extra annotation (such as rule names) is presented; two case studies are then provided to illustrate in concrete terms the possibilities of the multitape formalism.

traditional rewrite rule grammars
A significant portion of morphological analysis tools are written with the design described above: (1) a transducer that encodes morphotactics and tag sequences, and (2) a series of transducers that model morphophonological/orthographic alternation.The latter may be expressed as Sound Pattern of English-inspired 'rewrite rules' (Chomsky and Halle 1968) or as two-level parallel constraints (Koskenniemi 1983), the former being arguably the more popular choice at present due to simplicity of debugging complex rule interactions (Alegria et al. 2010).The result of composing the lexicon transducer and the mor-  kenniemi, 1983), the former being the arly more popular choice at present.The result mposing the lexicon transducer and the morhonological transducers is one monolithic ducer that directly performs the bidirectional ing from underlying-to-surface forms (genn) and vice versa (parsing).The prevalence is design is probably partly due to known ithms (Kaplan and Kay, 1994;Kempe and unen, 1996;Mohri and Sproat, 1996;Hulden, a) or software tools designed around this igm (such as Xerox's lexc/xfst/twol (Beesley arttunen, 2003), foma (Hulden, 2009b), or e (Beesley, 2012)).In the following, we assume the more common 'rewrite-rule' igm.
ble 1 illustrates this standard design using example words from a grammar of Lardilample language often used to illustrate comrule ordering and word-final phonology with that are sensitive to ordering.The original stems from Hale (1973), and we follow analby Kenstowicz and Kisseberth (1979); Hayes ); Round (2011).Due to the rich interaction rd-final deletion rules, this is a widely used set that has been a target of many analyses, which illustrate the difficulty of marshaling able to be able to produce a rich representation such as the one in table 1 from either an underlying form (morphological information) or the surface form showing all the processes that the word undergoes step-by-step.
Under the standard composition model, there is no easy way to do this, save by applying an underlying form to each of the individual transducers representing the alternation rules in order, saving the results, and passing them on as input to the next transducer.However, in the inverse direction, such a strategy is not directly feasible, in addition to the fact that not composing the transducers partly defeats the purpose of using a finitestate model in the first place.
There is no principled reason, however, why the composition algorithm should destroy the intermediate representations if they are desired later.In other words, when creating a composite transducer modeling x:z from transducers x:y and y:z, one can in principle expand the composition algorithm to yield x:y:z in some representation, retaining all the intermediate information.

Previous work
The importance of the preservation of intermediate results in composition has been noted and partly phophonological transducers is one monolithic transducer that directly performs the bidirectional mapping from underlying-to-surface forms (generation) and vice versa (parsing).The prevalence of this design is probably partly due to known algorithms (Kaplan and Kay 1994;Kempe and Karttunen 1996;Mohri and Sproat 1996;Hulden 2009c) or software tools designed around this paradigm (such as lexc/xfst/twol by Xerox (Beesley and Karttunen 2003), foma (Hulden 2009b), or Kleene (Beesley 2012)).In the following, I shall assume the more common 'rewrite-rule' paradigm.
Table 1 illustrates this standard design using some example words from a grammar of Lardil (iso 639-3: lbz, a Pama-Nyungan language spoken on Mornington Island in Australia).This is an example language often used to illustrate complex rule ordering and word-final phonology with rules that are sensitive to ordering.The table is laid out in a manner often employed by phonologists to quickly give an overview of interacting processes.The original data stems from Hale (1973), and I follow analyses by Kenstowicz and Kisseberth (1979); Hayes (2011);Round (2011).Due to the rich interaction of word-final deletion rules, this is a commonly cited data set that has been a target of many analyses, all of which illustrate the difficulty of marshaling a complex set of phonological alternations.In the language, we find three independently motivated deletion rules (apocope, cluster reduction, non-apical truncation) which interact in complex ways, sometimes conspiring to elide multiple segments word-finally.The rules in question are shown here in traditional phonological notation: To explain the workings of the grammar, the table shows all the intermediate steps in mapping from lemma-and-inflection forms to actual surface realizations.In actuality, however, if modeled by transducer composition, all the intermediate forms are lost through the composition process, which is one of the shortcomings addressed below.That is, a final composite transducer simply provides mappings between parse and surface.For phonological analysis, possible grammar debugging, and perhaps language documentation purposes, it would be very desirable to be able to produce a rich representation such as any of the columns shown in Table 1 from either an underlying form (morphological information) or the surface form showing all the processes that the word undergoes step-by-step.
Under the standard composition model, there is no easy way to do this, save by applying an underlying form to each of the individual transducers representing the alternation rules in order, saving the results, and passing them on as input to the next transducer.However, in the inverse direction, such a strategy is not directly feasible, in addition to the fact that not composing the transducers partly defeats the purpose of using a finite-state model in the first place.
There is no principled reason, however, why the composition algorithm should destroy the intermediate representations.In other words, when creating a composite transducer modeling x:z from transducers x: y and y:z, one can in principle expand the composition algorithm to yield x: y:z in some representation, retaining all the intermediate information.As will be seen below, a combination of a multitape design together with a rule-decoration mechanism allows us to automatically produce rich analyses very much like the ones given in Table 1.

previous work
Multitape automata in general have been proposed as viable models for morphology and phonology, particular when addressing nonconcatenative phenomena abundant in Semitic languages such as Arabic, Hebrew, and Syriac (Altantawy et al. 2010;Kay 1987;Habash et al. [ 111 ] 2005; Habash and Rambow 2006;Hulden 2009d;Kiraz 2000Kiraz , 2001)).In these approaches, different phonological tiers are represented by different tapes in a multitape model.Most of these earlier models could in fact be called multitape transducer models, since they typically work akin to transducers, although with an extended symbol representation where instead of manipulating symbol pairs, as in the transducer case, transitions are labeled with n-tuples of symbols.Specialized algorithms are then used to handle this representation and to enforce symbol correspondences across tapes -Kiraz ( 2000), for example, works with a constraint formalism similar to that of two-level morphology (Koskenniemi 1983), extended to operate in a multitape transducer scenario.
By contrast, the current work assumes as a starting point that regularities across multiple levels of representation will be captured not by constraints across multiple tapes, but that adjacent tapes will be constrained by (morpho)phonological rewrite rules.To make this feasible, the compilation of rewrite rules must be extended to a multitape scenario, and a composition algorithm is required that is able to join multitape representations together, preserving intermediate information.
The importance of the preservation of intermediate results in composition has been noted and partly addressed in Kempe et al. (2004), among others.The formulation presented below differs from earlier work in both representation and algorithms, and also in that it is intended to be simple and easily implementable without special algorithms for multitape automata, i.e. using only established algorithms for single-tape automata and transducers.The same representation (without a composition design) has been used earlier for the construction of Arabic multitape grammars (Hulden 2009a).In that work, conversion from transducers is not considered, and no composition algorithm is given, as the assumption is that multitape automata are constructed through intersections of constraints on co-occurrence of symbols on the various tapes, analogously to two-level grammars (Koskenniemi 1983).The multitape representation in this paper uses the encoding from (Hulden 2009d) and builds upon extensions to it given in Hulden (2015).
[ 112 ] notation In discussing algorithmic aspects, familiarity with standard regular expression notation to construct automata and transducers is assumed.For regular languages or automata X and Y , the description below will make use of the operations union (X ∪ Y ), concatenation (X Y ), Kleene closure (X * ), Kleene plus (X + ), intersection (X ∩ Y ), complement (¬X ), and difference (X −Y ).The n-ary concatenation of a language X with itself is denoted X n .From two languages represented as automata, their string-wise cross-product and resulting regular relation (representable as a transducer) is denoted with X : Y .If X and Y are transducers, their composition is (X • Y ).The input and output projections of a relation/transducer X are denoted domain(X ) and range(X ).Whenever a regular language (or automaton) X appears in a transducer context, it is assumed to represent the identity relation, i.e. a transducer that simply repeats the set of words accepted by X .In some algorithms subtraction is performed in a transducer context (X − Y ); in such cases the subtraction refers to transducer path subtraction and not relation subtraction which regular relations are not closed under, i.e. the result represents the set of valid sequences of symbol pairs in X but not in Y .We use the special symbol ? to represent any single symbol.
When describing linguistic grammars, the well-known Xerox regular expression notation (Beesley and Karttunen 2003) is used in this paper to define and manipulate automata and transducers, rewrite rule transducers in particular; the examples should be directly compilable with the foma library.The formalism used is summarized in Table 2. Multitape additions are implemented through a Python interface discussed in Section 9.

a multitape encoding
In the implementations below, a multitape representation is assumed to be a simple single-tape automaton that either accepts or rejects a string s in the standard way.However, the strings in question are intended to represent valid computations of a multitape automaton where certain positions in s pertain to certain tapes.Which symbol in the linear string s belongs to which tape is modeled by a simple "interleaving" encoding where the length of any accepted string s is always an even multiple of the number of tapes in the multitape model it is intended to represent.Informally, the string first encodes the first column of the legal contents of an n-tape multitape automaton, topdown, then the second column, etc. etc.Every symbol in position k in the linear string representation corresponds to -in the case of n tapes -position ⌊k/n⌋ on tape (k mod n).A special representation for empty symbols (ε-symbols) in the single-tape model is assumed whereby they are represented with the symbol □ -a so-called "hard zero".A string of length l×n in the single-tape string would correspond to the multitape representation as follows, where, in parentheses, the position within a tape is shown first, followed by the tape number in the multitape representation.
For example, if a single-tape representation contains in its language the string abcd e□, this is assumed to correspond to a valid [ 114 ] configuration seen from the multitape point-of-view (a 3-tape configuration); i.e. a multitape automaton that accepts the string ad as input, translates it into be, and then translates this into c (the □-symbol representing the empty string).

conversion from transducers
It is evident that an existing transducer can be converted to this multitape representation -that is, to a 2-tape representation -without much effort.To convert a standard transducer, where transitions are encoded as symbol pairs, one simply expands each transition with a symbol pair x : y to a two-symbol sequence x y in the corresponding n-tape automaton.This operation will be referred to as "flattening."2If the original transducer T maps a string x 1 . . .x n to y 1 . . .y n by a sequence of transitions with labels ((x 1 , y 1 ), . . ., (x n , y n )), then the automaton flatten(T ) accepts a string (x 1 y 1 . . .x n y n ).In the result, εsymbols are replaced with the □-symbol.This □-symbol is only used to mark the alignment of epsilons and need not be specified by the user in any way, as will be discussed below.So-called Unknown symbols -placeholders for future alphabet expansion in incremental construction of automata -are denoted by @.These are symbols that match any symbol outside the alphabet of an automaton.Note that this is different from the semantics of the ?-symbol in regular expressions which represent any single symbol at all with no reference to an alphabet (Beesley and Karttunen 2003).
Conversion of transducers is particularly convenient since we can take advantage of existing algorithms for building complex transducers for NLP use.This includes replacement-rule transducers available in many toolkits, as well as lexicon transducers constructed through essentially right-linear grammars.Figure 1 shows a replacement rule that deletes x-symbols at the end of a string compiled into a transducer, and the result of subsequently converting that transducer to Figure 1: Illustration of a replacement-rule encoded as a transducer (left) and subsequently converted to a 2-tape automaton using the encoding presented here a standard automaton representing a 2-tape layout in the encoding used here.In other words, we can rely on existing algorithms to build phonological transducers, and only convert them to 2-tape automata before multitape composition.

multitape composition
The overall usefulness of converting transducers to 2-tape automata, and then combining a number of individual such 2-tape automata by composition, is illustrated in Figure 2 tation, all the intermediate representations which would normally be destroyed in a series of compositions can be preserved.As will be seen below, if such a strategy is augmented with the possibility of adding decoration and comment symbols to the individual tapes, very userfriendly grammars for parsing and generation can be developed.Interestingly, a generic multitape composition algorithm in this representation can be encoded entirely algebraically, which is to say, as regular expressions.Given two multitape automata, A and B, encoded as above, each representing some specified number of tapes m and n, the core idea is to break down their composed representation as a two-step process, which yields an m + n − 1 tape representation of the composite.Informally, this multitape composition process for any m and n-tape automata in the representation at hand can be described as follows: 1. Force automata A and B to be of the same number of tapes (m + n − 1) by alternatively inserting columns of empty (□) symbols followed (in A) or preceded (in B) by arbitrary symbols, or retaining the original columns in A an B but inserting arbitrary symbols after each column (in A) or before each column (in B).Input: A = FSM with m tapes, B = FSM with n tapes

Path filtering
A well known problem of standard composition algorithms for transducers also carries over to the multitape representation; this is the problem of producing multiple alternate paths in the resulting transducer when epsilon-symbols are present (ε-multiplicity).The cause of this is that there exist many equivalent paths that yield the same transduction: e.g.a:ε • ε:b can be represented as a:b, a sequence a:ε ε:b, or a sequence ε:b a:ε. Figure 4 illustrates different but equivalent outputs for the composition of two multitape automata.None of the multiple paths for describing a relation are incorrect, but the inconvenience of handling the possibility of multiple equivalent parses or generations motivates an attempt to provide unambiguous paths for each composition during the process itself.Furthermore, in a weighted automaton/transducer scenario -which we will not specifically deal with here -use of a non-idempotent semiring can yield incorrect results if multiple paths are not filtered out.
The common solution in the classical transducer domain is to either design a separate filter transducer that serves to prefer some specific order of epsilon-interleaving (Mohri et al. 2002) or to incorporate this filter mechanism directly into the composition algorithm (Hulden 2009a).In the multitape case, however, this filtering mechanism can be encoded entirely as a regular language filter which disallows certain interleavings of epsilon-symbols in the string representation, in particular those where an x:□-transition (when automaton A has an epsilon on the last tape in some position) immediately follows or precedes a □: y-transition (when automaton B inserts a symbol on its first pair of tapes).This filter can then be intersected with the output of the earlier algorithm.As mentioned, this regular expression (Filter) can simply be intersected with the earlier result to remove redundant paths in the composition (shown in lines 7-11 in the algorithm).

Algorithm details
The algorithm in 1 essentially reiterates the above, with a few details worth mentioning.In lines 1-4, constants that perform the insertion and padding are declared.Lines 5 and 6 create the transducers A extend and B extend .Lines 7-11 create the filter automaton which is independent of A and B, and the three-element intersection at line 12 yields the result of the final composition.

composition in grammars
The composition algorithm is the only extension needed to retain all the intermediate information in an ordered rewrite-rule grammar.One can simply convert any individual transducers to a multitape representation and proceed with the composition, yielding a multitape representation of the same grammar.Parsing and generation of a string s can be performed by creating a padded multitape automaton where either the underlying representation or the surface representation is in place, with arbitrary symbols present on the other tapes.This multitape automaton can then be intersected with the grammar G, yielding a string representation of the set of legal parses or generations, with their intermediate representations intact.
That is to say, if we have an n-tape automaton grammar G and want to parse a string s, we can convert the string to an automaton that accepts that string (ignoring possible intervening blanks □), pad the automaton to match the number of tapes in G (making sure s is on the last tape), and then intersect with G.The padding operation may be performed by the standard method of composing with a transducer that inserts the right amount of arbitrary symbols, and then extracting the range of the transducer.

Parse(s, G)
Likewise, to generate, we may perform the same calculation with the padding done in such a manner that s is on the first tape: Again, these functions are intended to make the system transparent to the user so that no knowledge of the actual multitape representation is needed to design and apply grammars.

Adding intermediate information
It was hinted above that annotating the effect of various transducers is a very useful feature (as seen in Table 1) for debugging or phonological analysis.Incorporating such information can be done separately from the multitape encoding; that is, one can first incorporate the desired decorative information in a standard transducer and then perform the conversion to a multitape representation, retaining the decoration.For [ 120 ] morphophonological processes, it suffices to modify the transducers that encode the relevant replacement rules in such a way as to add information about each process.In most cases, this would only entail naming the process in question.Such an annotation mechanism can be added separately to each rule transducer before converting it to a 2-tape representation.

Decoration example
In the examples below, each alternation rule transducer is augmented with a textual description of that rule.This allows us to pair up rule descriptions with rules, so that when parsing or generating with a multitape automaton, informative descriptions will appear for each rule in a chain of compositions.In essence, this allows for the inclusion of comments whenever a phonological alternation rule fires, similar to those given in Table 1.
For example, a rule that deletes the latter of consecutive vowels can be encoded as follows as a rule-description pair: and would have the following effect on input words (a) papiin and (b) papi, respectively, when generating words: making it clear to the user that this particular rule applies at that point in the derivation.

implementation
As the foma tool has existing Python bindings that can be used to call the underlying standard algorithms for manipulating automata and transducers, providing an extension to that library becomes a matter of implementing the above algorithms.The multitape encoding has been implemented as a standard Python-class that (1) provides a multitape automaton data type MTFSM and (2) can perform composition together with rule decoration on arbitrary transducers.This allows for [ 121 ] a certain level of transparency in the bookkeeping needed.For example, the information about how many tapes are encoded in an FSM is auxiliary information that it is necessary to store during a composition process, since the multitape encoding does not inherently contain this information.The interface to the foma formalism allows for automatic conversion of transducers to 2-tape automata, which may then be incrementally composed to yield representations with multiple tapes.
In effect, designing a complete grammar does not require the user to possess knowledge about or keep track of the underlying machinery, such as the number of tapes used, the padding performed, etc.Even the padding symbols -though helpful for debugging individual rules -can be omitted from the output as they are only used internally to produce a consistent alignment of different-length strings.
For example, to simply compose two rules, without any decoration, the user may enter arbitrary regular expressions (in this example rewrite rules) which automatically convert to two-tape representations that can be composed and inspected: Entire grammars can be compiled through a separate and more involved mtgrammar module.This module allows for the type of rule decoration described above, and provides for a method of composing the different multitape automata in order, as well parsing and generation functionality: printparses ('ac', G, dir='up') Here, two rewrite rules are compiled, converted automatically to multitape automata through the compilemt statement and composed [ 122 ] in the order given.After this, the resulting 3-tape automaton is used to parse the word ac in the "upward" direction, that is, assuming that the string is on the output tape.This produces the three aligned outputs: Here, we see that there are three ways the two phonological rules in question could produce the output ac -by starting from the underlying forms abc, axc, and ac, respectively.The blanks are automatically positioned in their correct positions without the user having to specify anything except the input string to be parsed and the direction of parsing (up = from surface form to underlying form, down = from underlying form to surface form).

Illustrative example 1: phonology (Lardil)
Returning now to the original Lardil example: annotating replacement rules with additional descriptive symbols to be inserted at the ends of strings every time a rule fires in combination with the multitape composition mechanism allows us to essentially automatically replicate the linguist-friendly representation given in Table 1.The following snippet illustrates some key points in the design of such grammars: automatically added to the right end of each tier, as illustrated in two different parses below: In the generation direction, the same procedure applies, and the library offers an up/down parameter to control for the direction of operation; a command printparses(u'putuka[Uninflected]', G, dir='down') in the above would have produced the same output as the example on the right hand side.

9.2
Illustrative example 2: Historical Linguistics (Proto-Indo-European) As alluded to above, another scenario where intermediate, possibly annotated strings provide important information is in the modeling of historical sound change by finite-state means.In the development of models of diachronic sound change, this provides the possibility of providing annotated parses from modern variants to proto-language forms given hypothesized chronological sound changes.The following parses show the behavior of an ordered set of rewrite rules in multitape form that model the path of sound changes from Proto-Indo-European (PIE) to German and Latin.The relevant rules are implemented as rewrite transducers as in the Lardil example above.Here, we see the parsing of the Latin form for the word "father", pátēr as well as the German form fā tɐr, using two different grammars that share part of the rewrite rules (the early sound changes affecting both).Both correspond to the underlying, hypothesized PIE form ph₂tḗrs.The relevant sound changes in this grammar were modeled following Beekes (2011);Trask (1996).As opposed to synchronic phonological grammars, the chains of sound changes over long periods can grow quite extensive.For example, the German surface form is subject to a number of them: first, a sound change called Szemerényi's law deleting coda fricatives takes place, followed by a process of laryngeal vocalization, 3 Grimm's and Verner's Laws, a stress shift, as well as a number of processes that affect vowels.The multitape parse in this case illustrates the value of such a design in checking correctness of very complex sequences of sound changes.Such sequences could plausibly be generated in the chronological direction through non-finitestate means, but the direction of interest for the linguist is generally the inverse one -parsing from surface form to underlying form, which is what is calculated here.
More advanced usage scenarios can also be explored with the method through more complex intersections of individual tapes in multitape representations for different languages.For example, having postulated a sequence of sound changes that two modern languages have undergone from the proto-language, we can calculate the set of possible proto-forms for some modern cognates x and y in two languages.In the above parses of "father", only a single parse per cognate is given, since we have included the postulated proto-form in the grammar.There might, however, exist other plausible PIE-forms that fit the sequence of sound changes.For example, removing the proto-form from the grammar yields two plausible parses in the intersection of Latin and German, patḗr and patḗrs.Such techniques can be extended to a larger scale to support the endeavor of verifying consistency of postulated sound changes with the possibility of immediate feedback when minor changes are made in the various sound laws.
3 Laryngeals are abstract segments proposed to have been present in Proto-Indo-European (De Saussure 1879) but later disappeared, leaving behind different vowel qualities and a compensatory lengthening.The laryngeals are commonly labeled * h 1 , * h 2 , and * h 3 , and * H is used as a cover symbol for all three.
[ 125 ] conclusion This paper has presented a general, automatic method for extending finite-state grammars in the composed rewrite-rule tradition.The method in effect replaces the use of transducers with multitape automata, which are shown to have the capacity to provide rich parses and to support elaborate annotation of intermediate forms.Existing algorithms for constructing transducers from rewrite-rule specifications can still be used, once converted to multitape representations.We can also take advantage of specialized string-rewriting and constraint systems to handle syllabification (Hulden 2006), Semitic interdigitation (Beesley and Karttunen 2000), and, with some caution, unification features such as flag diacritics to model long-distance dependencies (Beesley 1998).Potentially, steps in candidate removal in Optimality Theoretic grammars could also be implemented by incorporating proposals to model such processes by finite-state composition (Karttunen 1998;Gerdemann and van Noord 2000;Gerdemann and Hulden 2012).
The model itself assumes little machinery beyond the ability to compose the resulting multitape automata, but offers a way to produce rich representations of grammars constructed in this vein.If desired (for memory efficiency reasons), the resulting multitape automata can still be re-converted to transducers by eliminating the intermediate representations.This offers the possibility to only use the multitape representation for debugging purposes, if the final intent is to produce a simpler underlying-to-surface mapping or vice versa.
The above techniques may be useful for applications outside standard designs of morphophonological grammars.In modeling historical sound changes, for example, 'debugging' problems similar to those in phonology and morphology tend to arise -much exacerbated by the fact that one is often dealing with multiple languages at the same time.Keeping track of hundreds of proposed sound laws together with their effect on lexical items across languages is a task that is well suited for the type of modeling presented in this paper.
Although the application focus of this paper has been more along the lines of modeling traditional non-probabilistic grammars, the methods presented above -the composition algorithm in particular -are also adaptable to weighted automata.[ 126 ]

Figure 2 :
Figure 2: Workflow for converting rewrite rule specifications to transducers, then 2-tape automata, then n-tape automata Figure 3: Illustration of multitape composition: the shaded areas show possible contents of the original multitape automata A and B, while the remaining areas show the result of insertions to coerce the automata to have the same dimensions and epsilon-behavior before intersection of A and B

Figure 4 :
Figure 4: Composition of automata A and B, illustrating different alignments of epsilon-symbols.This shows composition behavior with respect to two particular configurations in A and B. A subsequent filter, expressed as an automaton, removes all the solutions except the upper leftmost one

Table 1 :
Interaction of multiple phonological processes in Lardil

Table 1 :
Interaction of multiple phonological processes in Lardil.