6.863 Lab 1 Part I - Mitchell Berger

6.863 Lab 1 Part I
Mitchell Berger
[Collaborated with Tao B. Schardl]

Question 1

When you generate from the lexical string refer+ing, the result that you get is refering instead of referring. The direct technical problem here is that whenever Kimmo attempts to add the second "r", gemination15 fails. Essentially, the doubling of the letter "r" is only permitted by gemination15 if it is in state 14. However, without a symbol to indicate the stressed syllable in the word, it won't ever do anything aside from staying in state 1 or rejecting. Practically, we need to always provide the stress indicator for these machines to work, and indeed generating from the string re`fer+ing works properly, as shown in this screenshot:

Question 2

The result of recognizing the surface string dogs is `dog+s (Noun(dog)+PL). I added "dog" as a verb by putting the following line in the V_ROOT_NO_PREF section of the lexicon:

`dog    AfterVerb Verb(dog)

The word "dogs" is now recognized both as the form of the noun that it was before, and as a form of the verb, as seen in this screenshot:

Question 3

When the system is asked to recognize flier, it at first appears that the system is making more than one left-to-right pass across the word to find it. However, this is not strictly speaking the case. It is making a single pass across the word, and backtracking as far as necessary when it fails. In this case, it fails after it has gone through most of the word and finds that it needs to backtrack pretty much all the way to the beginning. That is, essentially Kimmo is running a depth-first search. This means that while Lauri Kartunnen's quote claiming that "the recognizer only makes a single left-to-right pass through the string as it homes in on its target in the lexicon" is technically not wrong, there are some cases in which it will appear to be misleading.

The actual issue that causes Kimmo's recognizer to have to backtrack all the way in this case can be identified by stepping through the recognition steps and watching the states that the lexicon is in. Up until the point of backtracking, most of the states hint that it is trying to recognize "flier" as a form of the noun "fly." Beyond the backtracking, most of the lexicon states hint that it is trying to recognize "flier" as a form of the verb "fly." Thus, we may expect to see this sort of behavior when the root of a word may be used as multiple different parts of speech. This behavior will not occur with generation because generation only cares about spelling change rules, and since it doesn't care about parts of speech at all, it makes no use of the lexicon. This is the key way in which it differs from recognition.

Question 4

Question 4.1

After announcing Success, the machine does not actually loop back at all. What is actually happening is just a continuation of the depth-first search that it has been conducting all along. Misunderstanding this is a result of failing to grasp the crucial point that the rules should not be thought of as driving the process of generation to some end; instead, the rules are filters, whose only purpose is to weed out and reject the impermissible words. Kimmo needs to generate all allowable words, and so it must eventually travel all the branches in its depth-first search until each one either results in a complete valid word, or is rejected by one or more of the filters (rules).

Question 4.2

Background from the questions posed in the setup for this question, but not actually asked explicitly in Question 4.2 itself: Regardless of which order the two original rules from rule12.yaml appear in, the output is kampat. The lab asks us to compare this with what we hypothesized, and in fact this exactly matches what I expected to happen, having read the Kimmo documentation that was suggested. However, someone who had not yet familiarized themselves with the way Kimmo worked would have expected the rules to be applied in the opposite order from that order which they appeared to be applied in previously.

This is not the case because Kimmo does not order rules at all; all the rules apply simultaneously, and can reference information in both the underlying lexical representation and the surface form. They don't feed one another; they each simply limit what the outputs from the total system may be. There is, of course, another reason someone with the above hypothesis would be confused... if rule 1 had been applied and then rule 2 had been applied, the expected output would be kammat, but it was actually kampat. Perhaps this is actually an error in the way these rules are ordered in the lab.

Rather than take the rules as originally written in the lab, I felt that the syntax used there was insufficiently expressive in that it doesn't reflect the true two-level nature of the Kimmo engine. Specifically, it doesn't specify whether the various elements of the environments are constrained with respect to the lexical form, the surface realization, or both. Thus, considering the generation we wished to happen, I rewrote the rules in the much more expressive syntax outlined in the Kimmo documentation, and proceeded to compile those into FST tables, discarding the original implementations entirely. In essence, I think the problem was posed in a strange way, and that suggesting that we modify the machines to interact "better" is going about the problem the wrong way - properly written two-level rules should not have to care much about one another, and if the rules are expressed correctly at the outset, this problem shouldn't exist in the first place.

The rules I implemented in my file, corresponding to the functions performed by the original rules, are as follows:

# p:m <= @:m ___

  p realized as m: |
    FSA
        p @ p @
        m m @ @
    1:  1 2 1 1
    2.  1 1 0 1

# N:m <= ___ p:@

  N realized as m: |
    FSA
       N N p @
       m @ @ @
    1: 1 2 1 1
    2. 1 2 0 1

Here is a transcript of a trace that shows the revised system working as desired; it outputs kammat from kaNpat while rejecting kampat.

>>> k=load("rule12mine.yaml")
>>> print list(k.generate('kaNpat', TextTrace(3)))


p realized as m: 1 => 1
N realized as m: 1 => 1

 k
 k
 p realized as m: 1 => 1
 N realized as m: 1 => 1

  ka
  ka
  p realized as m: 1 => 2
  N realized as m: 1 => 1

   kaN
   kam
   p realized as m: 2 => 1
   N realized as m: 1 => 1

    kaNp
    kamm
    p realized as m: 1 => 1
    N realized as m: 1 => 1

     kaNpa
     kamma
     p realized as m: 1 => 1
     N realized as m: 1 => 1

      kaNpat<#>
      kammat<#>
      p realized as m: 1 => 1
      N realized as m: 1 => 1

       kaNpat#
       kammat#
       SUCCESS: kaNpat# <=> kammat#


   kaN

   kam

   p realized as m: 2 => 0
   N realized as m: 1 => 1

['kammat']

Question 4.3

I commented above on some of the reasons that I chose this approach - it reflects the two-level nature of rules in Kimmo, and allows each of them to not really care about the other. This is an excellent characteristic for computational extendibility, because even if we add more rules, they will each be able to remain relatively small and simple, whereas if they had either been merged into a single rule or had each gained more specific state or symbol knowledge related to what the other rule would be doing in each state, the addition of more and more rules would cause the complexity of the rule(s) to grow nonlinearly.

In addition, it's reasonably easy to look at each rule's table in this system and understand the two-level rule it is trying to represent (by looking for the cases that it will reject - it is, after all, really a filter). This means that this method has good linguistic transparency. In a system that either merged rules or incorporated into each rule specialized knowledge about the other rules that it will be coexisting with, this task can become exceptionally hard.

Question 4.4

For the reasons described in my answer to the previous question, I believe that my system would remain at roughly the same level of complexity if there were 20 rules interacting instead of just 2. Of course, the length of the rule file would grow linearly, and the actual two-level rules would need to be thought about with some care so as not to contradict each other. However, given 20 properly written two-level rules, they could each be compiled into a table much as the 2 rules above, each of these tables would be of a sane size, and would be readable to understand what rule it was expressing, and for all its designed-in simplicity, the system would be able to handle considerably complex transformations.