-
Notifications
You must be signed in to change notification settings - Fork 0
Transforming a GF grammar
This is an explanation of my thoughts behind issue #4.
There is also a GF crash course for Haskell programmers.
An example grammar in English and Swedish is given later in this document. Both of them has inflection on Number (and Gender for Swedish). Now I want to make this grammar recognise sentences with disagreement in Number (but keep the Gender agreement). Here's how I imagine an automatic transformation could be done.
- Input: a multilingual GF grammar; a parameter to be merged
- Output: a GF grammar where the parameter is conflated to
Any
-
(Manually) Decide which parameters should be merged. In this example we want to merge Sg and Pl...
param Number = Sg | Pl;
...into a single parameter:
param Number = Any;
-
All functions with a linearisation table involving Sg or Pl...
-- English concrete syntax love = {s = table {Sg => "loves"; Pl => "love"}}; elk = {s = table {Sg => "elk"; Pl => "elks"}}; deer = {s = table {_ => "deer"}}; -- Swedish concrete syntax love = {s = "älskar"}; elk = {s = table {Sg => "älg"; Pl => "älgar"}; gen = Utr}; deer = {s = table {_ => "rådjur"}; gen = Neu};
...should be split into several functions, one each for Sg and Pl:
-- English concrete syntax love_Sg = {s = table {Sg => "loves"}}; love_Pl = {s = table {Pl => "love"}}; elk_Sg = {s = table {Sg => "elk"}}; elk_Pl = {s = table {Pl => "elks"}}; deer = {s = table {_ => "deer"}}; -- Swedish concrete syntax love = {s = "älskar"}; elk_Sg = {s = table {Sg => "älg"}; gen = Utr}; elk_Pl = {s = table {Pl => "älgar"}; gen = Utr}; deer = {s = table {_ => "rådjur"}; gen = Neu};
-
The corresponding functions in the abstract GF grammar should also be split.
-- Abstract syntax love_Sg, love_Pl : Verb; elk_Sg, elk_Pl : Noun;
Some additional functions in a language also have to be split, if they are split in the other language. E.g., love has to be in Swedish because it was split in English:
-- Swedish concrete syntax love_Sg = {s = "älskar"}; love_Pl = {s = "älskar"};
Note that not all nouns in the example have to be split. E.g.,
deer
does not inflect for Number in either English or Swedish, so it's not necessary to split it. -
Change all occurrences of Sg and Pl into Any:
-- English concrete syntax a = {s = "a"; num = Any}; love_Sg = {s = table {Any => "loves"}}; love_Pl = {s = table {Any => "love"}}; elk_Sg = {s = table {Any => "elk"}}; elk_Pl = {s = table {Any => "elks"}}; -- Swedish concrete syntax a = {s = table {Neu => "ett"; Utr => "en"}; num = Any}; elk_Sg = {s = table {Any => "älg"}; gen = Utr}; elk_Pl = {s = table {Any => "älgar"}; gen = Utr};
With these transformation steps, the grammars should be able to recognise things like "all deer loves a elks" == "alla rådjur älskar en älgar".
Now, this is my idea. Issue #4 is about implementing this grammar transformation automatically, as an add-on to GF.
Important notes:
-
GF grammars in general make use of grammar libraries, such as the RGL, so the linearisation types are much more complex than in this example. But that should not change the basic idea of the transformation.
-
GF abstract syntax also allows dependent types, but we can assume that the grammar does not contain any dependent types.
-
For simplicity we can assume that both languages have exactly the same Number parameters. I.e., we disallow this for now:
-- English concrete syntax param Number = Sg | Pl; -- Russian concrete syntax param Number = Sg | Pl | Dual;
-
GF concrete syntax has a lot of "sugaring" constructions, such as operations, lambdas, let-constructs. All these are compiled away somewhere by the GF compiler, into a "canonical" form. Unfortunately, the GF source code is not very well commented...
Here is the information I've got so far:
What you need to do is to insert a new phase between the typechecker and the backend code generation. If you look at
GF.CompileOne.compileSourceModule
, then there is this sequence:generateGFO <=< ifComplete (backend <=< middle) <=< frontend
Your new phase seems to fit between backend and middle.
abstract Mini = {
cat S; VP; NP; Verb; Det; Noun;
fun
mkS : NP -> VP -> S;
mkVP : Verb -> NP -> VP;
mkNP : Det -> Noun -> NP;
love, hate : Verb;
a, all : Det;
elk, deer : Noun;
}
concrete MiniEng of Mini = {
param
Number = Sg | Pl;
lincat
S = {s : Str};
NP, Det = {s : Str; num : Number};
VP, Verb, Noun = {s : Number => Str};
lin
mkS np vp = {s = np.s ++ vp.s!np.num};
mkVP verb np = {s = \\num => verb.s!num ++ np.s};
mkNP det noun = {s = det.s ++ noun.s!det.num; num = det.num};
love = {s = table {Sg => "loves"; Pl => "love"}};
hate = {s = table {Sg => "hates"; Pl => "hate"}};
a = {s = "a"; num = Sg};
all = {s = "all"; num = Pl};
elk = {s = table {Sg => "elk"; Pl => "elks"}};
deer = {s = table {Sg => "deer"; Pl => "deer"}};
}
concrete MiniSwe of Mini = {
param
Number = Sg | Pl;
Gender = Neu | Utr;
lincat
S, NP, VP, Verb = {s : Str};
Det = {s : Gender => Str; num : Number};
Noun = {s : Number => Str; gen : Gender};
lin
mkS np vp = {s = np.s ++ vp.s};
mkVP verb np = {s = verb.s ++ np.s};
mkNP det noun = {s = det.s!noun.gen ++ noun.s!det.num};
love = {s = "älskar"};
hate = {s = "hatar"};
a = {s = table {Neu => "ett"; Utr => "en"}; num = Sg};
all = {s = \\_ => "alla"; num = Pl};
elk = {s = table {Sg => "älg"; Pl => "älgar"}; gen = Utr};
deer = {s = \\_ => "rådjur"; gen = Neu};
}