<quote>
The word “grammar” evokes, for many people, the idea of useless formal knowledge which is inflicted apon students as a kind of punishment. Language learners usually think they should learn the grammar of the new language but don’t want to. Computer programmers usually don’t look at the formal Backus-Naur grammar of the coding language they are learning or using, and probably it wouldn't help them much if they did, just as it doesn't help (human) language learners to study the grammar of the language they are learning.
In fact, studying the grammar of a language is arguably counterproductive to actually learning the language. Infants never do it, and they learn their own language or languages fluently, in the same way that they never consult a dictionary to find the meaning of a word. Despite this, Grammar was one of the three “Trivium” medieval European academic subjects considered of first-order importance. Gramma means in Greek simply “line”, grammata is the plural “lines”. In the context of language learning, studying grammar may actually be conterproductive, because it encourages the learners brain into a “meta” mode of analysis, rather than the reactive mode of listen/repeat/express.
If grammar is not of importance in language learning, why is it considered so important? My answer to that, is that it represents the structure of textual or linguistic patterns. So it is, essentially, the patterns of the patterns, and that is an interesting idea for people to cogitate on.
In the last few weeks, I have been developing a language which I call
syntagma* . The development and ideas have flowed easily because syntagma*
builds on the ideas and structure of pep* and nom* and the syntagma
compiler uses nom to compile into nom. Because nom* is a language
for recognising or compiling “simple” languages, it gives rise to all sorts
of self-reflectivity and recursive patterns. I have often noted these
before but here are some concrete examples:
/compile.pss )
/tr/nom.toperl.pss and
/tr/nom.torust.pss and others)
pep -f nom.toperl.pss nom.toperl.pss > nom.toperl.pl;
cat nom.toperl.pss | perl nom.toperl.pl > nom.toperl.2nd.pl
diff nom.toperl.pl nom.toperl.2nd.plThe final diff produces no output because the files “nom.toperl.pl ” and “nom.toperl.2nd.pl” have the same content (a nom-to-perl translation script, in perl). This fact seems to me bizarrely profound.
/tr/syntagma.pss )
Here is a concrete example.
# turn brackets into 'literal' parse tokens using class [...] and
# alternation | syntax. literal tokens must be defined here, before
# they can be used in the parse section of the script.
literal: [{}()] | '[' | ']';
# delete all other input (spaces, uppercase letters etc)
delete;
# define the parse rules for nested balanced brackets
# literal tokens can be put in quotes "" or '' and must be if they
# are special characters. Other parse tokens (nest,list)
# are written without quotes.
nest = '(' ')' | '{' '}' | '[' ']' |
'(' nest ')' | '{' nest '}' | '[' nest ']' |
'(' list ')' | '{' list '}' | '[' list ']';
# the list parse token is not necessary (I could just write "nest = nest nest;")
# but it may make the semantics of the grammar clearer (a list is not a nest!)
list = nest nest | list nest;
# print a success/fail message at the end of input
eof {
stack (nest|list) { println "Balanced brackets!"; exit; }
println "Terribly sorry, but you're not balanced"; exit 1;
}
The code above solves the nested and balanced brackets problem that is often used as an elementary parsing challenge, for the simple reason that it is slightly too difficult for regular expressions . The syntagma script is a “recogniser” or “syntax checker” in the sense that it only determines whether the input adheres to the given grammatical rules; it doesn't try to translate the input into some diffent output, but it could. The code looks to me clear and expressive of the grammatical structure and doesn't have too much unnecessary 'noise' (apart from the comments).
Now,
this syntagma script can be run immediately using the interpreter
which translates syntagma ↦ perl and then runs
with the given input, for example:
echo "{}[{(){}[()]}](text ignored)([()]{{}})" | ./engine.perl.sh -f script.txt
Also,
This code can be compiled to nom with any of the following
pep -f syntagma.pss script.txt > script.pss cat script.txt | ./syntagma.tonom.pl > script.pss cat script.txt | ./syntagma.tonom.lua > script.pss # or use any other language for which a nom translator exists.
Then the nom code can be interpreted with input
pep -f script.pss input.txt; # or pep -f script.pss -i "some input"
Or the nom code can be compiled to another language
pep -f nom.torust.pss script.pss > script.rs
rustc -o script.exe script.rs
cat input.txt | ./script.exe
pep -f nom.toperl.pss script.pss > script.pl
cat input.txt | perl script.pl
# or
pep -f nom.tolua.pss script.pss > script.lua
cat input.txt | lua script.lua
Or do this to any other language for which a working translator exists in the nom translation folder
This very technical blog post was essentially about the expressive power
of syntagma* and how it builds on the structure of the nom language to
provide a much more intuitive language for expressing and manipulating textual and
linguistic patterns.