
#*
ABOUT

 This is supposed to be the simplest grammar 'recogniser' or 
 the nom* language. The script checks for valid command names 
 and parameter counts but not for abbreviations of commands
 
 This script returns 0 if the nom syntax is good, otherwise it returns:

 * current exit codes
 -----
 1 general syntax error
 2 bad word or command
 3 parameter error
 4 strange character in script
 ,,,,

 This is quite useful because we can combine with a bash one-liner to 
 check multiple nom .pss script files at once by checking the $? exit
 value. See below for an example

GRAMMAR NOTES

  This script actually uses a slightly different grammar to the 
  translation scripts and the script /eg/nom.syntax.reference.pss .
  I like the parseblock* token because it assures that only one
  parse> label can exists without using the accumulator.

  I dont use abbreviated commands much and delim; and echar; currently
  have the same abbreviation 'z' ! Also abbreviation for short commands
  like 'a+' seem silly. But these may be useful in a microcontroller
  context. These abbreviations a copied from the pep interpreter 
  using the *pep -C* command.

  A parse> label at the end or start of the script is allowed but
  (I hope) only one.

  This doesnt allow [while 'x';] or [whilenot 'y';] although other
  nom parsers may. This is because the quote syntax is not necessary
  and I prefer to enforce the while [abc]; syntax

  in /eg/machine.snippet.tohtml.pss and other snippet scripts, the 
  parse> label is within a block, which is probably not a good idea,
  in fact I dont even know if the translation scripts can handle this.
  This script marks that as an error. The parse stack is displayed as
  >> ...{*statementset*parse>*statementset*}* at EOF

TESTING

  Just run any nom script that you know or think is ok through this
  will 'less' and you can see how everything is parsing

  * test this script
  >> pep -f grammar.nom.pss some-script.pss | less

  If it looks like this script is misdiagnosing or mis-recognising a 
  nom script you can scroll through less, search for a particular 
  parse-token sequence in order to find the line and character where
  the mis-parsing is occurring. This is a powerful technique - 

  * first run without less 
  >> pep -f grammar.nom.pss script-with-syntax-error.pss 

  Now examine the sequence of tokens to see where the parse error 
  is (eg: extra } brace ). Now run the same line but with 'less' 
  and search specifically with that token sequence with '/'
  For example: search for /[^{]\*statementset\*}\*

  This should jump straight to the line and character number where the
  parse error starts to occur.

  Below is a quite a cool way to test all nom scripts in a folder for
  correctness. If a syntax error is detected then the loop breaks and
  prints the script name 

  >> for f in *.pss; do pep -f grammar.nom.pss $f; if [ "$?" != "0" ]; then echo $f; break; fi; done

  The snippet below is great to syntax check all nom scripts and keep 
  going even if some fail. The 'sed' trick removes the parse-stack 
  information from the output. This can also be disabled by commenting 
  out the 2 lines just after the parse> label in this script.

  * test all nom scripts in a folder and show results
  -------
   for f in *.pss; do 
     echo $f; pep -f grammar.nom.pss $f | sed '/^#/d' 
   done | less
  ,,,

HISTORY

 12 may 2026 
   continued testing and debugging. This script seems very useful
   to check a large number of nom scripts for syntactical correctness.
   fix: need to transform abbreviations into long names so that param
   checks will work?
   
 11 may 2026 
   began this script, and apparently the script is working and 
   returns an exit code according to the type of error. I could
   expand these exit codes to be more comprehensive, but I would
   like to keep the script simple. The script about 2 hours to write.
   Script is about 200 lines.

*#

 read; put; 
 "\n" { clear; nochars; !(eof) { .restart } .reparse }
 "#".(eof) { clear; !(eof) { .restart } .reparse }
 "#" { 
   read; "#\n" { clear; } "#*" { until "*#"; } !B"#*".!"" { until "\n"; }
   put; clear; add "comment*"; push; .reparse 
 }
 "[" { until "]"; put; clear; add "class*"; push; .reparse }
 "'" { until "'"; put; clear; add "quoted*"; push; .reparse }
 '"' { until '"'; put; clear; add "quoted*"; push; .reparse }
 [{};EB!,.] { 
   # count braces as a debug help 
   "{" { a+; } "}" { a-; } add "*"; push; .reparse
 } 

 [abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ+-0><()=] {
   while [abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ+-0><()=];
   put; clear; add "word*"; push; .reparse 
 }

 [:space:] { while [:space:]; clear; } 
 !"" { 
   zero; a+; a+; a+; a+; clear; 
   add "[nom bad char: "; get; add "]\n";
   add "[exit code="; count; add "]\n"; print;
   clear; quit; 
 }

parse>

 # watch the parse-stack resolve. also include a brace counter
 add "#("; count; add ") line "; lines; 
 add " char "; chars; add ": "; print; clear; 
 unstack; print; stack; (eof) { add " EOF"; } add "\n"; print; clear;

 #-----------------------
 # 1 token reductions
 pop;

 # add a dummy parseblock to simplify parse rules if the script 
 # ends with parse> (which maybe silly but should not be an error)
 "parse>*".(eof) { clear; add "parseblock*"; push; .reparse }

 # ignore comments
 "comment*" { clear; .reparse }
 "word*" {
   clear; get;

   # abbreviations and commands 
   "a+","+","a-","-","++",">","--","<",
   "add","a","clip","k","clop","K","replace","D",
   "upper","u","lower","U","cap","A","clear","d",
   "print","t","state","S","pop","p","push","P",
   "unstack","U","stack","u","put","G","get","g","swap","x",
   "mark","m","go","M","system","X","read","r","until","R",
   "while","w","whilenot","W","count","n","zero","0",
   "chars","c","lines","l","nochars","C","nolines","L",
   "escape","^","unescape","V","echar","z","delim","z",
   "quit","q","write","s","writefile","f","appendfile","F","nop","o" {
     clear; add "command*"; push; .reparse
   }

   "(==)","<==>","(eof)","(EOF)","<eof>","<EOF>" {
     clear; add "test*"; push; .reparse
   }

   "parse>","begin","reparse","restart" { add "*"; push; .reparse }

   clear; zero; a+; a+; 
   add '[nom unknown word: "'; get; add '" line:"'; lines; add ']\n'; 
   add "[exit code="; count; add "]\n"; print; quit;
 }

 #-----------------------
 # 2 token reductions
 pop;
 "E*quoted*","B*quoted*","!*quoted*" { clear; add "test*"; push; .reparse } 
 "!*class*" { clear; add "test*"; push; .reparse } 
 "!*test*" { clear; add "test*"; push; .reparse } 

 "statement*statement*" { clear; add "statementset*"; push; .reparse }
 "statementset*statement*" { clear; add "statementset*"; push; .reparse }

 (eof)."parse>*statementset*" { clear; add "parseblock*"; push; .reparse }
 (eof)."parse>*statement*" { clear; add "parseblock*"; push; .reparse }

 #-----------------------

 # parameter error
 "command*quoted*" {
   clear; get; 
   !"add".!"a".!"replace".!"D".!"mark".!"m".!"go".!"M".!"until".!"R".
   !"delim".!"z".!"escape".!"^".!"unescape".!"V".!"echar".!"z" {
     zero; a+; a+; clear; 
     add "[nom parameter syntax error: line "; lines; add "]\n"; 
     add "[exit code="; count; add "]\n"; print;
     quit; 
   }
   clear; add "command*quoted*";
 }

 # parameter error
 "command*class*" {
   clear; get; !"while".!"w".!"whilenot".!"W" {
     zero; a+; a+; clear; 
     add "[nom parameter syntax error: line "; lines; add "]\n"; 
     add "[exit code="; count; add "]\n"; print;
     quit; 
   }
   clear; add "command*class*";
 }

 # parameter error
 "command*;*" {
   clear; get; 
   !"a+".!"+".!"a-".!"-".!"zero".!"0".!"++".!">".!"--".!"<".
   !"clip".!"k".!"clop".!"K".!"upper".!"u".!"lower".!"U".
   !"cap".!"A".!"clear".!"d".!"print".!"t".!"state".!"S".
   !"pop".!"p".!"push".!"P".!"unstack".!"U".!"stack".!"u".
   !"put".!"G".!"get".!"g".!"swap".!"x".!"mark".!"m".!"go".!"M".
   !"system".!"X".!"read".!"r".!"count".!"n".
   !"chars".!"c".!"lines".!"l".!"nochars".!"C".!"nolines".!"L".
   !"quit".!"q".!"write".!"s".!"nop" {
     zero; a+; a+; clear; 
     add "[nom parameter syntax error: line "; lines; add "]\n"; 
     add "[exit code="; count; add "]\n"; print;
     quit; 
   }
   clear; add "command*;*";
 }

 "command*;*" { clear; add "statement*"; push; .reparse }
 ".*reparse*",".*restart*" { clear; add "statement*"; push; .reparse }



 #-----------------------
 # 3 token reductions
 pop;

 # check for parameters?
 "command*quoted*;*" { clear; add "statement*"; push; .reparse }
 "command*class*;*" { clear; add "statement*"; push; .reparse }

 "{*statement*}*" { clear; add "{*statementset*}*"; }

 "quoted*,*test*" { clear; add "ortest*"; push; .reparse }
 "class*,*test*" { clear; add "ortest*"; push; .reparse }
 "class*,*class*","quoted*,*class*" { clear; add "ortest*"; push; .reparse }
 "test*,*class*","ortest*,*class*" { clear; add "ortest*"; push; .reparse }
 "test*,*quoted*","ortest*,*quoted*" { clear; add "ortest*"; push; .reparse }
 "class*,*quoted*","quoted*,*quoted*" { clear; add "ortest*"; push; .reparse }
 "test*,*test*","ortest*,*test*" { clear; add "ortest*"; push; .reparse }

 "quoted*.*test*" { clear; add "andtest*"; push; .reparse }
 "class*.*test*" { clear; add "andtest*"; push; .reparse }
 "class*.*class*","quoted*.*class*" { clear; add "andtest*"; push; .reparse }
 "test*.*class*","andtest*.*class*" { clear; add "andtest*"; push; .reparse }
 "test*.*quoted*","andtest*.*quoted*" { clear; add "andtest*"; push; .reparse }
 "class*.*quoted*","quoted*.*quoted*" { clear; add "andtest*"; push; .reparse }
 "test*.*test*","andtest*.*test*" { clear; add "andtest*"; push; .reparse }

 "command*quoted*;*" { clear; add "statement*"; push; .reparse }
 "command*class*;*" { clear; add "statement*"; push; .reparse }

 #-----------------------
 # parameter error
 "command*quoted*quoted*" {
   clear; get; !"replace".!"D" { 
     zero; a+; a+; clear; 
     add "[nom parameter syntax error: line "; lines; add "]\n"; 
     add "[exit code="; count; add "]\n"; print;
     quit; 
   }
   clear; add "command*quoted*quoted*";
 }

 #-----------------------
 # 4 token reductions
 pop;

 "begin*{*statementset*}*" { clear; add "beginblock*"; push; .reparse }

 "command*quoted*quoted*;*" { clear; add "statement*"; push; .reparse }
 "word*{*statementset*}*" { clear; add "statement*"; push; .reparse }
 "test*{*statementset*}*" { clear; add "statement*"; push; .reparse }
 "quoted*{*statementset*}*" { clear; add "statement*"; push; .reparse }
 "class*{*statementset*}*" { clear; add "statement*"; push; .reparse }
 "ortest*{*statementset*}*" { clear; add "statement*"; push; .reparse }
 "andtest*{*statementset*}*" { clear; add "statement*"; push; .reparse }

 #-----------------------
 # 5 token reductions
 pop;


 (eof) {

   "parseblock*","beginblock*","statement*","statementset*","script*" {
      clear; add "[nom syntax ok]\n"; print; 
      clear; zero; quit;
   }
   "beginblock*statement*parseblock*","beginblock*statementset*parseblock*" {
      clear; add "[nom syntax ok]\n"; print; 
      clear; zero; quit;
   }
   "beginblock*statement*","beginblock*statementset*" {
      clear; add "[nom syntax ok]\n"; print; 
      clear; zero; quit;
   }
   "beginblock*parseblock*" {
      clear; add "[nom syntax ok]\n"; print; 
      clear; zero; quit;
   }
   "statement*parseblock*","statementset*parseblock*" {
      clear; add "[nom syntax ok]\n"; print; 
      clear; zero; quit;
   }
   unstack; put; zero; a+; clear;
   add "[nom syntax error]\n";
   add "[exit code="; count; add "]\n"; 
   add "[parse stack: "; get; add "]\n"; print;
   quit;
 }
 push; push; push; push; push;

