A set of questions and answers about the ℙ𝕖𝕡 virtual machine and the ℕ𝕠𝕞 scripting language.
If you see something strange and unusual, just enjoy it while you can. Mr. Curly
Yes, it’s simple because pretty-printers (or source code colourisers) don’t (normally) actually parse the source code. They only tokenise which is to say, they recognise pattern like quoted text, command names, punctuation etc and format them. An example of a pretty printer in Nom is /eg/nom.tohtml.pss or /eg/nom.tolatex.pss which does the same thing but into LATEX not HTML .
If you want to write something that properly indents and formats the source code then you will need to actually parse the language ( ℕ𝕠𝕞 can do that too). Possibly the most difficult task is then deciding where to split lines that are too long. But you are not flying an aeroplane, just do something that doesn't look too bad.
Follow the instructions at www.nomlang.org/download.
I believe so. It can certainly be used to parse and compile
context-free languages but I have not yet written a successful
type checker using nom. But it is definitely
possible, especially using the nom commands mark
and go
in their non-parameter versions (this essentially allows the
ℙ𝕖𝕡
tape structure to be used as an en.wikipedia.org/wiki/associative_array )
I have been developing, or trying to develop ℙ𝕖𝕡 🙵 ℕ𝕠𝕞 for a long time. I 1st thought of the idea several decades ago, but judging by my notes and work-diaries I didn't really achieve a decent interpreting implementation until around 2019. Since then I have worked off and on on the system. Sometimes I take long breaks from it to do other things but when I come back to it, it always seems very interesting and powerful.
So the answer is “yes” (as of 2025).
No, it isn't. I believe that almost nobody knows about it, which is why I am writing this website and also this FAQ . I believe the system is extremely interesting and potentially very useful and what is more, its free and open source
You can use it for parsing, translating, compiling or transpiling context-free languages such as Lisp, JSON data, CSS code, palindromes, regular expressions, CSV data, XML and so on and so forth. You can use it to create your own mini-languages. You can use it to explore and understand the grammar of languages and patterns without having to code in a full computer language. Nom and Pep can also handle certain types of context-sensitive languages or patterns ( [XML] maybe context-sensitive)
A linguist who came up with an interesting classification of formal languages.
The Nom language closely reflects the virtual machine
which it manipulates. This machine has a stack a tape (an
array with a pointer) and a workspace buffer as well as some other
minor registers. So, for example, the ++
command increments the
pointer to the tape element and the push
command pushes one
parse token from the (beginning of) the workspace onto the
(top of the ) stack. So once you are familiar with the virtual
machine you will find the language not so cryptic.
Having said that, it was actually my intention, when I first thought of pep&nom that I would write a more expressive and natural language “on top” of it. In other words, I would use nom to write a compiler for a language which would reflect more closely a kind of BNF grammar.
The page /doc/syntax/nom.syntax.html contains an example of compiling a simple (toy) BNF format into ℕ𝕠𝕞 which can be used as the basic of a more expressive language that compiles into the nom script language.
In my defence, I could say that all languages or mini-languages that are based on virtual machines tend to be cryptic, for example SED AWK FORTH etc.
I work mostly on Linux or MacOS so the tools I use are available on those machines such as SED VIM the c language and the GCC compiler (for the pep interpreter ) although the ℙ𝕖𝕡 interpreter also compiles with TCC and should compile with any standard c compiler.
Nom is not a general purpose computer language, it is a “domain-specific ” computer language designed to recognise and 'translate' text patterns (more formally called context-free languages).
Nom is also important because it highlights that a compiler it really just a map function from one 'language' to another. A formal language is a set of strings in a given alphabet. So a compiler (if we ignore the final stage of converting to binary code) is a map function from one set of strings to another.
language A: { 'and','with','for','to','from' }
--> (maps to)
language B: { 'y','con','para','a','de' }
But, of course the language map function does not have to be one-to-one. When compilers are written as computer code, this map function is hidden or obfuscated . But when the compiler is written as a text-based map function, we are able to “multiply” map functions (that is “chain” compilers together to create new compilers). Nom already does this chaining with it’s translation scripts
The nom language is still code-oriented rather than data-oriented but it is an important step towards regarding compilers as language map functions.
As far as I know, in February 2025, only me.
No, it is an idea, that I feel could be useful and interesting in many software circumstances. I believe, for example, that it could be used to write very small simple compilers for microcontrollers (or other 'small' machines) that would be capable of running on the microcontroller like FORTH or micro-python.
This is the central text buffer of the ℙ𝕖𝕡 virtual machine Each part of the machine is documented in /doc/machine/ and the workspace buffer is documented at workspace and the parse-token stack is documented at stack
Because, like a number of things in the ℙ𝕖𝕡 🙵 ℕ𝕠𝕞 system, it was influenced by the SED stream editor and I believe that the GNU sed program uses similar terminology.
Sed reads the text input-stream line by line and uses regular expression patterns to transform (or edit ) the text stream. ℕ𝕠𝕞 can read the input stream character-by-character or any other way you wish and uses grammar rules to transform (compile/transpile/translate) the input-stream.
I think so, but it is not really designed (at the moment) for translating human languages. But it may be useful for linguists to understand how formal grammars function (and their limitations).
There are many complexities to parsing human languages (for example the nexus between semantic considerations and grammatical rules) but one of the more obvious ones is the “ambiguity” of certain sentences: that is, one sentence may have several different possible ways to parse, or initial parsings. Since ℙ𝕖𝕡 🙵 ℕ𝕠𝕞 cannot “backtrack” in the input stream it cannot choose between multiple parsings.
Yes, unfortunately there will be. This is also because I have been working on
this by myself off and on. I just found a bug in the mark
and go
code where more than one tape cell is marked with the same 'tag' . But
generally the system works remarkably well. For a list of known bugs, see the
document /doc/pepnom.doc.bugs.html
The reason is that you need to maintain the state of the
ℙ𝕖𝕡
machine in your head as you are manipulating it. That is similar to
the way that I blindfold chess player maintains the position of
the board in his or her head during the game. But with nom you
have several tools to 'peek' into the state of the machine
during the development and execution of a script. These tools are
outlined in the document /doc/howto/nom.debug.html . One simple one
is the state
command which prints the state of
the machine at the time that the command is executed.