Myrr is a custom Forth-based interpreter which enables the embedding of Forth commands within a text document and outputs HTML formatted files (one of which you are reading right now).
"Data is code." A powerful mantra in an era when data are considered only insofar as their ability to be consumed. Indeed, large language models have taken the exact opposite position: "Code is data". Forth enables orthogonal perspectives in other ways too, the stack-based nature of the language requires the programmer to be hyper-aware of the state of the processes they write. Efficiency and care become unspoken yet obvious requirements of Forth programs. Programmers exposed only to modern languages may never consider the implications of creating, populating, and modifying a dataframe Megabytes in size. Of course, this is isn't a blunder or a misgiving on their part, such nonchalant behavior is encouraged by every language of the 21st century. It makes life easier, and the resources available to even beginners have redefined "efficiency" and "considerate programming" so many times and to such extreme degress that they have become antonyms of their original selves from the mid 1900s.
The beauty of Forth is in it's low-level nature. Being half a step above Assembly predicates the strict memory management and "walking on eggshells" approach to programming described earlier (indeed, if C gives you the ability to shoot yourself in the foot, with Forth you may blow both legs clean off). Yet, a half-step above Assembly means Forth enables you to do almost anything. You can take full advantage of the polymorphic nature of the language and define your own custom syntax, abstracting the language from the most basic of stack manipulations to a subroutine-based code that reads almost like a poem.
I've written a Forth-based interpreter of custom Markdown-like .bth files whose text executes code in the Forth language itself. I wanted to dive deep into a project that fully embraces "Data is code". Indeed, any language can parse Markdown into HTML, but it feels so different when the markdown text is not simply being replaced with strings from a lookup table, but instead is invoking calls to functions which modify the flow of the program and organization of the stack to manipulate the output data. In a way, the input data is writing itself. A sample .bth file might look something like this:
^html ^h2 A beautiful heading ^h2 ^p A paragraph about Forth ^+p Another paragraph about Forth ^/p ^href https://slewis.wiki link to my blog ^/href ^/html
Where the ^
sign informs the Forth program to break out of the continuous character dump to the outfile buffer to execute the Forth word that follow. For example, the program will interpret ^href
as the word href
which itself is defined as outputing the string "<a href="
as well as the following token, https://slewis.wiki
in the example above, then closing the a
element with ">"
. The following text "link to my blog" is then read into the output buffer and the a
element is closed with ^/href
.
A curious self-referencial loop is taking place here. I am describing the program that parsed and wrote this text into the text you are reading which itself can be parsed by the same program. I don't really know what to make of that but it tickles me.
As such, links are easy to embed . You can add images too:
And Forth commands can be nested within themselves in the source text. Hence, you can hotlink images:
To take the previous meta self-references to the extreme, the following is the code necessary to process the .bth sample and is the same code that constructs this page. About 60 lines of Forth.
( configuration ) : srcfile S" myrr.bth" ; : outfile S" myrr.html" ; ( input buffer ) VARIABLE 'src ( address ) VARIABLE #src ( size of buffer ) VARIABLE 'out VARIABLE #out VARIABLE fh : open srcfile R/O OPEN-FILE THROW fh ! ; : close fh @ CLOSE-FILE THROW ; : read BEGIN HERE 4096 fh @ READ-FILE THROW DUP ALLOT 0= UNTIL ; : gulp open read close ; : start HERE 'src ! ; : finish HERE 'src @ - #src ! ; : slurp start gulp finish ; : open outfile R/W CREATE-FILE THROW fh ! ; : write 'out @ #out @ fh @ WRITE-FILE THROW ; : spew open write close ; ( command dispatcher) variable offset variable 'token variable #token : addr offset @ 'src @ + ; : chr addr C@ ; : -ws 32 U> ; : advance 1 offset +! ; : seek begin chr -ws WHILE advance REPEAT ; : token addr seek addr over - advance 2DUP #token ! 'token ! ; : .token 'token @ #token @ TYPE ; : error CR CR .token -1 ABORT" Command was not found" ; : command token sfind IF EXECUTE ELSE error THEN ; ( vectored output ) : b-emit C, ; : b-type BEGIN DUP WHILE OVER C@ EMIT 1 /string REPEAT 2DROP ; : buffered ['] b-emit IS EMIT ['] b-type IS TYPE ; : interactive ['] (EMIT) IS EMIT ['] (TYPE) IS TYPE ; : start HERE 'out ! buffered ; : finish HERE 'out @ - #out ! interactive ; ( process input buffer ) : rdrop POSTPONE R> POSTPONE DROP ; IMMEDIATE : call >R ; : entity [CHAR] & emit type [CHAR] ; emit ; : ===> OVER = IF DROP R> call entity THEN rdrop ; : either& [CHAR] & ===> S" amp" ; ( mapping symbols to entities ) : or< [CHAR] < ===> S" lt" ; : or> [CHAR] > ===> S" gt" ; : orESC DUP [CHAR] ^ = IF DROP command RDROP EXIT THEN ; : interpret chr advance either& or< or> orEsc ( else ) emit ; : -end offset @ #src @ U< ; : format 0 offset ! BEGIN -end WHILE interpret REPEAT ; : process start format finish ; ( commands ) : html ." <!DOCTYPE html><head><title>" srcfile TYPE ." </title></head><body>" CR ; : /html ." </body></html>" ; : p ." <p>" ; : /p ." </p>" ; : +p /p CR p ; : h2 ." <h2>" ; : /h2 ." </h2>" ; : href ." <a href=" token TYPE ." >" ; : /href ." </a>" ; ( kick off procedures ) : myrr slurp process spew ; myrr BYE
This code is an extension of the sample provided by Samuel A Falvo II in a tutorial video from 2008.
The syntax of Forth is very foreign to programmers like me. I taught myself Python and have used it professionally for a decade. Forth, in comparison, approaches programming upside down and inside out. Here are some quick syntax and convention notes that I frequently forget
( ! - store ... ( value address -- , STORE value TO address in memory ) ( @ - fetch ... ( address -- value , FETCH value FROM address in memory ) ( ? - look ... ( address -- , LOOK at variable ) ( VARIABLE ... ( <name> -- , define a 4 byte memory storage location ) 'var ( address ) #var ( size ) : start HERE 'var ! ; : finish HERE 'var @ - #var ! ;