\ Finite State Machine DSL
\ Peter Salvi, 2009

\ The program tells us whether the machine accepts or rejects a given input.
\ Using only a prefix of the input to reach the final state is valid.

\ A state is represented by the execution token of its decision function.
\ For simple states (defined by state / end-state) this is the `decision'
\ function, which takes its data from subsequent cells. These have the
\ format: <# of cases> <case1-value> <case1-counted-str> <case2-value> ...,
\ where every case is aligned (ie. at a word boundary).

\ >data ( c-addr -- state-addr )
\ Finds the address of the state data, given a counted string representation.
: >data find drop execute ;

\ next-cell ( c-addr -- case-addr )
\ Jumps to the next case (after the given counted string)
: next-cell count + aligned ;

\ check ( input case-addr -- state-addr true | cell-addr false )
\ If input equals the current case, return the corresponding state and true.
\ Otherwise return the address of the next case and false.
: check over @ = if cell+ >data true else cell+ next-cell false then ;

\ decision ( input state-addr -- state-addr | false )
\ Return the next state address, if the input is found, false otherwise.
: decision cell+ dup cell+ swap @ 0 ?do
        over check if nip unloop exit then
    loop 2drop false ;

\ state ( "word" -- count-addr )
\ Creates the state word, using the xt of `decision'. Also allocates a cell
\ for the case count, and leaves its address on the stack.
: state create ['] decision , here 0 , ;

\ => ( count-addr case "word" -- )
\ Saves the new transition case as a cell and a counted string, and updates
\ the case count of the current state.
: => , 1 over +! bl word count dup c, here over allot swap chars move align ;

\ end-state ( count-addr -- )
\ Removes the case count address, thus ending the state creation process.
: end-state drop ;

\ state: ( "word" -- xt-addr ) colon-sys
\ Enters compilation mode, but the program gets compiled indirectly -
\ the word's first data field is an execution token to the real program.
\ The program also has a built-in drop command, so its stack effect will
\ be ( input -- state-addr | false ).
: state: create here 0 , :noname postpone drop ;

\ state; ( xt-addr -- )
\ This is where the execution token actually gets saved.
: state; postpone ; swap ! ; immediate

\ cliteral ( c-addr -- )
\ Compiles the given counted string as a counted string.
: cliteral dup c@ char+ postpone sliteral postpone drop ; immediate

\ next-state ( "word" -- )
\ Compiles a search for the parsed state's data.
: next-state bl word postpone cliteral postpone >data ;

\ input ( -- count-addr )
\ Starts a counted array, leaving the count address on the stack.
: input here 0 , ;

\ end-input ( count-addr n -- )
\ Finishes the counted array (also saves the top of the stack) and updates
\ the count.
: end-input , here over - cell / 1- over ! ;

\ stop ( input -- input false )
\ Retains the input as a sign that this is a valid final state
\ (other states can only return a single false).
\ This is only called if there are still data after reaching this state,
\ so it is an "early accept".
state: stop ." early accept" false state;

\ run ( addr -- )
\ Starts the execution at the `start' state. It halts if a state gives false,
\ and takes 2 values from the stack which results in an error for every state
\ except `stop'. If all the data were used up, it is accepted only if we are
\ in the stop state.
: run c" start" >data swap dup @ 0 ?do
        cell+ dup @ rot dup @ execute ?dup if swap else 2drop unloop exit then
    loop drop stop = if ." accept" else ." reject" then ;

\ Usage example:

\ state start
\   0 => foo
\   1 => bar
\ end-state

\ state foo
\   0 => foo
\   1 => baz
\ end-state

\ state: bar 0 = if next-state bar else next-state foo then state;

\ state baz
\   0 => stop
\ end-state

\ input 1 , 0 , 1 , 1 , 0 end-input run \ accept
\ input 0 , 1 , 0 , 0 , 1 end-input run \ early accept
\ input 1 , 0 , 1 , 0 , 0 end-input run \ reject
\ input 1 , 0 , 1 , 1 , 1 end-input run \ (error)