\ Finite State Machine DSL \ Peter Salvi, 2009 \ The program tells us whether the machine accepts or rejects a given input. \ Using only a prefix of the input to reach the final state is valid. \ A state is represented by the execution token of its decision function. \ For simple states (defined by state / end-state) this is the `decision' \ function, which takes its data from subsequent cells. These have the \ format: <# of cases> <case1-value> <case1-counted-str> <case2-value> ..., \ where every case is aligned (ie. at a word boundary). \ >data ( c-addr -- state-addr ) \ Finds the address of the state data, given a counted string representation. : >data find drop execute ; \ next-cell ( c-addr -- case-addr ) \ Jumps to the next case (after the given counted string) : next-cell count + aligned ; \ check ( input case-addr -- state-addr true | cell-addr false ) \ If input equals the current case, return the corresponding state and true. \ Otherwise return the address of the next case and false. : check over @ = if cell+ >data true else cell+ next-cell false then ; \ decision ( input state-addr -- state-addr | false ) \ Return the next state address, if the input is found, false otherwise. : decision cell+ dup cell+ swap @ 0 ?do over check if nip unloop exit then loop 2drop false ; \ state ( "word" -- count-addr ) \ Creates the state word, using the xt of `decision'. Also allocates a cell \ for the case count, and leaves its address on the stack. : state create ['] decision , here 0 , ; \ => ( count-addr case "word" -- ) \ Saves the new transition case as a cell and a counted string, and updates \ the case count of the current state. : => , 1 over +! bl word count dup c, here over allot swap chars move align ; \ end-state ( count-addr -- ) \ Removes the case count address, thus ending the state creation process. : end-state drop ; \ state: ( "word" -- xt-addr ) colon-sys \ Enters compilation mode, but the program gets compiled indirectly - \ the word's first data field is an execution token to the real program. \ The program also has a built-in drop command, so its stack effect will \ be ( input -- state-addr | false ). : state: create here 0 , :noname postpone drop ; \ state; ( xt-addr -- ) \ This is where the execution token actually gets saved. : state; postpone ; swap ! ; immediate \ cliteral ( c-addr -- ) \ Compiles the given counted string as a counted string. : cliteral dup c@ char+ postpone sliteral postpone drop ; immediate \ next-state ( "word" -- ) \ Compiles a search for the parsed state's data. : next-state bl word postpone cliteral postpone >data ; \ input ( -- count-addr ) \ Starts a counted array, leaving the count address on the stack. : input here 0 , ; \ end-input ( count-addr n -- ) \ Finishes the counted array (also saves the top of the stack) and updates \ the count. : end-input , here over - cell / 1- over ! ; \ stop ( input -- input false ) \ Retains the input as a sign that this is a valid final state \ (other states can only return a single false). \ This is only called if there are still data after reaching this state, \ so it is an "early accept". state: stop ." early accept" false state; \ run ( addr -- ) \ Starts the execution at the `start' state. It halts if a state gives false, \ and takes 2 values from the stack which results in an error for every state \ except `stop'. If all the data were used up, it is accepted only if we are \ in the stop state. : run c" start" >data swap dup @ 0 ?do cell+ dup @ rot dup @ execute ?dup if swap else 2drop unloop exit then loop drop stop = if ." accept" else ." reject" then ; \ Usage example: \ state start \ 0 => foo \ 1 => bar \ end-state \ state foo \ 0 => foo \ 1 => baz \ end-state \ state: bar 0 = if next-state bar else next-state foo then state; \ state baz \ 0 => stop \ end-state \ input 1 , 0 , 1 , 1 , 0 end-input run \ accept \ input 0 , 1 , 0 , 0 , 1 end-input run \ early accept \ input 1 , 0 , 1 , 0 , 0 end-input run \ reject \ input 1 , 0 , 1 , 1 , 1 end-input run \ (error)