Thursday, September 28, 2017

SIL: Explorations in non-Turing Completeness


I want to demonstrate that Turing Completeness is unnecessary for general purpose computing, through something I have called SIL.

SIL either stands for Stand-In Language, or Simple Intermediary Language. I intended it to stand for one thing, then forgot what it was, then backronymed the other meaning.

SIL has a very simple core grammar, based on the simply typed lambda calculus, consisting of 10 instructions. On top of this, is a Prelude.sil file that defines common functions you find in functional programming languages, like fold, map, zip, etc.

SIL's type system is very simple. It's constructed of a generic "data" type, a pair type, and an arrow type (taking a type and returning a type). I'm working on adding refinement typing which will be the key to runtime optimizations.

At the stage I'm at, I believe I have demonstrated that general purpose computing is possible in SIL. For evidence, please look at the tictactoe.sil file, which implements a 2-player (no AI) game. If you're not convinced this is an adequate demonstration, please tell me something else to write!

(note: the master branch is outdated, for the best mix of up-to-dateness and non-brokeness, try the pairType branch)

The Core Grammar

  1. Zero: this can represent the number zero, or a terminator
  2. Pair: this is a pair of other expressions. We can use nested pairs to represent lists, numbers, or really any form of data.
  3. Var: this is misnamed. It actually represents the current closure environment. by manipulating it, we can pull out specific variables
  4. App: this feeds data to a function
  5. Anno: this is a deprecated instruction, applying one expression encoding a type annotation to another expression, to be evaluated at compile time. In the refinementTypes branch, work is being made to replace this with a more generic "Check" expression that allows checking an expression against a precondition function.
  6. Gate: creates conditional behavior. What is passed in acts as a switch which needs to be applied to a Pair. If the input is Zero, it selects the left side of the pair, anything else, and it selects the right side.
  7. PLeft: returns the left side of a Pair
  8. PRight: returns the right side of a Pair
  9. Trace: this instruction is simply used for debugging while designing the rest of the language. It echoes the expression within to the console, then evaluates it.
  10. Closure: Originally this was a lambda expression, representing a function. For historical reasons (and possibly future reasons?), I changed this to a closure, but it mostly serves the purpose of a lambda - creating a function.
Using nested pairs we can represent a stack, and this representation is used for closure environments. Values on the stack are the left sides of pairs, and the right sides hold the remainder of the stack. Thus, PLeft Var is an expression representing the first value on the stack, and PLeft (PRight Var) represents the second value.

Functional programmers may be aware that a fixed point combinator can be used to make non-recursive functions recursive. Through this same principle, we can apply a non-recursive function to a church numeral together with a function to return a default value, and get a function that can recurse limited to the magnitude of the church numeral. This is how SIL handles recursion.


Most practical programs can potentially run forever. How can a language where all expressions terminate be useful?

 

The key to this, is plugging the core grammar into input/output loop "frames". Each frame takes a pair of the input and program state, and returns a pair of output and program state. If running the SIL expression generates a null program state, the program is aborted, otherwise the state is fed back into the frame, and the program continues executing.


The Parser

 

Since constructing SIL ASTs by hand to write programs isn't fun, I've written a simple parser. It translates a simple Haskell-like syntax into the AST.

Identifiers (variables) in the syntax are translated to series of instructions that will look up that variable in the relevant closure environment.


The Type Checker

 

I am in the middle of rewriting the type checker to support refinement types, so I don't want to give too much information here that will be quickly out of date. With refinement types, type checking will be a two stage process.

The key function of the first stage is making sure that the expression will terminate. For each element of the core grammar, we can infer a simple type composed of 4 type elements: ZeroType, PairType, ArrowType, TypeVariable (which should really be called EnvironmentType)

Since SIL is based on simply typed lambda calculus, when we infer a simple type for an expression, if that type is self-referential (as in the case of a fixed-point combinator), that expression will not terminate, and we should generate a type error.

With refinement typing, the whole SIL language will be made available to validate expressions. Each expression can be pared with a function written in SIL. If the expression has no free variables, it can be processed at compile time, and alert the programmer if they have semantic errors.

To handle expressions with free variables, we can run an optimization process on the AST to move all refinement checks to the top level of the AST. At the top level, either the refinement checks will depend on no free variables and can be run at compile time, or will be dependent upon non-deterministic runtime values.

If a user of a program written in SIL inputs data that does not pass runtime checks, the error can be reported to the user before any other execution is performed, and the user can instead input different data. This prevents the SIL program from entering many unrecoverable states. The exception (pun unintended) is when there is no possible user input that will pass validation.


Summary

 

Hopefully this gives a good view of the current state and near future of SIL. I'm hoping for other contributors and/or feedback.