Program Synthesis
Tikhon Jelvis (tikhon@jelv.is)
Press → key to advance.
Zoom in/out: Ctrl or Command + +/-
1 Synthesis
- Find a program to some specification (\(\phi(input, output)\))
- input/output pairs
- executable specification
- \(\exists P \forall x. \phi(x, P(x))\)
2 Why?
- easy
- spec might be easier than program
- correct
- verifying spec easier than verifying program
3 Compilers
- compilers are deterministic
- only consider correct optimizations
- hard to write
4 Synthesizer
- can be non-deterministic
- considers potentially incorrect optimizations
- hard to scale
- accept partial specifications
5 Programming by Demonstration
- user provides inputs/outputs
- generate program to match
- interactive
6 Example
First | Last | Initials |
---|---|---|
John | Doe | |
Bob | Smith | |
Tikhon | Jelvis | |
… | … |
7 Example
First | Last | Initials |
---|---|---|
John | Doe | J. D. |
Bob | Smith | |
Tikhon | Jelvis | |
… | … |
8 Example
First | Last | Initials |
---|---|---|
John | Doe | J. D. |
Bob | Smith | B. S. |
Tikhon | Jelvis | T. J. |
… | … | … |
9 Spec
- One input/output:
- "John" "Doe" → "J. D."
- Usually only a few needed
- Interactive
10 Small Language
- Targets a custom language
- Heavily limits looping
- Much smaller space of possible programs
11 Rough Approach
- Uses Version Space Algebra (VSA)
- "Space" of possible programs
- inputs/outputs trim space
- Haskell package: HaVSA
12 Superoptimization
- optimize existing program
- synthesize equivalent program
- much easier to implement
- limited scalability
13 Example
- GreenArrays
- stack-based architecture
- uses Forth
- difficult to use
- Haskell package: array-forth
14 MCMC
- Markov Chain Monte Carlo (MCMC)
- randomized hill climbing
- random mutations
- incorrect code
- Haskell package: mcmc-synthesis (limited)
- needs cluster
15 SMT
- SMT: SAT modulo theories
- Solve logic formulas
- reasonably fast
16 Formulas
- compile program to formula
- \(\phi(input, program, output)\)
- fix \(input\), \(program\): interpreter
- fix \(program\), \(output\): reverse interpreter
- non-deterministic
- fix \(input\), \(output\): synthesis
17 Verification
- compare two programs exhaustively
- problem: \(\forall input. spec(input) = program(input)\)
- actual: \(\exists input. spec(input) \ne program(input)\)
18 CEGIS
- Counter-example guided inductive synthesis (CEGIS)
- solve for program
- limited set of inputs/outputs
- verify against spec
- if verified: done
- else: new input/output pair; repeat
- few pairs needed
19 Scaling (or not)
- formulas are hard to scale
- exponential in program size
- maybe 100 instructions
- with luck
- if you're patient
20 Sketching
- some things are easy for programmers
- some things are easy for solvers
- let programmers write the easy parts!
- specify a program with "holes" and solve for the holes
21 SBV
- SMT-based verification (sbv)
- Haskell package for SMT solvers
- write symbolic Haskell program
- prove facts about it
- solve for variables
- synthesis!
22 Credit
- Most of the presentation heavily influenced by:
- Professor Bodik at the Berkeley ParLab
- Especially slides from his class on synthesis
- Programming by example:
- Flash Fill, a team at MSR led by Sumit Gulwani
- GreenArrays:
- Based on a synthesis project I worked on along with Professor Bodik, Mangpo, Rohin Shah and Nishant Totla
- MCMC-synthesis:
- Based on a side-project around GreenArrays that I worked on with Jessica Taylor