As mentioned in my previous post, I’ve been reading “Collective Intelligence” by Toby Segaram and I’m really enjoying in it. It’s different to a lot of programming books, in that rather than focusing a specific language or API it focus on a particular set of problems and shows techniques that can be used to crack them.

For example I’ve known for a long time that it’s easy to create an abstract syntax tree (AST) in F#:

/// Untyped expression tree/o:p

type Expression =/o:p

    | Multiply of Expression * Expression/o:p

    | Add of Expression * Expression/o:p

    | Subtract of Expression * Expression/o:p

    | GreaterThan of Expression * Expression/o:p

    | If of Expression * Expression * Expression/o:p

    | Constant of int/o:p

    | Parameter of int

Here we present a simple AST for representing numeric expressions, while it’s easy to provide a function to evaluate such a tree:

/// Given a list of parameters evaluate the tree/o:p

let evaluateExpression parameters =/o:p

    let rec innerEval tree =/o:p

        match tree with/o:p

        | Multiply (x, y) -> innerEval x * innerEval y/o:p

        | Add (x, y) -> innerEval x + innerEval y/o:p

        | Subtract (x, y) -> innerEval x - innerEval y/o:p

        | GreaterThan (x, y) -> if innerEval x > innerEval y then 0 else 1/o:p

        | If (pred, tval, fval) -> if innerEval pred = 0 then innerEval fval else innerEval tval/o:p

        | Constant value -> value/o:p

        | Parameter pos -> List.nth parameters pos/o:p

    innerEval

I’ve often had problems coming up with nice simple examples of what you might want to do with such an expression. Well “Collective Intelligence” gave me a great idea of what you can do with such an expression: genetic programming. The example we’re going to look at is presented in chapter 11, pages 251 to 268. The idea is straight forward if you can abstractly represent a program as we can with our numeric expression AST then it’s possible to make random changes to our program. Wants we have made the random changes we can measure which programs are successful and use the successful ones to create a new generation of programs. So what kind of problems can we solve with this approach? Well here’s a nice one, suppose we have a set of points:

X             Y              Result

36           38          1485

17           13          371

13           2             217

24           31          715/o:p

13           0             213

38           3             1569

10           11          157

11           32          223/o:p

 18          25          433

And we want to find out what function fits best to these points, we can use genetic programming to solve this problem. I’ve given a program below that can solve this problem though genetic programming. I don’t want to go into too many details, you should really buy the book if you want a good explanation, but in a nutshell the algorithm works in the following way:

-          We create a bunch of random function

-          We test how close these get to the predefined points

-          If we find a function that fits we return it

-          Otherwise we breed the most successful function and loop to the testing step

Anyway the full implementation is here, and version can be downloaded here [EDIT: link fixed]:

#light/o:p

open System/o:p

open Microsoft.FSharp.Math/o:p

 /o:p

/// Untyped expression tree/o:p

type Expression =/o:p

    | Multiply of Expression * Expression/o:p

    | Add of Expression * Expression/o:p

    | Subtract of Expression * Expression/o:p

    | GreaterThan of Expression * Expression/o:p

    | If of Expression * Expression * Expression/o:p

    | Constant of int/o:p

    | Parameter of int/o:p

 /o:p

/// Given a list of parameters evaluate the tree/o:p

let evaluateExpression parameters =/o:p

    let rec innerEval tree =/o:p

        match tree with/o:p

        | Multiply (x, y -> innerEval x * innerEval y/o:p

        | Add (x, y -> innerEval x + innerEval y/o:p

        | Subtract (x, y -> innerEval x - innerEval y/o:p

        | GreaterThan (x, y -> if innerEval x > innerEval y then 0 else 1/o:p

        | If (pred, tval, fval -> if innerEval pred = 0 then innerEval fval else innerEval tval/o:p

        | Constant value -> value/o:p

        | Parameter pos -> List.nth parameters pos/o:p

    innerEval/o:p

 /o:p

let simplifyExpression =/o:p

    let rec innerEval tree =/o:p

        match tree with/o:p

        | Multiply (Constant 0, _ -> Constant (0/o:p

        | Multiply (_, Constant 0 -> Constant (0/o:p

        | Multiply (Constant x, Constant y -> Constant (x * y/o:p

        | Add (Constant x, Constant y -> Constant (x + y/o:p

        | Subtract (Constant x, Constant y -> Constant (x - y/o:p

        | GreaterThan (Constant x, Constant y -> if x > y then Constant 0 else Constant 1/o:p

        | If (Constant pred, tval, fval -> if pred = 0 then fval else tval/o:p

        | x -> x/o:p

    let rec loop tree =/o:p

        let tree’ = innerEval tree/o:p

        if tree’ = tree then/o:p

            tree/o:p

        else/o:p

            loop tree/o:p

    loop/o:p

 /o:p

/// print the expression to the console/o:p

let printExpression =/o:p

    let rec innerPrint ppf tree =/o:p

        match tree with/o:p

        | Multiply (x, y -> Printf.fprintf ppf ”(%a * %a” innerPrint x innerPrint y/o:p

        | Add (x, y -> Printf.fprintf ppf ”(%a + %a” innerPrint x innerPrint y/o:p

        | Subtract (x, y -> Printf.fprintf ppf ”(%a - %a” innerPrint x innerPrint y/o:p

        | GreaterThan (x, y -> Printf.fprintf ppf ”(%a > %a” innerPrint x innerPrint y/o:p

        | If (pred, tval, fval -> Printf.fprintf ppf ”(if %a then %a else %a” innerPrint pred innerPrint fval innerPrint tval/o:p

        | Constant value -> Printf.fprintf ppf ”%i” value/o:p

        | Parameter pos -> Printf.fprintf ppf “p%i” pos/o:p

    innerPrint System.Console.Out/o:p

 /o:p

let rand = new Random(/o:p

 /o:p

/// build a random expression with limited depth, a maximum constants value,/o:p

/// and a limited number of parameters/o:p

let buildRandomExpression maxDepth maxConst noParams =/o:p

    let rec innerBuild curDepth =/o:p

        if curDepth < maxDepth then/o:p

            let nextDepth = curDepth + 1/o:p

            match rand.Next(7 with/o:p

            | 0 -> Multiply (innerBuild nextDepth, innerBuild nextDepth/o:p

            | 1 -> Add (innerBuild nextDepth, innerBuild nextDepth/o:p

            | 2 -> Subtract (innerBuild nextDepth, innerBuild nextDepth /o:p

            | 3 -> GreaterThan (innerBuild nextDepth, innerBuild nextDepth/o:p

            | 4 -> If (innerBuild nextDepth, innerBuild nextDepth, innerBuild nextDepth /o:p

            | 5 -> Constant (rand.Next(maxConst/o:p

            | 6 -> Parameter (rand.Next(noParams/o:p

            | _ -> failwith “assert false”/o:p

        else /o:p

            match rand.Next(2 with/o:p

            | 0 -> Constant (rand.Next(maxConst/o:p

            | 1 -> Parameter (rand.Next(noParams/o:p

            | _ -> failwith “assert false”/o:p

    innerBuild 0/o:p

 /o:p

/// make a change to an existing tree by replace a node/o:p

/// with a randomly generated tree/o:p

let mutateExpression maxConst maxParam rate =/o:p

    let rec innerMutate currDepth tree =/o:p

        let mutate node = /o:p

            let newNode =/o:p

                if rand.NextDouble( < rate then /o:p

                    buildRandomExpression maxConst maxParam (currDepth + 1 /o:p

                else node/o:p

            innerMutate (currDepth + 1 node/o:p

        match tree with/o:p

        | Multiply (x, y -> Multiply (mutate x, mutate y/o:p

        | Add (x, y -> Add(mutate  x, mutate  y/o:p

        | Subtract (x, y -> Subtract (mutate  x, mutate  y/o:p

        | GreaterThan (x, y -> GreaterThan (mutate  x, mutate  y/o:p

        | If (pred, tval, fval -> If (mutate  pred, mutate  fval, mutate  tval/o:p

        | Constant value -> Constant( value /o:p

        | Parameter pos -> Parameter ( pos /o:p

    innerMutate 0/o:p

 /o:p

 /o:p

let (|Binary|Nullary| = function /o:p

    | Add(x,y -> Binaryfun(x,y -> Add(x,y,x,y/o:p

    | Subtract(x,y -> Binaryfun(x,y -> Subtract(x,y,x,y/o:p

    | Multiply(x,y -> Binaryfun(x,y -> Multiply(x,y,x,y/o:p

    | GreaterThan(x,y -> Binary(fun (x,y -> GreaterThan(x,y,x,y/o:p

    | If(pred,tval,fval -> Binaryfun (x,y -> If (pred,x,y,tval,fval/o:p

    | x -> Nullary(x/o:p

 /o:p

type HoleTree =/o:p

  | LeftHole of (Expression * Expression -> Expression * HoleTree * Expression/o:p

  | RightHole of (Expression * Expression -> Expression * Expression * HoleTree/o:p

  | Hole/o:p

 /o:p

let rec plug = function/o:p

  | LeftHole(con,h,r,t -> con(plug(h,t, r/o:p

  | RightHole(con,l,h,t -> con(l, plug(h,t/o:p

  | Hole,t -> t/o:p

 /o:p

 /o:p

let rec descendTree top p = function/o:p

  | Nullary(x -> Hole, x/o:p

  | t when not top && rand.NextDouble( < p -> Hole, t/o:p

  | Binary(con,l,r -> /o:p

      if rand.NextDouble( < 0.5 then/o:p

        let h,t = descendTree false p l/o:p

        LeftHole(con,h,r,t/o:p

      else/o:p

        let h,t = descendTree false p r/o:p

        RightHole(con,l,h,t/o:p

 /o:p

let crossOverExpressions p t1 t2 =/o:p

    let h,_ = descendTree true p t1/o:p

    let _,t = descendTree true p t2/o:p

    plug(h,t/o:p

 /o:p

let evolve scoreFunction mutRate crossRate breedChance pop maxGen maxDepth maxConst noParams =/o:p

 /o:p

    let initPop = List.init pop (fun _ -> buildRandomExpression maxDepth maxConst noParams/o:p

 /o:p

    // the inner loop which will handle each generation /o:p

    let rec innerGenEvolve currPop currGen =/o:p

    /o:p

        // calculate score sort list to find the winner/o:p

        let res =/o:p

            [ for expr in currPop ->/o:p

                scoreFunction expr, expr ]/o:p

        let res = List.sort (fun (score1,_ (score2,_ -> compare score1 score2 res/o:p

        let score,winner = List.hd res/o:p

        /o:p

        // print the winner … just for info/o:p

        printfn “\nGen:%i score:%A” currGen score/o:p

        printExpression winner/o:p

        /o:p

        // if we’ve found winner or reached the maxium gens return/o:p

        if score = 0I || currGen = maxGen then/o:p

            winner/o:p

        else/o:p

            // get rid of scores, no longer needed/o:p

            let res = List.map snd res/o:p

            /o:p

            // always keep winner and second/o:p

            let winner, second = /o:p

                match res with /o:p

                | winner :: second :: _ -> winner, second /o:p

                | _ -> failwith “assert false”/o:p

            let newpop = winner :: second :: []/o:p

            /o:p

            // select an expression probably towards to top of the list/o:p

            let selectExpr( = List.nth res (min (List.length res - 1 (int(log(rand.NextDouble( / log(breedChance/o:p

            /o:p

            // loop to calculate the new population/o:p

            let rec addExpress acc=/o:p

                if  List.length acc = pop then/o:p

                    acc/o:p

                else/o:p

                    // cross two expressions then mutate/o:p

                    let crossExpress = (crossOverExpressions crossRate (selectExpr( (selectExpr(/o:p

                    let newExp = mutateExpression maxConst noParams mutRate crossExpress/o:p

                    addExpress (newExp :: acc/o:p

                    /o:p

            let newpop = addExpress newpop/o:p

            // loop recursively/o:p

            innerGenEvolve newpop (currGen + 1/o:p

    // start the loop/o:p

    innerGenEvolve initPop 0/o:p

 /o:p

// define a secret funtion we’re trying to find/o:p

let secertFunction x y = (x * x + (2 * y + (3 * x + 5/o:p

 /o:p

// calculate some data from the secret function/o:p

let data = [ for x in [0 .. 200] -> /o:p

                let x = rand.Next(40/o:p

                let y = rand.Next(40/o:p

                (x,y, secertFunction x y ]/o:p

 /o:p

// evaluate the an expression to see how close to the secret function it is/o:p

let scoreFunction expr =/o:p

    let results =/o:p

        [ for (x,y,res in data ->/o:p

            res - evaluateExpression [x;y] expr ]/o:p

    results |> List.fold_left (fun acc x -> BigInt.Abs (BigInt x + acc 0I/o:p

 /o:p

// call the evolve function/o:p

evolve scoreFunction 0.2 0.7 0.91 800 100 6 10 2

Happy hacking, and a big thanks to “Keith” from the cs.hubsf.net who helped me with the tree “crossover” function.

Feedback:

Feedback was imported from my only blog engine, it’s no longer possible to post feedback here.

re: Genetic Programming – A Language Oriented Programming Example - Damon Wilder Carr

As a fan of genetic algorithms I was really happy to find this (and I can leverage some of the work on creating the recent Linq provider we created over global financial markets)..

Thanks

Damon Wilder Carr,
http://blog.domaindotnet.com

re: Genetic Programming – A Language Oriented Programming Example - jm

If I may point out, the download link appears to be broken. Thank you for your excellent thoughts & works on F#.

re: Genetic Programming – A Language Oriented Programming Example - Robert Pickering

My web server didn’t like the fsx extenstion! I’ve zipped the script and fixed the link.

Cheers,
Rob

re: Genetic Programming – A Language Oriented Programming Example - Chris

All your right parentheses are missing, at least the way my browser displays this. Nice technique though!

re: Genetic Programming – A Language Oriented Programming Example - Robert Pickering

Wow. Your right, don’t know what went wrong there. The downloadable version is okay so just use that for now. Will fix the code ASAP.

Cheers,
Rob

re: Genetic Programming – A Language Oriented Programming Example - Michael de la Maza

Just a quick note to let everyone know that the right parentheses still do not appear, at least not in Firefox 3.0.4.

re: Genetic Programming – A Language Oriented Programming Example - Ryan Flynn

Thanks for the example code; I was interested and ended up porting it to scheme. I found that the convergence could be greatly sped up by subtituting the "winner" as a parameter to crossOverExpressions; I also ended up reducing the number of tests per generation.

Anyways, very interesting stuff and thanks for the code.

bugs - Jordan

First, thanks for posting this it is very cool and helpful.

just an fyi in case others try out this code – I found a couple of bugs:
1. simplifyExpression doesn’t recurse, so you only end up simplifying the top level
2. simplifyExpression needs to call loop with tree’ or it goes into infinite loop.
3. mutateExpression calls buildRandomExpression with the args in the wrong order (should be maxDepth– curDepth + 1–first. Also not sure if this is even the right max depth to use, since you are building a random expression that is at most as deep as the current depth from the top, not the bottom of the tree.