Sunday, May 19, 2013

Word Games (The Domain Model)

This is a continuation of the previous blog entry (Word Games).

We've now created the logic for generating the Word Puzzles, however, we need to store the generated puzzles somewhere (in a database).

Ideally I would like to store all possible paths for a given word combination/path length as a single record.
For example, I know that there are 2 possible ways to get from aback to stank in 3 intermediate steps

aback -> alack -> slack -> slank -> stank
aback -> alack -> slack -> stack -> stank


I would like a single record that has a start word of aback, an end word of stank and stores 2 possible paths (the ones above).

With that requirement, here is the domain model.
It consists of a Puzzle object and a Path object. The Puzzle object consists of information about the Puzzle (start word, end word etc) and a collection of paths.

To see this in action, here is the sample Groovy code to retrieve it

import com.wordpath.Puzzle

def puzzle = Puzzle.findByStartWordAndEndWord('aback', 'stank')
println puzzle
puzzle.paths.each{ path ->
    List lst = path.toString().tokenize(',[]')   //Convert a string presentation of a list to a List
    println lst
   
}

The output is as follows
{161} aback stank possible paths = 2 [[aback, alack, slack, slank, stank], [aback, alack, slack, stack, stank]]
[aback,  alack,  slack,  slank,  stank]
[aback,  alack,  slack,  stack,  stank]


Here are the 2 domain classes. Note the unique constraint (startWord, endWord, length).

Puzzle
package com.wordpath
import com.wordpath.Path

class Puzzle {
 
 String startWord
 List paths
 String endWord
 Integer possiblePaths
 Integer wordLength
 boolean isActive 
 
 static hasMany = [paths: Path]

    static constraints = {
  id generator: 'identity', column: 'id'
  startWord blank: false
  paths nullable: true
  endWord blank: false
  possiblePaths nullable: false
  wordLength nullable: false
  isActive nullable: false
  startWord(unique: ['endWord','wordLength'])
    }
 
 String toString(){
  String recordInfo = "{$id} ${startWord} ${endWord} possible paths = ${possiblePaths} ${paths}"
  return recordInfo  
 }
}


Path
package com.wordpath
import com.wordpath.Puzzle

class Path {
 String path
 Puzzle puzzle
 
 static belongsTo = [puzzle: Puzzle]
 
 String toString(){
  String recordInfo = "${path}"
  return recordInfo
 }
}


I am using a shortcut in storing the path as a String that is easily converted to a List using Groovy (see the comment in the first code listing above)

Monday, May 13, 2013

Word Games

A few months ago on a flight to San Diego I opened up an inflight magazine and found one of those MENSA puzzles.

The goal of the puzzle was to get from one word to another in four steps only being able to change a single letter at a time. I cannot remember the exact words, but I couldnt do it because I was unfamiliar with one of the intermediate words. In a nutshell, the game is as follows.

In 3 steps, get from MOOD to GOAT, only changing one letter in the word at each step.
A possible solution would be

MOOD
MOOT
BOOT
BOAT
GOAT


though they may be other solutions. The correct solution is the fastest one possible
This is an interesting problem, and I wondered whether it would be possible to create an application that would generate such puzzles. There are a number of prerequisites

1: A list of all valid english words (this is tougher than it sounds, there are multiple lists and they dont agree)
2: A way to specify word length and step length (I want a puzzle where the words are 4 characters long and the solution is 4 steps).

I begun playing around with it and came up with the basic generation code. For example, given a starting word of POT, create a Tree with all possible combinations. Because the tree could be arbitrarily deep, specify a termination length (a given path cannot be longer than X).

For example, assume the only valid 3 letter english word were [pot,pit,pet,lot,bot,pin,tin,tan,pep,bet] and given a start word of pot and a max depth of 5, here is the Tree structure built.

The total possible paths (irrespective of path length) are

[pot, pit, pin, tin, tan]
[pot, pet, pep]
[pot, pet, bet]
[pot, lot]
[pot, bot, bet]

In fact, the Groovy unit test for this code would look as below

@Test
void testGetAllPaths(){
	def allowedWords = ['pot','bot','lot','pet','pit','bet','pep','pin','tin','tan']
	Node node = new Node(null,'pot',allowedWords,0,5)
	def result = node.getAllPaths()
	assert result == [['pot', 'pit', 'pin', 'tin', 'tan'], ['pot', 'pet', 'pep'], ['pot', 'pet', 'bet'], ['pot', 'lot'], ['pot', 'bot', 'bet']]

}


The code for the Node class is below

package com.wordPath.utilities;

import java.util.ArrayList;
import java.util.Deque;
import java.util.LinkedList;
import java.util.List;
import java.util.Queue;
/**
 * This class represents a Tree structure of word combinations with a single letter in the word changed as you progress
 * down the tree structure
 * frog -> frag -> brag -> brat
 * @author wkimeria
 *
 */

public class Node {

	/**
	 *  Create tree starting from the given start word (value), terminate branches at specified level
	 * @param parent			Parent Node
	 * @param value				Node value
	 * @param allowedWords		List of allowed words (must be valid english words of a given length)
	 * @param level				level of this node in given tree structure. Root is 0
	 * @param requestedDepth 	Level at which to terminate branches
	 */

	String value;
	List children;
	Node parent;
	int level;
	boolean visited;

	Node(Node parent, String value, List allowedWords, int level,
			int requestedDepth) {
		List allowedWordsForPath = new ArrayList();
		allowedWordsForPath.addAll(allowedWords);
		allowedWordsForPath.remove(value);
		children = new ArrayList();
		this.value = value;
		this.level = level;		
		this.parent = parent;
		if (level >= requestedDepth-1) {
			return;
		}
		List variations = new ArrayList();
		int wordLength = value.length();
		for (int i = 0; i <= wordLength; i++) {
			variations.addAll(getVariations(value, i));
		}
		variations = getAllowedWords(variations, allowedWordsForPath);
		if (variations.size() != 0) {
			level++;
			for (String v : variations) {
				Node n = new Node(this, v, allowedWordsForPath, level,
						requestedDepth);
				children.add(n);
			}
		}
	}

	/**
	 * Given a list of words generated by combinations, remove invalid words and only
	 * return valid english words.
	 * @param words
	 * @param allowedWords
	 * @return
	 */
	List getAllowedWords(List words, List allowedWords) {
		List allowedWordsFound = new ArrayList();
		for (String w : words) {
			if (allowedWords.contains(w)) {
				allowedWordsFound.add(w);
				allowedWords.remove(w);
			}
		}
		return allowedWordsFound;
	}

	/**
	 * For a given word, get all variations with the letter at position s cycled
	 * @param word
	 * @param pos
	 * @return
	 */
	List getVariations(String word, int pos) {
		List variations = new ArrayList();
		if (word != null && pos < word.length()) {
			String letter = word.substring(pos, pos + 1);
			char next = (char) (letter.toCharArray()[0] + 1);
			char stop = (char) (next - 1);
			while (next != stop) {
				String current = word.substring(0, pos) + next
						+ word.substring(pos + 1, word.length());
				variations.add(current);
				next++;
				if (next == (char) 'z' + 1) {
					next = (char) 'a';
				}
			}
		}
		return variations;
	}

	/**
	 * Get a list of lists of all possible paths (essentially capturing each depth first search from root to leaf
	 * @return
	 */
	public List> getAllPaths() {
		List> paths = new ArrayList>();
		Deque queue = new LinkedList();
		queue.clear();
		queue.add(this);
		List currentPath = new ArrayList();
		Deque currentPathQueue = new LinkedList();
		while (!queue.isEmpty()) {
			Node n = queue.removeLast();
			if (n.children != null && n.children.size() > 0) {
				queue.addAll(n.children);
			} else {
				currentPathQueue = new LinkedList();
				currentPath = new ArrayList();
				Node walker = n;
				while (walker != null) {
					currentPathQueue.add(walker.value);
					walker = walker.parent;
				}
				while (!currentPathQueue.isEmpty()) {
					currentPath.add(currentPathQueue.removeLast());
				}
				paths.add(currentPath);				
			}
		}
		return paths;
	}

	/**
	 * Return string representation of Tree with value and node depth (uses Breadth First Search)
	 */
	public String toString() {
		String nodeValues = "";
		Queue queue = new LinkedList();		
		queue.clear();
		queue.add(this);
		while (!queue.isEmpty()) {
			Node n = queue.remove();
			String parent = "";
			if (n.parent != null) {
				parent = n.parent.value;
			}
			nodeValues += n.value + " " + n.level + " parent = " + parent
					+ "\n";
			if (n.children != null)
				queue.addAll(n.children);
		}
		return nodeValues;
	}
}


This class is not perfect. You could call it and start with a particularly unlucky word and building the Node could take an inordinate amount of time, but it is a first pass. The idea is to have a running process that generates puzzles and stores them, that way one could request a puzzle that starts with x word and ends with y word of length n, and get back a list of all puzzles that fit the given criteria

int wordLength = 4
int puzzleLength = 3
List allowedWords = new ArrayList<>()
getWordsOfLength(wordLength, allowedWords)
System.out.println("Words = " + allowedWords.size())
Double randomWordIndex = Math.random() * allowedWords.size()
for(String s:allowedWords){
	String randomWord = s;
	Node node = new Node(null, randomWord, allowedWords, 0, puzzleLength);
	println("--------------- Node built for ${s} ---------------------");
	List> paths = node.getAllPaths();
	for (List path : paths) {
		println(path);		//Save paths somewhere
	}
	System.out.println("Possible Paths = ${paths.size()}");
}



Ongoing code at

WordPath on GitHub

Saturday, February 9, 2013

Fun with Grails (part 1)

I've been playing on and off with a personal project written in Grails over the last few weeks.

 The Massachusetts Bay Transportation Authority (MBTA) has an API that provides real-time train information for the Red Line, Blue Line and Orange Line (sadly, nothing for the Green Line).

 I wrote a Grails application that displays real time Train Arrival information at Train Stops (the same data that the MBTA uses for their displays in some MBTA stations). I wrote this application because I take the Red Line to and from work and it would be nice to see real time data as I took the bus from home to the train station and as I walk to the Red Line after leaving work.

So far it works great.

 The project is still in it's infancy and the look and feel is pretty basic, but I will improve the appearance over the next few weeks.
It is currently deployed in the cloud (Cloudbees.net, a service we previously used for one of our projects at work).

http://wheresmytrain.wmk.cloudbees.net/

I will be doing a series of posts later on as the application evolves.

There are a lot of things that Grails and its plugin architecture enables (like using the Quartz plugin to poll MBTA data periodically, the MBTA User Agreement states that you cannot poll more than once every 10 seconds, and there are 3 endpoints, one for each line so I assume you cannot poll any ONE endpoint more than once every 10 seconds).


http://grails.org/plugin/quartz

The class that does the polling is as follows

package com.trainschedule
class GetScheduleBlueLineJob {

 static triggers = {  
  cron name:'cronTriggerBlue', startDelay:1000, cronExpression: '0/30 * * * * ?'
 }
  
 def scheduleInformationService
 def concurrent = false
 def grailsApplication
  
 def group = "MyGroupBlue"
   def execute(){
    log.info "called for blue info at ..." + new Date()
    scheduleInformationService.cacheNewDataForLineFromMBTA('blue')  
 }
}

Sunday, November 18, 2012

Performance costs of Dynamic Typing (Groovy vs Java)



Recently, someone at work introduced me to www.projecteuler.net, which is a web site that posts logic/math problems and asks users to solve them using code.

The web site is weirdly addictive, and the only requirement is that the solution arrived should run in under a minute (which is a rather arbitrary limit).

I begun solving the problems in Groovy until I arrived at a problem that asked the user to compute the 10001th Prime number. My initial solution, written in Groovy using a very simple Prime number function (code snippet below)


def getPrimesTo(start,upperLimit){    
    def nums = start..upperLimit       
    for(int i=2;i<=nums.last();i++){
        nums = remove(nums,i) 
        if(nums.size()==0){
            break
        }  
    }
    return nums   
}

My initial solution iterated through all numbers beginning at 2, adding those that were determined to be Prime to a List and checking the size of the list. I realized this solution would be a no go when after 4 minutes it was still runnning on my machine which is an Intel i7 with 8 Gig of RAM. Obviously this was a no go.

I then made a modification and used a Sieve of Erastothenes approach (get a List of numbers from 2 -x, and then starting at 2, remove numbers from the List that are not that number and multiples of that number).
So for example, to find all the prime numbers between 2 and 10, start with

[2,3,4,5,6,7,8,9,10]

remove all numbers that are multiples of 2

[2,3,5,7,9]

remove all numbers that are multiples of 3

[2,3,5,7]

and so on. The relevant methods in Groovy looks like

def getPrimesTo(start,upperLimit){    
    def nums = start..upperLimit       
    for(int i=2;i<=nums.last();i++){
        nums = remove(nums,i) 
        if(nums.size()==0){
            break
        }  
    }
    return nums   
}

def remove(nums,n){
    return  nums.findAll{it==n || it%n!=0}    
}



Much better, but computing the 10001th Prime on my machine  still took 90 seconds, meaning it would be much slower on an average machine. The solution above can be tweaked (if you're removed the multiples of 2, there is no need to test for the multiples of 4 or 8 etc). However, I didn't think that even that change would get me to 60 seconds, and in reality, it needed to be much less than 60 seconds to be considered an optimal solution. I begun to wonder how much Groovy's dynamic Typing was costing me in performance.

The answer was, a lot. I re-wrote the the solution in Java (Java does not have Groovys awesome findall method/closure used in the remove method above, so it is a bit more verbose. In Java, the solution using Sieve of Erastothenes takes 4 seconds. The full solution is below (I doubt this is the optimal solution, but I think a run time of 4 seconds fulfills the 'run under a minute' requirement.

import java.util.ArrayList;
import java.util.Date;

public class MainApp {
import java.util.List;

 /**
  * @param args
  */
 public static void main(String[] args) {
  
  Date start = new Date();
  Integer n = getNthPrime(10001);  
  Date end = new Date();
  long totalTime =  (end.getTime()-start.getTime())/1000;
  System.out.println("Result = " + n);  
  System.out.println("Calculation took " + totalTime + " seconds");
  
  
  

 }  
 static Integer getNthPrime(Integer n){
  Integer stepSize = n;
     Integer start = n*2;
  Integer bottom = 2;
  List<Integer> existingPrimes = new ArrayList<>();
  List<Integer> primes = new ArrayList<>();
  
  while(primes.size()<n){   
   existingPrimes = getPrimesTo(bottom,start);
   if(existingPrimes.size()>0){       
      primes.addAll(existingPrimes);
     }
     bottom = start + 1;
     start = bottom + stepSize -1; 
  }
  
  return primes.get(n-1);
 }
 
 static List<Integer> getPrimesTo(Integer start,Integer upperLimit){  
  List<Integer>nums = new ArrayList<>();
  for(int i=start;i<=upperLimit;i++){
   nums.add(i);   
  }  
  for(int i=2;i<=nums.get(nums.size()-1);i++){
   nums = remove(nums,i);   
   if(nums.size()==0){return nums;}
  }  
  return nums;
 }
 
 static List<Integer> remove(List<Integer> nums,Integer n){   
  for(int i=0;i<nums.size();i++){
   if(!nums.get(i).equals(n)){
    if(nums.get(i).intValue()%n==0){
     nums.remove(i); 
    }
   }
  }
  return nums;
 }
}