Archive for May, 2009

Probability

*** I’m not a mathematician, the explanation here is probably crap, so have a look on the internet on factorials and sets.

A friend of mine asked me today if I had a script lying around to produce anagrams. After saying no and secretly checking what the heck an anagram was again, I started thinking about what it reminded me of.

My thoughts diverged on the symbol “!”, which is used in probability to calculate what your chance might be when you eliminate your previous results from the equation. So for example if you have a set of [1,2,3,4,5,6] – after rolling 3 with a dice, the set will become [1,2,4,5,6] so you can’t roll 3 again.

Anagrams are a combination of a set of predefined characters, and then mixed around. And while you might think you’ll have 6^6 possibilities, you actually have less, because you can’t use more of the same characters that are available.

Say for example you have a word of “abc”, you can make “acb”, but you can’t make “aab”, because you only have 1 “a”. Thus in the example of [1,2,3,4,5,6] you actually have 6! possibilities. We can write 6! as 6*5*4*3*2*1 = 720 possibilities.

And the rewrite of 6! is exactly what made me think of this simple algorithm:

procedure factorWord( const sBeginning: string; const sWord: string; const lstPermutations: TStrings );
var
  i, c: integer;
begin
  c := Length( sWord );

  if c = 0 then
  begin
    lstPermutations.Add( sBeginning );
  end;

  for i := 1 to c do
  begin
    factorWord( sBeginning + sWord[i], Copy( sWord, 1, i - 1 ) + Copy( sWord, i + 1 ), lstPermutations );
  end;
end;

 

You’ll initially call it for example with factorWord( ”, Edit1.Text, Memo1.Lines ); and you’ll end up with all the possible anagrams. The function works recursively on 2 sets, one that defines the entire set (Edit1.Text), and one thats starts out empty. With every recursion the possibilities of new characters to add gets smaller, while the anagram is filled with the same amount of characters in a different order, and is added to the stringlist when the recursionpath ends up with 0 remaining characters.

The biggest problem however of making anagrams is that there are simply too much possibilities when you go over about 6 characters. The worst of it, and this is after some code rewrites to append the words to a file occasionally to avoid an out-of-memory error, is that a word with 14 characters will make your textfile about the size of 2.4GB until a file streaming error occurs…

14! = 87178291200 lines, multiplied by “14 byte anagrams plus 2 bytes for the crlf”, you’ll end up with 1394852659200 bytes which amounts to 1299GB of data.

Have fun storing that on your desktop pc…

Add comment May 26, 2009

Search cmp to 0

Search: optimization “cmp eax,0″

You don’t have to compare EAX to 0 to know it’s 0, just make sure the item in EAX is edited last, with any opcode that you see in your cpu manual with the appendage that it modifies the ZF. That’s a fancy flag that says a value you just put in a register is zero.

Any mathematical, bitwise or just plain MOV into a register, modifies the Zero flag. Skip the CMP and just use either JZ (jump if zero) or JNZ (jump if not zero) to see if the zero flag is set and jump somewhere accordingly.

Add comment May 7, 2009


RSS Twitter

 

May 2009
S M T W T F S
« Apr   Jul »
 12
3456789
10111213141516
17181920212223
24252627282930
31  

Categories

Blogroll

Meta

Top Posts