最適化

In a recent programming competition, I wrote a program to execute a breadth first search through a state space of 9-character strings, in which each character was a distinct digit 1-9. The goal was, given a 3 by 3 matrix of digits 1-9, permute them so that the final matrix is
1 2 3
4 5 6
7 8 9
Permutations allowed are swaps of adjacent (horizontally or vertically) digits whose sum is prime. In
1 6 7
2 3 8
4 5 9
Swapping 1 and 6 would be allowed (7 is prime), but swapping 7 and 8 would not (15 is not prime).
The breadth first search (keeping the leaves of the tree and their parents in memory) worked, but seems to run too slow. I prevented cycles (e.g. checking the grid 132465798 twice) by caching the grids already searched in an associative array (c++map).