Go Back   Wordtwist Forums > Wordtwist.org Discussion > FAQs and Help Area

Reply
 
Thread Tools Display Modes
  #11  
Old 03-13-2018, 06:00 AM
bwt1213 bwt1213 is offline
Senior Member
 
Join Date: Oct 2013
Location: Pleasant Prairie, Wisconsin
Posts: 305
bwt1213 is an unknown quantity at this point
Default

Spike, look at it from the other end. START with the dictionary, then go word by word to see if the word will fit in the grid. You would have only about a quarter million words to check, and almost all of them would be eliminated after the first two letters.
Reply With Quote
  #12  
Old 03-13-2018, 06:31 AM
Spike1007 Spike1007 is online now
Senior Member
 
Join Date: Aug 2017
Posts: 416
Spike1007 has received several accolades
Default

I thought about going from that end, but without thinking about it a lot, I decided that was even less efficient. Then again, there are probably fast ways to eliminate a word as a possibility without doing a full search, so you're probably right.

Going from both ends might be reasonable. First, get a list of all 3-letter combinations that appear on the board (If my program is right, there are only 768 of those.) and arrange them in an easily-searchable format. Then go through the dictionary and compare the first three (or any string of three) letters in each word with this database. Like you noted, most words will be eliminated immediately. Extended searches will be occasionally required (especially in cases where the word is there, or almost there). I'll have to think about what it would cost to actually find a 24-letter word if it's there.

Last edited by Spike1007 : 03-13-2018 at 07:03 AM.
Reply With Quote
  #13  
Old 03-13-2018, 07:08 AM
bwt1213 bwt1213 is offline
Senior Member
 
Join Date: Oct 2013
Location: Pleasant Prairie, Wisconsin
Posts: 305
bwt1213 is an unknown quantity at this point
Default

Start with the grid and the first word in the dictionary. Make a copy of the grid. Start with the upper left corner and see if the first letter of the word appears. If not, go to the next position (any way that is most convenient). If/when you find a match on the first letter, clear that letter position on your copy of the grid. Make another copy of what you have so far. Then look at every adjacent square. If it is not cleared and matches the next letter, make another copy of the grid, clear the current letter position, and check to see if the next letter position occurs in an adjacent uncleared position. This sets up as a nice recursive routine. Any time you are stopped -- there is no match for the next letter -- return a value of FALSE. If you match all letters, return TRUE. When a value is returned at your position, if the value is true put the word on the list as "found". If the value is false, return to the original grid at that level and check the next position. When all 25 positions on the original grid have been checked, or when the word has been found, return to the original grid and the next word in the dictionary and repeat. No word is checked past the minimum number of letters. You may need to hold 25 copies of the 5 X 5 grid in memory, and you may need to recurse through 24 levels at most. For nearly all words, the first letter of the word will match a letter in the grid only a few times. The second letter of the word will be adjacent to that letter only rarely, and so on. There is no need to calculate all the possible paths or all the different ways you might place a word in the grid, no worry about possibly using the same letter twice, and you'd probably dispose of any reasonable sized dictionary in a few seconds.
Reply With Quote
  #14  
Old 03-13-2018, 08:33 AM
Spike1007 Spike1007 is online now
Senior Member
 
Join Date: Aug 2017
Posts: 416
Spike1007 has received several accolades
Default

OK, I wasn't thinking clearly. I was thinking of the number of possible paths, as opposed to finding a matching path for a particular word as a cost in finding it. That seems reasonable, but I'll still have to think some more.

You can also do it without making up to 25 copies of the grid. (In my little program, I did something similar to what I'm describing.) For a 5x5 board, number the squares from 1,...,25. For each square, build a list of permissible neighbors. (There will be from 3 to 8 of these.) I defined a "marker" array (let's call it mark(i), to tell whether square i has been used, and initialized to zero). Now, for a given word (of length N), define arrays pos(1:N) and nbr(1:N). (pos(L) tells the square position of character L in my current word path, and nbr(L) tells which neighbor (actually a pointer into the list of neighbors) I'll use for my next (L+1st) letter. Anyway, if possible, find a letter on the board that matches the first letter of the word. Suppose that's position i. (Of course, if it's not found, stop.) Mark that position (set mark(i) = 1 and set L = 1.) That kind of initializes things.

Now, for any L < N, sweep over neighbors of pos(L) looking for a match for the next letter of the word.(using nbr(L) to tell where you are in this neighborhood list). If an unmarked match is found, set L = L+1, pos(L) = the square number of the matching neighbor. If L = N, you found the word. If not, repeat. (nbr(L) should also be initialized.) If no matching neighbor is found, you have a dead end. (That doesn't mean that the search failed, only that you have made a wrong choice or false partial match.) Zero out mark(pos(L)), and set L = L-1. (If L = 0 here, you're done and the word hasn't been found.) Go back to sweeping over neighbors of pos(L), seeing if there's another match, but start from where you left off in nbr(L).)

By the way, the way you describe it, there could be multiple promising paths , but you happened on a dead end & decided the word wasn't there. (Say the next letter is an A, but there's more than one A. You just picked the wrong one.) That's why I'll back up & try a different path if one exists. I only return FALSE if I back up to the first square & run out of matching neighbors, ending up with L = 0. (Edit: Your recursive use of multiple grids may accomplish the same thing. I just can't quite see it at the moment. I've always programmed in Fortran, which (in older versions) doesn't allow for recursion. I have to do it the hard way, setting up my own stacks. That's what L, pos, and nbr are for. Also, I kind of erase my tracks as I decrement L.)

After all this, I decided that you're right to start from the dictionary, but maybe my (or Asmarandana's) "start from both ends" approach can filter out a lot of stuff right away.

By the way, if we're going through the dictionary in what seems like no time at all (especially if all the -NESS & -NESSES & whatnot have expanded it to a million words or so), we should do single word searches in milliseconds, if not microseconds. I think we have a good start (not that either of us will try it), but I'd still be interested in how we'd compare to what's actually used.

Again, I apologize to those who'd rather be reading essay-writing posts. (In my defense, I think this is somewhat relevant to WordTwist.)

Last edited by Spike1007 : 03-13-2018 at 10:00 AM.
Reply With Quote
  #15  
Old 03-13-2018, 10:12 AM
bwt1213 bwt1213 is offline
Senior Member
 
Join Date: Oct 2013
Location: Pleasant Prairie, Wisconsin
Posts: 305
bwt1213 is an unknown quantity at this point
Default

I just realized: if you set the starting board with an array indexed at zero and all the letters starting at array position 1, then the board would be bordered by empty entries. You could use the same logic as always to decide if a cell had been used (that it's empty) and you'd have no special cases to deal with for cells on the edges. SO: outer loop: get next word and set result to FALSE. Next loop in: loop over the rows, from 1 to 5. Innermost loop: loop over the columns, 1 to 5. Check the letter at current position. Does it match the first letter of the word? If no, move to next column, until you're at column 6. Then go to next row and reset column to 1. If you're at row 6, get the next word. If yes, then make a copy of the array. Set the current position to blank. Call the recursive routine. In the recursive routine, check all the array locations within one row and one column of the current position for the next letter in the word (note that the array positions bordering the letter array are all set to blank and will be skipped automatically). If there is a position that matches, make a copy of the array, set the current position to blank, and call the recursive routine to check the next letter in the word. If there is a match and there are no more letters in the word, return TRUE. If there is no match, return one level. The recursive routine will then check the next position near the previous letter until it finds another match or runs out of positions to check. When it runs out of positions to check, it returns one level and continues. At every level of recursion, if the result is TRUE at this level or some lower one, stop checking and return; it doesn't matter if the word is found in multiple places, just that it occurs once. If a word is present in the grid, it will certainly be found. The nice thing about this method is that the letters don't even have to be treated as letters. Most computers are especially fast at byte comparisons, and an "A" is just a 65. The blank spots could be set to zero or 32 and it won't make a difference because they'll never match anyway. The routine could be run in BASIC, FORTRAN, Assembly, C, Python, pretty much anything.
Reply With Quote
  #16  
Old 03-13-2018, 10:46 AM
Spike1007 Spike1007 is online now
Senior Member
 
Join Date: Aug 2017
Posts: 416
Spike1007 has received several accolades
Default

It's getting later (for me, for thinking) but I think I understand what you're saying. In spite of the way I described things, I also used row/column addressing. I dimensioned the board marker as mark(0:6,0:6) and set all "out-of-bounds" points as "used" (and hopefully never changed). For every letter position, I'd sweep over all 8 neighbors (including those "outside" the grid, but they'd be marked as used and therefore ignored. (I think that the way I described it, it would be more efficient, since there never any of these "out-of-bounds" things. They were just addressed once and for all in some a global neighborhood vector, which kind of forgot to address specifically.)

I'll have to read what you said more carefully, but I think we're just doing the same thing in different ways. (By the way, I'm using integers for my "mark" array, but yes, tests would probably be faster using shorter integers or just logical values.)

I've also thought about using integer values for 3-letter strings (aaa = 1, aab = 2, and so on) and just keeping a vector for what's available on the board (after a first pass examination, which is fast) with just TRUE/FALSE stored. Keep a vector based on these numerical values, based on whether the string appears or not. That would allow for checking a 3-letter match based on a simple arithmetic operation & a test, rather than a search. (For the hell of it, we could store the possible locations of the three-letter strings that do appear, cutting down search times if there's a match.)

Like I said, the dinosaur Fortran I use doesn't use recursion. However, I assume that it's fast in character comparisons, although I've never tested that. It could be compiler-dependent though.

Last edited by Spike1007 : 03-13-2018 at 11:16 AM.
Reply With Quote
  #17  
Old 03-14-2018, 03:57 AM
kblicksterPremium Member kblickster is offline
Premium Member
 
Join Date: Oct 2011
Posts: 7
kblickster is an unknown quantity at this point
Default

Typed the word enose by mistake and it scored so I typed enoses and it scored. I've tried it many times since and it never scores.
Reply With Quote
  #18  
Old 03-14-2018, 05:34 AM
Spike1007 Spike1007 is online now
Senior Member
 
Join Date: Aug 2017
Posts: 416
Spike1007 has received several accolades
Default

Quote:
Originally Posted by kblickster View Post
Typed the word enose by mistake and it scored so I typed enoses and it scored. I've tried it many times since and it never scores.
Do you mean that ENOSE or ENOSES doesn't work?

Last edited by Spike1007 : 03-14-2018 at 05:36 AM.
Reply With Quote
  #19  
Old 03-14-2018, 11:16 PM
kblicksterPremium Member kblickster is offline
Premium Member
 
Join Date: Oct 2011
Posts: 7
kblickster is an unknown quantity at this point
Default

I've tried both on several other boards neither one of them work. Drives me nuts because I see the word a lot and try it every time.
Reply With Quote
  #20  
Old 03-15-2018, 01:29 AM
Spike1007 Spike1007 is online now
Senior Member
 
Join Date: Aug 2017
Posts: 416
Spike1007 has received several accolades
Default

I saw them yesterday on 5x5 boards. ENOSES worked for me, but ENOSE didn't.
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 01:32 PM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2018, Jelsoft Enterprises Ltd.

About Puzzle Baron

The Puzzle Baron family of web sites has served millions and millions of puzzle enthusiasts since its inception in 2006. From cryptograms to acrostics, logic puzzles to drop quotes, patchwords to wordtwist and even sudoku, we run the gamut in word puzzles, printable puzzles and logic games.

Questions or Comments?

The word 'bought' has how many letters in it?