TAGS :Viewed: 10 - Published at: a few seconds ago

[ How can I use ByteArray to access a graph tree of 1 million leaves? ]

I have a dictionary of words and I want to make a search algorithm to determine if a given string (with length at least 3, maximum 10) exists in the dictionary.

What I thought to do was a tree where each level is a consecutive letter from the tested word. If I try to get a child for the next letter and there is none, then the word does not exist.

For example, for the word "weed", the root is w, is there a child "e" ? Yes? Does that have a child "e" ? Yes? Does THAT have a child "d" ? No? Word does not exist. Yes? Word exists.

The problem is the sheer size of the dictionary. It takes so much time to build that immense tree from a text file that my application freezes and it takes too many seconds (around 8, depends on the pc) and might trigger browsers with "swf stopped responding, stop it?"

What I want is to pre-build the tree in AIR then save it as binary. The last step is to extract the prebuilt tree, somehow. Not using readObject because that builds the giant tree with new I somehow want to cast the bytearrary as Object and access that from memory, but I have no clue how to start doing this.

Answer 1


Pre-calculating all those objects would cost a lot of time and more important - tons of memory!

If you want to search for a specific word ("weed", not all words starting with "wee") - the absolute one is a simple Object like this:

var dictionary = {
    'weed': 1,
    'other_word': 1
}

So your 'search' would be:

var search:String = 'weed';
if (dictionary[search])
    trace('exists');
else
    trace('does not exist');

Now, if you want to search for all words that start with specific symbols, there are a few options: 1) you can loop through properties in this array and collect all that start with your searched pattern into a separate array; 2) construct some data-structure based on queries

First one is trivial and will do the job most of the time, especially when you don't want to get all of the words starting with "wee", just a fixed number (break the loop). The second one is similar to what you are thinking, but you should optimize it. Saving it as a binary wont help much, I even think it will worsen the things. It's because those objects do not exist in the memory, so no matter how - you will need to create them (even if it's from BA).

You can always do some magic by yourself (again only if we are speaking for searching for words starting with specific text; for explicit word search - use Object). For example - you can put the words in specific arrays, depending on the first letter. Let's say that the words are equally spread, this means you are decreasing the search size around 30 times. It doesn't need to be perfect, as long as you get the desired results :)

Good luck, would love to see how you've managed to work it out!

Answer 2


My first thought is that you could use a Worker with a ByteArray with sharable=true to build the data without hanging the UI. This would not make the process any faster exactly, but it would make the UI behave.

What I want is to pre-build the tree in AIR then save it as binary. The last step is to extract the prebuilt tree, somehow. Not using readObject because that builds the giant tree with new I somehow want to cast the bytearrary as Object and access that from memory, but I have no clue how to start doing this.

I'm not really sure I understand your comment about readObject(). You can't just cast a ByteArray into something else. It's just an API for raw binary data. The readObject() is exactly how you would decode AMF binary data into AS3 objects in memory. This is almost certainly the fastest and most efficient way to build an object tree from binary data.