Would you like to make this site your homepage? It's fast and easy...
Yes, Please make this my home page!
Array Searching Techniques
- Introduction
Once you start programming, it doesn't
take long before you find a requirement to load information into an array, and
then retrieve the information by searching with some form of key. Like
with everything else: in programming, there are 'many ways to skin a
cat'! I have put together 7 different examples of searching an
array, from the simplest ( and slowest) to the more complex (but F a s t
!). I have attached 2 programs that contains all seven examples,
with a timer for each, so that you can make some simple changes to test out
which of the techniques is the most suitable for your particular
application. The first program uses a single-dimensioned array of keys,
the second example is identical, but has the extra coding for a two
dimensional array.
- Assumptions
- For the following examples we will assume that we have an array
called "Array$("
- and that the number of elements in the array is stored in a variable:
'numA'
- that the element we are searching for is in the string
"search.a$"
- The result of our search will be put in a variable called
"search.fnd". It will be set to zero if an element wasn't found.
- Example 1. Simple but slow:
search.fnd = 0 for
search.i = 1 to numA
if Array$(search.i) =
search.a$ then search.fnd = search.i
next
search.i |
I feel confident that your first
program that searched an array used a technique similar to this. That's
fine, it works, and for an array with a small number of elements, it wont be
too time consuming.
So what's wrong with this method? Very
simple, it searches every element of the array whether it gets a match or
not. So if there are 100 elements in the array, and the one we want is
at the first position, we still look at the other 99 elements - just wasting
time!
- Example 2: An improvement on Example 1.
search.fnd = 0 for
search.i = 1 to numA
if Array$(search.i) =
search.a$ then
search.fnd = search. i
search.i = numA
end if next
search.i |
Notice the small change I made
to the program :- when a match is found, I drop out of the For...Next loop by
setting 'search.i' to the number of elements.
Say an array has 100
elements, then on average, we need to look at 50 elements to find a match
(this is average only, many factors can change this). So typically, the
program will run twice as fast as the first example. By the way, take
particular note of the code I use to get out of the For....Next Loop - this is
very important. Do NOT use the method in the following example - without
going into the technical details, it can cause all sorts of problems:
Do NOT use this technique to drop out of a
For..Next loop: search.fnd =
0 for search.i = 1 to numA
if Array$(search.i) =
search.a$ then
search.fnd = search. i
goto [done] 'this
technique is WRONG
end if next
search.i [done] |
- Example 3. Reducing the number of accesses even more
If your application for some reason is more likely NOT to find a match,
then in these cases you will still be searching the entire Array. To
avoid this, we can SORT the array, and then give up the search when we reach
the point where the element should have been. Consider the following
code:
search.fnd = 0 for
search.i = 1 to numA
if Array$(search.i) =
search.a$ then
search.fnd = search.i
search.i = numA
end if if
Array$(search.i) >search.a$ then search.i = numA
next
search.i |
Notice the new line of
code. If we have sorted the array, we will know when we get past the
point where the element should have been, and can stop searching.
However, unless you are more likely to get a MISS, rather than a HIT, this
code will (on average) still take the same amount of time to execute.
Why is that? Very simple, this extra line of code is know being executed for
every element, so remembering that on average, we find a hit after 50
searches, we have still done the equivalent of 100 searches because we access
the array twice!
- Example 4. Writing the same method, but with neater code.
There is always more than one way to achieve the same result, so if
timing is critical, write some test code with a timer (see the samples
attached) to see which method is the fastest>
Asit happens, this
example will execute at much the same speed as Example 4, but to my mind, it
is far neater code, and therefore easier to understand.
search.fnd = 0
search.i = 1 while (search.a$ >
Array$(search.i)) and (search.i < numA)
search.i = search.i +
1 wend
if Array$(search.i) = search.a$ then search.fnd =
search.i |
The reason that I prefer this
technique over the one used in Example 3, is it avoids the messy bit of code
to drop out of the for...next loop. You will notice that it also only
accesses the array once for each iteration of the loop.
So why
isn't it quicker than Example 3? The answer lies in the compound 'WHILE"
statement. We only access the array once, but we also have to check on
the value of "search.i" and "AND" the result with only array
search.
- Example 5. Speeding up example 4
If we could find some way
of taking out the checking on the value of "search.i" in example 4, we would
speed up the code considerable. We do this by putting a HIGH key in the
top position of the array:
search.fnd = 0
search.i = 1 Array$(numA) =
chr$(255) while search.a$ >
Array$(search.i)
search.i = search.i +
1 wend
if Array$(search.i) = search.a$ then search.fnd =
search.i |
chr$(255) or hex value 'FF' is the
highest character in the sort order, so unless you are using this character in
your keys, all your entries will be below this one in the sorted array. Notice
now that the "WHILE" statement only needs to make one comparison.
This is the code to use for searching an array with less than about 30
elements. If you have more than 30 elements then you need Example 6 or
7.
- Example 6. Binary Search - For speed freaks!!!!!!!!
Since
you are learning about programming, Im sure you've heard the word 'binary'
before. It means: ON or OFF, 0 or 1, one of two, etc.
This
technique involves splitting our array into two parts (the bottom half, and
the top half) hence the name binary, and then deciding which half our element
should be in, and then repeat the process over and over again, until the
'half' has only one element left in it. So, if we have an array with 256
elements, it works something like this:
- 256 elements - divide into 128 and 128 - which half? 128 left to
search!
128 elements - divide into 64 and 64 - which half? 64
left to search!
64 elements - divide into 32 and 32 - which half?
32 left to search!
32 elements - divide into 16 and 16 - which
half? 16 left to search!
8 elements - divide into 4 and 4 - which
half? 4 left to search!
4 elements - divide into 2 and 2 - which
half? 2 left to search!
2 elements - divide into 1 and 1 - which
half? 1 left to search!
1 element left - Found? - OR not
found?
So we start with 256 elements and get a HIT or a MISS
after just 8 comparisons! Its not exactly like that, but close
enough. There's a couple of extra comparisons to execute so on small
arrays there is very little benefit, but once the array starts getting
bigger (more than 30 elements) - WOW - you wont believe the speed!
search.top =
numA
' Set the top of the array search.bot =
1
' set the bottom of the array search.mp =
int(numA /
2) '
set the midpoint of the array
search.fnd = 0
search.done = 0
while search.done =
0 ' repeat until
the value of done changes
if search.a$ >
Array$(search.mp)
then ' calc new
midpoint
search.bot = search.mp ' new bottom of
array
search.mp = int((search.mp+search.top) / 2)
else
if search.a$ < Array$(search.mp) then '
calc new midpoint
search.top = search.mp ' new top of array
search.mp = int((search.mp+search.bot) / 2)
else
' Aha! not GreaterThan and not LessThan -we found it
search.done = -1
search.fnd = search.mp
end if
end if if search.mp
= search.bot then search.done = -1 ' Nothing left to search!
wend
if search.a$ = Array$(search.top) then search.fnd
= search.top if search.a$ =
Array$(search.mp) then search.fnd = search.mp
|
When you're
not going to use your PC for an hour or so, run my sample programs with
numA=1000, or even more, and then compare the timings between Example 1 and
Example 6 - its just staggering! By the way, turn off your screen saver if
you are going to rest with a large number of elements - the screen saver will
take away CPU resiurces and make the timings invalid!
- Example 7. Using Windows Search Facilities
Windows provides
us with a way to search arrays using the facilities of Listbox or
Combobox. It takes just a few lines of code and will give similar speed
ratings to the binary sort techniques above.
combobox #wb.cb, Array$(, [cbClick], 0, 0, 100,
400 open "test ms search" for window as
#wb . . [search7]
print #wb.cb, "select
";search.a$ 'Tell Windows to look for our
string print #wb.cb,
"selectionindex?" ' What is
the index of the string input #wb.cb,
search.fnd
' will return 0 if not found |
- If you dont want the user to be able to see the Combobox, you can hide
it from the user (demonstrated in the download program).
- ADVANTAGES:
- Fast - about the sane as the Binary search technique
- Easy - just a couple of lines of code
- The array does not need to be sorted
- DISADVANTAGES
- Only works on single dimension arrays
- Only works on arrays of strings - not numeric data
- Array elements cannot have a value of "".
All the above examples are in a complete program that you can download from
here. The program is designed so that you can change the variable 'numA'
at the start of the program and then run timing tests to determine which
technique suits your particular application. This HTM page is also included in
the ZIP (and is the only documentation you get ... sorry).
Well that's it! You now know how to speed up your searches and
hopefully have learnt a couple of good programming techniques as well. In
case you're interested: I'm new to Liberty Basic (got my copy mid December
1997), but I have been a computer professional for 30 years and have coded in
about 10 different programming languages. The techniques above are not specific
to Liberty Basic, experienced programmers use identical techniques all the time,
no matter what language they use.
Please give me some feedback as to whether
this article has useful, or if I'm too technical, not technical enough, or
...... whatever. With a bit of encouragement, I may consider to
documenting a few more techniques on other areas.
Email: Brosco
This document
(and associated programs) is free from any copyright. Please feel free, to
copy, cut and paste, utilise, distribute, .... whatever. The only
stipulation that I have is: if you publish the contents in any form, be
courteous enough to give credit to me for the original content!