Index Quiz!‎ > ‎

Index Quiz!

What follows is a series of questions you can use to measure your IQ  (Indexing Quotient.)  This list does not by any means cover all the issues you may run into, but should serve as a good starting point for new Progress programmers.

Let's say your database table CUSTOMER contains the following fields, among others:

Name character
Region character
Active logical
Balance decimal
ContactName character
Sells character
CustNum integer

And the table has (only) the following indexes:

Idx1 Idx2 Idx3  (word index) Idx4  (primary unique)
Region asc
ContactName
asc
Sells

CustNum
asc
Active asc






Balance asc








For each of the queries below, determine whether the query will return records efficiently.  In other words, identify whether just the desired records are retrieved, or whether a larger set will be read; if a larger set is read, identify the set.  Also determine which indexes are used, and what order (if any) the records will be returned.

(Note: We will ignore, for space and readability reasons, the NO-LOCK or EXCLUSIVE-LOCK that should be added to every one of these queries, and the NO-UNDO that should be on each DEFINE VARIABLE statement. Because you DID notice that. Right?)

1.  for each customer where 
customer.region = "SOUTH":
2.  for each customer where 
customer.active = yes:
3.  for each customer where 
customer.region = "MEX" and
customer.balance > 10000:
4.  for each customer where 
customer.active = yes and
customer.balance > 10000:
5.  for each customer where 
customer.region = "ASIA" and
customer.active = yes and
customer.balance > 10000:
5a. Same query as #5 above, but add
"
by customer.balance". Explain
whether, how, and why, the behavior
of the query is changed.
6.  for each customer where
customer.sells = "apparel":
6a. Assume that in query #6 the word
"apparel" is unique within the
table. Does this affect your
answer?

If it does not, give another query
that returns the same result set
more efficiently (disregarding the
ordering of the records.)
The uniqueness or commonality of a value within the table has no effect on how efficient a search is. The only way to improve the query would be to use an index -- either by changing the = to CONTAINS or to create a new index in which Customer.Sells is the first field.

We would prefer to change the code (and that is the request that was made!)  So what about:
  for each customer where
customer.sells contains "apparel":
Sorry, no.  This will use an index, but will return a different set of results -- records with "apparel" or "shoes apparel hardware" will both be returned.  You've broken your query by doing this.

How do you fix it?  Any ideas, before reading on?

How about this not-so-obvious code?
  for each customer where
customer.sells contains "apparel" and
customer.sells = "apparel":
Progress will use the word index to locate only those records with the word "apparel" in the string, and then reject those that aren't only that.  All the records containing "apparel" will still be read, but that should be a smaller set of records than the entire table, which is what the original query read.
7.  for each customer where
customer.region = "EAST" and
(customer.active = yes or
customer.active = no) and
customer.balance > 10000:
8.  for each customer where 
customer.region <> "WEST" and
customer.active = yes and
customer.balance > 10000
by customer.name:
9.  for each customer where 
customer.region = "WEST" and
customer.active = yes and
customer.balance > 10000
by customer.name:
10. for each customer where 
customer.region = "MEX" and
customer.contactname = "Leroy":
11. for each customer where 
customer.region = "MEX" and
customer.active = yes and
customer.balance > 10000 and
customer.sells contains "shoes":
11a. In query #11, are multiple
brackets constructed? Why or
why not?
11b. Describe the effect, for query #11,
of data distribution on query
completion times. How might you
be able to apply this knowledge
to make the query run faster?
12. for each customer where 
customer.region = "WEST" and
customer.active = yes and
((customer.balance > 10000 and
customer.balance < 11000) or
(customer.balance > 20000 and
customer.balance < 21000)) :
12a. Query #12 is an attempt to create
multiple brackets. Is it
successful? If it is not,
rewrite it so that it is.
12b. Is query #12 as written more 
efficient, less efficient, or
equally as efficient as query #5
above?
13.  define somevar as char init "".
for each customer where
customer.sells begins somevar:
14.  define somevar as char init "".
for each customer where
customer.sells contains somevar:


It cannot be stressed enough that no amount of performance tuning can correct application-wide indexing errors! It follows that every Progress programmer should be comfortable with index rules and efficient query construction. This topic and related ones are covered in the Progress course 4GL Performance Tuning, which I teach through Progress Software's Education Department. I recommend that course to all new Progress programmers; it is my opinion that it should be taken somewhere around three to six months after being introduced to the language, to help set good habits among those who have mastered database concepts and are now familiar with building queries in Progress.