Index Quiz!‎ > ‎

Index Quiz!

What follows is a series of questions you can use to measure your IQ  (Indexing Quotient.)  This list does not by any means cover all the issues you may run into, but should serve as a good starting point for new Progress programmers.

Let's say your database table CUSTOMER contains the following fields, among others:

Name character
Region character
Active logical
Balance decimal
ContactName character
Sells character
CustNum integer

And the table has (only) the following indexes:

Idx1 Idx2 Idx3  (word index) Idx4  (primary unique)
Region asc
ContactName
asc
Sells

CustNum
asc
Active asc






Balance asc








For each of the queries below, determine whether the query will return records efficiently.  In other words, identify whether just the desired records are retrieved, or whether a larger set will be read; if a larger set is read, identify the set.  Also determine which indexes are used, and what order (if any) the records will be returned.

(Note: We will ignore, for space and readability reasons, the NO-LOCK or EXCLUSIVE-LOCK that should be added to every one of these queries, and the NO-UNDO that should be on each DEFINE VARIABLE statement. Because you DID notice that. Right?)

1.  for each customer where 
customer.region = "SOUTH":
2.  for each customer where 
customer.active = yes:
3.  for each customer where 
customer.region = "MEX" and
customer.balance > 10000:
By specifying the first field of Idx1, this query correctly eliminates all non-"MEX" records up front, but must read all of the records and reject those without sufficient balances.  With the specified criteria, this query is as efficient as possible.

But consider whether you might be able to sneak in an "active = yes"; if you can, you will instantly jump two fields ahead in the index and the query will only read the records you want.  (In fact, this behavior is so much preferred that depending on the data distribution in the database it may be worth some code gyrations to force it to happen.  We'll come back to this topic in a bit.)

No sorting is done, so all of the inactives will be returned, followed by all of the actives, followed by (if there are any) all the records with the unknown value in the active field, with each grouping in order by balance.
4.  for each customer where 
customer.active = yes and
customer.balance > 10000:
5.  for each customer where 
customer.region = "ASIA" and
customer.active = yes and
customer.balance > 10000:
5a. Same query as #5 above, but add
"
by customer.balance". Explain
whether, how, and why, the behavior
of the query is changed.
6.  for each customer where
customer.sells = "apparel":
6a. Assume that in query #6 the word
"apparel" is unique within the
table. Does this affect your
answer?

If it does not, give another query
that returns the same result set
more efficiently (disregarding the
ordering of the records.)
7.  for each customer where
customer.region = "EAST" and
(customer.active = yes or
customer.active = no) and
customer.balance > 10000:
8.  for each customer where 
customer.region <> "WEST" and
customer.active = yes and
customer.balance > 10000
by customer.name:
9.  for each customer where 
customer.region = "WEST" and
customer.active = yes and
customer.balance > 10000
by customer.name:
10. for each customer where 
customer.region = "MEX" and
customer.contactname = "Leroy":
11. for each customer where 
customer.region = "MEX" and
customer.active = yes and
customer.balance > 10000 and
customer.sells contains "shoes":
11a. In query #11, are multiple
brackets constructed? Why or
why not?
11b. Describe the effect, for query #11,
of data distribution on query
completion times. How might you
be able to apply this knowledge
to make the query run faster?
12. for each customer where 
customer.region = "WEST" and
customer.active = yes and
((customer.balance > 10000 and
customer.balance < 11000) or
(customer.balance > 20000 and
customer.balance < 21000)) :
12a. Query #12 is an attempt to create
multiple brackets. Is it
successful? If it is not,
rewrite it so that it is.
12b. Is query #12 as written more 
efficient, less efficient, or
equally as efficient as query #5
above?
13.  define somevar as char init "".
for each customer where
customer.sells begins somevar:
14.  define somevar as char init "".
for each customer where
customer.sells contains somevar:


It cannot be stressed enough that no amount of performance tuning can correct application-wide indexing errors! It follows that every Progress programmer should be comfortable with index rules and efficient query construction. This topic and related ones are covered in the Progress course 4GL Performance Tuning, which I teach through Progress Software's Education Department. I recommend that course to all new Progress programmers; it is my opinion that it should be taken somewhere around three to six months after being introduced to the language, to help set good habits among those who have mastered database concepts and are now familiar with building queries in Progress.