Figure 1. DB2
optimization in
action.
The
optimizer is
equivalent in
function to an
expert system.
An expert system
is a set of
standard rules
that, when
combined with
situational
data, returns an
"expert"
opinion. For
example, a
medical expert
system takes the
set of rules
determining
which medication
is useful for
which illness,
combines it with
data describing
the symptoms of
ailments, and
applies that
knowledge base
to a list of
input symptoms.
The DB2
optimizer
renders expert
opinions on data
retrieval
methods based on
the situational
data housed in
DB2's system
catalog and a
query input in
SQL format.
The notion of
optimizing data
access in the
DBMS is one of
the most
powerful
capabilities of
DB2. Remember,
you access DB2
data by telling
DB2 what to
retrieve, not
how to retrieve
it. Regardless
of how the data
is physically
stored and
manipulated, DB2
and SQL can
still access
that data. This
separation of
access criteria
from physical
storage
characteristics
is called
physical data
independence.
DB2's optimizer
is the component
that
accomplishes
this physical
data
independence.
If you remove
the indexes, DB2
can still access
the data
(although less
efficiently). If
you add a column
to the table
being accessed,
DB2 can still
manipulate the
data without
changing the
program code.
This is all
possible because
the physical
access paths to
DB2 data are not
coded by
programmers in
application
programs, but
are generated by
DB2.
Compare this
with non-DBMS
systems in which
the programmer
must know the
physical
structure of the
data. If there
is an index, the
programmer must
write
appropriate code
to use the
index. If
someone removes
the index, the
program will not
work unless the
programmer makes
changes. Not so
with DB2 and
SQL. All this
flexibility is
attributable to
DB2's capability
to optimize data
manipulation
requests
automatically.
The optimizer
performs complex
calculations
based on a host
of information.
To visualize how
the optimizer
works, picture
the optimizer as
performing a
four-step
process:
- Receive
and verify
the syntax
of the SQL
statement.
- Analyze
the
environment
and optimize
the method
of
satisfying
the SQL
statement.
- Create
machine-readable
instructions
to execute
the
optimized
SQL.
- Execute
the
instructions
or store
them for
future
execution.
The second
step of this
process is the
most intriguing.
How does the
optimizer decide
how to execute
the vast array
of SQL
statements that
you can send its
way?
The optimizer
has many types
of strategies
for optimizing
SQL. How does it
choose which of
these strategies
to use in the
optimized access
paths? IBM does
not publish the
actual, in-depth
details of how
the optimizer
determines the
best access
path, but the
optimizer is a
cost-based
optimizer. This
means the
optimizer will
always attempt
to formulate an
access path for
each query that
reduces overall
cost. To
accomplish this,
the DB2
optimizer
applies query
cost formulas
that evaluate
and weigh four
factors for each
potential access
path: the CPU
cost, the I/O
cost,
statistical
information in
the DB2 system
catalog, and the
actual SQL
statement.
Back
Next
|