REDMOND, Wash., March
16, 1999 -- On November 23, 1998, Microsoft was given a "Million Dollar
Challenge" by Oracle: $1 million to anyone who could "demonstrate that SQL
Server 7.0 is not 100 times slower than the Oracle database when running a
standard business query against a large database."
The challenge was based on Oracle's time of 71.5
seconds on query five of the TPC-D benchmark. Administered by the
Transaction Processing Performance Council, the TPC-D benchmark was designed
to test the efficiency and speed of very large databases in "an ad hoc
business environment where users submitted more or less random queries
against a data warehouse." The challenge seemed straightforward enough:
build a terabyte-scale database that includes customers, suppliers, orders,
parts, and revenue over time and geography, and then answer the basic
business question of "how much of last year's revenue was shipped nationally
vs. internationally."
Not that Oracle's query five results were causing
the Microsoft SQL group to lose sleep; while 71.5 seconds was a decent
enough mark, it certainly wasn't ground-breaking news. Rather, it was the
fact that Oracle CEO Larry Ellison had turned TPC-D into such a solid public
relations coup. For in truth, although the Oracle Challenge sounded simple,
it was anything but straightforward.
"On its face the Oracle Challenge seemed credible
and fair," says Doug Leland, SQL Server group product manager for Microsoft.
"But it was a loaded deck. To put Oracle's offer in any kind of perspective,
you had to understand the minutia of benchmark deadlines and database
creation. All in all, it was a very clever ploy on their part."
At issue was a new approach to database
architecture called "materialized views" that was making it possible for
companies to report TPC-D query execution times of less than one second, an
incredible improvement compared with the 1,000 seconds that was typical in
April 1998. But materialized views had fundamentally altered the nature of
the test and, as a result, the value of the TPC-D benchmark for evaluating
database technologies had become very unclear. So unclear, in fact, that in
February the General Council of the Transaction Processing Performance
Council voted to recommend that TPC-D be abolished and split into two
separate benchmarks.
Beyond the question of materialized views, there
was an even more important issue at stake, says Leland, and that was the
value of TPC-D as a measure of the business benefits of database
technologies. Oracle's execution times for TPC-D were achieved with a
solution that cost over $10 million, a cost far beyond the means of many
companies that would benefit from terabyte-level data warehouses and
decision support databases.
"The Oracle Challenge wasn't terribly relevant to
customers," said Leland. "It was based on a single query in a larger
benchmark, with terms and conditions that don't reflect real business
conditions, and it required an extremely expensive system. We believe that
customer benefits derive from a broader picture that takes into
consideration such issues as affordability and return on investment."
The OLAP Approach
This month, as part of the premiere event of the
"getting Results" Web cast series, Microsoft released its response time to
the same query issued in the Challenge, announcing that it had achieved an
execution time of 1.075 seconds on query five, significantly faster than
Oracle's original mark and on par with Oracle's recent result of 0.7
seconds. Microsoft's results were achieved using Microsoft SQL Server 7.0
Enterprise Edition for a total cost of less than $600,000.
"Our solution not only matches Oracle's
performance, but it does so at about one-sixteenth the price," says Leland.
"It demonstrates that Microsoft offers powerful data warehousing and
business intelligence solutions at a cost of ownership that is in line with
real-world business realities. That's the core of our approach: to provide
business solutions that drive down the cost of ownership and maximize
return."
The Microsoft solution, developed jointly with
Hewlett-Packard and using Hewlett-Packard hardware, is based on Microsoft's
OLAP (on-line analytical processing) technologies. An integrated component
of SQL Server 7.0, OLAP is an extremely flexible, high-performance approach
to accessing, viewing and analyzing data.
OLAP technology uses a multidimensional approach
allowing data to be ordered into descriptive categories called dimensions
(such as time, geography, product, channel, and organization), and
quantitative values called measures (including dollar sales, unit sales,
inventory, headcount, income, and expenses). Dimensions are then organized
into hierarchies: time, for example, can be broken down in years, quarters,
months and days.
The OLAP data model makes it simple for users to
formulate complex queries, arrange data, move from summary to detailed data,
and filter data into meaningful subsets. It offers a natural, intuitive
navigational method that allows end users to view and understand the
information in their data warehouses more easily and more effectively,
helping organizations reap the greatest value from their data.
OLAP, along with data transformation and meta data
management, are just a few of the key data warehousing technologies
integrated into SQL Server 7.0 as standard components. They are an important
part of the Microsoft Data Warehousing Framework, a set of open interfaces
and specifications designed to enable third-parties to develop tightly
integrated solutions that address the broadest range of real-world business
issues, while reducing the cost of acquiring, deploying, and managing
large-scale business intelligence solutions.
Putting the "Ad-Hocness" Back in TPC-D
When Microsoft and Hewlett-Packard sat down
together to address the Challenge, they knew that Oracle had heavily
optimized for TPC-D through the use of materialized views, and the benchmark
results were not very meaningful for understanding ad hoc query performance.
This led to rapid escalation in TPC-D query performance, causing the
Transaction Processing Performance Council to seriously consider drastic
changes to TPC-D.
Materialized views are, in fact, a useful
technology for some types of database queries. With materialized views,
databases are optimized so that the results from specific queries can be
computed in advance, saved, and then returned at incredible speeds.
Materialized views work by pre-aggregating data for the query to be
answered. This moves most of the work that used to be done at the execution
of a query into the database design and load phase. As a result, when a
query is sent to the database, the answer has already been calculated by the
materialized view, rendering nearly instantaneous results. One significant
downside to this technology is that queries that update the materialized
view's underlying data take longer because they have to update both the base
tables and the materialized view itself.
According to the March issue of the Transaction
Processing Performance Council's newsletter TPC Benchmark Status, "[Material
views are] very useful when . . . very knowledgeable users like database
administrators know the queries and the domain well in advance, can create
auxiliary structures like aggregated columns, and can optimize their
databases to run these queries. . . . The problem is that TPC-D was intended
to represent an ad hoc environment in which queries are submitted on a
random basis and are not known in advance."
In an attempt to solve confusion over materialized
views and TPC-D, the Transaction Processing Performance Council has proposed
that TPC-D be replaced by two new benchmarks, TPC-R and TPC-H. TPC-R will
allow for continued use of materialized views. According to the TPC
Benchmark Status, TPC-H "restores the 'ad hocness' of the original benchmark
workload" of TPC-D.
Rather than try to match a result in a benchmark
test that was already in dispute, Microsoft and Hewlett-Packard chose to
craft a solution that would more closely match actual business conditions,
including the need to get maximum value from any enterprise solution. They
also wanted to develop a solution that met the original intention of TPC-D.
"There was no merit in responding to the aspect of
the challenge that was nothing more than a marketing stunt," explains
Microsoft's Douglas Leland. "Instead, what we found interesting was the
opportunity that the challenge raised for us to use our technology in a way
that focuses on what customers really need, which is affordable, scalable,
and powerful solutions that are flexible enough to answer a wide range of
business needs and issues."
The joint Microsoft-HP project used SQL Server 7.0
Enterprise Edition, SQL Server OLAP Services, and HP NetServer LXr8000
system with four 450MHz Pentium II Xeon processors, and 4 GB DRAM. Nine HP
NetRAID-3Si disk array controller cards were used, attached to 560 disk
drives. All told, the entire hardware setup cost $512,899.
To test the system, 21 runs were executed. Times
ranged from a high of 1.532 seconds to a low of 0.062. The mean result was
1.075 seconds. "Our joint test results prove that large-scale databases can
be created, loaded, indexed, and deployed with industry standard technology
at a low cost," says Michael Mahon, who manages the Software and System
Development Lab at Hewlett Packard. "We have accomplished a ten-fold
increase in the size of our databases without added expense or development
time. We are committed to continuing our joint efforts and passing our
expertise along to our enterprise customers."
"The old
school of thought was that data warehousing had to be difficult and
expensive to be useful," adds Microsoft's Doug Leland. "What we've
demonstrated very clearly here is that the innovations of the Microsoft
business intelligence platform make it possible to accomplish the same
business tasks with equivalent performance to Oracle for a fraction of the
price. That shows tremendous customer value."

More Information Sources
SQL Server Web Site
SQL Server OLAP
Services
©1999 Microsoft Corporation. All rights reserved. Terms of Use