The On-Line Executive Journal for Data-Intensive Decision Support
*** December 9, 1997: Vol. 1, No. 10 ***
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Article retrieval instructions are at the end of this file.
For subscription information, email dssub@tgc.com
-------------------- * --------------------
IN THIS ISSUE:
PITFALLS OF SAMPLING AND SUMMARIZATION BY KAMRAN PARSAYE
THE DATA WAREHOUSE DATABASE EXPLOSION, PART II BY SID ADELMAN
UNDERSTANDING DATA MINING BY KURT THEARLING
ANALYSIS & COMMENTARY
SMALL DATA, SMALL KNOWLEDGE:
THE PITFALLS OF SAMPLING AND SUMMARIZATION
by Dr. Kamran Parsaye, Information Discovery, Inc
Kamran Parsaye, PhD, is CEO of Information Discovery, Inc. He is one of the original developers of the concept of "data mining", having developed commercial programs for this purpose in the mid 1980's. He originated the concept of an "intelligent database", and is the author of "Intelligent Database Tools & Applications" published by John Wiley). Dr. Parsaye has a wide range of experience in the software industry, both as a research scientist and as a high technology business leader, having provided guidance and direction to top level management of leading industrial and financial organizations, as well as key government entities such as the US AirForce. He received his BS and MS degrees in Mathematics from King's College, London, and his PhD in Computer Science from the University of California, Los Angeles. His books on databases are used in universities world-wide.
Parsaye writes: "When it is too daunting a task to look a large data warehouse straight in the eyes, it is tempting to try and obtain a smaller "sample" of the data for analysis. While sampling may seem to offer a short-cut to data analysis, the end results may often be less than desirable. The shyness to look at the whole data is often more expensive in the long term because we get lower quality information."
THE DATA WAREHOUSE DATABASE EXPLOSION, PART II
by Sid Adelman
Sid Adelman is President of Sid Adelman & Associates, a Sherman Oaks, California-based consulting firm specializing in data warehouse and strategic data architecture. He co-authored a methodology and project planning tool tailored for data warehouses. Sid is an international speaker at data warehouse and industry conferences. He has written a number of articles on data warehouse and has chapters on data quality and organizational and cultural issues in Data Warehouse: A Practical Guide from the Experts.
In part I last week, Adelman noted important features of the trend toward burgeoning databases. In the concluding section, he tackles the problem of how organizations can come to grips with this incredible influx of information. He observes: "A company recently monitored which of their reports were actually being used. The sad results were that less than 25% of the reports were ever read. The company shrewdly halted the unread reports and only reinstated them when users demanded their return. Understanding the situation can reveal opportunities to improve."
UNDERSTANDING DATA MINING: IT'S ALL IN THE INTERACTION
by Kurt Thearling
Kurt Thearling is Director of Advanced Analytics at Exchange Applications, a Boston based database marketing company, where he directs the use of data mining and visualization technology in EA's database marketing software and consulting practice. Over the past decade he has developed a number of data mining software products, including Thinking Machines' Darwin and Pilot Software's Discovery Server. He also an independent consultant in areas related to data mining and decision support technologies. His data mining web page can be found at http://www.santafe.edu/~kurt
Thearling observes: "Data mining...extracts information from a database that the user did not know existed. Relationships between variables and customer behaviors that are non-intuitive are the jewels that data mining hopes to figure out. And because the user does not know beforehand what the data mining process has discovered, it is a much bigger leap to take the output of the system and translate it into a solution to a business problem."
ACTION ITEMS
Users awaiting Cabletron Systems, Inc.'s data warehouse for network management said they have one all-important goal: to extract business-relevant information from a glut of statistics. Cabletron and partners recently outlined plans to provide the first SQL-based management repository that correlates diverse data about networks, servers and applications. It will ship next quarter.
Tandem, a Compaq company, has announced that Europe's largest home-improvements retailer, the 400-store OBI chain, headquartered in Germany, is relying on a decision support solution from Tandem to put real-time customer buying information at the fingertips of its franchise managers. With a better understanding of their customers, store managers expect to improve their merchandise mix, in-stock positions, reduce handling costs and strengthen shopper loyalty, leading to greatly enhanced profitability.
Seagate Software, a provider of business intelligence software, has announced that Bankers Trust, after a rigorous evaluation process, will deploy Seagate Crystal Info 6 as its worldwide enterprise reporting and analysis system.
D S * INFORMATION
D S * welcomes bylined comments for publication.All comments regarding editorial content should be sent to: dseditor@tgc.com