Q&AFrom: Wray Buntine wray@dynaptics.comDate: Wed, 21 Feb 2001 13:06:57 -0800 Subject: Suitability of MYSQL for Data Mining In response to Alan Mclean's (?) question in KDnuggets News 2001 : n04 : item38 about MySQL. MySQL has its strength as a back-end for an internet server, not in traditional database applications. Its routinely twice as fast on many apps where it is suited. MySQL has previously had the following differences with major commercial SQL systems: * Transaction support not as good, i.e., don't use it for financial transactions. * Row locking not traditionally supported, so difficult when multiple statements are required for your transactions. * SQL statements cannot be precompiled for efficient reuse. * Not as well supported in distributed environment. * Thread-safe version require special compiling. But, now precompiled thread-safe versions around. Some of these are being addressed in latest versions, and some are not as critical in data mining. * VA Linux, for instance, is using the latest distributed support in MySQL to deliver systems targeted at server-farms using reader/writer CPUs sharing disks. Thus they are going for a Linux/MySQL solution to an area dominated by SUN. * Transaction support and row locking not as critical in data-mining if your using it as a load-rarely read-often system. Of course, given these caveats, MySQL is a great system. Being a successful open source project means that it is robust and being developed/updated at a rapid pace. Now if you're building a dynamic/embedded data mining system, such as a personalization engine, then SQL might not be the way to go anyway. You might want to use an embedded btree system for a 10 times speed up in performance by working directly in binary and avoiding socket communication for database work. However, this assumes you have some high-quality programmers around who know about building some of the architecture needed to make this work in a distributed/multi-processor environment. Systems based on SQL are intrinsically easier to maintain if you're relying on grunt/plug-compatible programmers!! At Dynaptics, we're not a traditional data mining company trawling large databases, but an embedded data mining company, and MySQL is a great part of our solutions. Wray Buntine Dir. of Advanced Dev. Dynaptics Inc. |
Copyright © 2001 KDnuggets. Subscribe to KDnuggets News!