KDD Nuggets Index


To KD Mine: main site for Data Mining and Knowledge Discovery.
To subscribe to KDD Nuggets, email to kdd-request
Past Issues: 1996 Nuggets, 1995 Nuggets, 1994 Nuggets, 1993 Nuggets


Data Mining and Knowledge Discovery Nuggets 96:4, e-mailed 96-01-29

Contents:
News:
* GPS, Recent Data Mining and Knowledge Discovery applications ?
* R. Mantaras, Strategic Task Force KDD of MLnet -- email discussion
* R. Gibson, Request for speakers on DM/KDD at U. of Central Florida?
Publications:
* B. Wuthrich, new version of 'Knowledge Discovery in Databases'
manuscript, http://www.cs.ust.hk/faculty/beat/bio.html
Siftware:
* S. Tafolla, Q-Yield, data analysis tool for semiconductor
manufacturing, http://www.quadrillion.com/quad/
Meetings:
* T. Catarci, AVI'96, Advanced Visual Interfaces workshop program:
http://www.dis.uniroma1.it/AVI96/info.html
--
Data Mining and Knowledge Discovery community,
focusing on the latest research and applications.

Contributions are most welcome and should be emailed,
with a DESCRIPTIVE subject line (and a URL, when available) to (kdd@gte.com).
E-mail add/delete requests to (kdd-request@gte.com).

Nuggets frequency is approximately weekly.
Back issues of Nuggets, a catalog of S*i*ftware (data mining tools),
and a wealth of other information on Data Mining and Knowledge Discovery
is available at Knowledge Discovery Mine site, URL http://info.gte.com/~kdd.

-- Gregory Piatetsky-Shapiro (moderator)

********************* Official disclaimer ***********************************
* All opinions expressed herein are those of the writers (or the moderator) *
* and not necessarily of their respective employers (or GTE Laboratories) *
*****************************************************************************

~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'A good book can replace a guide,
A bad book can replace a mother-in-law'
heard from GPS mother-in-law recently

Previous  1 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Mon, 29 Jan 1996 11:36:55 -0500
From: gps@gte.com (Gregory Piatetsky-Shapiro)
Subject: Recent Data Mining and Knowledge Discovery Applications ?

For a survey of industrial applications of DM/KD that will appear in
Communications of ACM, we are looking for recent (within the last two
years) application examples. CACM is one the largest-circulation and
most important computer publications and we want to present DM/KD
community and results in the best way.

If you are/have been working on DM/KD applications, or know about good
ones, can you please email brief info about it to gps@gte.com,
describing:

Application name:
Description: what does it do?
Comments: other comments
Discovery methods: prediction, description, visualization, clustering, etc
Tools used: commercial DM tools, built own DM tools, DBMS, etc
Platform: hardware/OS
Current status: (deployed ?, how many users, results achievend)
Contact person:
Source of information: company, developer, publication, ...

I will also summarize the received information in Nuggets.

Your input will be greatly appreciated.

-- Gregory Piatetsky-Shapiro


Previous  2 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Thu, 25 Jan 1996 12:57:38 +0100
From: mantaras@sinera.iiia.csic.es (Ramon Lopez de Mantaras)
Subject: announcement -- general discussion on KDD via e-mail.

Strategic Task Force KDD of MLnet (European Network of Excellence in ML)

In view of the great interest of ML in KDD and regarding the expertise
that members of MLnet can offer to this new field, it is proposed that
a general discussion on KDD takes place via e-mail.

The e-mail discussion can extend beyond MLnet, in particular it should be
coordinated with the KDD nuggets, the Machine Learning list and the
AI-stats lists.

The purpose of this email discussion is to identify the most crucial issue
in KDD that the ML community can contribute to. Below, some topics for
discussion are proposed.

The e-mail discussion should run for less than 2 months.

On the basis of the contributions, people may apply to become one of the 7
experts that will write a document, based on the e-mail discussion and
additional discussions among themselves.

The coordinator of MLnet (Ramon Lopez de Mantaras : mantaras@iiia.csic.es)
will decide on the applications for becoming a member of the expert group.


Possible TOPICS FOR DISCUSSION:

1) Who will be the users of KDD tools?

AI has often claimed to offer information directly to managers or
experts. It then turned out, that even AI products are used and
maintained by the computer experts. Concerning KDD, some (e.g. Tej
Anand from AT & T) perceive the business user as using and maintaining
a KDD system. What is your experience? For whom do you design your
tool? Are there any real users out there?

2) What are the goals of KDD?
A variety of goals has been put forward in KDD literature:
- allowing for better answers of a database system (e.g.: the query
'who are the customers of product X' will not be answered by a
table listing the customers but by a characterization of the
customers).
- supporting data quality maintenance (the KDD tool will deliver
dependencies or rules hidden in the data; if these contradict the
domain expert's knowledge, the data must be incomplete or wrong).
- database query optimization (e.g., the KDD tool discovers queries that
cannot have a positive answer - the database system need not look-up
any entry).
- overview of database content
- prediction of new data.

3) What is the relation between statistics and ML techniques in KDD?
Statistics might play the role of a pre-processor for a learning algorithm.
It may also be part of the kernel of a KDD system. Or it may be considered an
ancestor of KDD - now overcome by ML techniques!

4) What is the role of a knowledge base in KDD? It is often claimed
that a KB can be exploited in order to guide KDD. What are the
restrictions and constraints this imposes on the KB? Another claim is
that the KB can (partly) be constructed by KDD. What does this mean for
the relation between KA and KDD? In particular, does KDD re-describe a
given KB? Alternatively a KB can be used to redescribe a dataset before
KDD techniques are applied.

5) Which prerequisites of DBMS technology are required by KDD? The
demand of KDD is to some extent based on the behind-the-state-of-art
state of data handling. It is hoped that KDD will give insight into
data that have been managed poorly. On the other hand, a data warehouse
is stated as a prerequisite for KDD applications.

6) The KDD search space is very large: many techniques can be applied
and application of one technique may yield a large number of clusters,
rules, classes etc. How can a user be supported in navigating this
large space?

7) Clever data preparation of often a key to success in ML. This works
when one knows what one is looking for. In KDD one may not know what
one is looking for and hence data preparation has to be performed in
the dark. How to solve this?

8) Currently there is no clear framework or methodology for KDD and its
application in an industrial context. Is such a methodology (Life
Cycle, set of methods, guidelines) needed and what would industrial users
expect from them?


=========================================================================



_________________________________________________________________________

_/ _/ _/_| Ramon Lopez de Mantaras
/ _/ _/ _| IIIA - Artificial Intelligence Research Institute
_/ _/___| CSIC - Spanish Scientific Research Council
_/ _/____| Campus Universitat Autonoma de Barcelona
/ _/ _| 08193 Bellaterra, Spain

voice: +34-3-580 95 70 fax: +34-3-580 96 61
mantaras@iiia.csic.es http://www.iiia.csic.es


Previous  3 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Tue, 23 Jan 1996 22:07:18 -0800
From: Ryder Gibson (ryderg@iag.net)
Subject: PR?


Does your company have a PR department that would be interested in
sending a person to (sunny) Orlando, FLorida to speak about data mining
to a group of approximately 30 serious Management Information Systems
Students at the University of Central Florida? Whew! That was a
mouthful! Anything would help.

Ryder Gibson


Previous  4 Next   Top
>~~~Publications:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Thu, 25 Jan 1996 09:50:58 +0800
From: 'DR. BEAT WUTHRICH' (beat@cs.ust.hk)
Subject: KDD Nuggets

Dear Gregory:

A new version of the manuscript 'Knowledge Discovery in Databases' is available via ftp
Please announce this
in a forthcoming KDD Nugget email. Especially, three more
chapters are added now.

Go to my homepage

http://www.cs.ust.hk/faculty/beat/bio.html

and klick on technical 'Technical Report HKUST-CS96-4'.

Best regards,
Beat

------------------------------------------------------------

Beat Wuthrich, PhD
Assistant Professor, CS Dept
The Hong Kong University of Science and Technology
Clear Water Bay
Kowloon, Hong Kong


Tel. (852) 2358 7013
Fax: (852) 2358 1477
email: beat@cs.ust.hk
http://www.cs.ust.hk/faculty/beat/bio.html


Previous  5 Next   Top
>~~~Siftware:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Wed, 24 Jan 1996 12:08:54 -0600
From: stafolla@cis.usouthal.edu (Susan Tafolla)
Sibject: Data Analysis tool for semiconductor manufacturing

*Name: Q-YIELD

*URL: http://www.quadrillion.com/quad/

*Description: Quadrillion has developed this data analysis product for the semiconductor manufacturing market. Q-YIELD can:


*Discovery Methods: Rule generation, visualization

*Comments: Was developed in association with the National Research Council's Knowledge Systems Laboratory,
Q-YIELD has analyzed up to 2000 variables in a single analysis
Click here to visit Quadrillion

*Platform(s): SUN OS 4.1.3 or MS Windows 3.1
Hardware requirements - UNIX: 16MB RAM (32 recommended), 60MB hard disk space; PC:
8MB RAM, 3.5MB disk space. SUN Solaris 2, HP, DEC and IBM UNIX under development

*Contact: Quadrillion Corporation, 380 Pinhey Point Road, Dunrobin, Ontario, Canada KOA ITO Tel: (613) 832-3393, Fax: (613) 832-0547

*Status: product

*Source of Information: URL above, and company brochure

*Updated by: Susan Tafolla on 1996-01-23


Previous  6 Next   Top
>~~~Meetings:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Date: Mon, 29 Jan 1996 12:10:39 +0100
From: 'Tiziana Catarci' (catarci@infokit.dis.uniroma1.it)
Subject: AVI
------------------------------------------------------------------
Please accept our apologies if you receive this message multiple
times.
------------------------------------------------------------------
AVI'96

Advanced Visual Interfaces: an International Workshop

Gubbio, Italy, May 27-29, 1996


Under the patronage of the Universita di Bari
Under the patronage of the Universita' di Roma 'La Sapienza'
Sponsored by ACM SIGMultimedia
Sponsored by IDOMENEUS (Esprit Network of Excellence No. 6606)
In cooperation with ACM-SIGCHI
In cooperation with the Esprit 8422 'FADIVA' Working Group

*******************************************************************
The Preliminary Program and Registration Form are now available at:

http://www.dis.uniroma1.it/AVI96/info.html

********************************************************************

Tiziana Catarci
Dipartimento di Informatica e Sistemistica
Universita' degli Studi di Roma 'La Sapienza'
Via Salaria 113, 00198 Roma
ITALY

Tel. +39-6-49918331
Fax. +39-6-49918331 or +39-6-85300849
E-mail catarci@infokit.dis.uniroma1.it or tic@cs.brown.edu
URL http://www.dis.uniroma1.it/AVI96/tchome.html


Previous  7 Next   Top
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~