Knowledge Discovery Nuggets Index
To KDNuggets Directory
| Here is how to subscribe to KD Nuggets |
This Year |
Past Issues
Knowledge Discovery Nuggets(tm) 98:17, e-mailed 98-07-29
Requests:
(text)
M2 Ingrid, Questions about practical data mining projects
Publications:
(text)
Alex Buchner, New Book: 'Decision Support using Data Mining'
http://www.ftmanagement.com/
(text)
Gerd Wagner, New Book: Foundations of Knowledge Systems
http://www.informatik.uni-leipzig.de/~gwagner/ks-abstract.html
(text)
Jerome Friedman, Paper: Additive Logistic Regression: a
Statistical View of Boosting
ftp://stat.stanford.edu/pub/friedman/boost.ps.Z
(text)
Steven Salzberg, New Book: Computational Methods in Molecular Biology,
http://www.cs.jhu.edu/~salzberg/compbio-book.html
Tools/Services:
(text)
Anant Kishore, Wizsoft Press Release
http://www.wizsoft.com/
Positions:
(text)
Ed DeRouin, FL-US-Orlando, Data modeler/developer
http://www.itcTX.com
(text)
Liu Bing, Research Positions in Data Mining,
National University of Singapore, Singapore
Meetings:
(text)
Thomas Reinartz, 3rd CRISP-DM SIG Workshop,
New York, NY, 1 Sep 1998
http://www.ncr.dk/CRISP
(text)
Michael Berthold, IDA-99 Call for Papers,
Amsterdam, The Netherlands, 9-11 August 1999
http://www.wi.leidenuniv.nl/~ida99/
(text)
Hiroshi Motoda, Final CFP: PKAW98 new submission deadline, July 31
http://www.ar.sanken.osaka-u.ac.jp/PKAW98.html
(text)
Riccardo Bellazzi, IDAMAP 98 final announcement,
Brighton, UK, 25 August 1998
http://aim.unipv.it/~ric/idamap98
--
on the latest news, publications, tools, meetings, and other relevant items
in the Data Mining and Knowledge Discovery field.
KD Nuggets is currently reaching over 4800 readers in 65+ countries
2-3 times a month.
Items relevant to data mining and knowledge discovery are welcome
and should be emailed to gps
in ASCII text or HTML format.
An item should have a subject line which clearly describes
what is it about to KDNuggets readers.
Please keep calls for papers and meeting announcements
short (50 lines or less of up to 80-characters), and provide a web site for
details, such as papers submission guidelines.
All items may be edited for size.
To subscribe, see http://www.kdnuggets.com/subscribe.html
Back issues of KD Nuggets, a catalog of data mining tools
('Siftware'), pointers to data mining companies, relevant websites,
meetings, etc are available at KDNuggets Directory at
http://www.kdnuggets.com/
-- Gregory Piatetsky-Shapiro (editor)
gps
********************* Official disclaimer ***************************
All opinions expressed herein are those of the contributors and not
necessarily of their respective employers (or of KD Nuggets)
*********************************************************************
~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If a packet hits a pocket on a socket on a port,
And the bus is interrupted as a very last resort,
And the address of the memory makes your floppy disk abort,
Then the socket packet pocket has an error to report!
Thanks to Vincent Sabio, humournet.com
Previous
1 Next Top
Date: Thu, 16 Jul 1998 17:04:57 +0900 (JST)
From: ingrid@soft.comp.kyutech.ac.jp
(M2 Ingrid )
Subject: Questions about practical data mining projects
Hi,
Recently I have been searching frantically for detailed information
with respect to data mining/knowledge discovery i.e. practical
projects concerning:
- Application area
- Problem description
- Data description (# records, attributes,...)
- Preprocessing methods
- Data Mining technique(s)/algorithm(s) used
- Postprocessing methods
- Evaluation criteria
- Software used
Could somebody suggest possible references. Thanks, it is greatly appreciated.
Ingrid.
e-mail: ingrid@soft.comp.kyutech.ac.jp
Previous
2 Next Top
Date: Thu, 23 Jul 1998 12:12:54 +0100
From: 'Alex Buchner' ag.buchner@ulst.ac.uk
Subject: New Book: 'Decision Support using Data Mining'
Web: http://www.ftmanagement.com/
Our latest book 'Decision Support using Data Mining' by Sarab S. Anand
and Alex Buchner has just been published by Financial Times Pitman
Publishers (ISBN 0-273-63269-8), 168 pages.
See http://www.ftmanagement.com/
for more details.
Table of contents:
1 INTRODUCTION
2 A HISTORICAL PERSPECTIVE OF DATA MANAGEMENT
3 RELATED DISCIPLINES
4 THE DATA MINING PROCESS
5 DATA MINING GOALS AND METHODOLOGIES
6 THE DATA MINING STRATEGY
Appendix I: Case Studies
Appendix II: A Brief Market Overview
Appendix III: Glossary
Appendix IV: References
Appendix V: Further Reading
Appendix VI: Internet Resources
Check http://www.ftmanagement.com/
for more details.
Alex G. Buchner, Research Fellow
Northern Ireland Knowledge Engineering Laboratory
University of Ulster, Newtownabbey, BT37 0QB, UK
Phone:+44 (0)1232 368394 Fax: +44 (0)1232 366068
URL: http://www.infj.ulst.ac.uk/staff/ag.buchner
Previous
3 Next Top
Date: Wed, 22 Jul 1998 12:02:38 +0200
From: Gerd Wagner gw@inf.fu-berlin.de
Subject: new book on knowledge systems
Web: http://www.informatik.uni-leipzig.de/~gwagner/ks-abstract.html
* new book * new book * new book * new book * new book *
FOUNDATIONS OF KNOWLEDGE SYSTEMS
with Applications to Databases and Agents
by Gerd Wagner
Kluwer Academic Publishers
The book presents formal concepts and models of advanced database
and knowledge base systems which may be of interest to DM&KD
researchers. From the content:
- object-relational, temporal, fuzzy, possibilistic, MLS and lineage databases
- representing explicit negative information in 'bitables'
- reaction rules and their transition systems semantics
- deduction rules and their stable generated closure semantics
An extended abstract (and the order URL) is available from
http://www.informatik.uni-leipzig.de/~gwagner/ks-abstract.html
Previous
4 Next Top
Date: Thu, 23 Jul 1998 17:23:34 -0700 (PDT)
From: 'Jerome H. Friedman' jhf@stat.Stanford.EDU
Subject: Paper: Additive Logistic Regression: a Statistical View of Boosting
Web: ftp://stat.stanford.edu/pub/friedman/boost.ps.Z
Jerome Friedman
(jhf@stat.stanford.edu)
Trevor Hastie
(trevor@stat.stanford.edu)
Robert Tibshirani
(tibs@utstat.toronto.edu)
ABSTRACT
Boosting (Freund & Schapire 1996, Schapire & Singer 1998) is one of
the most important recent developments in classification
methodology. The performance of many classification algorithms often
can be dramatically improved by sequentially applying them to
reweighted versions of the input data, and taking a weighted majority
vote of the sequence of classifiers thereby produced. We show that
this seemingly mysterious phenomenon can be understood in terms of
well known statistical principles, namely additive modeling and
maximum likelihood. For the two-class problem, boosting can be viewed
as an approximation to additive modeling on the logistic scale using
maximum Bernoulli likelihood as a criterion. We develop more direct
approximations and show that they exhibit nearly identical results to
that of boosting. Direct multi-class generalizations based on
multinomial likelihood are derived that exhibit performance comparable
to other recently proposed multi-class generalizations of boosting in
most situations, and far superior in some. We suggest a minor
modification to boosting that can reduce computation, often by factors
of 10 to 50. Finally, we apply these insights to produce an
alternative formulation of boosting decision trees. This approach,
based on best-first truncated tree induction, often leads to better
performance, and can provide interpretable descriptions of the
aggregate decision rule. It is also much faster computationally making
it more suitable to large scale data mining applications.
Available by ftp from:
'ftp://stat.stanford.edu/pub/friedman/boost.ps.Z'
or 'ftp://utstat.toronto.edu/pub/tibs/boost.ps.Z'
Comments welcomed.
Previous
5 Next Top
Date: Thu, 16 Jul 1998 17:26:20 -0400
From: Steven Salzberg salzberg@tigr.org
Subject: New Book: Computational Methods in Molecular Biology
Web: http://www.cs.jhu.edu/~salzberg/compbio-book.html
NEW BOOK ON COMPUTATIONAL BIOLOGY
Computational Methods in Molecular Biology
Edited by Steven Salzberg, David Searls, and Simon Kasif
Published by Elsevier Science, 1998
Announcing the publication of a new book in computational biology,
with a heavy emphasis on machine learning and data mining techniques.
This book contains considerable tutorial material for those wishing
to 'break into' computational biology (a.k.a. bioinformatics) from
either the biological or the computational side. The authors include
many leading researchers from the machine learning and computational
biology communities.
For the complete table of contents, go to:
http://www.cs.jhu.edu/~salzberg/compbio-book.html
For purchasing information, go to http://www.elsevier.com,
click
on 'search' and then search for 'Salzberg'. Note that very substantial
bulk discounts are available if you wish to order this book for a course!
Please contact Jane Kerr at Elsevier, j.kerr@elsevier.nl,
for details on
such discounts.
==========================================================================
Steven Salzberg, Ph.D. Email: salzberg@tigr.org
The Institute for Genomic Research http://www.cs.jhu.edu/~salzberg
9712 Medical Center Drive
Rockville, MD 20850
Ph: (301)315-2537 Fax: (301)838-0208
Currently on leave from:
Department of Computer Science Ph. 410-516-8438
Johns Hopkins University Email: salzberg@cs.jhu.edu
Baltimore, MD 21218
Previous
6 Next Top
Date: Fri, 17 Jul 98 12:03:44 -0500
From: 'Anant Kishore' Kishore@mclnet.com
Subject: Wizsoft Press Release For KDNuggets Newsletter
Web: http://www.wizsoft.com/
WizSoft, Inc. Press Release For KDNuggets Newsletter
Announcement
WizSoft Inc. recently unveiled a new sales strategy aimed at developing
strategic alliances and partnerships with major data warehouse, OLAP and
data mining vendors. For the past two years, WizSoft has focused on
fulfilling the needs of end users and has built a strong North American
customer base. Clients such as the Metropolitan Transportation Authority
of New York, Turner Broadcasting, and Lucent
Technologies have all used WizSoft products for data mining purposes.
According to Abraham Meidan, WizSoft's Chairman and CEO, 'The expansion
of the data mining market creates an excellent opportunity for DBMS and
data warehouse vendors to add significant value to their existing products.
The companies can leverage WizSoft technology by quickly embedding our OCX.
Its a win-win for both partners. Our partners can brand our products under
their own names while we leverage additional channels to proliferate our
industrial strength data mining technology more quickly.' WizSoft is
currently in the process of completing several OEM and partnership deals
with major software vendors.
About WizSoft
A leader in the data mining and knowledge discovery software industry,
WizSoft specializes in the development of products based on
association rules. WizSoft offers its customers two data mining products:
WizWhy for issuing predictions and WizRule for data auditing.
Applications for WizSoft products include Fraud Detection, Market
Research, Credit Scoring and Data Quality. The firm markets its
software through distributors to over 30,000 customers in various
businesses and institutions worldwide. Established in 1983, WizSoft
is based in Tel Aviv, Israel with US offices in the New York and
Washington, DC areas.
Corporate Contact: Sales Contact:
Ms. Irina Sered Ms. Stacie Cayci
Executive Vice President Account Manager
E-mail: Isered@wizsoft.com
E-mail: cayci@mclnet.com
www.wizsoft.com
voice: (703) 351-7772
Previous
7 Next Top
Date: Tue, 21 Jul 1998 14:28:02 -0400
From: 'Ed DeRouin' ed@itctx.com
Subject: Job: FL-US-Orlando Data modeler/developer
Web: http://www.itcTX.com
Intelligent Technologies Corporation (ITC) is a market leader in intelligent
fraud detection software for the healthcare, insurance, and financial
and has several openings as a result of its recent growth.
DATA MODELER/DEVELOPER
We are looking for a candidate with applied mathematics/ engineering/CS
degree with several years of experience in data modeling and software
development in a UNIX C/C++ environment. The ideal candidate will have a
broad background covering the following skill sets.
- Applications of neural networks, genetic algorithms, fuzzy logic and
statistics.
- C, C++, Java programming.
- Experience in UNIX
- Good communication skills.
We offer excellent compensation plus a complete benefits package including
an employee stock options purchase plan. For consideration, send your resume
including salary history to: ITC, Job Code N, 455 Douglas Ave., Suite
2155-23, Altamonte Springs, FL 32714. Or fax to (407) 862-2490.
Or e-mail to: ed@itcTX.com ed@itcTX.com.
Check out our Web site at: www.itcTX.com
Previous
8 Next Top
Date: Sun, 26 Jul 1998 20:49:55 +0800 (GMT-8)
From: Liu Bing liub@comp.nus.edu.sg
Subject: Research Positions in Data Mining ...
******************************************************************************
RESEARCH FELLOW and RESEARCH ASSISTANT
for a Data Mining project
School of Computing
National University of Singapore, Singapore
--------------------------------------------------------------------------
Salary (Research Fellow) :
S$50,000 -- S$70,000 per annum (US$1 = S$1.68 approx).
(Research Assistant) :
S$30,000 -- S$50,000 per annum (US$1 = S$1.68 approx).
----------------------------------------------------------------------------
Seeking to employ a Research Fellow and two Research Assistants on a data
mining project for a period of 3 years. The project involves the design
and development of data mining tools and techniques.
Applicants for the Research Fellow position should have a Ph.D degree
in Computer Science or a related area and is specialized in one of the
following fields:
data mining,
pattern recognition,
natural language understanding,
neural networks for data mining,
statistics,
machine learning.
Applicants for the Research Assistant position should have a Master's or
honours degree in Computer Science or a related field (preferably with
good knowledge of data mining, machine learning, and statistics), and will
be required to have excellent C and/or C++ programming skills, and familar
with programming in Microsoft Environment.
Applications (with a resume, transcripts and 2 references) should be sent
to the following (resume and references can be sent via e-mail).
Dr. Bing Liu
School of Computing
National University of Singapore
Lower Kent Ridge Road
Singapore 119260
E-mail: liub@comp.nus.edu.sg
web: http://www.comp.nus.edu.sg/~liub
Tel: (65) 874 6736
Fax: (65) 779 4580
Previous
9 Next Top
Date: Tue, 7 Jul 1998 15:14:32 +0200
From: reinartz@dbag.ulm.DaimlerBenz.COM
(Thomas Reinartz)
Subject: 3rd CRISP-DM SIG Workshop - Announcement in Kdnuggets
Web: http://www.ncr.dk/CRISP
* C R I S P - D M S p e c i a l I n t e r e s t G r o u p *
* 3rd W o r k s h o p *
* 1st September 1998 *
* 10.00 a.m. - 5 p.m. *
* NCR New York *
* 1290 6th Avenue *
* New York, NY *
* U S A *
* (immediately following KDD '98 *
* only 5-6 blocks from KDD '98 venue) *
CRISP-DM - CRoss-Industry Standard Process for Data Mining - is an
initiative which aims to develop, validate and promote a standard
process model for data mining.
The core CRISP-DM consortium consists of NCR, Daimler-Benz, Integral
Solutions and OHRA. Key to the project is a Special Interest Group (SIG)
comprising data mining vendors, suppliers of related products and
services, and large-scale commercial users. The SIG's role is to provide
input on the requirements for and design of the CRISP-DM process model,
and to comment on draft versions. The CRISP-DM work is non-proprietary,
and the final process model will be made public at the end of the
project (January 1999); SIG members get early access to the materials
produced. The SIG currently has over 90 member organisations around the
world.
Two very successful one-day workshops have already been held for SIG
members in Europe (Amsterdam, November 1997; London, May 1998). The next
SIG Workshop will be held in New York on 1st September 1998 (immediately
following KDD '98).
The day will include:
- updates on the current state of the CRISP-DM process model
- input on user requirements
- feedback on the draft process model
- discussion
The previous workshops have succeeded because of the high level of input
from members. We would like to invite you to give brief presentations
(10-20 minutes) on:
- process model requirements, based on your data mining experience
- comments on the current draft CRISP-DM process model (circulated
to SIG members end of March 1998)
There will be a nominal charge of approximately $20 for the workshop,
including buffet lunch and refreshments.
(The workshop is only open to CRISP-DM SIG members, but there is no
charge for membership. If you are not already a SIG member, please tick
the appropriate box below and you will be sent the appropriate form.)
To participate, please return the attached form to:
crisp@dbag.ulm.daimlerbenz.com
You can also email this address for more information.
Looking forward to meet you in New York.
-- The CRISP-DM Consortium.
=====================================================================
[ ] I would like to attend the CRISP-DM SIG workshop in New
York on 1st September 1998
[ ] I am already a CRISP-DM SIG member.
[ ] I am not currently a CRISP-SIG member, but would like to join.
[ ] I would be willing to give a presentation on my data mining
experiences and requirements for a data mining process model
[ ] I would be willing to present my comments on the current draft
process model
[ ] I am happy for workshop participants to receive a copy of my
presentation(s)
Name :
Organisation :
Postal Address:
Email :
Previous
10 Next Top
Date: Wed, 15 Jul 1998 13:57:46 -0700 (PDT)
From: Michael Berthold berthold@ICSI.Berkeley.EDU
Subject: IDA-99 Call for Papers
Web: http://www.wi.leidenuniv.nl/~ida99/
Call for Papers
Third International Symposium on Intelligent Data Analysis (IDA-99)
Center for Mathematics and Computer Science,
Amsterdam, The Netherlands
9th-11th August 1999
Call for papers
===============
IDA-99 will take place in Amsterdam from 9th to 11th August 1999, and is
organised by Leiden University in cooperation with AAAI and NVKI. It will
consist of a stimulating program of tutorials, invited talks by leading
international experts in intelligent data analysis, contributed papers,
poster sessions, and an exciting social program.
Our aim is for IDA-99 to bring together a wide variety of researchers
concerned with extracting knowledge from data, including people from
statistics, machine learning, neural networks, computer science, pattern
recognition, database management, and other areas. The strategies adopted by
people from these areas are often different, and a synergy results if this
is recognised. IDA-99 is intended to stimulate interaction between these
different areas, so that more powerful tools emerge for extracting knowledge
from data and a better understanding is developed of the process of
intelligent data analysis.
It is the third symposium on Intelligent Data Analysis after the successful
symposia Intelligent Data Analysis 97 http://www.dcs.bbk.ac.uk/ida97.html/
and Intelligent Data Analysis 95.
IDA-99 Organisation
===================
General Chair: David Hand, Open University, UK
Program Chair: Joost Kok, Leiden University, The Netherlands
Program Co-Chairs: Michael Berthold, University of California, USA
Doug Fisher, Vanderbilt University
Important Dates
===============
February 1st, 1999 Deadline for submitting papers
April 15th, 1999 Notification of acceptance
May 15th, 1999 Deadline for submission of final papers
Publications
============
The proceedings will be published in the Lecture Notes in Computer Science
series of Springer http://www.springer.de/comp/lncs/.
The proceedings of
Intelligent Data Analysis 97 appeared in this series as LNCS 1280
http://www.springer.de/comp/lncs/volumes/1280.htm.
Additional Information
======================
A list of topics of interest, guidelines for submissions, and information
about the conference-site can be found on the World Wide Web Server of the
Leiden Institute for Advanced Computer Science:
http://www.wi.leidenuniv.nl/~ida99/
Previous
11 Next Top
Date: Tue, 14 Jul 1998 14:05:56 +0900
From: motoda@ar.sanken.osaka-u.ac.jp
(Hiroshi Motoda)
Subject: Final CFP: PKAW98 -- New submission deadline, July 31
Web: http://www.ar.sanken.osaka-u.ac.jp/PKAW98.html
Call for Papers
PKAW98, The 1998 Pacific Rim Knowledge Acquisition Workshop
Sponsored by PRICAI98
Venue & Date
Singapore, November 22-23, 1998
1. Introduction
The objective of this workshop is to assemble theoreticians and
practitioners concerned with developing methods and systems that
assist the knowledge acquisition process and assessing the suitability
of such methods. Thus, the workshop includes all aspects of
eliciting, acquiring, modeling and managing knowledge, and their role
in the construction of knowledge-intensive systems. Knowledge
acquisition still remains the bottleneck for building a knowledge based
system. Reuse and sharing of knowledge bases are major issues and
no satisfactory solutions have been agreed upon yet. There is a wide
range of research. Much of the work in this field has been knowledge
acquisition from human experts. The advent of the age of digital
information has brought the problem of data overload. Our ability to
analyze and understand massive datasets lags far behind our ability to
gather and store the data. A new generation of computational
techniques and tools is required to support the acquisition of useful
knowledge from the rapidly growing volume of data. All of these are to
be discussed in this workshop.
This workshop offers an opportunity to draw together both aspects of
dealing with the situated nature of human knowledge and expertise and
of developing methods that depend more on their algorithmic adequacy
than on the expertise of the knowledge engineer.
4. Important Dates
Papers due by: July 31, 1998
Notification of Acceptance: September 10, 1998
Camera-ready version of Final Paper due: October 10, 1998
Date of Workshop: November 22-23, 1998
For the latest information, please visit
http://www.ar.sanken.osaka-u.ac.jp/PKAW98.html.
Hiroshi Motoda
Division of Intelligent Systems Science,
The Institute of Scientific and Industrial Research,
Osaka University
8-1 Mihogaoka, Ibaraki, Osaka 567-0047
Japan
E-mail motoda@sanken.osaka-u.ac.jp
Phone : 81-6-879-8540
Fax : 81-6-879-8544
Previous
12 Next Top
Date: Wed, 22 Jul 1998 14:28:12 +0200
From: Riccardo Bellazzi ric@aim.unipv.it
Subject: IDAMAP 98 final announcement
Web: http://aim.unipv.it/~ric/idamap98
IDAMAP-98
Inteligent Data Analysis in Medicine and Pharmacology
A Workshop at the 13th European Conference on Artificial Intelligence
IDAMAP-98 is a one day ECAI-98 workshop that will be held in Brighton,
UK, on Tuesday, August 25, 1998 prior to the start of the main ECAI
conference. The topics of the workshop are computational methods for
data analysis able to exploit the available knowledge to narrow the
gap between data gathering and data comprehension, as well as their
applications in medicine and pharmacology.
The final schedule of the workshop and the abstracts of all
accepted papers are available at the workshop web site:
http://aim.unipv.it/~ric/idamap98
=====================================================================
Riccardo Bellazzi, PhD
Dipartimento di Informatica e Sistemistica
Universita' di Pavia, via Ferrata 1, 27100 Pavia, Italy
tel: 39-382-505-511, fax:39-382-505-373
e-mail: ric@ipvaimed3.unipv.it
Previous
13 Next Top