Knowledge Discovery Nuggets Index


To
KDNuggets Directory   |   Here is how to subscribe to KD Nuggets   |   This Year   |   Past Issues

Knowledge Discovery Nuggets(tm) 98:17, e-mailed 98-07-29



Requests:
  • (text) M2 Ingrid, Questions about practical data mining projects

    Publications:
  • (text) Alex Buchner, New Book: 'Decision Support using Data Mining'
    http://www.ftmanagement.com/
  • (text) Gerd Wagner, New Book: Foundations of Knowledge Systems
    http://www.informatik.uni-leipzig.de/~gwagner/ks-abstract.html
  • (text) Jerome Friedman, Paper: Additive Logistic Regression: a
    Statistical View of Boosting
    ftp://stat.stanford.edu/pub/friedman/boost.ps.Z
  • (text) Steven Salzberg, New Book: Computational Methods in Molecular Biology,
    http://www.cs.jhu.edu/~salzberg/compbio-book.html

    Tools/Services:
  • (text) Anant Kishore, Wizsoft Press Release
    http://www.wizsoft.com/

    Positions:
  • (text) Ed DeRouin, FL-US-Orlando, Data modeler/developer
    http://www.itcTX.com
  • (text) Liu Bing, Research Positions in Data Mining,
    National University of Singapore, Singapore

    Meetings:
  • (text) Thomas Reinartz, 3rd CRISP-DM SIG Workshop,
    New York, NY, 1 Sep 1998
    http://www.ncr.dk/CRISP
  • (text) Michael Berthold, IDA-99 Call for Papers,
    Amsterdam, The Netherlands, 9-11 August 1999
    http://www.wi.leidenuniv.nl/~ida99/
  • (text) Hiroshi Motoda, Final CFP: PKAW98 new submission deadline, July 31
    http://www.ar.sanken.osaka-u.ac.jp/PKAW98.html
  • (text) Riccardo Bellazzi, IDAMAP 98 final announcement,
    Brighton, UK, 25 August 1998
    http://aim.unipv.it/~ric/idamap98
    --
    on the latest news, publications, tools, meetings, and other relevant items
    in the Data Mining and Knowledge Discovery field.
    KD Nuggets is currently reaching over 4800 readers in 65+ countries
    2-3 times a month.

    Items relevant to data mining and knowledge discovery are welcome
    and should be emailed to gps in ASCII text or HTML format.
    An item should have a subject line which clearly describes
    what is it about to KDNuggets readers.
    Please keep calls for papers and meeting announcements
    short (50 lines or less of up to 80-characters), and provide a web site for
    details, such as papers submission guidelines.
    All items may be edited for size.

    To subscribe, see http://www.kdnuggets.com/subscribe.html

    Back issues of KD Nuggets, a catalog of data mining tools
    ('Siftware'), pointers to data mining companies, relevant websites,
    meetings, etc are available at KDNuggets Directory at
    http://www.kdnuggets.com/

    -- Gregory Piatetsky-Shapiro (editor)
    gps

    ********************* Official disclaimer ***************************
    All opinions expressed herein are those of the contributors and not
    necessarily of their respective employers (or of KD Nuggets)
    *********************************************************************

    ~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    If a packet hits a pocket on a socket on a port,
    And the bus is interrupted as a very last resort,
    And the address of the memory makes your floppy disk abort,
    Then the socket packet pocket has an error to report!
    Thanks to Vincent Sabio, humournet.com


    Previous  1 Next   Top
    Date: Thu, 16 Jul 1998 17:04:57 +0900 (JST)
    From: ingrid@soft.comp.kyutech.ac.jp (M2 Ingrid )
    Subject: Questions about practical data mining projects

    Hi,

    Recently I have been searching frantically for detailed information
    with respect to data mining/knowledge discovery i.e. practical
    projects concerning:

    - Application area
    - Problem description
    - Data description (# records, attributes,...)
    - Preprocessing methods
    - Data Mining technique(s)/algorithm(s) used
    - Postprocessing methods
    - Evaluation criteria
    - Software used

    Could somebody suggest possible references. Thanks, it is greatly appreciated.

    Ingrid.

    e-mail: ingrid@soft.comp.kyutech.ac.jp


    Previous  2 Next   Top
    Date: Thu, 23 Jul 1998 12:12:54 +0100
    From: 'Alex Buchner' ag.buchner@ulst.ac.uk
    Subject: New Book: 'Decision Support using Data Mining'
    Web: http://www.ftmanagement.com/


    Our latest book 'Decision Support using Data Mining' by Sarab S. Anand
    and Alex Buchner has just been published by Financial Times Pitman
    Publishers (ISBN 0-273-63269-8), 168 pages.

    See http://www.ftmanagement.com/ for more details.

    Table of contents:

    1 INTRODUCTION
    2 A HISTORICAL PERSPECTIVE OF DATA MANAGEMENT
    3 RELATED DISCIPLINES
    4 THE DATA MINING PROCESS
    5 DATA MINING GOALS AND METHODOLOGIES
    6 THE DATA MINING STRATEGY
    Appendix I: Case Studies
    Appendix II: A Brief Market Overview
    Appendix III: Glossary
    Appendix IV: References
    Appendix V: Further Reading
    Appendix VI: Internet Resources

    Check http://www.ftmanagement.com/ for more details.

    Alex G. Buchner, Research Fellow
    Northern Ireland Knowledge Engineering Laboratory
    University of Ulster, Newtownabbey, BT37 0QB, UK
    Phone:+44 (0)1232 368394 Fax: +44 (0)1232 366068
    URL: http://www.infj.ulst.ac.uk/staff/ag.buchner


    Previous  3 Next   Top
    Date: Wed, 22 Jul 1998 12:02:38 +0200
    From: Gerd Wagner gw@inf.fu-berlin.de
    Subject: new book on knowledge systems
    Web: http://www.informatik.uni-leipzig.de/~gwagner/ks-abstract.html

    * new book * new book * new book * new book * new book *

    FOUNDATIONS OF KNOWLEDGE SYSTEMS
    with Applications to Databases and Agents
    by Gerd Wagner
    Kluwer Academic Publishers

    The book presents formal concepts and models of advanced database
    and knowledge base systems which may be of interest to DM&KD
    researchers. From the content:

    - object-relational, temporal, fuzzy, possibilistic, MLS and lineage databases
    - representing explicit negative information in 'bitables'
    - reaction rules and their transition systems semantics
    - deduction rules and their stable generated closure semantics

    An extended abstract (and the order URL) is available from
    http://www.informatik.uni-leipzig.de/~gwagner/ks-abstract.html


    Previous  4 Next   Top
    Date: Thu, 23 Jul 1998 17:23:34 -0700 (PDT)
    From: 'Jerome H. Friedman' jhf@stat.Stanford.EDU
    Subject: Paper: Additive Logistic Regression: a Statistical View of Boosting
    Web: ftp://stat.stanford.edu/pub/friedman/boost.ps.Z

    Jerome Friedman
    (jhf@stat.stanford.edu)

    Trevor Hastie
    (trevor@stat.stanford.edu)

    Robert Tibshirani
    (tibs@utstat.toronto.edu)

    ABSTRACT

    Boosting (Freund & Schapire 1996, Schapire & Singer 1998) is one of
    the most important recent developments in classification
    methodology. The performance of many classification algorithms often
    can be dramatically improved by sequentially applying them to
    reweighted versions of the input data, and taking a weighted majority
    vote of the sequence of classifiers thereby produced. We show that
    this seemingly mysterious phenomenon can be understood in terms of
    well known statistical principles, namely additive modeling and
    maximum likelihood. For the two-class problem, boosting can be viewed
    as an approximation to additive modeling on the logistic scale using
    maximum Bernoulli likelihood as a criterion. We develop more direct
    approximations and show that they exhibit nearly identical results to
    that of boosting. Direct multi-class generalizations based on
    multinomial likelihood are derived that exhibit performance comparable
    to other recently proposed multi-class generalizations of boosting in
    most situations, and far superior in some. We suggest a minor
    modification to boosting that can reduce computation, often by factors
    of 10 to 50. Finally, we apply these insights to produce an
    alternative formulation of boosting decision trees. This approach,
    based on best-first truncated tree induction, often leads to better
    performance, and can provide interpretable descriptions of the
    aggregate decision rule. It is also much faster computationally making
    it more suitable to large scale data mining applications.

    Available by ftp from:
    'ftp://stat.stanford.edu/pub/friedman/boost.ps.Z'
    or 'ftp://utstat.toronto.edu/pub/tibs/boost.ps.Z'

    Comments welcomed.


    Previous  5 Next   Top
    Date: Thu, 16 Jul 1998 17:26:20 -0400
    From: Steven Salzberg salzberg@tigr.org
    Subject: New Book: Computational Methods in Molecular Biology
    Web: http://www.cs.jhu.edu/~salzberg/compbio-book.html

    NEW BOOK ON COMPUTATIONAL BIOLOGY

    Computational Methods in Molecular Biology

    Edited by Steven Salzberg, David Searls, and Simon Kasif
    Published by Elsevier Science, 1998

    Announcing the publication of a new book in computational biology,
    with a heavy emphasis on machine learning and data mining techniques.
    This book contains considerable tutorial material for those wishing
    to 'break into' computational biology (a.k.a. bioinformatics) from
    either the biological or the computational side. The authors include
    many leading researchers from the machine learning and computational
    biology communities.

    For the complete table of contents, go to:
    http://www.cs.jhu.edu/~salzberg/compbio-book.html

    For purchasing information, go to http://www.elsevier.com, click
    on 'search' and then search for 'Salzberg'. Note that very substantial
    bulk discounts are available if you wish to order this book for a course!
    Please contact Jane Kerr at Elsevier, j.kerr@elsevier.nl, for details on
    such discounts.

    ==========================================================================
    Steven Salzberg, Ph.D. Email: salzberg@tigr.org
    The Institute for Genomic Research http://www.cs.jhu.edu/~salzberg
    9712 Medical Center Drive
    Rockville, MD 20850
    Ph: (301)315-2537 Fax: (301)838-0208
    Currently on leave from:
    Department of Computer Science Ph. 410-516-8438
    Johns Hopkins University Email: salzberg@cs.jhu.edu
    Baltimore, MD 21218


    Previous  6 Next   Top
    Date: Fri, 17 Jul 98 12:03:44 -0500
    From: 'Anant Kishore' Kishore@mclnet.com
    Subject: Wizsoft Press Release For KDNuggets Newsletter
    Web: http://www.wizsoft.com/

    WizSoft, Inc. Press Release For KDNuggets Newsletter

    Announcement

    WizSoft Inc. recently unveiled a new sales strategy aimed at developing
    strategic alliances and partnerships with major data warehouse, OLAP and
    data mining vendors. For the past two years, WizSoft has focused on
    fulfilling the needs of end users and has built a strong North American
    customer base. Clients such as the Metropolitan Transportation Authority
    of New York, Turner Broadcasting, and Lucent
    Technologies have all used WizSoft products for data mining purposes.

    According to Abraham Meidan, WizSoft's Chairman and CEO, 'The expansion
    of the data mining market creates an excellent opportunity for DBMS and
    data warehouse vendors to add significant value to their existing products.

    The companies can leverage WizSoft technology by quickly embedding our OCX.
    Its a win-win for both partners. Our partners can brand our products under
    their own names while we leverage additional channels to proliferate our
    industrial strength data mining technology more quickly.' WizSoft is
    currently in the process of completing several OEM and partnership deals
    with major software vendors.

    About WizSoft

    A leader in the data mining and knowledge discovery software industry,
    WizSoft specializes in the development of products based on
    association rules. WizSoft offers its customers two data mining products:
    WizWhy for issuing predictions and WizRule for data auditing.

    Applications for WizSoft products include Fraud Detection, Market
    Research, Credit Scoring and Data Quality. The firm markets its
    software through distributors to over 30,000 customers in various
    businesses and institutions worldwide. Established in 1983, WizSoft
    is based in Tel Aviv, Israel with US offices in the New York and
    Washington, DC areas.

    Corporate Contact: Sales Contact:

    Ms. Irina Sered Ms. Stacie Cayci
    Executive Vice President Account Manager
    E-mail: Isered@wizsoft.com E-mail: cayci@mclnet.com
  • www.wizsoft.com
  • voice: (703) 351-7772


    Previous  7 Next   Top
    Date: Tue, 21 Jul 1998 14:28:02 -0400
    From: 'Ed DeRouin' ed@itctx.com
    Subject: Job: FL-US-Orlando Data modeler/developer
    Web: http://www.itcTX.com

    Intelligent Technologies Corporation (ITC) is a market leader in intelligent
    fraud detection software for the healthcare, insurance, and financial
    and has several openings as a result of its recent growth.

    DATA MODELER/DEVELOPER

    We are looking for a candidate with applied mathematics/ engineering/CS
    degree with several years of experience in data modeling and software
    development in a UNIX C/C++ environment. The ideal candidate will have a
    broad background covering the following skill sets.



    We offer excellent compensation plus a complete benefits package including
    an employee stock options purchase plan. For consideration, send your resume
    including salary history to: ITC, Job Code N, 455 Douglas Ave., Suite
    2155-23, Altamonte Springs, FL 32714. Or fax to (407) 862-2490.

  • Or e-mail to: ed@itcTX.com ed@itcTX.com.


  • Check out our Web site at: www.itcTX.com




  • Previous  8 Next   Top
    Date: Sun, 26 Jul 1998 20:49:55 +0800 (GMT-8)
    From: Liu Bing liub@comp.nus.edu.sg
    Subject: Research Positions in Data Mining ...

    ******************************************************************************
    RESEARCH FELLOW and RESEARCH ASSISTANT
    for a Data Mining project
    School of Computing
    National University of Singapore, Singapore
    --------------------------------------------------------------------------
    Salary (Research Fellow) :
    S$50,000 -- S$70,000 per annum (US$1 = S$1.68 approx).
    (Research Assistant) :
    S$30,000 -- S$50,000 per annum (US$1 = S$1.68 approx).
    ----------------------------------------------------------------------------

    Seeking to employ a Research Fellow and two Research Assistants on a data
    mining project for a period of 3 years. The project involves the design
    and development of data mining tools and techniques.

    Applicants for the Research Fellow position should have a Ph.D degree
    in Computer Science or a related area and is specialized in one of the
    following fields:
    data mining,
    pattern recognition,
    natural language understanding,
    neural networks for data mining,
    statistics,
    machine learning.

    Applicants for the Research Assistant position should have a Master's or
    honours degree in Computer Science or a related field (preferably with
    good knowledge of data mining, machine learning, and statistics), and will
    be required to have excellent C and/or C++ programming skills, and familar
    with programming in Microsoft Environment.

    Applications (with a resume, transcripts and 2 references) should be sent
    to the following (resume and references can be sent via e-mail).

    Dr. Bing Liu
    School of Computing
    National University of Singapore
    Lower Kent Ridge Road
    Singapore 119260

    E-mail: liub@comp.nus.edu.sg
    web: http://www.comp.nus.edu.sg/~liub
    Tel: (65) 874 6736
    Fax: (65) 779 4580


    Previous  9 Next   Top
    Date: Tue, 7 Jul 1998 15:14:32 +0200
    From: reinartz@dbag.ulm.DaimlerBenz.COM (Thomas Reinartz)
    Subject: 3rd CRISP-DM SIG Workshop - Announcement in Kdnuggets
    Web: http://www.ncr.dk/CRISP


    * C R I S P - D M S p e c i a l I n t e r e s t G r o u p *
    * 3rd W o r k s h o p *
    * 1st September 1998 *
    * 10.00 a.m. - 5 p.m. *
    * NCR New York *
    * 1290 6th Avenue *
    * New York, NY *
    * U S A *
    * (immediately following KDD '98 *
    * only 5-6 blocks from KDD '98 venue) *


    CRISP-DM - CRoss-Industry Standard Process for Data Mining - is an
    initiative which aims to develop, validate and promote a standard
    process model for data mining.

    The core CRISP-DM consortium consists of NCR, Daimler-Benz, Integral
    Solutions and OHRA. Key to the project is a Special Interest Group (SIG)
    comprising data mining vendors, suppliers of related products and
    services, and large-scale commercial users. The SIG's role is to provide
    input on the requirements for and design of the CRISP-DM process model,
    and to comment on draft versions. The CRISP-DM work is non-proprietary,
    and the final process model will be made public at the end of the
    project (January 1999); SIG members get early access to the materials
    produced. The SIG currently has over 90 member organisations around the
    world.

    Two very successful one-day workshops have already been held for SIG
    members in Europe (Amsterdam, November 1997; London, May 1998). The next
    SIG Workshop will be held in New York on 1st September 1998 (immediately
    following KDD '98).

    The day will include:

    - updates on the current state of the CRISP-DM process model

    - input on user requirements

    - feedback on the draft process model

    - discussion

    The previous workshops have succeeded because of the high level of input
    from members. We would like to invite you to give brief presentations
    (10-20 minutes) on:

    - process model requirements, based on your data mining experience

    - comments on the current draft CRISP-DM process model (circulated
    to SIG members end of March 1998)

    There will be a nominal charge of approximately $20 for the workshop,
    including buffet lunch and refreshments.

    (The workshop is only open to CRISP-DM SIG members, but there is no
    charge for membership. If you are not already a SIG member, please tick
    the appropriate box below and you will be sent the appropriate form.)

    To participate, please return the attached form to:

    crisp@dbag.ulm.daimlerbenz.com

    You can also email this address for more information.

    Looking forward to meet you in New York.

    -- The CRISP-DM Consortium.

    =====================================================================

    [ ] I would like to attend the CRISP-DM SIG workshop in New
    York on 1st September 1998

    [ ] I am already a CRISP-DM SIG member.
    [ ] I am not currently a CRISP-SIG member, but would like to join.

    [ ] I would be willing to give a presentation on my data mining
    experiences and requirements for a data mining process model

    [ ] I would be willing to present my comments on the current draft
    process model

    [ ] I am happy for workshop participants to receive a copy of my
    presentation(s)

    Name :
    Organisation :
    Postal Address:

    Email :


    Previous  10 Next   Top
    Date: Wed, 15 Jul 1998 13:57:46 -0700 (PDT)
    From: Michael Berthold berthold@ICSI.Berkeley.EDU
    Subject: IDA-99 Call for Papers
    Web: http://www.wi.leidenuniv.nl/~ida99/

    Call for Papers

    Third International Symposium on Intelligent Data Analysis (IDA-99)
    Center for Mathematics and Computer Science,
    Amsterdam, The Netherlands
    9th-11th August 1999

    Call for papers
    ===============
    IDA-99 will take place in Amsterdam from 9th to 11th August 1999, and is
    organised by Leiden University in cooperation with AAAI and NVKI. It will
    consist of a stimulating program of tutorials, invited talks by leading
    international experts in intelligent data analysis, contributed papers,
    poster sessions, and an exciting social program.
    Our aim is for IDA-99 to bring together a wide variety of researchers
    concerned with extracting knowledge from data, including people from
    statistics, machine learning, neural networks, computer science, pattern
    recognition, database management, and other areas. The strategies adopted by
    people from these areas are often different, and a synergy results if this
    is recognised. IDA-99 is intended to stimulate interaction between these
    different areas, so that more powerful tools emerge for extracting knowledge
    from data and a better understanding is developed of the process of
    intelligent data analysis.

    It is the third symposium on Intelligent Data Analysis after the successful
    symposia Intelligent Data Analysis 97 http://www.dcs.bbk.ac.uk/ida97.html/
    and Intelligent Data Analysis 95.

    IDA-99 Organisation
    ===================
    General Chair: David Hand, Open University, UK
    Program Chair: Joost Kok, Leiden University, The Netherlands
    Program Co-Chairs: Michael Berthold, University of California, USA
    Doug Fisher, Vanderbilt University

    Important Dates
    ===============
    February 1st, 1999 Deadline for submitting papers
    April 15th, 1999 Notification of acceptance
    May 15th, 1999 Deadline for submission of final papers

    Publications
    ============
    The proceedings will be published in the Lecture Notes in Computer Science
    series of Springer http://www.springer.de/comp/lncs/. The proceedings of
    Intelligent Data Analysis 97 appeared in this series as LNCS 1280
    http://www.springer.de/comp/lncs/volumes/1280.htm.

    Additional Information
    ======================
    A list of topics of interest, guidelines for submissions, and information
    about the conference-site can be found on the World Wide Web Server of the
    Leiden Institute for Advanced Computer Science:

    http://www.wi.leidenuniv.nl/~ida99/


    Previous  11 Next   Top
    Date: Tue, 14 Jul 1998 14:05:56 +0900
    From: motoda@ar.sanken.osaka-u.ac.jp (Hiroshi Motoda)
    Subject: Final CFP: PKAW98 -- New submission deadline, July 31
    Web: http://www.ar.sanken.osaka-u.ac.jp/PKAW98.html

    Call for Papers
    PKAW98, The 1998 Pacific Rim Knowledge Acquisition Workshop
    Sponsored by PRICAI98

    Venue & Date
    Singapore, November 22-23, 1998

    1. Introduction

    The objective of this workshop is to assemble theoreticians and
    practitioners concerned with developing methods and systems that
    assist the knowledge acquisition process and assessing the suitability
    of such methods. Thus, the workshop includes all aspects of
    eliciting, acquiring, modeling and managing knowledge, and their role
    in the construction of knowledge-intensive systems. Knowledge
    acquisition still remains the bottleneck for building a knowledge based
    system. Reuse and sharing of knowledge bases are major issues and
    no satisfactory solutions have been agreed upon yet. There is a wide
    range of research. Much of the work in this field has been knowledge
    acquisition from human experts. The advent of the age of digital
    information has brought the problem of data overload. Our ability to
    analyze and understand massive datasets lags far behind our ability to
    gather and store the data. A new generation of computational
    techniques and tools is required to support the acquisition of useful
    knowledge from the rapidly growing volume of data. All of these are to
    be discussed in this workshop.

    This workshop offers an opportunity to draw together both aspects of
    dealing with the situated nature of human knowledge and expertise and
    of developing methods that depend more on their algorithmic adequacy
    than on the expertise of the knowledge engineer.

    4. Important Dates

    Papers due by: July 31, 1998
    Notification of Acceptance: September 10, 1998
    Camera-ready version of Final Paper due: October 10, 1998
    Date of Workshop: November 22-23, 1998

    For the latest information, please visit
    http://www.ar.sanken.osaka-u.ac.jp/PKAW98.html.

    Hiroshi Motoda
    Division of Intelligent Systems Science,
    The Institute of Scientific and Industrial Research,
    Osaka University
    8-1 Mihogaoka, Ibaraki, Osaka 567-0047
    Japan

    E-mail motoda@sanken.osaka-u.ac.jp
    Phone : 81-6-879-8540
    Fax : 81-6-879-8544


    Previous  12 Next   Top
    Date: Wed, 22 Jul 1998 14:28:12 +0200
    From: Riccardo Bellazzi ric@aim.unipv.it
    Subject: IDAMAP 98 final announcement
    Web: http://aim.unipv.it/~ric/idamap98

    IDAMAP-98
    Inteligent Data Analysis in Medicine and Pharmacology
    A Workshop at the 13th European Conference on Artificial Intelligence

    IDAMAP-98 is a one day ECAI-98 workshop that will be held in Brighton,
    UK, on Tuesday, August 25, 1998 prior to the start of the main ECAI
    conference. The topics of the workshop are computational methods for
    data analysis able to exploit the available knowledge to narrow the
    gap between data gathering and data comprehension, as well as their
    applications in medicine and pharmacology.

    The final schedule of the workshop and the abstracts of all
    accepted papers are available at the workshop web site:

    http://aim.unipv.it/~ric/idamap98

    =====================================================================

    Riccardo Bellazzi, PhD
    Dipartimento di Informatica e Sistemistica
    Universita' di Pavia, via Ferrata 1, 27100 Pavia, Italy
    tel: 39-382-505-511, fax:39-382-505-373
    e-mail: ric@ipvaimed3.unipv.it


    Previous  13 Next   Top