Knowledge Discovery Nuggets Index


To
KDNuggets Directory   |   Here is how to subscribe to KD Nuggets   |   This Year   |   Past Issues

Knowledge Discovery Nuggets(tm) 98:1, e-mailed 98-01-05


News:
  • (text) GPS, What Wal-Mart might do with Barbie association rules

    Publications:
  • (text) Ronny Kohavi, MineSet mailing list and
    article in Data Management Strategies
  • (text) Shivakumar Vaithaynathan, AI Review: Special Issue on Data Mining
    on the Internet, deadline extended to Jan 30, 1998
  • (text) Stephen Koo, Data Mining articles in Chinese,
    http://www.hkstar.com/~skoo

    Siftware:
  • (text) Raphaelle Thomas, ALICE on the Web,
    http://www.alice.fr/

    Meetings:
  • (text) Aurora Perez, SCI'98-ISAS'98 FOCUS SYMPOSIUM ON KDD,
    Orlando, Florida, July 12-16, 1998,
    http://orion.ls.fi.upm.es/~aurora/sci98/sci98.html
  • (text) Mehran Sahami, AAAI/ICML-98 Workshop on Learning for
    Text Categorization,
    http://www.cs.cmu.edu/~mccallum/textcat.html
  • (text) Floor Verdenius, AAAI/ICML-98 WS The Methodology of Applying
    Machine Learning,
    http://www.aifb.uni-karlsruhe.de/WBS/AAAI98/AAAIWS98.html
  • (text) DG, EUFIT '98, Aachen, Germany, Sep 7-10, 1998,
    http://www.mitgmbh.de/elite/eufit.html
    --
    Data Mining and Knowledge Discovery community, focusing on the
    latest research and applications.

    Submissions are most welcome and should be emailed, with a
    DESCRIPTIVE subject line (and a URL) to gps.
    Please keep CFP and meetings announcements short and provide
    a URL for details.

    To subscribe, see http://www.kdnuggets.com/subscribe.html

    KD Nuggets frequency is 2-3 times a month.
    Back issues of KD Nuggets, a catalog of data mining tools
    ('Siftware'), pointers to Data Mining Companies, Relevant Websites,
    Meetings, and more is available at Knowledge Discovery Mine site
    at http://www.kdnuggets.com/

    -- Gregory Piatetsky-Shapiro (editor)
    gps

    ********************* Official disclaimer ***************************
    All opinions expressed herein are those of the contributors and not
    necessarily of their respective employers (or of KD Nuggets)
    *********************************************************************

    ~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    'If I had six hours to chop down a tree,
    I'd spend the first four sharpening the axe'
    - Abraham Lincoln (thanks to Stephen Koo)

    Previous  1 Next   Top
    Date: Mon, 5 Jan 1998
    From: GPS gps
    Subject: What Wal-Mart might do with Barbie association rules

    Here is a summary of answers to a question asked by
    Charles P. Elkan elkan@cs.columbia.edu in KDNuggets 97:35,
    who wrote:

    Wal-Mart knows that customers who buy Barbie dolls (it sells one
    every 20 seconds) have a 60% likelihood of buying one of three types of
    candy bars. What does Wal-Mart do with information like that? 'I don't
    have a clue,' says Wal-Mart's chief of merchandising, Lee Scott.

    Source: Palmeri, Christopher. Believe in yourself, believe in the merchandise.
    Forbes v160, n5 (Sep 8, 1997):118-124.

    The best answer was given by Doron Shalvi doron@eng.umd.edu,
    who suggested the mentioned candies should be manufactured in
    the shape of a Barbie doll!

    Most other suggestions were to put the Barbie and the candy closer together,
    or packaging Barbie, candy and perhaps other products together,
    or offering 'affinity program' that give Barbi
    accessories in exchange for proofs of purchase.

    Here are suggestions from other readers (edited for space):

    Gregory
    ---
    From: yerrams2@cis.uab.edu (Ramesh Yerramseti)

    By increasing the price of Barbie doll and giving the type of candy bar free,
    wal-mart can reinforce the buying habits of that particular types of buyer.
    Once this is done, the next time a buyer buys something else along with
    this combination, those items from name-brand substitute manufacturer can
    be placed suitably in the next aisle. A reorganization of the aisle content
    will happen which which differ geographically. Users can buy more stuff
    spending less time, increasing sales (and credit debt, of course)

    Quantity discounts can be obtained from the manufacturer of that candy since
    some amount of sales along with barbie is always assured.

    ---
    From: Clemens van Brunschot c.van.brunschot@wxs.nl

    My idea (probably pretty obvious) would be that retail outlets might use
    the association rules (in general) for choosing what kind of products
    they put close to each other. Now they probably won't put the candy bars
    next to the Barbie dolls, but then they might introduce a sort of cross
    reference signs: e.g. a sign above the Barbie dolls stating where the
    candy bars are (or the other way around).

    ---
    From: 'Desmond Lim'desmondl@tech.singalab.com.sg

    (1) Highest margin candy to be placed near dolls.
    (2) Special promotions for Barbie dolls with candy at a slightly higher
    margin.
    (3) Coupons for dolls and candy at different times or places.
    (4) Package 2 or more candy bars with Barbie dolls.

    ---

    From: Ethan E.Collopy@cs.ucl.ac.uk

    It seems likely that there will be a high correlation between
    any customer purchases and whatever is placed for sale
    at the checkout (i.e. candy) as an impulse buy. In these circumstances
    Wal-mart should attempt to maximise its profits by placing higher
    profit-margin goods near the checkout and during a specific
    period discover the maximum expenditure/profit ratio customers will
    devote to an impulse purchase.

    Otherwise, if a strong
    correlation truly exists between two products, Wal-mart can:

    - exploit discovered associations with the companies who
    manufacture the products with tie-ins
    - create buy one, get one type offers (.e.g. buy a candy multipack for
    a free Barbie hairbrush!) to increase sales based on this association
    - Take a poorly selling product X and incorporate an offer on this which
    is based on buying Barbie and Candy. If the customer is likely to buy
    these two products anyway then why not try to increase sales on X?

    ---
    From: 'VPAMPATT.US.ORACLE.COM' VPAMPATT@us.oracle.com

    Probably they can not only bundle candy of type A with Barbie dolls,
    but can also introduce new candy of Type N in this bundle while
    offering discount on whole bundle. As bundle is going to sell because of
    Barbie dolls & candy of type A, candy of type N can get free ride to
    customers houses. And with the fact that you like something , if you see
    it often, Candy of type N can become popular.

    Also they can try to increase the sell by hiking price of candy of type
    A while lowering price of barbie dolls. Thus cut in dolls price can be
    taken by hike in candy of type A & thus getting more money for pair
    together.
    As they introduce new & new candy's in this bundle, they can lower dolls
    price
    & keep hiking candy's price. Now you have many candy's in this bundle, so
    little hike in candy's price will mint you a lot money.

    Now can you guess which software empire build his company with this idea
    in mind?

    ---
    From: 'Barry Cohen, IS' bcohen@surveys.com

    It's true that association rules and unsupervised learning can have
    good applications in business, but not every discovery is a gold
    nugget. Nevertheless, this one might have some practical
    implications.

    I wouldn't package candy bars with Barbi dolls -- the dolls may stay
    on the shelf longer than the life of the candy. In addition, the
    dolls are manufactured off-shore, while the candy is made here,
    requiring extra steps for 'in-packing.'

    Instead, how about an 'affinity program' that offers Barbi
    accessories in exchange for proofs of purchase. This rewards
    consumers for maintaining the association and benefits both
    companies.

    ---

    From: AmitSeth AmitSeth@aol.com

    A) Positioning of items.
    Marketers know that for every extra minute of 'quality' time that
    one spends in a store, there is a high degree of likelihood that that
    person will spend one extra dollar in that store (research states that
    this is true for large retail stores). With this in mind, Walmart can
    place Barbie dolls in one corner (Toys section) and Candy in a section
    that is further away from the Toys section. And, in the path that lies
    between these two sections, place special 'Kid' items - which may
    be either promotion items, high margin items, Walmart's own brand
    items, etc.

    B) Co-packaging or Co-positioning
    As you suggested, co-packaging is definitley a solution. But, this could
    have the adverse effect of ultimately bringing down the sales of the dolls -
    because the packaged items will cost more than the price of an individual
    item. Co-packaging can be helpful in promotional cases, where the candy
    may be being given away for free.

    Co-positioning allows us to increase the sales of the follower items - in
    this case candy. Barbie dolls and candy may be placed together or
    very close to each other. The classic case of chips and dips is a good
    example of this - where special racks were made in the chips section
    to accomodate dips which ultimately helped increase the sales of dips.

    C) The Bottom-Line issue.
    Consider the following ...

    If the profit margin on the leader item (A) is $1 and that of the follower (B)
    is 75 cents. And, say, the confidence interval is known to be 60%.

    Suppose that we sell 10,000 of A everyday. Thus, this amounts to a
    sale of 6,000 of B everyday. Suppose we have a promotion offering
    A on sale reducing our margin to 95 cents. At the same time we increase
    our margin on B to 85 cents. By our sale, we increase our sales to 11,000
    of A everyday (which amounts to 6,600 of B).

    Pre-sale scenario ....
    Profit on A = 10,000 * 1 = $10,000 per day
    Profit on B = 6,000 * 0.75 = $4,500
    Total Profit = $14,500

    On-sale scenario ....
    Profit on A = 11,000 * 0.95 = $10,450 per day
    Profit on B = 6,600 * 0.85 = $5,610 per day
    Total Profit = $16,060

    Thus, by carefully adjusting promotions and prices, we can better our
    bottom line.

    D) The Advertising Lesson.
    Retail stores advertise a lot - their advertisement budgets touch the sky.
    Most of this money is going down the drain. These inflated advertisement
    expenditures are begging to be better managed. The lesson to be learnt
    here is that one should not advertise both a leader and a follower. By
    advertising the leader, the follower automatically posts an increase in its'
    sales. By using these rules for better advertisement management, companies
    can either save on advertisement, or make their dollar reach out more than
    before.


    Previous  2 Next   Top
    Date: Fri, 26 Dec 1997 22:59:24 -0800
    From: Ronny Kohavi ronnyk@starry.engr.sgi.com
    Subject: MineSet mailing list and article in Data Management Strategies

    We have setup a mailing list for people interested in Silicon
    Graphics' MineSet announcements and discussions.

    To subscribe to the mineset_list mailing list, send e-mail to
    external-majordomo@postofc.corp.sgi.com
    with the BODY (subject is ignored) containing one line:
    subscribe mineset_list _your_email_address_here_

    In addition, we would like to tell you about a very nice article
    evaluating MineSet 2.0 that appeared in Data Management Strategies.

    An electronic copy is available off:
    http://mineset.sgi.com/DMStrategies/

    Some excerpts from the article:

    I examined MineSet about a year ago and was impressed with its
    capabilities. MineSet 2.0 includes some important new additions that
    make this very capable data mining tool even more impressive.

    MineSet has exceptional data visualization capabilities. But more
    important for data analysts, MineSet's data mining and data
    visualization capabilities are tightly integrated with each other...

    ...MineSet's three-dimensional landscape format, which uses a
    'fly-through' navigational format, presents uncovered patterns and
    trends to the user in a manner that is intuitive yet avoids
    cluttering and overwhelming the analyst.

    MineSet's capabilities will appeal to a variety of analysts, ranging
    from database marketing and brand managers to retail buyers and
    stockbrokers on Wall Street. In addition, scientists and engineers
    will love its data modeling, visualization, and animation facilities.
    --

    Ronny Kohavi (ronnyk@sgi.com, http://robotics.stanford.edu/~ronnyk
    Engineering Manager, MineSet.
    Maximize the value of your data with data mining and visualization.


    Previous  3 Next   Top
    Date: Sun, 21 Dec 97 08:37:25 -0800
    From: 'Shivakumar Vaithaynathan' SHIV@almaden.ibm.com
    Subject: 2nd notice: AI Review: Special Issue on Data Mining on the Internet

    Due to several requests, the deadline for the following special issues
    has been extended till the 30th of January.

    Artificial Intelligence Review:
    Special Issue on Data Mining on the Internet


    The advent of the World Wide Web has caused a dramatic increase in usage
    of the Internet. The resulting growth in on-line information combined
    with the almost chaotic nature of the web necessitates the development
    of powerful yet computationally efficient algorithms to track and tame
    this constantly evolving complex system.

    While traditionally the data mining community has dealt with
    structured databases, web mining poses problems not only due to the
    lack of structure, but also due to the intrinsic distributed nature of
    the data. Furthermore, mining on the Internet involves also dealing
    with multi-media content consisting of not only natural language
    documents but also images, audio and video streams. Several
    interesting and potentially useful applications have already been
    developed by academic researchers and industry practitioners to address
    these challenges. It is important to learn from these initial endeavors,
    if we are to develop new algorithms and interesting applications.

    The purpose of this special issue is to provide a comprehensive
    state-of-the-art overview of the technical challenges and successes
    in mining of the Internet. Of particular interest are papers
    describing both the development of novel algorithms and applications.
    Topics of interest could include but are not limited to:

    * Resource Discovery
    * Collaborative Filtering
    * Information Filtering
    * Content Mining (text, images, video, etc.)
    * Information Extraction
    * User Profiling
    * Applications, e.g., one-to-one marketing

    In addition to the call for full-length papers, we request that any
    researchers working in this area submit abstracts and/or pointers to
    recently published applications for the purposes of compiling a
    comprehensive survey of the current state-of-the-art.

    <

    **** Instructions for submitting papers ***

    Papers should be no more than 30 printed pages (approximately 15,000
    words) with a 12-point font and 18-point spacing, including figures
    and tables. Papers must not have appeared in, nor be under
    consideration by other journals. Include a separate page specifying
    the paper's title and providing the address of the contact author for
    correspondence (including postal, telephone number, fax number, and
    e-mail address). Send FOUR copies of each submission to the guest
    editor listed below. Papers in ascii or postscript form may be
    submitted electronically. Instructions for on-line submission are
    given below.

    ==================================
    Information For on-line submission
    ==================================
    Kluwer Academic Publishers allows on-line submission
    of scientific articles via ftp and e-mail. We will make
    this system more user-friendly by incorporating it into our
    KAPIS WWW server and use Netscape as the user-interface.
    This is currently being prepared and will be implemented by
    the end of this year. Below, please find the procedure that
    should be used until then.

    - an author sends an e-mail message to 'submit@wkap.nl' containing the
    following line
    REQUEST SUBMISSIONFORM AIRE

    AIRE = Artificial Intelligence Review (the 4-letter code that is used
    at Kluwer)

    - the author receives the electronic submission form (see attachment)
    via e-mail with a dedicated file name filled in (and also the
    information that is given at point 4: the journal's four-letter code plus
    the full journal title)

    - the author fills in the submission form and send it back to:
    'submit@wkap.nl'

    - at the same time, the author submits his/her article via anonymous ftp
    at the following address: 'ftp.wkap.nl' in the subdirectory
    INCOMING/SUBMIT, using the dedicated file name with an appropriate
    extension

    - at Kluwer, the article is registrated and taken into production in the
    usual way

    ========================================================================

    ** Important Dates **

    Papers Due: January 30, 1989
    Acceptance Notification: April 1, 1998
    Final Manuscript due: July 1, 1998
    Guest Editor: Shivakumar Vaithyanathan, net.Mining, IBM Almaden Research Center,
    650 Harry Road, San Jose, CA 95120
    (408)927-2465 (Phone)
    (408)927-2240 (Fax)
    e-mail: shiv@almaden.ibm.com


    Previous  4 Next   Top
    Date: Wed, 17 Dec 1997 21:32:45 +0800
    From: Stephen Koo skoo@hkstar.com
    Subject: Datamining articles in Chinese

    [The following site has data mining related articles,
    but in Chinese ! GPS]

    Dear Gregory,

    I am one of your Nuggets subscriber, and one of freelance journalist
    writing datamining articles in Chinese. The articles are published in
    local Chinese newspaper weekly. The audiences are general public and
    computing professional and practioners. I know your site is almost an
    official site for datamining sources. Please go to my site
    http://www.hkstar.com/~skoo, if you find it useful and beneficial to
    datamining community and technology sharing, please add my site into
    your link. If you have any queries, please feel free to contact me.

    Stephen Koo.



    Previous  5 Next   Top
    Date: Thu, 18 Dec 1997 12:49:57 +0100
    From: Raphaelle THOMAS rthomas@isoftfr.isoft.fr
    Subject: Alice on the Web
    Web: http://www.alice.fr/

    For your information, ISoft has announced ALICE on the Web, which gives
    complete Internet access to ALICE d'ISoft.

    New Features:

    Mine your data on a remote site
    Export mining results
    Visualize Decision Tree Model with editing functions on a remote workstation.

    Other news: ISoft and Valoris Technologies have signed a
    partnership agreement.

    Valoris now possesses datamining competence and has a service and training
    offer based on ISoft datamining solutions. In particular, Alice d'ISoft is
    integrated to Valoris Technology Workshops.

    Raphaelle Thomas
    International Development Manager
    Tel : + 33 (0)1 69 35 37 37
    Fax : + 33 (0)1 69 35 37 39
    Email : rthomas@isoft.fr


    Previous  6 Next   Top
    Date: Fri, 19 Dec 1997 20:36:35 +0100
    From: Aurora Perez aurora@fi.upm.es
    Subject: SCI'98-ISAS'98 FOCUS SYMPOSIUM ON KDD
    Web: http://orion.ls.fi.upm.es/~aurora/sci98/sci98.html

    SCI'98-ISAS'98 FOCUS SYMPOSIUM ON KDD, Orlando, Florida, July 12-16, 1998

    In the context of the 1998 WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS
    AND INFORMATICS and the 1998 INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS
    ANALYSIS AND SYNTHESIS, to be held in Orlando (Florida) next July 12-16, we
    are organizing a FOCUS SYMPOSIUM on Knowledge Discovery in Databases (KDD).

    <>

    FOCUS SYMPOSIUM ORGANIZERS

    Technical University of Madrid Database Research Group
    * Maria C. Fernandez cfbaizan@fi.upm.es
    * Aurora Perez aurora@fi.upm.es

    University of Oviedo Database Research Group

    * Concepcion Perez Llera cpllera@etsiig.uniovi.es

    The topics of interest include, but are not limited to:

    * High dimensional data sets and data preprocessing
    * Data and knowledge acquisition and representation
    * Use of prior domain knowledge and re-use of discovered knowledge
    * Algorithmic complexity, efficiency and scalability issues in data
    mining
    * Distributed discovering algorithms and parallel processing
    * Unsupervised discovery
    * Clustering techniques
    * Probabilistic and statistical models and methods
    * Uncertainty management
    * Data mining tools
    * Methods for evaluating subjective interestingness and utility
    * Data and knowledge visualization
    * Applications of KDD systems

    ------------------------------------------------------------------------
    SUBMISSIONS

    Participants who wish to present a paper, are requested to submit a
    condensed first draft including title, author name(s), affiliation(s),
    e-mail address(es), together with an abstract (500-1500 words) by February
    15th, 1998. Submissions must be sent by e-mail to any of the addresses of
    the Focus Symposium Organizers.

    All submitted abstracts will be reviewed on the basis of technical quality,
    novelty, significance and clarity. Acceptance notifications will be done by
    April 15th, 1998.

    Final versions should be sent by May 15th, 1998. Accepted papers will be
    included in proceedings.

    For further information about the event see: http://www.iiis.org

    ------------------------------------------------------------------------
    IMPORTANT DATES

    * February 15th, 1998: Abstract submission deadline
    * April 15th, 1998: Author notification
    * May 15th, 1998: Final version deadline


    Previous  7 Next   Top
    Date: Wed, 17 Dec 97 23:40:38 PST
    From: Mehran Sahami sahami@Robotics.Stanford.EDU
    Subject: CFP: Workshop on Learning for Text Categorization at AAAI/ICML
    Web: http://www.cs.cmu.edu/~mccallum/textcat.html

    CALL FOR PAPERS

    AAAI/ICML-98 Workshop on
    Learning for Text Categorization

    to be held July 27, 1998 in Madison, WI

    The enormous growth of on-line information, has led to a comparable
    growth in the need for methods that help users organize such
    information. One area in particular that has seen much recent
    research activity is the use of automated learning techniques to
    categorize text documents. Such methods are useful for addressing
    problems including, but not limited to: keyword tagging, word sense
    disambiguation, information filtering and routing, sentence parsing,
    clustering of related documents and classification of documents into
    pre-defined topics.

    The aim of this workshop is to examine recent theoretical,
    methodological, and practical innovations from the various communities
    interested in text categorization. The workshop will cover recent
    advances from such fields as Machine Learning, Bayesian Networks,
    Information Retrieval, Natural Language Processing, Case-Based
    Reasoning, Language Modeling and Speech Recognition. By analyzing the
    different underlying assumptions and state-of-the-art methodologies
    used in text categorization research, as well as successful
    applications of this work, we hope to foster new interactions between
    researchers in this area.

    <>

    For further information, see: http://www.cs.cmu.edu/~mccallum/textcat.html

    Workshop Committee:
    Mehran Sahami (Chair) sahami@cs.stanford.edu
    Phone: (650) 725-8784 FAX: (650) 725-1449

    Mark Craven (mark.craven@cs.cmu.edu) Carnegie Mellon University
    Thorsten Joachims (thorsten@informatik.uni-dortmund.de) Universitaet Dortmund
    Andrew McCallum (mccallum@jprc.com) Justsystem Pittsburgh Research Center

    Previous  8 Next   Top

    Date: Fri, 19 Dec 1997 10:40:52 +0000 (MED)
    From: Floor Verdenius F.VERDENIUS@ato.dlo.nl
    Subject: WS CfP The Methodology of Applying Machine Learning (AAAI/ICML)

    AAAI/ICML 1998 Workshop
    The Methodology of Applying Machine Learning
    (Problem Definition, Task Decomposition and Technique Selection)
    http://www.aifb.uni-karlsruhe.de/WBS/AAAI98/AAAIWS98.html

    Madison (WI), July 27, 1998

    OBJECTIVES: This workshop will focus on refining the state-of-the-art
    in applying machine learning (ML) techniques rather than documenting
    application experiences. Our objective is to analyze existing
    experience to extract guidelines for developing ML applications.

    TOPICS: (including but not limited to)
    * Frameworks for creating ML applications and reusing parts of
    previously developed applications
    * Methodologies for applying ML techniques
    * The roles of knowledge necessary for applying ML
    * Matching problem definitions to specific technique configurations
    * Relating and characterizing ML techniques with problem types
    * Embedding the ML application process in knowledge
    acquisition and system development methodologies
    * Comparing ML applications with applications of related techniques
    * Approaches that combine human and automated learning agents

    << edited GPS>>

    for SUBMISSION REQUIREMENTS and all other information see:
    http://www.aifb.uni-karlsruhe.de/WBS/AAAI98/AAAIWS98.html

    WORKSHOP ORGANISERS:
    * Floor Verdenius, ATO-DLO, Netherlands, f.verdenius@ato.dlo.nl
    * Robert Engels, University of Karlsruhe, Germany, engels@aifb.uni-karlsruhe.de
    * David W. Aha, Naval Research Laboratory (USA), aha@aic.nrl.navy.mil


    Previous  9 Next   Top
    From: dg@mitgmbh.de
    Subject: EUFIT '98
    Date: 19 Dec 97 16:09:26 UT
    Web: http://www.mitgmbh.de/elite/eufit.html

    EUFIT`98 - EUFIT`98 - EUFIT`98 - EUFIT`98 - EUFIT`98 - EUFIT`98 - EUFIT`98
    - Fuzzy Logic
    - Neural Networks
    - Evolutionary Computation

    Announcement

    The ELITE Foundation (European Laboratory for Intelligent Techniques
    Engineering) is pleased to announce EUFIT `98 - The 6th European
    Congress on Intelligent Techniques and Soft Computing. EUFIT `98 aims
    to bring together scientists and practitioners from academic,
    governmental, and industrial institutions to discuss new developments
    and results in the field of intelligent technologies.

    The congress will take place in Aachen (Aix-la-Chapelle), Germany, on
    September 7-10, 1998.

    <>

    Structure of the Congress

    Tutorials: September 7, 1998
    Conference: September 8 - 10, 1998
    Exhibition: September 7 - 9, 1998
    Working Groups: September 7 - 11, 1998

    You are invited to show your interest for EUFIT `98 only by completing the
    preregistration form, and returning it immediately.

    For full information, see http://www.mitgmbh.de/elite/eufit.html


    Previous  10 Next   Top