Features
- Poll Results: Largest Dataset Analyzed - May 10, 2012.The median largest dataset is now in 11-100 GB range for almost all the regions, and about 20% of all analysts now have experience with Terabyte size datasets.
- Successful Text Analytics Case Studies, June 12-13, Boston - KDnuggets Discount - May 7, 2012.The Text Analytics Summit is the meeting for maximizing the commercial benefits of text analytics. Learn from the leading companies including AMEX, eBay, Disney and save with KDnuggets discount.
- IBM ad: using analytics to predict and reduce crime - May 6, 2012.Police analysis of crime data and finding patterns (using IBM software) helped several US cities cut serious crime by up to 30%. In the ad policeman stops attempted robbery before it happens, letting the criminal go away. Is that really a good way to reduce crime?
- Election prediction using Twitter Data: it does not work - May 8, 2012.Researcher wants to predict elections with Twitter and instead writes a survey why election prediction using Twitter does not work.
- Top news for Apr 29 - May 5: Million Song Challenge, Data Without Borders, To Big Data and Beyond - May 6, 2012.Million Song Dataset Challenge; Data Without Borders; To Big Data and Beyond; Top jobs: Data Mining Engineer at Apple; Data Scientist / Data Miner at nPario.
Courses, Events
- Mine with the experts this summer - May 9, 2012.Through real-world knowledge and practical application SAS Business Knowledge Series instructors teach you how to apply data mining techniques to generate true business intelligence.
- Stanford: New Online Certificates in Data Mining, Statistics - May 9, 2012.Earn world-class credentials from Stanford and demonstrate advanced knowledge in areas like Mining Massive Datasets, Financial Risk Analysis and Management, Quantitative Methods in Finance.
- MapReduce, Hive and Pig courses, in NYC and Mountain view - May 2, 2012.MapR Hive makes Hadoop accessible to users who already know SQL. Pig is a data flow language for complex data transformation. This two-day course teaches you both tools and how to use them for many standard data processing and analysis tasks.
Software
- Provalis: Sentiment analysis dictionaries for financial, political, general news - May 8, 2012.These dictionaries enhance abilities of WordStat, a flexible content analysis and text mining tool that can quickly extract themes, trends and patterns from large collections of text data
- Google BigQuery analytics are now available to the public - May 5, 2012.Google BigQuery is a scalable, easy to use, web service that lets you do interactive analysis of massive datasets-up to billions of rows. BigQuery is now available to the public - sign up now.
- Dedoose 4.2: Coding and Analysis Of Qualitative Data as SaaS - May 3, 2012.Dedoose is a SaaS application which facilitates the coding and analysis of qualitative data and their integration with demographic and other quantitative data.
Jobs
- Analytics Solutions Architect at Predixion Software, telecommute - May 7, 2012.Design, model, develop and support enterprise scale predictive analytics software solutions.
- Data Scientist at Inkiru, Palo Alto - May 4, 2012.Inkiru combines predictive modeling and real-time transaction analysis with a unique combination of relevant private and public data to provide insights and actionable recommendations.
- Analytic/Forensic Technology professionals, ALL levels at Deloitte LLP, Atlanta, Boston, Chicago, Dallas, Houston, New York, San Francisco, Washington D.C. - May 3, 2012.Commercial investigations and litigations increasingly rely on collecting, preserving and analyzing vast amounts of data. AFT professionals combine forensic accounting and investigative skills with advanced technology to assist clients and their legal counsel.
- Data Mining Engineer at Apple, Inc., Cupertino, CA - May 2, 2012.The Advanced Analytics team within the Internet Services group has an opening for a craftsman skilled in Data Mining and Machine Learning.
- Data Scientist / Data Miner at nPario, Palo Alto, CA - May 2, 2012.The nPario platform uses big-data columnar database for to power the next generation of in-database analytics with specific emphasis on internet advertising and monetization. nPario has global partnership with WPP, the world's largest ad group, and experienced leadership.
Academic/Research positions
- Researchers at Huawei Noah's Ark Lab, Hong Kong - May 8, 2012.Huawei, a large Chinese telco, is opening a new research lab (named the "Noah's Ark Lab") in Hong Kong to focus on AI and Big Data Mining research, and is looking for outstanding researchers at junior, senior and leadership levels.
- Assistant Professor, CS, data mining, machine learning, DBMS at U. of Delhi, India - May 3, 2012.Strengthen research in the area of Intelligent Data Analysis, and encourages applicants with expertise in analytics/ data mining/ machine learning/ computational intelligence/ database systems.
Competitions
- RecSys Challenge 2012 - May 7, 2012.The RecSysChallenge 2012 features 2 tracks: the Benchmarking Track on Context-Aware Movie Recommendation, and the Exploratory Track on Scientific Paper Recommendation. Submissions due 8 June 2012.
- The GitHub Data Challenge - May 6, 2012.The GitHub public timeline is now easy to query and analyze. With hundreds of thousands of events in the timeline every day, there are many stories to tell and visualize. Submith your graph and description by May 21.
Publications
- Interview with Judea Pearl, Turing Award Winner, Graphical Probability Pioneer - May 9, 2012.Pearl's attempt to filter out uncertainty and noisy data has profound implications for many applications, including AI, machine learning, and natural language processing.
- Top KDnuggets tweets, Apr 30 - May 6: - May 7, 2012.Online classes from Coursera/MIT/Harvard; Data scientists will become new rocket scientists when ...; Very cool: #BigData, R and HANA. Google Big Query now available.
- NPR: Game Giant Forced to Play Catch Up - May 4, 2012.Electronic Arts tries to catch up to Zynga lead in social games by using data mining, machine learning, and AI. Can EA "out-Zynga" Zynga?
- Data Science Global Hackathon Report: Incompetence borne of excessive cleverness - May 4, 2012.Derek Jones reports on Data Science Global Hackathon - what did they do with air-quality training dataset and what did they learn from their mistakes.
- Video: Jake Porway on Data Without Borders - May 4, 2012.Jake Porway, Data Without Borders founder, explored how to connect data scientists with social organizations at the second Big Data for the Public Good seminar
- Interview with Mike Stonebraker: "One Size does not fit all" - May 3, 2012.I believe that "one size does not fit all". I.e. in every vertical market I can think of, there is a way to beat legacy relational DBMSs by 1-2 orders of magnitude, says Mike Stonebraker.
- Nuts and Bolts of Data Mining: Correlation & Scatter Plots - May 2, 2012.We review two intertwined tools in the data mining arsenal: correlation and scatter plots and discuss the good, bad, and ugly things that can happen.
News Briefs
- SAS High-Performance Analytics pulls valuable insight from text, big data - May 9, 2012.SAS will add high-performance text mining to its in-memory analytics software in Q3 of 2012; partnership with Capgemini for fraud detection; success story with one of the largest bank-led credit data consortium.
- Rapid Insight Launches Predictive Healthcare Analytics Solution - May 9, 2012.Software solution enables healthcare organizations to reduce hospital readmissions through predictive modeling.
- Accel Partners Big Data Conference, May 9 - May 8, 2012.Accel Partners, the VC firm that is one of the backers of Facebook, is holding a big data conference at Stanford University.
- SlashBI: Slashdot site for Business Intelligence, Analytics - May 5, 2012.SlashBI will offer IT pros access to the latest information about business intelligence applications and analytics.
CFP - Calls for Papers
- UDM: Ubiquitous Data Mining Workshop, due May 31
- RSWEB-2012: Recommender Systems & the Social Web, due Jun 8
- KDIR: Int. Conf. on Knowledge Discovery and Information Retrieval, due Jun 14
- IEEE ICDM 2012: The 12th IEEE International Conference on Data Mining, due Jun 18
- BDA2012: Big Data Analytics, due Aug 1
- WPDM: Declarative Data Mining, due Aug 10
- COSTS: Cost Sensitive Data Mining, due Aug 10
- BioDM: Biological Data Mining and its Applications in Healthcare, due Aug 10
- WEMA: Web Entity Modeling and Applications, due Aug 10
Quote
Most damning is the lack of a single actual prediction (using Twitter). Every analysis on elections so far has been done after the fact. "I have not found a single paper predicting a future result," says Gayo-Avello.www.technologyreview.com/blog/arxiv/27812/