More Data Mining with Weka
This online course teaches both principles and practical data mining techniques, lets students work on very big datasets, classify text, experiment with clustering, and much more.
on Jan 30, 2014 in Association Rules, Clustering, Data Mining with Weka, Online Education, Text Classification, Weka
Determining the Value of Insights
With the value of Consumer Insights being questioned to justify ROI, the Market Research professionals need to figure out ways to quantify the value of those insights. Determining the value of insights is no easy task and requires focus on three key components.
on Jan 30, 2014 in Efficiency, Insight Effectiveness, Insight Quality, Market Research
Viewpoint: Why your company should NOT use “Big Data”
Hardcore analytics (and Big Data) can add value, but only marginally and only for companies that have already mastered using the data they already have. The ‘obvious’ information from your own data can get you 90%+ of the total impact, so start there. The hard part is executing the basic insights across the organization.
on Jan 27, 2014 in 80/20 Principle, Hardcore Analytics, Pair Search, Quality Score, Sort Order
Using Data Mining to Predict the Winter Olympics Medal Counts in Sochi
Could data mining techniques accurately predict the medal counts at the Olympics? A predictive model could give us an estimate of the number of medals each nation might win; but how close could we get to the actual outcomes? It was a tantalizing project …
on Jan 25, 2014 in Olympics, Russia, Sports
Split on Data Science Skills: Individual vs Team Approach
The results of latest KDnuggets poll show an almost equal split between those who favor individual and those who favor the team approach. See the counterintuitive regional differences and interesting comments.
on Jan 21, 2014 in Data Science, Poll, Skills, Team
PAN Competition: Plagiarism Detection, Author Identification, Author Profiling
Take part in one of 3 tasks: Plagiarism Detection - given a document, is it an original? Author Identification - given a document, who wrote it? Author Profiling - given a document, what is author age / gender?
on Jan 15, 2014 in Author Detection, Author Profiling, Competition, Plagiarism Detection
Interpreting Model Performance with Cost Functions
Cost functions are critical for the correct assessment of performance of data mining and predictive models. This series goes deep into the statistical properties and mathematical understanding of each cost function and explores their similarities and differences.
on Jan 13, 2014 in Cost Function, Model Performance, Online Education, Salford Systems
MADlib: Big Data Machine Learning in SQL for Data Scientists
MADlib is open source with commercially usable BSD license; supports Postgres and Pivotal Greenplum DBMS, and provides classification, regression, clustering, topic modeling and other analytics for Big Data.
on Jan 6, 2014 in
|