KDnuggets : News : 2008 : n08 : item1 | NEXT |
FeaturesFrom: Gregory Piatetsky-ShapiroDate: 21 Apr 2080 Subject: Poll Results: More data or Better Algorithm?
The previous KDnuggets Poll asked:
45% voted for more data, while 20% for a more advanced algorithm, confirming my rule of thumb: More data (especially more relevant features) produces larger improvement than a more advanced algorithm, Of course, as with all such general sayings, a lot depends on specifics: Dean Abbott commented: ... just more data is not enough, but better features (particularly multi-variate features) can provide significant model improvement.
Greg Safarz wrote: More attributes and features wins hands down.
Jozo Kovac wrote: But what are "results"? Model accuracy, model benefits in real world, new extracted knowledge(rules) about your customers?
Alexandru Floares
suggested: If the number of cases is less than 10 times number of features, and the quality is reasonable, adding data can improve the accuracy. If the data quality is low, adding data can improve the accuracy, by increasing the number of informative cases, which remain in the data set after pre-processing or cleaning the initial data.
For full results and more interesting comments, see KDnuggets 2008 Poll: More data or Better algorithm? |
KDnuggets : News : 2008 : n08 : item1 | NEXT |
Copyright © 2008 KDnuggets. Subscribe to KDnuggets News!