Do We Really Need More Complex Models?

Simplicity might be a better solution.

By Cornellius Yudha Wijaya, KDnuggets Technical Content Specialist on October 1, 2024 in Data Science

Do We Really Need More Complex Models?_2

Image by Author | Ideogram

In the current era, many machine learning model solutions and research are dominated by Generative models such as the Large Language Model (LLM). Their popularity has risen with the presence of AI products like ChatGPT and Midjourney, which allow many people to learn about deep learning models actively.

Even when the AI product isn’t as prominent as it is now, the complex model is always a more popular option. Complex models such as neural networks are specifically used in many use cases, even the simplest ones. Many data people jump straight into the complex model without considering the simplest one because the allure of a complex model is always better.

However, do we really need complex models in every machine learning project? Let’s explore it.

What is a Complex Model?

There are no exact definitions for complex models. A deep neural network is a complex model, while linear regression is a simple model. Something like Random Forest does not generally constitute a simple model, but it’s not necessarily a complex model either.

So, how can the model be called complex? Many characteristics determine its complexity, which often comes from the following information:

Number of Parameters
Interpretability
Multiple Structure
Computational Efficiency

Do We Really Need More Complex Models?_1

Image by Author

The number of parameters is the inherent model configuration parameter, the value of which was learned during the training process, not the hyperparameter that was set initially before the model training started. Complex models generally have higher parameters than simpler models.

Interpretability means explaining why the model provides its prediction. Complex models have a more challenging time interpreting as the higher number of parameters contribute to the interpretability complexity, while the simpler model is more accessible to interpret.

Multiple structures refer to how the models were designed. Complex models often have multiple structures, such as multiple layers like neural networks or multiple models combined like ensemble models.

Computer efficiency for the complex model is much more significant than that for the simpler model, as the training time and resources required to train the complex model are much higher. This is also a direct effect of the parameter numbers.

That was the characteristic of complex models, so do we need more complex models when simpler models work?

When to Work with Complex Models

I have briefly mentioned what distinguishes complex models from simpler ones and how their characteristics affect model selection.

We understand that the parameter number affects the model's complexity, whereas a higher parameter means the model is more complex. With higher parameters, the model could capture the pattern better than a simpler model, especially the non-linear pattern, which a simple model can’t capture.

However, a higher number of parameters also increases the chance of overfitting risks. Overfitting is basically a condition where the model has poor generalization capability because it learns the noise from the dataset. It’s in contrast with a simpler model, where it is harder to overfit but easier to underfit, as it can’t learn much more complex patterns.

A higher number of parameters and multiple structures also affect interpretability and computational efficiency.

I have mentioned previously that a complex model is more challenging to interpret than a simpler model. In many business use cases, we prefer a model with higher interpretability, even with lower model performances. This is because we want to avoid bias and have confidence in the model prediction.

The decision would also be affected by our production environment. Complex models require more resources compared to simpler models. The simple model would use fewer resources, which means fewer costs to deploy and maintain.

All of the above were considered when you want to use a simple or complex model.

So, do we really need more complex models? Well, the answer is: it depends on your situation.

A simple rule of thumb you can follow: a simple model is your go-to model if it could solve your problem already. Only going to a more complex model if it’s required.

Conclusion

A complex model always looks fancier as the complexity attracts many to use it. However, there are many characteristics that you want to understand before using the complex model. You need to understand the number of parameters, interpretability, structure, and computational efficiency.

You don’t need to always use complex models for any situation. If the simple model already works, then it’s a better solution than the complex one.

Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and data tips via social media and writing media. Cornellius writes on a variety of AI and machine learning topics.