Search results for FK

    Found 363 documents, 6051 searched:

  • Building a Formula 1 Streaming Data Pipeline With Kafka and Risingwave

    Build a streaming data pipeline using Formula 1 data, Python, Kafka, RisingWave as the streaming database, and visualize all the real-time data in Grafana.

    https://www.kdnuggets.com/building-a-formula-1-streaming-data-pipeline-with-kafka-and-risingwave

  • How to Build a Scalable Data Architecture with Apache Kafka

    Learn about Apache Kafka architecture and its implementation using a real-world use case of a taxi booking app.

    https://www.kdnuggets.com/2023/04/build-scalable-data-architecture-apache-kafka.html

  • Speed up Machine Learning with Fast Kriging (FKR)

    Machine Learning has revolutionized the world, yet expensive computation costs on model trainings are often a large limitation, especially for large data sets or elevated precisions. VMC Consulting offers a new algorithm called Fast Kriging (FKR), which allows to train models with the high precision of Kriging at a speed 100+ times faster, without compromising precision, for any data set size.

    https://www.kdnuggets.com/2022/06/vmc-speed-machine-learning-fast-kriging.html

  • How to Use Kafka Connect to Create an Open Source Data Pipeline for Processing Real-Time Data

    This article shows you how to create a real-time data pipeline using only pure open source technologies. These include Kafka Connect, Apache Kafka, Kibana and more.

    https://www.kdnuggets.com/2021/07/kafka-open-source-data-pipeline-processing-real-time-data.html

  • Brain Monitoring with Kafka, OpenTSDB, and Grafana

    Interested in using open source software to monitor brain activity, and control your devices? Sure you are! Read this fantastic post for some insight and direction.

    https://www.kdnuggets.com/2016/08/brain-monitoring-kafka-opentsb-grafana.html

  • Navigating Your Data Science Career: From Learning to Earning

    Is earning worth learning in today’s data science landscape? Short answer: yes. The long answer calls for an article.

    https://www.kdnuggets.com/navigating-your-data-science-career-from-learning-to-earning

  • 7 Python Libraries Every Data Engineer Should Know

    Interested in switching to data engineering? Here’s a list of Python libraries you’ll find super helpful.

    https://www.kdnuggets.com/7-python-libraries-every-data-engineer-should-know

  • 7 Steps to Mastering Data Engineering

    The only data engineering roadmap you need for an introduction to concepts, tools, and techniques to collect, store, transform, analyze, and model data.

    https://www.kdnuggets.com/7-steps-to-mastering-data-engineering

  • What is a Database? Everything You Need to Know

    Unlocking Database Basics.

    https://www.kdnuggets.com/what-is-a-database-everything-you-need-to-know

  • Boost Your Data Science Skills: The Essential SQL Certifications You Need

    If you are a data scientist who works with large amounts of data and hasn’t learned SQL yet - now might be the time.

    https://www.kdnuggets.com/boost-your-data-science-skills-the-essential-sql-certifications-you-need

  • 2024 Reading List: 5 Essential Reads on Artificial Intelligence

    Transform your understanding of current and future tech with these top 5 AI reads to explore the minds shaping our future.

    https://www.kdnuggets.com/2024-reading-list-5-essential-reads-on-artificial-intelligence

  • Collection of Free Courses to Learn Data Science, Data Engineering, Machine Learning, MLOps, and LLMOps

    Begin your data professional journey from the basics of statistics to building a production-grade AI application.

    https://www.kdnuggets.com/collection-of-free-courses-to-learn-data-science-data-engineering-machine-learning-mlops-and-llmops

  • 2024 Tech Trends: AI Breakthroughs & Development Insights from O’Reilly’s Free Report

    Want to prepare your tech career for 2024 and onwards? Have a look at O’Reilly’s FREE technology trends report.

    https://www.kdnuggets.com/2024-tech-trends-ai-breakthroughs-development-insights-oreilly-free-report

  • The Only Free Course You Need To Become a Professional Data Engineer

    Data Engineering ZoomCamp offers free access to reading materials, video tutorials, assignments, homeworks, projects, and workshops.

    https://www.kdnuggets.com/the-only-free-course-you-need-to-become-a-professional-data-engineer

  • How Big Data Is Saving Lives in Real Time: IoV Data Analytics Helps Prevent Accidents

    This posts talks about what needs to be taken care of in IoV data analysis, and shows the difference between a near real-time analytic platform and an actual real-time analytic platform with a real-world example.

    https://www.kdnuggets.com/how-big-data-is-saving-lives-in-real-time-iov-data-analytics-helps-prevent-accidents

  • Working with Big Data: Tools and Techniques

    Where do you start in a field as vast as big data? Which tools and techniques to use? We explore this and talk about the most common tools in big data.

    https://www.kdnuggets.com/working-with-big-data-tools-and-techniques

  • KDnuggets News, September 6: Happy 30th Anniversary KDnuggets! • Getting Started with Python Data Structures in 5 Steps

    Happy 30th Anniversary KDnuggets! • Getting Started with Python Data Structures in 5 Steps • KDnuggets 30th Anniversary Interview with Founder Gregory Piatetsky-Shapiro

    https://www.kdnuggets.com/2023/n32.html

  • How to Digest 15 Billion Logs Per Day and Keep Big Queries Within 1 Second

    This article describes a large-scale data warehousing use case to provide reference for data engineers who are looking for log analytic solutions. It introduces the log processing architecture and real-case practice in data ingestion, storage, and queries.

    https://www.kdnuggets.com/how-to-digest-15-billion-logs-per-day-and-keep-big-queries-within-1-second

  • LangChain + Streamlit + Llama: Bringing Conversational AI to Your Local Machine

    Integrating Open Source LLMs and LangChain for Free Generative Question Answering (No API Key required).

    https://www.kdnuggets.com/2023/08/langchain-streamlit-llama-bringing-conversational-ai-local-machine.html

  • OpenAI’s Whisper API for Transcription and Translation

    This article will show you how to use OpenAI's Whisper API to transcribe audio into text. It will also show you how to use it in your own projects and how to integrate it into your data science projects.

    https://www.kdnuggets.com/2023/06/openai-whisper-api-transcription-translation.html

  • GPT-4 is Vulnerable to Prompt Injection Attacks on Causing Misinformation

    ChatGPT might have some loophole to provide unreliable facts.

    https://www.kdnuggets.com/2023/05/gpt4-vulnerable-prompt-injection-attacks-causing-misinformation.html

  • Data Engineering Landscape in the AI-Driven World

    Generative AI has just started to capture the imagination of data engineers, so the impact thus far has been just a fraction of what it will be a year or two from now.

    https://www.kdnuggets.com/2023/05/data-engineering-landscape-aidriven-world.html

  • How to Efficiently Scale Data Science Projects with Cloud Computing

    This article discusses the key components that contribute to the successful scaling of data science projects. It covers how to collect data using APIs, how to store data in the cloud, how to clean and process data, how to visualize data, and how to harness the power of data visualization through interactive dashboards.

    https://www.kdnuggets.com/2023/05/efficiently-scale-data-science-projects-cloud-computing.html

  • Introducing Healthcare-Specific Large Language Models from John Snow Labs

    John Snow Labs recently released a new LLM called BioGPT-JSL and capabilities tuned specifically to the medical domain. This article summarizes three things you should know about it. 

    https://www.kdnuggets.com/2023/04/john-snow-introducing-healthcare-specific-large-language-models-john-snow-labs.html

  • 6 ChatGPT mind-blowing extensions to use anywhere

    And how to make ChatGPT our daily assistant using them.

    https://www.kdnuggets.com/2023/04/6-chatgpt-mindblowing-extensions-anywhere.html

  • KDnuggets News, April 12: Top 19 Skills for a Data Scientist in 2023 • 8 ChatGPT Open-Source Alternatives

    Top 19 Skills You Need to Know in 2023 to Be a Data Scientist • 8 Open-Source Alternative to ChatGPT and Bard • Free eBook: 10 Practical Python Programming Tricks • DataLang: A New Programming Language for Data Scientists… Created by ChatGPT? • How to Build a Scalable Data Architecture with Apache Kafka

    https://www.kdnuggets.com/2023/n13.html

  • Beyond Accuracy: Evaluating & Improving a Model with the NLP Test Library

    John Snow Labs has learned a lot about testing and delivering Responsible NLP models: not only in terms of policies and goals, but by building day-to-day tools for data scientists. The nlptest library aims to share these tools with the open-source community.

    https://www.kdnuggets.com/2023/04/john-snow-beyond-accuracy-nlp-test-library.html

  • 3 Hard Python Coding Interview Questions For Data Science

    No mercy today! I have three hard-level Python coding interview questions that require you to be on top of your game in Python and solve business problems.

    https://www.kdnuggets.com/2023/03/3-hard-python-coding-interview-questions-data-science.html

  • Top Free Resources To Learn ChatGPT

    KDnuggets Top Blog Learn about ChatGPT through Cheat Sheets, Guides, Books, Tutorials, and Blogs.

    https://www.kdnuggets.com/2023/02/top-free-resources-learn-chatgpt.html

  • Building a Recommender System for Amazon Products with Python

    I built a recommender system for Amazon’s electronics category.

    https://www.kdnuggets.com/2023/02/building-recommender-system-amazon-products-python.html

  • 5 Ways to Deal with the Lack of Data in Machine Learning

    Effective solutions exist when you don't have enough data for your models. While there is no perfect approach, five proven ways will get your model to production.

    https://www.kdnuggets.com/2019/06/5-ways-lack-data-machine-learning.html

  • 7 Essential Cheat Sheets for Data Engineering

    KDnuggets Top Blog Learn about the data life cycle, PySpark, dbt, Kafka, BigQuery, Airflow, and Docker.

    https://www.kdnuggets.com/2022/12/7-essential-cheat-sheets-data-engineering.html

  • How I Got 4 Data Science Offers and Doubled My Income 2 Months After Being Laid Off

    In this blog, I shared my story on getting 4 data science job offers including Airbnb, Lyft and Twitter after being laid off. Any data scientist who was laid off due to the pandemic or who is actively looking for a data science position can find something here to which they can relate.

    https://www.kdnuggets.com/2021/01/data-science-offers-doubled-income-2-months.html

  • Getting Started with PyTorch Lightning

    Introduction to PyTorch Lightning and how it can be used for the model building process. It also provides a brief overview of the PyTorch characteristics and how they are different from TensorFlow.

    https://www.kdnuggets.com/2022/12/getting-started-pytorch-lightning.html

  • What Google Recommends You do Before Taking Their Machine Learning or Data Science Course

    First steps to learning data science & machine learning are the foundations.

    https://www.kdnuggets.com/2021/10/google-recommends-before-machine-learning-data-science-course.html

  • The Complete Data Engineering Study Roadmap

    KDnuggets Top Blog Everything you need to know to start your career in Data Engineering.

    https://www.kdnuggets.com/2022/11/complete-data-engineering-study-roadmap.html

  • SHAP: Explain Any Machine Learning Model in Python

    A Comprehensive Guide to SHAP and Shapley Values

    https://www.kdnuggets.com/2022/11/shap-explain-machine-learning-model-python.html

  • 9 Skills You Need to Become a Data Engineer

    A data engineer is a fast-growing profession with amazing challenges and rewards. Which skills do you need to become a data engineer? In this post, we’ll take a look at both hard and soft skills.

    https://www.kdnuggets.com/2021/03/9-skills-become-data-engineer.html

  • Essential Books You Need to Become a Data Engineer

    KDnuggets Top Blog In this article, I will go through the roadmap of books you need to become a Data Engineer.

    https://www.kdnuggets.com/2022/10/essential-books-need-become-data-engineer.html

  • 11 Questions About Data Engineers: What’s the profession about, and where’s it heading?

    I hope my answers will be useful to novice data engineers and anyone interested in data engineering.

    https://www.kdnuggets.com/2022/10/11-questions-data-engineers-profession-heading.html

  • How to Correctly Select a Sample From a Huge Dataset in Machine Learning

    We explain how choosing a small, representative dataset from a large population can improve model training reliability.

    https://www.kdnuggets.com/2019/05/sample-huge-dataset-machine-learning.html

  • 7 Things You Didn’t Know You Could do with a Low Code Tool

    Surprisingly easy solutions for complex data problems.

    https://www.kdnuggets.com/2022/09/7-things-didnt-know-could-low-code-tool.html

  • How to land an ML job: Advice from engineers at Meta, Google Brain, and SAP

    Check out this video, summary and transcript of a discussion between co:rise co-founder Jake Samuelson and three outstanding ML engineers — Kaushik Rangadurai, Shalvi Mahajan, and Frank Chen — to hear their advice on landing a job in machine learning.

    https://www.kdnuggets.com/2022/08/corise-land-ml-job-advice-engineers-meta-google-brain-sap.html

  • Most In-demand Artificial Intelligence Skills To Learn In 2022

    KDnuggets Top Blog Artificial Intelligence (AI) is the process of programming a computer that can reason and learn like a human being and make decisions for itself.

    https://www.kdnuggets.com/2022/08/indemand-artificial-intelligence-skills-learn-2022.html

  • 10 Modern Data Engineering Tools

    Learn about the modern tools for data orchestration, data storage, analytical engineering, batch processing, and data streaming.

    https://www.kdnuggets.com/2022/07/10-modern-data-engineering-tools.html

  • KDnuggets News, June 29: 20 Basic Linux Commands for Data Science Beginners; Market Data and News: A Time Series Analysis

    20 Basic Linux Commands for Data Science Beginners; Market Data and News: A Time Series Analysis; Data Science Career: 7 Expectations vs Reality; Machine Learning Is Not Like Your Brain Part 4: The Neuron’s Limited Ability to Represent Precise Values; Comprehensive Guide to the Normal Distribution

    https://www.kdnuggets.com/2022/n26.html

  • Top Data Science Podcasts for 2022

    Here are some data science related podcasts to help you either grow your interest in the field, increase your current knowledge, or help you develop yourself.

    https://www.kdnuggets.com/2022/06/top-data-science-podcasts-2022.html

  • Free Data Engineering Courses

    Get into the highly in-demand world of data engineering for free and earn 6 figures salary.

    https://www.kdnuggets.com/2022/05/free-data-engineering-courses.html

  • The Complete Collection of Data Science Books – Part 1

    KDnuggets Top Blog Read the best books on Programming, Statistics, Data Engineering, Web Scraping, Data Analytics, Business Intelligence, Data Applications, Data Management, Big Data, and Cloud Architecture.

    https://www.kdnuggets.com/2022/05/complete-collection-data-science-books-part-1.html

  • Should The Data Warehouse Be Immutable?

    Is the data warehouse broken? Is the "immutable data warehouse" the right path for your data team? Learn more here.

    https://www.kdnuggets.com/2022/05/data-warehouse-immutable.html

  • How to Build Strong Data Science Portfolio as a Beginner

    After learning the basics of data science, you can start to work on real-world problems. But how do you showcase your work? In this article, we are going to learn a unique way to create a data science portfolio.

    https://www.kdnuggets.com/2021/10/strong-data-science-portfolio-as-beginner.html

  • Feature Stores for Real-time AI & Machine Learning

    Real-time AI/ML is on the rise and feature stores are key to successfully deploying them. Read on to see how the choice of online store and the feature store architecture play important roles in determining its performance and cost.

    https://www.kdnuggets.com/2022/03/feature-stores-realtime-ai-machine-learning.html

  • Top 7 YouTube Courses on Data Analytics

    Learn data analytics by taking the best YouTube courses. These courses will cover data analysis with Python, R, SQL, PowerBI, Tableau, Excel, and SPSS.

    https://www.kdnuggets.com/2022/02/top-7-youtube-courses-data-analytics.html

  • The Complete Collection of Data Science Cheat Sheets – Part 2

    KDnuggets Top Blog A collection of cheat sheets that will help you prepare for a technical interview on Data Structures & Algorithms, Machine learning, Deep Learning, Natural Language Processing, Data Engineering, Web Frameworks.

    https://www.kdnuggets.com/2022/02/complete-collection-data-science-cheat-sheets-part-2.html

  • 19 Data Science Project Ideas for Beginners

    This article features 19 data science projects for beginners, categorized into 7 full project tutorials, 5 places to come up with your own data science projects using data, and 7 skills-based data science projects.

    https://www.kdnuggets.com/2021/11/19-data-science-project-ideas-beginners.html

  • Data Science Programming Languages and When To Use Them

    KDnuggets Top Blog Read this guide through the most common data science programming languages and when to use them in data science.

    https://www.kdnuggets.com/2022/02/data-science-programming-languages.html

  • Transfer Learning for Image Recognition and Natural Language Processing

    Read the second article in this series on Transfer Learning, and learn how to apply it to Image Recognition and Natural Language Processing.

    https://www.kdnuggets.com/2022/01/transfer-learning-image-recognition-natural-language-processing.html

  • Automate Microsoft Excel and Word Using Python

    Integrate Excel with Word to generate automated reports seamlessly.

    https://www.kdnuggets.com/2021/08/automate-microsoft-excel-word-python.html

  • Learn Deep Learning by Building 15 Neural Network Projects in 2022

    Here are 15 neural network projects you can take on in 2022 to build your skills, your know-how, and your portfolio.

    https://www.kdnuggets.com/2022/01/15-neural-network-projects-build-2022.html

  • Hands-On Reinforcement Learning Course, Part 2

    Continue your learning journey in Reinforcement Learning with this second of two part tutorial that covers the foundations of the technique with examples and Python code.

    https://www.kdnuggets.com/2021/12/hands-on-reinforcement-learning-part-2.html

  • Feature Selection: Where Science Meets Art

    From heuristic to algorithmic feature selection techniques for data science projects.

    https://www.kdnuggets.com/2021/12/feature-selection-science-meets-art.html

  • Inside DeepMind’s New Efforts to Use Deep Learning to Advance Mathematics

    Using deep learning techniques can help mathematicians develop intuitions about the toughest problems in the field.

    https://www.kdnuggets.com/2021/12/inside-deepmind-new-efforts-deep-learning-advance-mathematics.html

  • A Beginner’s Guide to End to End Machine Learning

    Learn to train, tune, deploy and monitor machine learning models.

    https://www.kdnuggets.com/2021/12/beginner-guide-end-end-machine-learning.html

  • Avoid These Mistakes with Time Series Forecasting

    A few checks to make before training a Machine Learning model on data that could be random.

    https://www.kdnuggets.com/2021/12/avoid-mistakes-time-series-forecasting.html

  • Sentiment Analysis API vs Custom Text Classification: Which one to choose?

    In this article, we are going to compare the sentiment extraction performance between Sentiment Analysis engines and Custom Text classification engines. The idea is to show pros and cons of these two types of engines on a concrete dataset.

    https://www.kdnuggets.com/2021/11/sentiment-analysis-api-custom-text-classification.html

  • New Poll: What Percentage of Your Machine Learning Models Have Been Deployed?

    Take a moment to participate in the latest KDnuggets poll and let the community know what percentage of your machine learning models have been deployed.

    https://www.kdnuggets.com/2021/11/percentage-machine-learning-models-deployed.html

  • A Spreadsheet that Generates Python: The Mito JupyterLab Extension

    You can call Mito into your Jupyter Environment and each edit you make will generate the equivalent Python in the code cell below.

    https://www.kdnuggets.com/2021/11/spreadsheet-generates-python-mito-jupyterlab-extension.html

  • Dask DataFrame is not Pandas

    This article is the second article of an ongoing series on using Dask in practice. Each article in this series will be simple enough for beginners, but provide useful tips for real work. The next article in the series is about parallelizing for loops, and other embarrassingly parallel operations with dask.delayed.

    https://www.kdnuggets.com/2021/11/dask-dataframe-not-pandas.html

  • Visual Scoring Techniques for Classification Models

    Read this article assessing a model performance in a broader context.

    https://www.kdnuggets.com/2021/11/visual-scoring-techniques-classification-models.html

  • Top 5 Time Series Methods

    Data that varies in time can offer powerful applications and use cases for data scientists to analyze. This overview considers the top techniques you can learn to understand and gain insight from time-series data.

    https://www.kdnuggets.com/2021/11/top-5-time-series-methods.html

  • Is the Modern Data Stack Leaving You Behind?

    The modern data stack narrative is largely dominated by analytics engineering. Where does that leave data engineers? Discover the difference between the MDS for data engineers & analytics engineers.

    https://www.kdnuggets.com/2021/11/modern-data-stack-leaving-behind.html

  • AutoML: An Introduction Using Auto-Sklearn and Auto-PyTorch

    AutoML is a broad category of techniques and tools for applying automated search to your automated search and learning to your learning. In addition to Auto-Sklearn, the Freiburg-Hannover AutoML group has also developed an Auto-PyTorch library. We’ll use both of these as our entry point into AutoML in the following simple tutorial.

    https://www.kdnuggets.com/2021/10/automl-introduction-auto-sklearn-auto-pytorch.html

  • How to do “Limitless” Math in Python

    How to perform arbitrary-precision computation and much more math (and fast too) than what is possible with the built-in math library in Python.

    https://www.kdnuggets.com/2021/10/limitless-math-python.html

  • Parallelizing Python Code

    This article reviews some common options for parallelizing Python code, including process-based parallelism, specialized libraries, ipython parallel, and Ray.

    https://www.kdnuggets.com/2021/10/parallelizing-python-code.html

  • GitHub Desktop for Data Scientists

    Less scary than version control in the command line.

    https://www.kdnuggets.com/2021/09/github-desktop-data-scientists.html

  • Important Statistics Data Scientists Need to Know

    Several fundamental statistical concepts must be well appreciated by every data scientist -- from the enthusiast to the professional. Here, we provide code snippets in Python to increase understanding to bring you key tools that bring early insight into your data.

    https://www.kdnuggets.com/2021/09/important-statistics-data-scientists.html

  • How To Build A Database Using Python">Silver BlogHow To Build A Database Using Python

    Implement your database without handling the SQL using the Flask-SQLAlchemy library.

    https://www.kdnuggets.com/2021/09/build-database-using-python.html

  • Gold BlogPath to Full Stack Data Science">Rewards BlogGold BlogPath to Full Stack Data Science

    Start your journey toward mastering all aspects of the field of Data Science with this focused list of in-depth self-learning resources. Curated with the beginner in mind, these recommendations will help you learn efficiently, and can also offer existing professionals useful highlights for review or help filling in any gaps in skills.

    https://www.kdnuggets.com/2021/09/path-full-stack-data-science.html

  • Data Engineering Technologies 2021

    Emerging technologies supporting the field of data engineering are growing at a rapid clip. This curated list includes the most important offerings available in 2021.

    https://www.kdnuggets.com/2021/09/data-engineering-technologies-2021.html

  • What 2 years of self-teaching data science taught me

    Many of us self-learn data science from the very beginning. While continuing to self-learn on demand is crucial, especially after you become a professional, there can be many pitfalls early on for learning the wrong way or missing out on key ideas that are important for the real-world application of data science.

    https://www.kdnuggets.com/2021/09/2-years-self-teaching-data-science.html

  • Speeding up Neural Network Training With Multiple GPUs and Dask

    A common moment when training a neural network is when you realize the model isn’t training quickly enough on a CPU and you need to switch to using a GPU. It turns out multi-GPU model training across multiple machines is pretty easy with Dask. This blog post is about my first experiment in using multiple GPUs with Dask and the results.

    https://www.kdnuggets.com/2021/09/speeding-neural-network-training-multiple-gpus-dask.html

  • An Introduction to Reinforcement Learning with OpenAI Gym, RLlib, and Google Colab

    Get an Introduction to Reinforcement Learning by attempting to balance a virtual CartPole with OpenAI Gym, RLlib, and Google Colab.

    https://www.kdnuggets.com/2021/09/intro-reinforcement-learning-openai-gym-rllib-colab.html

  • The Prefect Way to Automate & Orchestrate Data Pipelines

    I am migrating all my ETL work from Airflow to this super-cool framework.

    https://www.kdnuggets.com/2021/09/prefect-way-automate-orchestrate-data-pipelines.html

  • How Many AI Neurons Does It Take to Simulate a Brain Neuron?

    A new research shows some shocking answers to that question.

    https://www.kdnuggets.com/2021/09/ai-neurons-simulate-brain-neuron.html

  • Working with Python APIs For Data Science Project

    In this article, we will work with YouTube Python API to collect video statistics from our channel using the requests python library to make an API call and save it as a Pandas DataFrame.

    https://www.kdnuggets.com/2021/09/python-apis-data-science-project.html

  • 6 Cool Python Libraries That I Came Across Recently

    Check out these awesome Python libraries for Machine Learning.

    https://www.kdnuggets.com/2021/09/6-cool-python-libraries-recently.html

  • Learning Data Science and Machine Learning: First Steps After The Roadmap">Silver BlogLearning Data Science and Machine Learning: First Steps After The Roadmap

    Just getting into learning data science may seem as daunting as (if not more than) trying to land your first job in the field. With so many options and resources online and in traditional academia to consider, these pre-requisites and pre-work are recommended before diving deep into data science and AI/ML.

    https://www.kdnuggets.com/2021/08/learn-data-science-machine-learning.html

  • 15 Things I Look for in Data Science Candidates

    This article presents advice for anyone looking or hiring for data science jobs, written by someone with practical and useful insight.

    https://www.kdnuggets.com/2021/08/15-things-data-science-candidates.html

  • Open Source Datasets for Computer Vision">Silver BlogOpen Source Datasets for Computer Vision

    Access to high-quality, noise-free, large-scale datasets is crucial for training complex deep neural network models for computer vision applications. Many open-source datasets are developed for use in image classification, pose estimation, image captioning, autonomous driving, and object segmentation. These datasets must be paired with the appropriate hardware and benchmarking strategies to optimize performance.

    https://www.kdnuggets.com/2021/08/open-source-datasets-computer-vision.html

  • Model Drift in Machine Learning – How To Handle It In Big Data

    Rendezvous Architecture helps you run and choose outputs from a Champion model and many Challenger models running in parallel without many overheads. The original approach works well for smaller data sets, so how can this idea adapt to big data pipelines?

    https://www.kdnuggets.com/2021/08/model-drift-machine-learning-big-data.html

  • Not Only for Deep Learning: How GPUs Accelerate Data Science & Data Analytics">Gold BlogNot Only for Deep Learning: How GPUs Accelerate Data Science & Data Analytics

    Modern AI/ML systems’ success has been critically dependent on their ability to process massive amounts of raw data in a parallel fashion using task-optimized hardware. Can we leverage the power of GPU and distributed computing for regular data processing jobs too?

    https://www.kdnuggets.com/2021/07/deep-learning-gpu-accelerate-data-science-data-analytics.html

  • Understanding BERT with Hugging Face

    We don’t really understand something before we implement it ourselves. So in this post, we will implement a Question Answering Neural Network using BERT and a Hugging Face Library.

    https://www.kdnuggets.com/2021/07/understanding-bert-hugging-face.html

  • How to Create Unbiased Machine Learning Models

    In this post we discuss the concepts of bias and fairness in the Machine Learning world, and show how ML biases often reflect existing biases in society. Additionally, We discuss various methods for testing and enforcing fairness in ML models.

    https://www.kdnuggets.com/2021/07/create-unbiased-machine-learning-models.html

  • 7 Open Source Libraries for Deep Learning Graphs

    In this article we’ll go through 7 up-and-coming open source libraries for graph deep learning, ranked in order of increasing popularity.

    https://www.kdnuggets.com/2021/07/7-open-source-libraries-deep-learning-graphs.html

  • Become an Analytics Engineer in 90 Days">Gold BlogBecome an Analytics Engineer in 90 Days

    A new role of the Analytics Engineer is an exciting opportunity that crosses the skill sets of a Data Analyst and Data Engineer. Here, we describe how this position can evolve at an organization, and recommend self-learning resources that can be used to prepare for the multifaceted responsibilities.

    https://www.kdnuggets.com/2021/07/become-analytics-engineer-90-days.html

  • Predict Customer Churn (the right way) using PyCaret

    A step-by-step guide on how to predict customer churn the right way using PyCaret that actually optimizes the business objective and improves ROI.

    https://www.kdnuggets.com/2021/07/pycaret-predict-customer-churn-right-way.html

  • How to Use NVIDIA GPU Accelerated Libraries

    If you are wondering how you can take advantage of NVIDIA GPU accelerated libraries for your AI projects, this guide will help answer questions and get you started on the right path.

    https://www.kdnuggets.com/2021/07/nvidia-gpu-accelerated-libraries.html

  • How to Train a Joint Entities and Relation Extraction Classifier using BERT Transformer with spaCy 3

    A step-by-step guide on how to train a relation extraction classifier using Transformer and spaCy3.

    https://www.kdnuggets.com/2021/06/train-joint-entities-relation-extraction-classifier-bert-spacy.html

  • Get Interactive Plots Directly With Pandas">Silver BlogGet Interactive Plots Directly With Pandas

    Telling a story with data is a core function for any Data Scientist, and creating data visualizations that are simultaneously illuminating and appealing can be challenging. This tutorial reviews how to create Plotly and Bokeh plots directly through Pandas plotting syntax, which will help you convert static visualizations into interactive counterparts -- and take your analysis to the next level.

    https://www.kdnuggets.com/2021/06/interactive-plots-directly-pandas.html

  • The 7 Best Open Source AI Libraries You May Not Have Heard Of

    AI researchers today have many exciting options for working with specialized tools. Although starting original projects from scratch is often not necessary, knowing which existing library to leverage remains a challenge. This list of generally unknown yet awesome, open-source libraries offers an interesting collection to consider for state-of-the-art research that spans from automatic machine learning to differentiable quantum circuits.

    https://www.kdnuggets.com/2021/06/7-open-source-ai-libraries.html

  • How to pitch to VCs, explained: The Deck We Used to Raise Capital For Our Open-Source ELT Platform

    Winning seed funding from venture capitalists is a daunting task, and the pitch is key. Learn how one effective slide deck resulted in a successful early funding round for an open-source start-up, Airbyte.

    https://www.kdnuggets.com/2021/05/vc-pitch-deck-open-source-elt-platform.html

  • Awesome list of datasets in 100+ categories

    With an estimated 44 zettabytes of data in existence in our digital world today and approximately 2.5 quintillion bytes of new data generated daily, there is a lot of data out there you could tap into for your data science projects. It's pretty hard to curate through such a massive universe of data, but this collection is a great start. Here, you can find data from cancer genomes to UFO reports, as well as years of air quality data to 200,000 jokes. Dive into this ocean of data to explore as you learn how to apply data science techniques or leverage your expertise to discover something new.

    https://www.kdnuggets.com/2021/05/awesome-list-datasets.html

  • Feature Engineering of DateTime Variables for Data Science, Machine Learning

    Learn how to make more meaningful features from DateTime type variables to be used by Machine Learning Models.

    https://www.kdnuggets.com/2021/04/feature-engineering-datetime-variables-data-science-machine-learning.html

  • Data Science Books You Should Start Reading in 2021">Gold BlogData Science Books You Should Start Reading in 2021

    Check out this curated list of the best data science books for any level.

    https://www.kdnuggets.com/2021/04/data-science-books-start-reading-2021.html

  • Continuous Training for Machine Learning – a Framework for a Successful Strategy

    A basic appreciation by anyone who builds machine learning models is that the model is not useful without useful data. This doesn't change after a model is deployed to production. Effectively monitoring and retraining models with updated data is key to maintaining valuable ML solutions, and can be accomplished with effective approaches to production-level continuous training that is guided by the data.

    https://www.kdnuggets.com/2021/04/continuous-training-machine-learning.html

  • Deep Learning Recommendation Models (DLRM): A Deep Dive

    The currency in the 21st century is no longer just data. It's the attention of people. This deep dive article presents the architecture and deployment issues experienced with the deep learning recommendation model, DLRM, which was open-sourced by Facebook in March 2019.

    https://www.kdnuggets.com/2021/04/deep-learning-recommendation-models-dlrm-deep-dive.html

  • The Inferential Statistics Data Scientists Should Know

    The foundations of Data Science and machine learning algorithms are in mathematics and statistics. To be the best Data Scientists you can be, your skills in statistical understanding should be well-established. The more you appreciate statistics, the better you will understand how machine learning performs its apparent magic.

    https://www.kdnuggets.com/2021/03/statistics-data-scientists-should-know.html

  • 11 Essential Code Blocks for Complete EDA (Exploratory Data Analysis)

    This article is a practical guide to exploring any data science project and gain valuable insights.

    https://www.kdnuggets.com/2021/03/11-essential-code-blocks-exploratory-data-analysis.html

  • Dask and Pandas: No Such Thing as Too Much Data

    Do you love pandas, but don't love it when you reach the limits of your memory or compute resources? Dask provides you with the option to use the pandas API with distributed data and computing. Learn how it works, how to use it, and why it’s worth the switch when you need it most.

    https://www.kdnuggets.com/2021/03/dask-pandas-data.html

  • Getting Started with Distributed Machine Learning with PyTorch and Ray

    Ray is a popular framework for distributed Python that can be paired with PyTorch to rapidly scale machine learning applications.

    https://www.kdnuggets.com/2021/03/getting-started-distributed-machine-learning-pytorch-ray.html

  • Silver BlogTop YouTube Channels for Data Science">Platinum BlogSilver BlogTop YouTube Channels for Data Science

    Have a look at the top 15 YouTube channels for data science by number of subscribers, along with some additional data on the channels to help you decide if they may have some content useful for you.

    https://www.kdnuggets.com/2021/03/top-youtube-channels-data-science.html

  • 10 Statistical Concepts You Should Know For Data Science Interviews

    Data Science is founded on time-honored concepts from statistics and probability theory. Having a strong understanding of the ten ideas and techniques highlighted here is key to your career in the field, and also a favorite topic for concept checks during interviews.

    https://www.kdnuggets.com/2021/02/10-statistical-concepts-data-science-interviews.html

  • Saving and loading models in TensorFlow — why it is important and how to do it

    So much time and effort can go into training your machine learning models. But, shut down the notebook or system, and all those trained weights and more vanish with the memory flush. Saving your models to maximize reusability is key for efficient productivity.

    https://www.kdnuggets.com/2021/02/saving-loading-models-tensorflow.html

  • Machine learning is going real-time

    Extracting immediate predictions from machine learning algorithms on the spot based on brand-new data can offer a next level of interaction and potential value to its consumers. The infrastructure and tech stack required to implement such real-time systems is also next level, and many organizations -- especially in the US -- seem to be resisting. But, what even is real-time ML, and how can it deliver a better experience?

    https://www.kdnuggets.com/2021/01/machine-learning-real-time.html

  • Is M.Tech in Data Science Worth It?

    Is M.Tech in Data Science worth it or should you learn using just online courses and projects. Let's try to find the answer to that question.

    https://www.kdnuggets.com/2021/01/greatlearning-mtech-data-science.html

  • Want to Be a Data Scientist? Don’t Start With Machine Learning">Gold BlogWant to Be a Data Scientist? Don’t Start With Machine Learning

    Machine learning may appear like the go-to topic to start learning for the aspiring data scientist. But. thinking these techniques are the key aspects of the role is the biggest misconception. So much more goes into becoming a successful data scientist, and machine learning is only one component of broader skills around processing, managing, and understanding the science behind the data.

    https://www.kdnuggets.com/2021/01/data-scientist-dont-start-machine-learning.html

  • Data Engineering — the Cousin of Data Science, is Troublesome">Gold BlogData Engineering — the Cousin of Data Science, is Troublesome

    A Data Scientist must be a jack of many, many trades. Especially when working in broader teams, understanding the roles of others, such as data engineering, can help you validate progress and be aware of potential pitfalls. So, how can you convince your analysts to realize the importance of expanding their toolkit? Examples from real life often provide great insight.

    https://www.kdnuggets.com/2021/01/data-engineering-troublesome.html

  • Comprehensive Guide to the Normal Distribution

    Drop in for some tips on how this fundamental statistics concept can improve your data science.

    https://www.kdnuggets.com/2021/01/comprehensive-guide-normal-distribution.html

  • Essential Math for Data Science: Information Theory">Gold BlogEssential Math for Data Science: Information Theory

    In the context of machine learning, some of the concepts of information theory are used to characterize or compare probability distributions. Read up on the underlying math to gain a solid understanding of relevant aspects of information theory.

    https://www.kdnuggets.com/2021/01/essential-math-data-science-information-theory.html

  • My Data Science Learning Journey So Far">Gold BlogMy Data Science Learning Journey So Far

    These are some obstacles the author faced in their data science learning journey in the past year, including how much time it took to overcome each obstacle and what it has taught the author.

    https://www.kdnuggets.com/2021/01/data-science-learning-journey.html

  • JupyterLab 3 is Here: Key reasons to upgrade now

    Read about these 3 reasons for checking out JupyterLab 3 today.

    https://www.kdnuggets.com/2021/01/jupyterlab-3-here-reasons-upgrade.html

  • Top 10 Computer Vision Papers 2020">Silver BlogTop 10 Computer Vision Papers 2020

    The top 10 computer vision papers in 2020 with video demos, articles, code, and paper reference.

    https://www.kdnuggets.com/2021/01/top-10-computer-vision-papers-2020.html

  • CatalyzeX: A must-have browser extension for machine learning engineers and researchers

    CatalyzeX is a free browser extension that finds code implementations for ML/AI papers anywhere on the internet (Google, Arxiv, Twitter, Scholar, and other sites).

    https://www.kdnuggets.com/2021/01/catalyzex-browser-extension-machine-learning.html

  • Model Experiments, Tracking and Registration using MLflow on Databricks

    This post covers how StreamSets can help expedite operations at some of the most crucial stages of Machine Learning Lifecycle and MLOps, and demonstrates integration with Databricks and MLflow.

    https://www.kdnuggets.com/2021/01/model-experiments-tracking-registration-mlflow-databricks.html

  • 2020: A Year Full of Amazing AI Papers — A Review

    So much happened in the world during 2020 that it may have been easy to miss the great progress in the world of AI. To catch you up quickly, check out this curated list of the latest breakthroughs in AI by release date, along with a video explanation, link to an in-depth article, and code.

    https://www.kdnuggets.com/2020/12/2020-amazing-ai-papers.html

  • Monte Carlo integration in Python">Gold BlogMonte Carlo integration in Python

    A famous Casino-inspired trick for data science, statistics, and all of science. How to do it in Python?

    https://www.kdnuggets.com/2020/12/monte-carlo-integration-python.html

  • Production Machine Learning Monitoring: Outliers, Drift, Explainers & Statistical Performance

    A practical deep dive on production monitoring architectures for machine learning at scale using real-time metrics, outlier detectors, drift detectors, metrics servers and explainers.

    https://www.kdnuggets.com/2020/12/production-machine-learning-monitoring-outliers-drift-explainers-statistical-performance.html

  • Fast and Intuitive Statistical Modeling with Pomegranate

    Pomegranate is a delicious fruit. It can also be a super useful Python library for statistical analysis. We will show how in this article.

    https://www.kdnuggets.com/2020/12/fast-intuitive-statistical-modeling-pomegranate.html

  • Industry 2021 Predictions for AI, Analytics, Data Science, Machine Learning

    We bring you industry predictions from 12 innovative companies - what key trends they expect in 2021 in AI, Analytics, Data Science, and Machine Learning?

    https://www.kdnuggets.com/2020/12/industry-2021-predictions-ai-data-science-machine-learning.html

  • How to Create Custom Real-time Plots in Deep Learning

    How to generate real-time visualizations of custom metrics while training a deep learning model using Keras callbacks.

    https://www.kdnuggets.com/2020/12/create-custom-real-time-plots-deep-learning.html

1
Refine your search here:

No, thanks!