Member-only story

Eric Goh Ming Hui
3 min readMar 14, 2023

--

What are they? dplyr, ggplot2, caret, RMarkDown... R Libraries

When you first go into data science, you will hear terminologies or words like dplyr, ggplot2, caret, RMarkDown... What are they? They are confusing.

This article will describe on what is dplyr, ggplot2, caret, RMarkDown and what are their differences.

For Data Mining process, we usually use CRISP DM data mining process:

Extracted from: https://www.datascience-pm.com/crisp-dm-2/

Data Mining process steps includes Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, Deployment.

- Business Understanding step - we need to understand the business and establish the question we need to answer for the data mining

- Data Understanding step - we need to understand the data. We can use statistics such as descriptive, regression analysis to understand the data.

- Data Preparation step - it is the cleaning of the data and we can remove duplicates here.

- Modeling step - we create clustering models, prediction models, classification models.

- Evaluation step - we evaluate which models is more accurate and select.

- Deployment steps - we can create data products.



For Data Science, at the Deployment steps, we…

--

--

Eric Goh Ming Hui
Eric Goh Ming Hui

Written by Eric Goh Ming Hui

(G.Dip, M.Tech, eMBA) | Author of "Learn R for Applied Statistics" | Founder of SVBook Pte. Ltd. : http://svbook.great-site.net

No responses yet