So you are all excited about making predictions and ready to get started with predictive analytics!
Well this is just the right place for you to begin with. I’m going to introduce you to a free and powerful statistical programming language called R and make you awesome with predictive analytics.
If you are wondering why I’m using R and not any other language to demonstrate the problem solving here’s the reason- why not SAS or Python.
Over the next 12 posts for Problem Solving with R basics I’ll ease you into R and its syntax, step by step, and you’ll be able to write your own algorithms to crack problems on your own within a week or two if you sincerely follow these steps. The list of these 12 posts are available at the end of this post. Also, at the end of each post you will find the link to the next suggested post.
The basics are for those who are either new to data science and are learning the tricks of the trade or for those who have already learnt the R language but are still finding out ways to reach the top 5% of Kaggle.
If you have already learnt the basics and are gearing up to climb the Kaggle competitions ranks, undertake some freelance consulting or internships you might be interested in our mentorship for dream analytics job program where you get direct exposure to analytic product development, teaming up for Kaggle with competent partners, knowing what analytics skills would best suit your existing profile, participation in in house competitions, mock interviews and guidance to build a strong analytic resume.
Getting Started with Analytics
In the initial few posts, I’ll start with the installation of R, some important packages, basic tips on syntax, working with Github, some useful datasets, creating Kaggle account and basic handling of a Kaggle dataset. If you already know the basics and want to jump straight away to problem solving part, visit predictive modelling problems. Predictive modelling section will introduce you to some challenging problems from Kaggle: selection of algorithms and frequently used feature engineering concepts that will give you a wide range of choices to attack a problem. Here is a guide to all the 12 tutorials:
- Fire up your analytics skills with R
- R Studio and GUI’s
- Installing R
- Installing and loading important packages in R (ggplot, party, Hmisc, car, MASS, plyr)
- Running R 64 bit vs 128 bit
- Useful datasets
- Get familiar with World of Analytics
- Creating Kaggle account (for practice)
- Training and test datasets
- Github basics
- Some commonly used terms
- Shortcuts (keyboard and others)
Techniques at a glance
- Regression vs Classification techniques in R
- Example of few regression techniques
- Example of few classification techniques
Approach to Predictive Analytics
- Understanding Predictive Analytics
- Difference with other forms of analytics
- Emphasis on large data sets
- Types of predictive analytics problems
- Challenges in predictive analytics
- 5 Steps for Mastering Data Analytics
- How to start data exploration in R
- Setting current directory
- Loading data sets
- Working with datasets
- Summary view of data
- Finding missing data
- Replacing missing data
- Modifying variables like date etc.
- Combining and separating data sets
- Handling factor values
- Basic Statistics for Data Science
- Basic rules of probability
- Expected value
- sample and population quantities
- Signal and Noise
- probability densities and mass functions
- variability and distribution
- Statistical meaning of overfitting
- Success criterion for Predictive Model
- Understanding Data sets using visualization in R
Predictive analytics tutorials
- Slicing and Dicing the data with R
- Understanding the models in R
- Starting with feature engineering