By
Ani Rud

Getting Started
So you are all excited about making predictions and ready to get started with predictive analytics!
Well this is just the right place for you to begin with. I’m going to introduce you to a free and powerful statistical programming language called R and make you awesome with predictive analytics.
If you are wondering why I’m using R and not any other language to demonstrate the problem solving here’s the reason why not SAS or Python.
Over the next 12 posts for Problem Solving with R basics I’ll ease you into R and its syntax, step by step, and you’ll be able to write your own algorithms to crack problems on your own within a week or two if you sincerely follow these steps. The list of these 12 posts are available at the end of this post. Also, at the end of each post you will find the link to the next suggested post.
The basics are for those who are either new to data science and are learning the tricks of the trade or for those who have already learnt the R language but are still finding out ways to reach the top 5% of Kaggle.
If you have already learnt the basics and are gearing up to climb the Kaggle competitions ranks, undertake some freelance consulting or internships you might be interested in our mentorship for dream analytics job program where you get direct exposure to analytic product development, teaming up for Kaggle with competent partners, knowing what analytics skills would best suit your existing profile, participation in in house competitions, mock interviews and guidance to build a strong analytic resume.
Getting Started with Analytics
In the initial few posts, I’ll start with the installation of R, some important packages, basic tips on syntax, working with Github, some useful datasets, creating Kaggle account and basic handling of a Kaggle dataset. If you already know the basics and want to jump straight away to problem solving part, visit predictive modelling problems. Predictive modelling section will introduce you to some challenging problems from Kaggle: selection of algorithms and frequently used feature engineering concepts that will give you a wide range of choices to attack a problem. Here is a guide to all the 12 tutorials:
 Fire up your analytics skills with R
 R Studio and GUI’s
 Installing R
 Installing and loading important packages in R (ggplot, party, Hmisc, car, MASS, plyr)
 Running R 64 bit vs 128 bit
 Useful datasets
 Get familiar with World of Analytics
 Creating Kaggle account (for practice)
 Training and test datasets
 Github basics
 Some commonly used terms
 Shortcuts (keyboard and others)
Techniques at a glance
 Regression vs Classification techniques in R
 Example of few regression techniques
 Example of few classification techniques
Approach to Predictive Analytics
 Understanding Predictive Analytics
 Difference with other forms of analytics
 Emphasis on large data sets
 Types of predictive analytics problems
 Challenges in predictive analytics
 5 Steps for Mastering Data Analytics
 How to start data exploration in R
 Setting current directory
 Loading data sets
 Working with datasets
 Summary view of data
 Finding missing data
 Replacing missing data
 Modifying variables like date etc.
 Combining and separating data sets
 Handling factor values
 Basic Statistics for Data Science
 Basic rules of probability
 Expected value
 sample and population quantities
 Signal and Noise
 probability densities and mass functions
 variability and distribution
 Statistical meaning of overfitting
 Success criterion for Predictive Model
 Understanding Data sets using visualization in R
Predictive analytics tutorials
 Slicing and Dicing the data with R
 Understanding the models in R
 Starting with feature engineering