Tag Archives for " Data Science "

Dec 29

Learning Predictive Analytics: Kaggle Competition Solutions

By ganpati | Tutorials

Success in Kaggle is a combination of many things like Machine Learning experience, type of competitions and your ability to work in a team. One should have tried a few beginner’s problems before getting into the advanced problems. Also, doing some hands-on with the data before looking at the solutions will make it easier for you to understand the rationale behind using a particular technique and the feature engineering aspects. Here are links to some amazing solutions to Kaggle problems:

You may also refer to the post secret sauce to Kaggle solutions for understanding approach to the solutions, feature engineering and algorithms used .

Walmart Recruiting – Store Sales Forecasting

Competition Link
Link to a detailed Tutorial
Solution Thread in Kaggle

Allstate Purchase Prediction Challenge

Competition Link
Solution Link

Amazon Employee Access Challenge

Competition Link
Link to Code and Solution for Leaderboard 146
A Blog with Solution Approach
A Solution Approach in Data Science Geek

Driver Telematics Analysis

Competition Link
Link to Winner’s Interview

StumbleUpon Evergreen Classification Challenge

Competition Link
Solution Approach 1
Solution Approach 2

Detecting Insults in Social Commentary

Competition Link
Rank 1 Solution with Approach in Brief

Facebook Recruiting IV: Human or Robot?

Competition Link
Solution Approach

Bike Sharing Demand

Competition Link
Solution Approach

Algorithmic Trading Challenge

Competition Link
Solution Approach

Loan Default Prediction – Imperial College London

Competition Link
Solution Code in Github

Dec 27

7 Important Qualities for a Data Scientist

By ganpati | Getting Started

I’ve often been asked by people who are part of the recruitment teams or those who are mulling a career switch to Data Science or related fields, what should be the best set of qualities that makes a candidate fit for Data Scientist roles. Since there is no checklist available for the exact set of skill sets or technologies required for such roles, recruiters often search for certain qualities and some skill sets that are transferable from their current job into the job of a Data Scientist. I’ve tried to list down some of the qualities that came out as most desirable among potential Data Scientists:

Intuition about Practical Scenarios-

Though the job of a Data Scientist or analyst revolves around data, there can be many situations in life for which we never capture data or data is not readily available. In such situations the data scientists need to make certain assumptions that fit the purpose of the analysis. Intuition plays an important role here. An intuitive researcher can achieve far better results by trying new methods or making better assumptions than those following conventional methods. At times this intuition is rooted in experience that helps the person to make certain hypothesis and test them.

Ability to Communicate

Passion for Storytelling-

I would have said great communication skills, but Data Science is not just about communicating some findings. It’s more about telling a story with the data. Insights drawn from data may vary from analyst to analyst but your story should be convincing enough to justify your conclusions through reasoning and using references to real life scenarios. The important thing here is to communicate insights in a clear, concise, and valid way, so that others in the company can effectively act on those insights.

Data visualization and presentation-

A picture is worth a thousand words. And more so, when there is a time crunch and you want to communicate the output of an analysis to an audience within a limited time. Sometimes there is nothing more effective and satisfying than a good graph at making or conveying a point.

Written or Oral Communication-

Good general communication can help facilitate trust and understanding with people that you interact with on a day to day basis, which is incredibly important for someone who is entrusted with being stewards of the data.

A Pursuit of Knowledge-

For a Data Scientist learning is a never ending process. Every project is a new challenge that has something new to offer in terms of learning. Since this field is an overlap of so many skills and domains, the key to success is admitting what one doesn’t know and being receptive to new ideas.

Organizational Skills-

You’ll often get overwhelmed by the tasks at hands. There are so many insights to find, so many different ways to tackle a problem and so many ways to optimize a model. Ability to prioritize the tasks at hand and have a knack to pick up only those ideas that’ll make maximum impact is very essential to achieve your goals.

Driven by Curiosity-

To use data for deriving insights it’s important to come up with your own questions about the data. The right questions will direct you towards the right path while analyzing the data.

Optimism and Resolve-

Most of the times the problems are unstructured and their is no end in sight. End goals may seem to be out of reach and even frustrating. An optimist view of things along with a resolve to overcome the barriers by breaking the large problems into small parts and then solving them one at a time with patience and perseverance will yield positive results.

Desire for Making an Impact-

Last but not the least, a strong desire to make an impact to the business- be it in terms of improving bottom line, customer satisfaction or reducing defects in a process is what drives the Data Scientist. This strong desire will keep you up at night while working at daunting problems and help you overcome the challenges.