Here are some other frequently asked Data Science Questions:
- How will you compare two or more algorithms and decide which one is better?
- Have you optimized an algorithm for speed? How, and by how much?
- How will you choose between parallel processing and/or faster algorithms? Explain with examples.
- How can you verify that an improvement you’ve brought to an algorithm is really an improvement?
- How will you define a good clustering algorithm?
- How would you improve a spam detection algorithm that uses naive Bayes?
- What is Gradient Descent Method (the intuition is mostly enough)?
- Which Clustering methods you are familiar with?
- You are given a data set. The data set has missing values which spread along 2 standard deviation from the median. What percentage of data would remain unaffected? Why?
- What is the difference between covariance and correlation?
- Is it possible capture the correlation between continuous and categorical variable? If yes, how?
- Explain prior probability, likelihood and marginal likelihood in context of naiveBayes algorithm?
- You came to know that your model is suffering from low bias and high variance. Which algorithm should you use to tackle it? Why?
- How is kNN different from kmeans clustering?
- How is True Positive Rate and Recall related? Write the equation.
- You were told that your regression model is suffering from multicollinearity. How would you check if that’s true? Without losing any information, can you still build a better model?
- When is Ridge regression favorable over Lasso regression?
- How would you select from two tree based algorithms? How is random forest different from Gradient boosting algorithm (GBM)?
- Python or R – Which one would you prefer for text analytics?
- What is P-Value ?
- What is Regularization? Which problem does Regularization try to solve?
- How you can fit a non-linear relations between X (say, Age) and Y (say, Income) into a Linear Model? – Show mathematically the marginal effect of X on Y based on their proposed solution.
- What is the probability of getting a sum of 2 if I have 2 equally weighted dices? Now with 4? 7?
- Which libraries for Analytics/DS you are familiar in Python?
- Describe to me a Data Science project that you led/participated?
- What is an eigenvalue? (linear algebra)
- Time Series: if you have a data-set with 100 observations for each Xi, and 3 lag-effect variables of X1, how many predictions you will have if you will run any simple linear regression?
Some Important Resources for Data Science Interviews
And now that you have spent days preparing for the interviews you’re ready to put yourself on the job market. Thankfully, a ton of folks have written about their experiences interviewing for data science roles.
- Crushed it! Landing a data science job (Erin Shellman)
- How to Land a High-Paying Data Science Job (Even If You Have the Wrong Background) Minda Zetlin
- What it’s like to be on the data science job market (Trey Causey)
- Building a data science portfolio: Storytelling with data (DataQuest.io)
- 5 secrets for writing the perfect data scientist resume (The Data Incubator)
- [VIDEO] What it’s Like to Interview as a Data Scientist (Dose of Data)
- [VIDEO] Lessons Learned the Hard Way: Hacking the Data Science Interview (Galvanize)