## Data Science Interview Questions

By ganpati | Interview Corner

Here are some other frequently asked Data Science Questions:

## Algorithms

- How will you compare two or more algorithms and decide which one is better?
- Have you optimized an algorithm for speed? How, and by how much?
- How will you choose between parallel processing and/or faster algorithms? Explain with examples.
- How can you verify that an improvement you’ve brought to an algorithm is really an improvement?
- How will you define a good clustering algorithm?
- How would you improve a spam detection algorithm that uses naive Bayes?
- What is Gradient Descent Method (the intuition is mostly enough)?
- Which Clustering methods you are familiar with?
- You are given a data set. The data set has missing values which spread along 2 standard deviation from the median. What percentage of data would remain unaffected? Why?
- What is the difference between covariance and correlation?
- Is it possible capture the correlation between continuous and categorical variable? If yes, how?
- Explain prior probability, likelihood and marginal likelihood in context of naiveBayes algorithm?
- You came to know that your model is suffering from low bias and high variance. Which algorithm should you use to tackle it? Why?
- How is kNN different from kmeans clustering?
- How is True Positive Rate and Recall related? Write the equation.
- You were told that your regression model is suffering from multicollinearity. How would you check if that’s true? Without losing any information, can you still build a better model?
- When is Ridge regression favorable over Lasso regression?
- How would you select from two tree based algorithms? How is random forest different from Gradient boosting algorithm (GBM)?

Statistics

## Naïve Bayes

## Models

- How would you train and deploy a logistic regression model? A recommender system?
- How would you monitor that the performance of a model you trained does not degrade over time?
- What is the curse of dimensionality and how should one deal with it when building machine-learning models?
- What’s more important: predictive power or interpretability of a model?
- Explain to the company management what model lift is and why is it important.
- Explain the statement: “Algorithm can be universal but not the model”.
- What are Recommender Systems?
- Why data cleaning plays a vital role in analysis?
- Differentiate between univariate, bivariate and multivariate analysis.
- What is power analysis?
- What is Collaborative filtering?
- What is the difference between Cluster and Systematic Sampling?
- How can you assess a good logistic model?
- How can you iterate over a list and also retrieve element indices at the same time?
- Explain about the box cox transformation in regression models.
- Write a function that takes in two sorted lists and outputs a sorted list that is their union.
- What is the difference between Bayesian Inference and Maximum Likelihood Estimation (MLE)?
- What is Regularization and what kind of problems does regularization solve?
- What is multicollinearity and how you can overcome it?
- What is the curse of dimensionality?
- How do you decide whether your linear regression model fits the data?
- What is the difference between squared error and absolute error?

Which technique is used to predict categorical responses?

## Other Questions

- Python or R – Which one would you prefer for text analytics?
- What is P-Value ?
- What is Regularization? Which problem does Regularization try to solve?
- How you can fit a non-linear relations between X (say, Age) and Y (say, Income) into a Linear Model? – Show mathematically the marginal effect of X on Y based on their proposed solution.
- What is the probability of getting a sum of 2 if I have 2 equally weighted dices? Now with 4? 7?
- Which libraries for Analytics/DS you are familiar in Python?
- Describe to me a Data Science project that you led/participated?
- What is an eigenvalue? (linear algebra)
- Time Series: if you have a data-set with 100 observations for each Xi, and 3 lag-effect variables of X1, how many predictions you will have if you will run any simple linear regression?

## Some Important Resources for Data Science Interviews

And now that you have spent days preparing for the interviews you’re ready to put yourself on the job market. Thankfully, a ton of folks have written about their experiences interviewing for data science roles.

- Crushed it! Landing a data science job (Erin Shellman)
- How to Land a High-Paying Data Science Job (Even If You Have the Wrong Background) Minda Zetlin
- What it’s like to be on the data science job market (Trey Causey)
- Building a data science portfolio: Storytelling with data (DataQuest.io)
- 5 secrets for writing the perfect data scientist resume (The Data Incubator)
- [VIDEO] What it’s Like to Interview as a Data Scientist (Dose of Data)
- [VIDEO] Lessons Learned the Hard Way: Hacking the Data Science Interview (Galvanize)