Introduction To Machine Learning Interview Questions
In this article we are going to discuss about top Machine Learning Interview Questions and their answers which can help you to crack next machine learning interview.
Q1. What are the different types of machine learning?
This is most important question in this Machine Learning Interview Questions answer series.
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
- Supervised Learning
In supervised learning used label data, it means to say we know the Output of our machine learning model. Train our machine learning model using training data and test on the testing.
Regression and classification type of problem solve by Supervised learning
- Unsupervised learning
Unsupervised learning uses unlabeled data, it means we don’t know the exact output of the machine learning model and allow the model to act on that information without guidance. Create a cluster of our data according to their feature similarities
Clustering types of problem solve by Unsupervised learning
- Reinforcement Learning
Reinforcement Learning is a part of Machine learning where an agent is put in an environment and he learns to behave in this environment by performing certain actions and observing the rewards which it gets from those actions.
Reward-based types of problems solved by Reinforcement Learning
Q2. Explain Classification and Regression
- Regression:- In the regression type problem is solved continuous quality output, For example, if you want to predict years of experience VS Salary or Predict stock price over a period of time this type of problem is solved by Regression. This type of problem solved by supervised learning algorithms like linear regression and multi-liner regression.
- Classification:- In this type solve the categorical type of output, for example, to predict person has a disease or don does not have disease or mail is spam or not spam this type of problem statement is solved by supervise classification type of algorithm, like Logistic regression, k-nearest neighbors, etc.
Q3. What is the confusion matrix?
- TN:- True Negative
- TP:- True Positive
- FP:- False Positive ( Type – I error )
- FN:- False Negative ( Type-II error )
Q4. Which is more important to you – model accuracy or model performance?
Model Accuracy is the subset of model performance, hence the better the model performance means the better the model accuracy, so the accuracy and model performance is directly proportional.
Q5. Explain false negative, false positive, true negative and true positive with a simple example.
- True Positives (TP): These are cases in which we predicted yes (they have the disease), and they do have the disease.
- True Negatives (TN): We predicted no, and they don’t have the disease.
- False Positives (FP): We predicted yes, but they don’t actually have the disease. (Also known as a “Type I error.”)
- False Negatives (FN): We predicted no, but they actually do have the disease. (Also known as a “Type II error.”)
Q6. Difference between K-Nearest neighbor and K Means
- K-Nearest neighbor:- It is a type of supervised machine learning technique, Solved classification type od problem, k in the KNNis a number of nearest point.
- K Means:- It is Unsupervised machine learning technique, Solve clustering types of problem statement, K in the k means is number of clusters.
Q7. What is the difference between Gini Impurity and Entropy in a Decision Tree?
To find the best root note parameter in Decision tree used Gini Impurity and Entropy
Entropy is used to calculate randomness in data using this we find the information gain and whichever parameter is higher the importation is our root note a parameter.
Gini measurement is the probability of a random sample variable that also be used to select the root smallest value or Gini is considered as the root note.
Q8. What is the difference between Entropy and Information Gain?
Entropy is just about the randomness in data. It decreases as you reach closer to the leaf node.
Information gain is used to reduce entropy of each attribute is also used to split out the dataset attribute, higher the values of information gain consider as the root node.
Q9. What is Overfitting? And how do you ensure you’re not overfitting with a model?
Overfitting occurs when a model highly train on training data but get a negative influence on testing data.
Methods to avoid overfitting
Use the ensemble methods such as random forest which reduces the variance in data by using a bagging method like uses multiple decision trees and combining their result.
Collect more data to train the varied sample data.
Use cross-validation technique.
Choose the right algorithm.
Q10. Explain the Ensemble learning technique in Machine Learning.
To create multiple Machine Learning models used Ensemble learning techniques,To get more accuracy combining these models, in ensemble learning split the training dataset into multiple subsets these multiple subsets build multiple separate models after the model is trained combining the all these model accuracies to get higher accuracy and predict an outcome in such a way that the variance in the output is reduced.
Q11. What are collinearity and multicollinearity?
when two independent variables (e.g., x1 and x2) in a multiple regression have some correlation then there is collinearity occurs.
when more than two independent variables (e.g., x1, x2, and x3) are inter-correlated with each other then Multicollinearity occurs.
Q12. What is bagging and boosting in Machine Learning?
Bagging tries to implement similar learners on small sample populations and then takes a mean of all the predictions.
In general, bagging you can use different learners on different populations. This helps us to reduce the variance error
Boosting is an iterative technique that adjusts the weight of an observation based on the last classification.
If an observation was classified incorrectly it tries to increase the weight of this observation and vice versa.
In general, decrease the bias error and builds strong predictive models.
Q13. What do you understand by Precision and Recall?
Precision: When it predicts yes, how often is it correct?
TP/predicted yes = 100/110 = 0.91
True Positive Rate: When it’s actually yes, how often does it predict yes?
TP/actual yes = 100/105 = 0.95
Also known as “Sensitivity” or “Recall”
Q14. What is ROC curve and what does it represent?
Receiver Operating Characteristic curve (or ROC curve) is a fundamental tool for diagnostic test evaluation and is a plot of the true positive rate (Sensitivity) against the false positive rate (Specificity) for the different possible cut-off points of a diagnostic test.
Q15. What is difference between R Square and R adjacent?
Coefficient of Determination (R Square) It suggests the proportion of variation in Y which can be explained with the independent variables. Mathematically:
R2 = SSR/SST or R2 = Explained variation / Total variation
Adjusted R Square:- Adding more independent variables or predictors to a regression model tends to increase the R-squared value, which tempts makers of the model to add even more. This is called overfitting and can return an unwarranted high R-squared value. Adjusted R-squared is used to determine how reliable the correlation is and how much is determined by the addition of independent variables.
R2 adjusted=1 - (1-R2)(N-1)/N-p-1
n = Number Of points in dataset
p = Number of independent variable in the model
Q16. Assumption in linear regression?
- Linear Relationship
- Multivariate normality
Q17. What Is autocorrelation? Which test is used to find autocorrelation?
If the values of a column or feature is correlated with values of that same column then it is said to be autocorrelated, In other words, Correlation within a column.
Durbin-Watson(DW) Test is Generally used to check the Autocorrelation.
It has a range of 0 to 4
where 0-2 shows positive Autocorrelation
2 means NO Autocorrelation
and 2-4 means Negative Autocorrelation
Durbin Watson test works well with 1st order AutoCorrelation whereas Brusch-Godfrey test for(2,3,4 order)
Q18. Difference between homoscedasticity and heteroscedasticity?
Homoscedasticity:- If the variance of the residual are symmetrically distributed across the residual line then data is said to be homoscedastic.
Heteroscedasticity:- If the variance is unequal for residual, across the residual line then the data is said to be Heteroscedasticity. In this case, the residual can form bow-tie, arrow, or any non-symmetric shape.
Q19. Difference Parameter and Hyperparameter?
Parameter:- A model parameter is a configuration variable that is internal to the model and whose value can be estimated from the given data. They are required by the model when making predictions. Their values define the skill of the model on your problem. They are estimated or learned from data. They are often not set manually by the practitioner.They are often saved as part of the learned model.
Hyperparameter:- A model hyperparameter is a configuration that is external to the model and whose value cannot be estimated from data. They are often used in processes to help estimate model parameters. They are often specified by the practitioner. They can often be set using heuristics. They are often tuned for a given predictive modeling problem
Q20. Suppose you found that your model is suffering from low bias and high variance. Which algorithm you think could handle this situation and Why?
The data is suffering from low bias and high variance means the model is Overfitting.
To handle this high variance parameter used random forest (Bagging method )
Bagging method has created a subset of data and build multiple decision trees to solve a particular problem, in classification type of problem get the prediction as majority of the class and the regression problem averaging the regression.
In this article we discussed about top most important Machine Learning Interview Questions and their answers. Hope this will help you to prepare for next machine learning interview.