Skip to content

Commit bafb8c8

Browse files
author
Amogh Singhal
authored
Update interview_prep.md
1 parent 648b98c commit bafb8c8

File tree

1 file changed

+33
-0
lines changed

1 file changed

+33
-0
lines changed

interview_prep.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,36 @@
22

33
When two or more predictors are highly correlated to each other such that one predictor
44
can be derived using the linear combinations of other predictors, then the predictors are said to be collinear
5+
6+
### 2. What is the difference between standardisation and normalization ? Why is it useful?
7+
### 3. What is the central limit theorem ? Why is it useful ?
8+
### 4. What is the inter quartile range ? Why is it useful ?
9+
### 5. What is the difference between t-test and z-test ? Why is it useful ?
10+
### 6. Why do we take n-1 when calculating sample variance? Why is it useful ?
11+
Read about Besel correction
12+
### 7. What are the assumptions of the normal distribution ? Why is it useful ?
13+
### 8. What are the different approches to outlier detection ? How will you handle the outliers? Why is it useful ?
14+
### 9. Where is RMSE a bad case ? How do we solve this ?
15+
### 10. What are the loss functions used in logistic regression ?
16+
log loss function
17+
### 11. Explain random forest in laymen terms ?
18+
### 12. How does logisitc regression work in laymen terms ?
19+
### 13. Why is logistic regression bad idea for multiclass classification ?
20+
### 14. How do you perform the train test split in a timeseries modelling ?
21+
### 15. What is the impact on timeseries model in case we have latge variation in data ?
22+
### 16. How do you decide the value of K(value of clusters) in K-means clustering ?
23+
### 17. What are the advantages and disadvantages of undersampling and oversampling ?
24+
### 18. Which are some supervised algorithms that are not impacted by imbalanced data ?
25+
### 19. You are a placement coordinator, you have to design a system for resume recommendation aligning to a company's requirement ?
26+
a. K means clustering to make clusters
27+
b. Ranking algorithm to sort for relevance
28+
29+
_Second Strategy_
30+
31+
a. Perform document similarity using Hamming distance (distance based approach)
32+
b. Compute the JD document distance with the resumes
33+
c. Shortlist top K resumes
34+
35+
### 20. How will you encode a feature like PinCode which has very high number of discrete values?
36+
Target mean encoding
37+
### 21. How do you design the architecture of a neural network?

0 commit comments

Comments
 (0)