You are not supposed to be hereπ€¨, which is awkward. Maybe you should head back!
This is still in development.π§βπ» And if you are here that probably means you are intersted in this.π So please please, please, contribute. π€ π«΅ You can SUBMIT simple text/markdown content, I will format it! π
\ No newline at end of file
+ Data Science Interview preparation
You are not supposed to be hereπ€¨, which is awkward. Maybe you should head back!
This is still in development.π§βπ» And if you are here that probably means you are intersted in this.π π This project is in early stages of development. π€ Please contibute content if possible! π€ π«΅ You can SUBMIT simple text/markdown content, I will format it! π
\ No newline at end of file
diff --git a/Cheat-Sheets/Django/index.html b/Cheat-Sheets/Django/index.html
index 176077a..158c785 100644
--- a/Cheat-Sheets/Django/index.html
+++ b/Cheat-Sheets/Django/index.html
@@ -1 +1 @@
- Django - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/Flask/index.html b/Cheat-Sheets/Flask/index.html
index 864896b..753f79c 100644
--- a/Cheat-Sheets/Flask/index.html
+++ b/Cheat-Sheets/Flask/index.html
@@ -1 +1 @@
- Flask - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/Hypothesis-Tests/index.html b/Cheat-Sheets/Hypothesis-Tests/index.html
index 285ad15..164cbf3 100644
--- a/Cheat-Sheets/Hypothesis-Tests/index.html
+++ b/Cheat-Sheets/Hypothesis-Tests/index.html
@@ -1,4 +1,4 @@
- Hypothesis Tests in Python (Cheat Sheet) - Data Science Interview preparation
AΒ statistical hypothesis testΒ is a method ofΒ statistical inferenceΒ used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters.
Few Notes:
When it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.
Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.
In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance, to name two examples.
Normality Tests
This section lists statistical tests that you can use to check if your data has a Gaussian distribution.
AΒ statistical hypothesis testΒ is a method ofΒ statistical inferenceΒ used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters.
Few Notes:
When it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.
Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.
In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance, to name two examples.
Normality Tests
This section lists statistical tests that you can use to check if your data has a Gaussian distribution.
Tests whether a data sample has a Gaussian distribution/Normal distribution.
Assumptions
Observations in each sample are independent and identically distributed (iid).
Interpretation
H0: the sample has a Gaussian distribution.
H1: the sample does not have a Gaussian distribution.
Python Code
# Example of the Shapiro-Wilk Normality Testfromscipy.statsimportshapirodata=[0.873,2.817,0.121,-0.945,-0.055,-1.436,0.360,-1.478,-1.637,-1.869]stat,p=shapiro(data)
diff --git a/Cheat-Sheets/Keras/index.html b/Cheat-Sheets/Keras/index.html
index b86e122..a8644ec 100644
--- a/Cheat-Sheets/Keras/index.html
+++ b/Cheat-Sheets/Keras/index.html
@@ -1 +1 @@
- Keras - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/NumPy/index.html b/Cheat-Sheets/NumPy/index.html
index cef5099..d99b41f 100644
--- a/Cheat-Sheets/NumPy/index.html
+++ b/Cheat-Sheets/NumPy/index.html
@@ -1 +1 @@
- NumPy - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/Pandas/index.html b/Cheat-Sheets/Pandas/index.html
index ffad173..274789b 100644
--- a/Cheat-Sheets/Pandas/index.html
+++ b/Cheat-Sheets/Pandas/index.html
@@ -1 +1 @@
- Pandas - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/PySpark/index.html b/Cheat-Sheets/PySpark/index.html
index c5d88f3..eb29b82 100644
--- a/Cheat-Sheets/PySpark/index.html
+++ b/Cheat-Sheets/PySpark/index.html
@@ -1 +1 @@
- PySpark - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/PyTorch/index.html b/Cheat-Sheets/PyTorch/index.html
index 9bce0db..9373aa1 100644
--- a/Cheat-Sheets/PyTorch/index.html
+++ b/Cheat-Sheets/PyTorch/index.html
@@ -1 +1 @@
- PyTorch - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/Python/index.html b/Cheat-Sheets/Python/index.html
index 492aabc..c0012ec 100644
--- a/Cheat-Sheets/Python/index.html
+++ b/Cheat-Sheets/Python/index.html
@@ -1 +1 @@
- Python - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/SQL/index.html b/Cheat-Sheets/SQL/index.html
index 99a95df..54c62c9 100644
--- a/Cheat-Sheets/SQL/index.html
+++ b/Cheat-Sheets/SQL/index.html
@@ -1 +1 @@
- SQL - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/Sk-learn/index.html b/Cheat-Sheets/Sk-learn/index.html
index f3095cf..684190e 100644
--- a/Cheat-Sheets/Sk-learn/index.html
+++ b/Cheat-Sheets/Sk-learn/index.html
@@ -1 +1 @@
- Scikit Learn - Data Science Interview preparation
\ No newline at end of file
diff --git a/Cheat-Sheets/tensorflow/index.html b/Cheat-Sheets/tensorflow/index.html
index 548eb8a..1e8dacc 100644
--- a/Cheat-Sheets/tensorflow/index.html
+++ b/Cheat-Sheets/tensorflow/index.html
@@ -1 +1 @@
- TensorFlow - Data Science Interview preparation
\ No newline at end of file
diff --git a/Deploying-ML-models/deploying-ml-models/index.html b/Deploying-ML-models/deploying-ml-models/index.html
index c398aa1..81654e4 100644
--- a/Deploying-ML-models/deploying-ml-models/index.html
+++ b/Deploying-ML-models/deploying-ml-models/index.html
@@ -1,4 +1,4 @@
- Home - Data Science Interview preparation
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! π€ You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
Contribute to the platform
Contribution in any form will be deeply appreciated. π
Add questions
β Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.
π€ Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.
Add answers/topics
π These are the answers/topics that need your help at the moment
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! π€ You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
Contribute to the platform
Contribution in any form will be deeply appreciated. π
Add questions
β Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.
π€ Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.
Add answers/topics
π These are the answers/topics that need your help at the moment
These are currently most commonly asked questions. Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.
\ No newline at end of file
+ Interview Questions - Data Science Interview preparation
These are currently most commonly asked questions. Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.
\ No newline at end of file
diff --git a/Interview-Questions/Natural-Language-Processing/index.html b/Interview-Questions/Natural-Language-Processing/index.html
index 4875606..c3bb0b1 100644
--- a/Interview-Questions/Natural-Language-Processing/index.html
+++ b/Interview-Questions/Natural-Language-Processing/index.html
@@ -1 +1 @@
- NLP Questions - Data Science Interview preparation
\ No newline at end of file
diff --git a/Interview-Questions/Probability/index.html b/Interview-Questions/Probability/index.html
index 1bc224f..e9e4915 100644
--- a/Interview-Questions/Probability/index.html
+++ b/Interview-Questions/Probability/index.html
@@ -1,4 +1,4 @@
- Probability Questions - Data Science Interview preparation
1.1 Average score on a dice role of at most 3 times
Question
Consider a fair 6-sided dice. Your aim is to get the highest score you can, in at-most 3 roles.
A score is defined as the number that appears on the face of the dice facing up after the role. You can role at most 3 times but every time you role it is up to you to decide whether you want to role again.
The last score will be counted as your final score.
Find the average score if you rolled the dice only once?
Find the average score that you can get with at most 3 roles?
If the dice is fair, why is the average score for at most 3 roles and 1 role not the same?
Hint 1
Find what is the expected score on single role
And for cases when scores of single role < expected score on single role is when you will go for next role
Eg: if expected score of single role comes out to be 4.5, you will only role next turn for 1,2,3,4 and not for 5,6
The average score if you rolled the dice only once is 3.5
For at most 3 roles, let's try back-tracking. Let's say just did your second role and you have to decide whether to do your 3rd role!
We just found out if we role dice once on average we can expect score of 3.5. So we will only role the 3rd time if score on 2nd role is less than 3.5 i.e (1,2 or 3)
Possibilities
2nd role score
Probability
3rd role score
Probability
1
β
3.5
β
2
β
3.5
β
3
β
3.5
β
4
β
NA
We won't role
5
β
NA
3rd time if we
6
β
NA
get score >3 on 2nd
So if we had 2 roles, average score would be:
[We role again if current score is less than 3.4]
+ Probability Questions - Data Science Interview preparation
1.1 Average score on a dice role of at most 3 times
Question
Consider a fair 6-sided dice. Your aim is to get the highest score you can, in at-most 3 roles.
A score is defined as the number that appears on the face of the dice facing up after the role. You can role at most 3 times but every time you role it is up to you to decide whether you want to role again.
The last score will be counted as your final score.
Find the average score if you rolled the dice only once?
Find the average score that you can get with at most 3 roles?
If the dice is fair, why is the average score for at most 3 roles and 1 role not the same?
Hint 1
Find what is the expected score on single role
And for cases when scores of single role < expected score on single role is when you will go for next role
Eg: if expected score of single role comes out to be 4.5, you will only role next turn for 1,2,3,4 and not for 5,6
The average score if you rolled the dice only once is 3.5
For at most 3 roles, let's try back-tracking. Let's say just did your second role and you have to decide whether to do your 3rd role!
We just found out if we role dice once on average we can expect score of 3.5. So we will only role the 3rd time if score on 2nd role is less than 3.5 i.e (1,2 or 3)
Possibilities
2nd role score
Probability
3rd role score
Probability
1
β
3.5
β
2
β
3.5
β
3
β
3.5
β
4
β
NA
We won't role
5
β
NA
3rd time if we
6
β
NA
get score >3 on 2nd
So if we had 2 roles, average score would be:
[We role again if current score is less than 3.4]
(3.5)*(1/6) + (3.5)*(1/6) + (3.5)*(1/6)
+
(4)*(1/6) + (5)*(1/6) + (6)*(1/6) [Decide not to role again]
diff --git a/Interview-Questions/System-design/index.html b/Interview-Questions/System-design/index.html
index d224a0d..70e9459 100644
--- a/Interview-Questions/System-design/index.html
+++ b/Interview-Questions/System-design/index.html
@@ -1 +1 @@
- System Design - Data Science Interview preparation
\ No newline at end of file
diff --git a/Interview-Questions/data-structures-algorithms/index.html b/Interview-Questions/data-structures-algorithms/index.html
index 769285b..f28fae2 100644
--- a/Interview-Questions/data-structures-algorithms/index.html
+++ b/Interview-Questions/data-structures-algorithms/index.html
@@ -1,4 +1,4 @@
- Data Structure and Algorithms - Data Science Interview preparation
π This project is in early stages of development. π€ Please contibute content if possible! π€ π«΅ You can SUBMIT simple text/markdown content, I will format it! π
π This project is in early stages of development. π€ Please contibute content if possible! π€ π«΅ You can SUBMIT simple text/markdown content, I will format it! π
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! π€ You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
Contribute to the platform
Contribution in any form will be deeply appreciated. π
Add questions
β Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.
π€ Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.
Add answers/topics
π These are the answers/topics that need your help at the moment
π You can also solve existing issues on GitHub and create a pull request.
Say Thanks
π If this platform helped you in any way, it would be great if you could share it with others.
Check out this π platform π for data science content:
+π https://singhsidhukuldeep.github.io/data-science-interview-prep/ π
+
You can also star the repository on GitHub and watch-out for any updates
Features
π¨ Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices β from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.
π§ Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search β server-less β is fast and accurate in responses to any of the queries.
π Accessible:
Easy to use: π The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
Easy to contribute: π€ The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
Setup
No setup is required for usage of the platform
Important:It is strongly advised to use virtual environment and not change anything in gh-pages
mkdocs serve - Start the live-reloading docs server.
mkdocs build - Build the documentation site.
mkdocs -h - Print help message and exit.
mkdocs gh-deploy - UseΒ mkdocs gh-deploy --helpΒ to get a full list of options available for theΒ gh-deployΒ command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using theΒ buildΒ orΒ serveΒ commands and reviewing the built files locally.
mkdocs new [dir-name] - Create a new project. No need to create a new project
As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. π€
This doesn't mean that such feature won't be added in the future. "Never say Never"
But as of now there is neither plan nor data to do so. π’
Why is this platform free? π€
Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.
If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. π
π The full list of all the contributors is available here
Current Status
\ No newline at end of file
diff --git a/Machine-Learning/ARIMA/index.html b/Machine-Learning/ARIMA/index.html
index aec2957..865dfab 100644
--- a/Machine-Learning/ARIMA/index.html
+++ b/Machine-Learning/ARIMA/index.html
@@ -1 +1 @@
- ARIMA - Data Science Interview preparation
\ No newline at end of file
diff --git a/Machine-Learning/DBSCAN/index.html b/Machine-Learning/DBSCAN/index.html
index 50c973c..3c2ba11 100644
--- a/Machine-Learning/DBSCAN/index.html
+++ b/Machine-Learning/DBSCAN/index.html
@@ -1 +1 @@
- DBSCAN - Data Science Interview preparation
\ No newline at end of file
diff --git a/Machine-Learning/Decision Trees/index.html b/Machine-Learning/Decision Trees/index.html
index aee62a9..c667a71 100644
--- a/Machine-Learning/Decision Trees/index.html
+++ b/Machine-Learning/Decision Trees/index.html
@@ -1 +1 @@
- Decision Trees - Data Science Interview preparation
\ No newline at end of file
diff --git a/Machine-Learning/K-means clustering/index.html b/Machine-Learning/K-means clustering/index.html
index 8e20866..5307ec9 100644
--- a/Machine-Learning/K-means clustering/index.html
+++ b/Machine-Learning/K-means clustering/index.html
@@ -1 +1 @@
- K means clustering - Data Science Interview preparation
\ No newline at end of file
diff --git a/Machine-Learning/Linear Regression/index.html b/Machine-Learning/Linear Regression/index.html
index a62eabc..99e0b82 100644
--- a/Machine-Learning/Linear Regression/index.html
+++ b/Machine-Learning/Linear Regression/index.html
@@ -1 +1 @@
- Linear Regression - Data Science Interview preparation
\ No newline at end of file
diff --git a/Machine-Learning/Loss Function MAE, RMSE/index.html b/Machine-Learning/Loss Function MAE, RMSE/index.html
index 49a993a..ec45b3a 100644
--- a/Machine-Learning/Loss Function MAE, RMSE/index.html
+++ b/Machine-Learning/Loss Function MAE, RMSE/index.html
@@ -1 +1 @@
- Loss Function MAE, RMSE - Data Science Interview preparation
\ No newline at end of file
diff --git a/Machine-Learning/Normal Distribution/index.html b/Machine-Learning/Normal Distribution/index.html
index 27ce658..45c80c2 100644
--- a/Machine-Learning/Normal Distribution/index.html
+++ b/Machine-Learning/Normal Distribution/index.html
@@ -1 +1 @@
- Normal Distribution - Data Science Interview preparation
\ No newline at end of file
diff --git a/Machine-Learning/PCA/index.html b/Machine-Learning/PCA/index.html
index 7caf80d..c4d0097 100644
--- a/Machine-Learning/PCA/index.html
+++ b/Machine-Learning/PCA/index.html
@@ -1 +1 @@
- PCA - Data Science Interview preparation
\ No newline at end of file
diff --git a/Machine-Learning/Random Forest/index.html b/Machine-Learning/Random Forest/index.html
index 68d5356..d2e081e 100644
--- a/Machine-Learning/Random Forest/index.html
+++ b/Machine-Learning/Random Forest/index.html
@@ -1 +1 @@
- Random Forest - Data Science Interview preparation
\ No newline at end of file
diff --git a/Machine-Learning/kNN/index.html b/Machine-Learning/kNN/index.html
index bfdfd48..dcdba22 100644
--- a/Machine-Learning/kNN/index.html
+++ b/Machine-Learning/kNN/index.html
@@ -1 +1 @@
- kNN - Data Science Interview preparation
\ No newline at end of file
diff --git a/Online-Material/Online-Material-for-Learning/index.html b/Online-Material/Online-Material-for-Learning/index.html
index fac3230..8ee51bc 100644
--- a/Online-Material/Online-Material-for-Learning/index.html
+++ b/Online-Material/Online-Material-for-Learning/index.html
@@ -1 +1 @@
- Online Study Material - Data Science Interview preparation
\ No newline at end of file
diff --git a/Online-Material/popular-resouces/index.html b/Online-Material/popular-resouces/index.html
index 2e25cfe..422df4d 100644
--- a/Online-Material/popular-resouces/index.html
+++ b/Online-Material/popular-resouces/index.html
@@ -1 +1 @@
- Popular Blogs - Data Science Interview preparation
\ No newline at end of file
diff --git a/as-fast-as-possible/Deep-CV/index.html b/as-fast-as-possible/Deep-CV/index.html
index 36431d1..c6b45bc 100644
--- a/as-fast-as-possible/Deep-CV/index.html
+++ b/as-fast-as-possible/Deep-CV/index.html
@@ -1 +1 @@
- Deep Computer Vision - Data Science Interview preparation
\ No newline at end of file
diff --git a/as-fast-as-possible/Deep-NLP/index.html b/as-fast-as-possible/Deep-NLP/index.html
index eb2bfb9..57050f3 100644
--- a/as-fast-as-possible/Deep-NLP/index.html
+++ b/as-fast-as-possible/Deep-NLP/index.html
@@ -1 +1 @@
- Deep Natural Language Processing - Data Science Interview preparation
\ No newline at end of file
diff --git a/as-fast-as-possible/Neural-Networks/index.html b/as-fast-as-possible/Neural-Networks/index.html
index 2369ce2..f2a8b83 100644
--- a/as-fast-as-possible/Neural-Networks/index.html
+++ b/as-fast-as-possible/Neural-Networks/index.html
@@ -1 +1 @@
- Neural Networks - Data Science Interview preparation
\ No newline at end of file
diff --git a/as-fast-as-possible/TF2-Keras/index.html b/as-fast-as-possible/TF2-Keras/index.html
index 49ef79e..2913ff0 100644
--- a/as-fast-as-possible/TF2-Keras/index.html
+++ b/as-fast-as-possible/TF2-Keras/index.html
@@ -1 +1 @@
- Tensorflow 2 with Keras - Data Science Interview preparation
\ No newline at end of file
diff --git a/as-fast-as-possible/index.html b/as-fast-as-possible/index.html
index cceacc8..27389e2 100644
--- a/as-fast-as-possible/index.html
+++ b/as-fast-as-possible/index.html
@@ -1 +1 @@
- Introduction - Data Science Interview preparation
\ No newline at end of file
diff --git a/contact/index.html b/contact/index.html
index 58571dc..e3109e8 100644
--- a/contact/index.html
+++ b/contact/index.html
@@ -1 +1 @@
- Contact - Data Science Interview preparation
\ No newline at end of file
diff --git a/index.html b/index.html
index 67950a8..6d0ec88 100644
--- a/index.html
+++ b/index.html
@@ -1,33 +1 @@
- Data Science - Data Science Interview preparation
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! π€ You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
Contribute to the platform
Contribution in any form will be deeply appreciated. π
Add questions
β Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.
π€ Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.
Add answers/topics
π These are the answers/topics that need your help at the moment
π You can also solve existing issues on GitHub and create a pull request.
Say Thanks
π If this platform helped you in any way, it would be great if you could share it with others.
Check out this π platform π for data science content:
-π https://singhsidhukuldeep.github.io/data-science-interview-prep/ π
-
You can also star the repository on GitHub and watch-out for any updates
Features
π¨ Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices β from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.
π§ Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search β server-less β is fast and accurate in responses to any of the queries.
π Accessible:
Easy to use: π The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
Easy to contribute: π€ The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
Setup
No setup is required for usage of the platform
Important:It is strongly advised to use virtual environment and not change anything in gh-pages
mkdocs serve - Start the live-reloading docs server.
mkdocs build - Build the documentation site.
mkdocs -h - Print help message and exit.
mkdocs gh-deploy - UseΒ mkdocs gh-deploy --helpΒ to get a full list of options available for theΒ gh-deployΒ command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using theΒ buildΒ orΒ serveΒ commands and reviewing the built files locally.
mkdocs new [dir-name] - Create a new project. No need to create a new project
As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. π€
This doesn't mean that such feature won't be added in the future. "Never say Never"
But as of now there is neither plan nor data to do so. π’
Why is this platform free? π€
Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.
If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. π
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! π€ You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
Interview Questions
These are currently most commonly asked interview questions.
Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.
\ No newline at end of file
diff --git a/privacy/index.html b/privacy/index.html
index cdba481..293a075 100644
--- a/privacy/index.html
+++ b/privacy/index.html
@@ -1 +1 @@
- Privacy Policy - Data Science Interview preparation
Welcome to https://singhsidhukuldeep.github.io/ (the "Website"). Your privacy is important to us, and we are committed to protecting the personal information you share with us. This Privacy Policy explains how we collect, use, and disclose your information, and our commitment to ensuring that your personal data is handled with care and security.
This policy complies with the General Data Protection Regulation (GDPR), ePrivacy Directive (EPD), California Privacy Rights Act (CPRA), Colorado Privacy Act (CPA), Virginia Consumer Data Protection Act (VCDPA), and Brazil's Lei Geral de Proteção de Dados (LGPD).
Information We Collect
Personal Information
We may collect personally identifiable information about you, such as:
Name
Email address
IP address
Other information you voluntarily provide through contact forms or interactions with the Website
Non-Personal Information
We may also collect non-personal information such as:
Browser type
Language preference
Referring site
Date and time of each visitor request
Aggregated data on how visitors use the Website
Cookies and Web Beacons
Our Website uses cookies to enhance your experience. A cookie is a small file that is placed on your device when you visit our Website. Cookies help us to:
Remember your preferences and settings
Understand how you interact with our Website
Track and analyze usage patterns
You can disable cookies through your browser settings; however, doing so may affect your ability to access certain features of the Website.
Google AdSense
We use Google AdSense to display advertisements on our Website. Google AdSense may use cookies and web beacons to collect information about your interaction with the ads displayed on our Website. This information may include:
Your IP address
The type of browser you use
The pages you visit on our Website
Google may use this information to show you personalized ads based on your interests and browsing history. For more information on how Google uses your data, please visit the Google Privacy & Terms page.
Legal Bases for Processing Your Data (GDPR Compliance)
We process your personal data under the following legal bases:
Consent: When you have given explicit consent for us to process your data for specific purposes.
Contract: When processing your data is necessary to fulfill a contract with you or to take steps at your request before entering into a contract.
Legitimate Interests: When the processing is necessary for our legitimate interests, such as improving our services, provided these are not overridden by your rights.
Compliance with Legal Obligations: When we need to process your data to comply with a legal obligation.
How Your Data Will Be Used to Show Ads
We work with third-party vendors, including Google, to serve ads on our Website. These vendors use cookies and similar technologies to collect and use data about your visits to this and other websites to show you ads that are more relevant to your interests.
Types of Data Used
The data used to show you ads may include:
Demographic Information: Age, gender, and other demographic details
Location Data: Approximate geographical location based on your IP address
Behavioral Data: Your browsing behavior, such as pages visited, links clicked, and time spent on our Website
Interests and Preferences: Based on your browsing history, the types of ads you interact with, and your preferences across websites
Purpose of Data Usage
The primary purpose of collecting and using this data is to:
Serve ads that are relevant and tailored to your interests
Improve ad targeting and effectiveness
Analyze and optimize the performance of ads on our Website
Opting Out of Personalized Ads
You can opt out of personalized ads by adjusting your ad settings with Google and other third-party vendors. For more information on how to opt out of personalized ads, please visit the Google Ads Settings page and review the options available to manage your preferences.
Data Subject Rights (GDPR, CPRA, CPA, VCDPA, LGPD Compliance)
Depending on your jurisdiction, you have the following rights regarding your personal data:
Right to Access
You have the right to request access to the personal data we hold about you and to receive a copy of this data.
Right to Rectification
You have the right to request that we correct any inaccuracies in the personal data we hold about you.
Right to Erasure (Right to Be Forgotten)
You have the right to request that we delete your personal data, subject to certain conditions and legal obligations.
Right to Restriction of Processing
You have the right to request that we restrict the processing of your personal data in certain circumstances, such as when you contest the accuracy of the data.
Right to Data Portability
You have the right to receive your personal data in a structured, commonly used, and machine-readable format and to transmit this data to another controller.
Right to Object
You have the right to object to the processing of your personal data based on legitimate interests or for direct marketing purposes.
Right to Withdraw Consent
Where we rely on your consent to process your personal data, you have the right to withdraw your consent at any time.
Right to Non-Discrimination (CPRA Compliance)
We will not discriminate against you for exercising any of your privacy rights under CPRA or any other applicable laws.
Exercising Your Rights
To exercise any of these rights, please contact us at:
We will respond to your request within the timeframes required by applicable law.
How We Use Your Information
We use the information collected from you to:
Improve the content and functionality of our Website
Display relevant advertisements through Google AdSense and other ad networks
Respond to your inquiries and provide customer support
Analyze usage patterns and improve our services
Data Sharing and Disclosure
Third-Party Service Providers
We may share your personal data with third-party service providers who assist us in operating our Website, conducting our business, or servicing you, as long as these parties agree to keep this information confidential.
Legal Obligations
We may disclose your personal data when required by law or to comply with legal processes, such as a court order or subpoena.
Business Transfers
In the event of a merger, acquisition, or sale of all or a portion of our assets, your personal data may be transferred to the acquiring entity.
Data Retention
We will retain your personal data only for as long as necessary to fulfill the purposes outlined in this Privacy Policy unless a longer retention period is required or permitted by law.
Data Security
We take reasonable measures to protect your information from unauthorized access, alteration, disclosure, or destruction. However, no method of transmission over the internet or electronic storage is 100% secure, and we cannot guarantee absolute security.
Cross-Border Data Transfers
Your personal data may be transferred to, and processed in, countries other than the country in which you are resident. These countries may have data protection laws that are different from the laws of your country.
Where we transfer your personal data to other countries, we will take appropriate measures to ensure that your personal data remains protected in accordance with this Privacy Policy and applicable data protection laws.
Your Consent
By using our Website, you consent to our Privacy Policy and agree to its terms.
Changes to This Privacy Policy
We may update this Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page. You are advised to review this Privacy Policy periodically for any changes.
Contact Us
If you have any questions about this Privacy Policy, or if you would like to exercise your rights under GDPR, CPRA, CPA, VCDPA, or LGPD, please contact us at:
Welcome to https://singhsidhukuldeep.github.io/ (the "Website"). Your privacy is important to us, and we are committed to protecting the personal information you share with us. This Privacy Policy explains how we collect, use, and disclose your information, and our commitment to ensuring that your personal data is handled with care and security.
This policy complies with the General Data Protection Regulation (GDPR), ePrivacy Directive (EPD), California Privacy Rights Act (CPRA), Colorado Privacy Act (CPA), Virginia Consumer Data Protection Act (VCDPA), and Brazil's Lei Geral de Proteção de Dados (LGPD).
Information We Collect
Personal Information
We may collect personally identifiable information about you, such as:
Name
Email address
IP address
Other information you voluntarily provide through contact forms or interactions with the Website
Non-Personal Information
We may also collect non-personal information such as:
Browser type
Language preference
Referring site
Date and time of each visitor request
Aggregated data on how visitors use the Website
Cookies and Web Beacons
Our Website uses cookies to enhance your experience. A cookie is a small file that is placed on your device when you visit our Website. Cookies help us to:
Remember your preferences and settings
Understand how you interact with our Website
Track and analyze usage patterns
You can disable cookies through your browser settings; however, doing so may affect your ability to access certain features of the Website.
Google AdSense
We use Google AdSense to display advertisements on our Website. Google AdSense may use cookies and web beacons to collect information about your interaction with the ads displayed on our Website. This information may include:
Your IP address
The type of browser you use
The pages you visit on our Website
Google may use this information to show you personalized ads based on your interests and browsing history. For more information on how Google uses your data, please visit the Google Privacy & Terms page.
Legal Bases for Processing Your Data (GDPR Compliance)
We process your personal data under the following legal bases:
Consent: When you have given explicit consent for us to process your data for specific purposes.
Contract: When processing your data is necessary to fulfill a contract with you or to take steps at your request before entering into a contract.
Legitimate Interests: When the processing is necessary for our legitimate interests, such as improving our services, provided these are not overridden by your rights.
Compliance with Legal Obligations: When we need to process your data to comply with a legal obligation.
How Your Data Will Be Used to Show Ads
We work with third-party vendors, including Google, to serve ads on our Website. These vendors use cookies and similar technologies to collect and use data about your visits to this and other websites to show you ads that are more relevant to your interests.
Types of Data Used
The data used to show you ads may include:
Demographic Information: Age, gender, and other demographic details
Location Data: Approximate geographical location based on your IP address
Behavioral Data: Your browsing behavior, such as pages visited, links clicked, and time spent on our Website
Interests and Preferences: Based on your browsing history, the types of ads you interact with, and your preferences across websites
Purpose of Data Usage
The primary purpose of collecting and using this data is to:
Serve ads that are relevant and tailored to your interests
Improve ad targeting and effectiveness
Analyze and optimize the performance of ads on our Website
Opting Out of Personalized Ads
You can opt out of personalized ads by adjusting your ad settings with Google and other third-party vendors. For more information on how to opt out of personalized ads, please visit the Google Ads Settings page and review the options available to manage your preferences.
Data Subject Rights (GDPR, CPRA, CPA, VCDPA, LGPD Compliance)
Depending on your jurisdiction, you have the following rights regarding your personal data:
Right to Access
You have the right to request access to the personal data we hold about you and to receive a copy of this data.
Right to Rectification
You have the right to request that we correct any inaccuracies in the personal data we hold about you.
Right to Erasure (Right to Be Forgotten)
You have the right to request that we delete your personal data, subject to certain conditions and legal obligations.
Right to Restriction of Processing
You have the right to request that we restrict the processing of your personal data in certain circumstances, such as when you contest the accuracy of the data.
Right to Data Portability
You have the right to receive your personal data in a structured, commonly used, and machine-readable format and to transmit this data to another controller.
Right to Object
You have the right to object to the processing of your personal data based on legitimate interests or for direct marketing purposes.
Right to Withdraw Consent
Where we rely on your consent to process your personal data, you have the right to withdraw your consent at any time.
Right to Non-Discrimination (CPRA Compliance)
We will not discriminate against you for exercising any of your privacy rights under CPRA or any other applicable laws.
Exercising Your Rights
To exercise any of these rights, please contact us at:
We will respond to your request within the timeframes required by applicable law.
How We Use Your Information
We use the information collected from you to:
Improve the content and functionality of our Website
Display relevant advertisements through Google AdSense and other ad networks
Respond to your inquiries and provide customer support
Analyze usage patterns and improve our services
Data Sharing and Disclosure
Third-Party Service Providers
We may share your personal data with third-party service providers who assist us in operating our Website, conducting our business, or servicing you, as long as these parties agree to keep this information confidential.
Legal Obligations
We may disclose your personal data when required by law or to comply with legal processes, such as a court order or subpoena.
Business Transfers
In the event of a merger, acquisition, or sale of all or a portion of our assets, your personal data may be transferred to the acquiring entity.
Data Retention
We will retain your personal data only for as long as necessary to fulfill the purposes outlined in this Privacy Policy unless a longer retention period is required or permitted by law.
Data Security
We take reasonable measures to protect your information from unauthorized access, alteration, disclosure, or destruction. However, no method of transmission over the internet or electronic storage is 100% secure, and we cannot guarantee absolute security.
Cross-Border Data Transfers
Your personal data may be transferred to, and processed in, countries other than the country in which you are resident. These countries may have data protection laws that are different from the laws of your country.
Where we transfer your personal data to other countries, we will take appropriate measures to ensure that your personal data remains protected in accordance with this Privacy Policy and applicable data protection laws.
Your Consent
By using our Website, you consent to our Privacy Policy and agree to its terms.
Changes to This Privacy Policy
We may update this Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page. You are advised to review this Privacy Policy periodically for any changes.
Contact Us
If you have any questions about this Privacy Policy, or if you would like to exercise your rights under GDPR, CPRA, CPA, VCDPA, or LGPD, please contact us at:
Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India
\ No newline at end of file
diff --git a/projects/index.html b/projects/index.html
index d188539..25211f5 100644
--- a/projects/index.html
+++ b/projects/index.html
@@ -1 +1 @@
- Projects - Data Science Interview preparation
\ No newline at end of file
diff --git a/search/search_index.json b/search/search_index.json
index 87c9b28..747a927 100644
--- a/search/search_index.json
+++ b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":""},{"location":"#introduction","title":"Introduction","text":"
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
"},{"location":"#contribute-to-the-platform","title":"Contribute to the platform","text":"
Contribution in any form will be deeply appreciated. \ud83d\ude4f
\u2753 Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.
\ud83e\udd1d Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.
\ud83d\ude0a If this platform helped you in any way, it would be great if you could share it with others.
Check out this \ud83d\udc47 platform \ud83d\udc47 for data science content:\n\ud83d\udc49 https://singhsidhukuldeep.github.io/data-science-interview-prep/ \ud83d\udc48\n
You can also star the repository on GitHub and watch-out for any updates
\ud83c\udfa8 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices \u2013 from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.
\ud83e\uddd0 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search \u2013 server-less \u2013 is fast and accurate in responses to any of the queries.
\ud83d\ude4c Accessible:
Easy to use: \ud83d\udc4c The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
Easy to contribute: \ud83e\udd1d The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
"},{"location":"#setup","title":"Setup","text":"
No setup is required for usage of the platform
Important: It is strongly advised to use virtual environment and not change anything in gh-pages
mkdocs serve - Start the live-reloading docs server.
mkdocs build - Build the documentation site.
mkdocs -h - Print help message and exit.
mkdocs gh-deploy - Use\u00a0mkdocs gh-deploy --help\u00a0to get a full list of options available for the\u00a0gh-deploy\u00a0command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using the\u00a0build\u00a0or\u00a0serve\u00a0commands and reviewing the built files locally.
mkdocs new [dir-name] - Create a new project. No need to create a new project
Can I filter questions based on companies? \ud83e\udd2a
As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. \ud83e\udd13
This doesn't mean that such feature won't be added in the future. \"Never say Never\"
But as of now there is neither plan nor data to do so. \ud83d\ude22
Why is this platform free? \ud83e\udd17
Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.
If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. \ud83d\ude07
\ud83d\ude0e The full list of all the contributors is available here
"},{"location":"#current-status","title":"Current Status","text":""},{"location":"contact/","title":"Contact for https://singhsidhukuldeep.github.io","text":"
Welcome to https://singhsidhukuldeep.github.io/
For any information, request or official correspondence please email to: singhsidhukuldeep@gmail.com
Mailing Address:
Kuldeep Singh Sidhu
Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India
"},{"location":"contact/#follow-on-social-media","title":"Follow on Social Media","text":"Platform Link GitHub https://github.com/singhsidhukuldeep LinkedIn https://www.linkedin.com/in/singhsidhukuldeep/ Twitter (X) https://twitter.com/kuldeep_s_s HuggingFace https://huggingface.co/singhsidhukuldeep StackOverflow https://stackoverflow.com/users/7182350 Website http://kuldeepsinghsidhu.com/"},{"location":"privacy/","title":"Privacy Policy for https://singhsidhukuldeep.github.io","text":""},{"location":"privacy/#introduction","title":"Introduction","text":"
Welcome to https://singhsidhukuldeep.github.io/ (the \"Website\"). Your privacy is important to us, and we are committed to protecting the personal information you share with us. This Privacy Policy explains how we collect, use, and disclose your information, and our commitment to ensuring that your personal data is handled with care and security.
This policy complies with the General Data Protection Regulation (GDPR), ePrivacy Directive (EPD), California Privacy Rights Act (CPRA), Colorado Privacy Act (CPA), Virginia Consumer Data Protection Act (VCDPA), and Brazil's Lei Geral de Prote\u00e7\u00e3o de Dados (LGPD).
"},{"location":"privacy/#information-we-collect","title":"Information We Collect","text":""},{"location":"privacy/#personal-information","title":"Personal Information","text":"
We may collect personally identifiable information about you, such as:
Name
Email address
IP address
Other information you voluntarily provide through contact forms or interactions with the Website
We may also collect non-personal information such as:
Browser type
Language preference
Referring site
Date and time of each visitor request
Aggregated data on how visitors use the Website
"},{"location":"privacy/#cookies-and-web-beacons","title":"Cookies and Web Beacons","text":"
Our Website uses cookies to enhance your experience. A cookie is a small file that is placed on your device when you visit our Website. Cookies help us to:
Remember your preferences and settings
Understand how you interact with our Website
Track and analyze usage patterns
You can disable cookies through your browser settings; however, doing so may affect your ability to access certain features of the Website.
We use Google AdSense to display advertisements on our Website. Google AdSense may use cookies and web beacons to collect information about your interaction with the ads displayed on our Website. This information may include:
Your IP address
The type of browser you use
The pages you visit on our Website
Google may use this information to show you personalized ads based on your interests and browsing history. For more information on how Google uses your data, please visit the Google Privacy & Terms page.
"},{"location":"privacy/#legal-bases-for-processing-your-data-gdpr-compliance","title":"Legal Bases for Processing Your Data (GDPR Compliance)","text":"
We process your personal data under the following legal bases:
Consent: When you have given explicit consent for us to process your data for specific purposes.
Contract: When processing your data is necessary to fulfill a contract with you or to take steps at your request before entering into a contract.
Legitimate Interests: When the processing is necessary for our legitimate interests, such as improving our services, provided these are not overridden by your rights.
Compliance with Legal Obligations: When we need to process your data to comply with a legal obligation.
"},{"location":"privacy/#how-your-data-will-be-used-to-show-ads","title":"How Your Data Will Be Used to Show Ads","text":"
We work with third-party vendors, including Google, to serve ads on our Website. These vendors use cookies and similar technologies to collect and use data about your visits to this and other websites to show you ads that are more relevant to your interests.
"},{"location":"privacy/#types-of-data-used","title":"Types of Data Used","text":"
The data used to show you ads may include:
Demographic Information: Age, gender, and other demographic details
Location Data: Approximate geographical location based on your IP address
Behavioral Data: Your browsing behavior, such as pages visited, links clicked, and time spent on our Website
Interests and Preferences: Based on your browsing history, the types of ads you interact with, and your preferences across websites
"},{"location":"privacy/#purpose-of-data-usage","title":"Purpose of Data Usage","text":"
The primary purpose of collecting and using this data is to:
Serve ads that are relevant and tailored to your interests
Improve ad targeting and effectiveness
Analyze and optimize the performance of ads on our Website
"},{"location":"privacy/#opting-out-of-personalized-ads","title":"Opting Out of Personalized Ads","text":"
You can opt out of personalized ads by adjusting your ad settings with Google and other third-party vendors. For more information on how to opt out of personalized ads, please visit the Google Ads Settings page and review the options available to manage your preferences.
"},{"location":"privacy/#data-subject-rights-gdpr-cpra-cpa-vcdpa-lgpd-compliance","title":"Data Subject Rights (GDPR, CPRA, CPA, VCDPA, LGPD Compliance)","text":"
Depending on your jurisdiction, you have the following rights regarding your personal data:
"},{"location":"privacy/#right-to-access","title":"Right to Access","text":"
You have the right to request access to the personal data we hold about you and to receive a copy of this data.
"},{"location":"privacy/#right-to-rectification","title":"Right to Rectification","text":"
You have the right to request that we correct any inaccuracies in the personal data we hold about you.
"},{"location":"privacy/#right-to-erasure-right-to-be-forgotten","title":"Right to Erasure (Right to Be Forgotten)","text":"
You have the right to request that we delete your personal data, subject to certain conditions and legal obligations.
"},{"location":"privacy/#right-to-restriction-of-processing","title":"Right to Restriction of Processing","text":"
You have the right to request that we restrict the processing of your personal data in certain circumstances, such as when you contest the accuracy of the data.
"},{"location":"privacy/#right-to-data-portability","title":"Right to Data Portability","text":"
You have the right to receive your personal data in a structured, commonly used, and machine-readable format and to transmit this data to another controller.
"},{"location":"privacy/#right-to-object","title":"Right to Object","text":"
You have the right to object to the processing of your personal data based on legitimate interests or for direct marketing purposes.
"},{"location":"privacy/#right-to-withdraw-consent","title":"Right to Withdraw Consent","text":"
Where we rely on your consent to process your personal data, you have the right to withdraw your consent at any time.
"},{"location":"privacy/#right-to-non-discrimination-cpra-compliance","title":"Right to Non-Discrimination (CPRA Compliance)","text":"
We will not discriminate against you for exercising any of your privacy rights under CPRA or any other applicable laws.
"},{"location":"privacy/#exercising-your-rights","title":"Exercising Your Rights","text":"
To exercise any of these rights, please contact us at:
Email: singhsidhukuldeep@gmail.com
We will respond to your request within the timeframes required by applicable law.
"},{"location":"privacy/#how-we-use-your-information","title":"How We Use Your Information","text":"
We use the information collected from you to:
Improve the content and functionality of our Website
Display relevant advertisements through Google AdSense and other ad networks
Respond to your inquiries and provide customer support
Analyze usage patterns and improve our services
"},{"location":"privacy/#data-sharing-and-disclosure","title":"Data Sharing and Disclosure","text":""},{"location":"privacy/#third-party-service-providers","title":"Third-Party Service Providers","text":"
We may share your personal data with third-party service providers who assist us in operating our Website, conducting our business, or servicing you, as long as these parties agree to keep this information confidential.
We will retain your personal data only for as long as necessary to fulfill the purposes outlined in this Privacy Policy unless a longer retention period is required or permitted by law.
We take reasonable measures to protect your information from unauthorized access, alteration, disclosure, or destruction. However, no method of transmission over the internet or electronic storage is 100% secure, and we cannot guarantee absolute security.
"},{"location":"privacy/#cross-border-data-transfers","title":"Cross-Border Data Transfers","text":"
Your personal data may be transferred to, and processed in, countries other than the country in which you are resident. These countries may have data protection laws that are different from the laws of your country.
Where we transfer your personal data to other countries, we will take appropriate measures to ensure that your personal data remains protected in accordance with this Privacy Policy and applicable data protection laws.
By using our Website, you consent to our Privacy Policy and agree to its terms.
"},{"location":"privacy/#changes-to-this-privacy-policy","title":"Changes to This Privacy Policy","text":"
We may update this Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page. You are advised to review this Privacy Policy periodically for any changes.
If you have any questions about this Privacy Policy, or if you would like to exercise your rights under GDPR, CPRA, CPA, VCDPA, or LGPD, please contact us at:
Email: singhsidhukuldeep@gmail.com
Mailing Address:
Kuldeep Singh Sidhu
Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India
These are the projects that you can take inspiration from and try to improve on them. \u270d\ufe0f
"},{"location":"projects/#popular-sources","title":"Popular Sources","text":""},{"location":"projects/#list-of-projects","title":"List of projects","text":""},{"location":"projects/#natural-language-processing-nlp","title":"Natural Language processing (NLP)","text":"Title Description Source Author Text Classification with Facebook fasttext Building the User Review Model with fastText (Text Classification) with response time of less than one second Kuldeep Singh Sidhu Chat-bot using ChatterBot ChatterBot is a Python library that makes it easy to generate automated responses to a user\u2019s input. Kuldeep Singh Sidhu Text Summarizer Comparing state of the art models for text summary generation Kuldeep Singh Sidhu NLP with Spacy Building NLP pipeline using Spacy Kuldeep Singh Sidhu"},{"location":"projects/#recommendation-engine","title":"Recommendation Engine","text":"Title Description Source Author Recommendation Engine with Surprise Comparing different recommendation systems algorithms like SVD, SVDpp (Matrix Factorization), KNN Baseline, KNN Basic, KNN Means, KNN ZScore), Baseline, Co Clustering Kuldeep Singh Sidhu"},{"location":"projects/#image-processing","title":"Image Processing","text":"Title Description Source Author Facial Landmarks Using Dlib, a library capable of giving you 68 points (land marks) of the face. Kuldeep Singh Sidhu"},{"location":"projects/#reinforcement-learning","title":"Reinforcement Learning","text":"Title Description Source Author Google Dopamine Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. Kuldeep Singh Sidhu Tic Tac Toe Training a computer to play Tic Tac Toe using reinforcement learning algorithms. Kuldeep Singh Sidhu"},{"location":"projects/#others","title":"Others","text":"Title Description Source Author TensorFlow Eager Execution Eager Execution (EE) enables you to run operations immediately. Kuldeep Singh Sidhu"},{"location":"Cheat-Sheets/Hypothesis-Tests/","title":"Hypothesis Tests in Python","text":"
A\u00a0statistical hypothesis test\u00a0is a method of\u00a0statistical inference\u00a0used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters.
Few Notes:
When it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.
Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.
In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance, to name two examples.
Tests whether a data sample has a Gaussian distribution/Normal distribution.
Assumptions
Observations in each sample are independent and identically distributed (iid).
Interpretation
H0: the sample has a Gaussian distribution.
H1: the sample does not have a Gaussian distribution.
Python Code
# Example of the Anderson-Darling Normality Test\nfrom scipy.stats import anderson\ndata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\nresult = anderson(data)\nprint('stat=%.3f' % (result.statistic))\nfor i in range(len(result.critical_values)):\n sl, cv = result.significance_level[i], result.critical_values[i]\n if result.statistic < cv:\n print('Probably Gaussian at the %.1f%% level' % (sl))\n else:\n print('Probably not Gaussian at the %.1f%% level' % (sl))\n
In Non-Parametric tests, we don't make any assumption about the parameters for the given population or the population we are studying. In fact, these tests don't depend on the population. Hence, there is no fixed set of parameters is available, and also there is no distribution (normal distribution, etc.)
"},{"location":"Cheat-Sheets/Hypothesis-Tests/#mann-whitney-u-test","title":"Mann-Whitney U Test","text":"
Tests whether the distributions of two independent samples are equal or not.
Assumptions
Observations in each sample are independent and identically distributed (iid).
Observations in each sample can be ranked.
Interpretation
H0: the distributions of both samples are equal.
H1: the distributions of both samples are not equal.
Python Code
# Example of the Mann-Whitney U Test\nfrom scipy.stats import mannwhitneyu\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = mannwhitneyu(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n print('Probably the same distribution')\nelse:\n print('Probably different distributions')\n
Levene\u2019s test is used to assess the equality of variance between two or more different samples.
Assumptions
The samples from the populations under consideration are independent.
The populations under consideration are approximately normally distributed.
Interpretation
H0: All the samples variances are equal
H1: At least one variance is different from the rest
Python Code
# Example of the Levene's test\nfrom scipy.stats import levene\na = [8.88, 9.12, 9.04, 8.98, 9.00, 9.08, 9.01, 8.85, 9.06, 8.99]\nb = [8.88, 8.95, 9.29, 9.44, 9.15, 9.58, 8.36, 9.18, 8.67, 9.05]\nc = [8.95, 9.12, 8.95, 8.85, 9.03, 8.84, 9.07, 8.98, 8.86, 8.98]\nstat, p = levene(a, b, c)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n print('Probably the same variances')\nelse:\n print('Probably at least one variance is different from the rest')\n
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
"},{"location":"Deploying-ML-models/deploying-ml-models/#contribute-to-the-platform","title":"Contribute to the platform","text":"
Contribution in any form will be deeply appreciated. \ud83d\ude4f
\u2753 Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.
\ud83e\udd1d Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.
\ud83d\ude0a If this platform helped you in any way, it would be great if you could share it with others.
Check out this \ud83d\udc47 platform \ud83d\udc47 for data science content:\n\ud83d\udc49 https://singhsidhukuldeep.github.io/data-science-interview-prep/ \ud83d\udc48\n\n#data-science #machine-learning #interview-preparation \n
You can also star the repository on GitHub and watch-out for any updates
\ud83c\udfa8 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices \u2013 from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.
\ud83e\uddd0 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search \u2013 server-less \u2013 is fast and accurate in responses to any of the queries.
\ud83d\ude4c Accessible:
Easy to use: \ud83d\udc4c The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
Easy to contribute: \ud83e\udd1d The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
mkdocs serve - Start the live-reloading docs server.
mkdocs build - Build the documentation site.
mkdocs -h - Print help message and exit.
mkdocs gh-deploy - Use\u00a0mkdocs gh-deploy --help\u00a0to get a full list of options available for the\u00a0gh-deploy\u00a0command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using the\u00a0build\u00a0or\u00a0serve\u00a0commands and reviewing the built files locally.
mkdocs new [dir-name] - Create a new project. No need to create a new project
Can I filter questions based on companies? \ud83e\udd2a
As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. \ud83e\udd13
This doesn't mean that such feature won't be added in the future. \"Never say Never\"
But as of now there is neither plan nor data to do so. \ud83d\ude22
Why is this platform free? \ud83e\udd17
Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.
If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. \ud83d\ude07
These are currently most commonly asked questions. Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.
"},{"location":"Interview-Questions/Probability/#average-score-on-a-dice-role-of-at-most-3-times","title":"Average score on a dice role of at most 3 times","text":"
Question
Consider a fair 6-sided dice. Your aim is to get the highest score you can, in at-most 3 roles.
A score is defined as the number that appears on the face of the dice facing up after the role. You can role at most 3 times but every time you role it is up to you to decide whether you want to role again.
The last score will be counted as your final score.
Find the average score if you rolled the dice only once?
Find the average score that you can get with at most 3 roles?
If the dice is fair, why is the average score for at most 3 roles and 1 role not the same?
Hint 1
Find what is the expected score on single role
And for cases when scores of single role < expected score on single role is when you will go for next role
Eg: if expected score of single role comes out to be 4.5, you will only role next turn for 1,2,3,4 and not for 5,6
The average score if you rolled the dice only once is 3.5
For at most 3 roles, let's try back-tracking. Let's say just did your second role and you have to decide whether to do your 3rd role!
We just found out if we role dice once on average we can expect score of 3.5. So we will only role the 3rd time if score on 2nd role is less than 3.5 i.e (1,2 or 3)
Possibilities
2nd role score Probability 3rd role score Probability 1 \u2159 3.5 \u2159 2 \u2159 3.5 \u2159 3 \u2159 3.5 \u2159 4 \u2159 NA We won't role 5 \u2159 NA 3rd time if we 6 \u2159 NA get score >3 on 2nd
So if we had 2 roles, average score would be:
[We role again if current score is less than 3.4]\n(3.5)*(1/6) + (3.5)*(1/6) + (3.5)*(1/6) \n+\n(4)*(1/6) + (5)*(1/6) + (6)*(1/6) [Decide not to role again]\n=\n1.75 + 2.5 = 4.25\n
The average score if you rolled the dice twice is 4.25
So now if we look from the perspective of first role. We will only role again if our score is less than 4.25 i.e 1,2,3 or 4
Possibilities
1st role score Probability 2nd and 3rd role score Probability 1 \u2159 4.25 \u2159 2 \u2159 4.25 \u2159 3 \u2159 4.25 \u2159 4 \u2159 4.25 \u2159 5 \u2159 NA We won't role again if we 6 \u2159 NA get score >4.25 on 1st
So if we had 3 roles, average score would be:
[We role again if current score is less than 4.25]\n(4.25)*(1/6) + (4.25)*(1/6) + (4.25)*(1/6) + (4.25)*(1/6) \n+\n(5)*(1/6) + (6)*(1/6) [[Decide not to role again]\n=\n17/6 + 11/6 = 4.66\n
The average score if you rolled the dice only once is 4.66
The average score for at most 3 roles and 1 role is not the same because although the dice is fair the event of rolling the dice is no longer independent. The scores would have been the same if we rolled the dice 2nd and 3rd time without considering what we got in the last roll i.e. if the event of rolling the dice was independent.
"},{"location":"Interview-Questions/System-design/","title":"System Design","text":""},{"location":"Interview-Questions/data-structures-algorithms/","title":"Data Structure and Algorithms (DSA)","text":"
"},{"location":"Interview-Questions/data-structures-algorithms/#easy","title":"\ud83d\ude01 Easy","text":""},{"location":"Interview-Questions/data-structures-algorithms/#two-number-sum","title":"Two Number Sum","text":"
Question
Write a function that takes in a non-empty array of distinct integers and an integer representing a target sum.
If any two numbers in the input array sum up to the target sum, the function should return them in an array, in any order.
If no two numbers sum up to the target sum, the function should return an empty array.
Try it!
LeetCode: https://leetcode.com/problems/two-sum/
Hint 1
No Hint
Answer
# O(n) time | O(n) space\ndef twoNumberSum(array, targetSum):\n avail = set()\n for i,v in enumerate(array):\n if targetSum-v in avail:\n return [targetSum-v,v]\n else:\n avail.add(v)\n return []\n pass\n
# O(nlog(n)) time | O(1) space\ndef twoNumberSum(array, targetSum):\n array.sort()\n n = len(array)\n left = 0\n right = n-1\n while left<right:\n currSum = array[left]+array[right]\n if currSum==targetSum: return [array[left], array[right]]\n elif currSum<targetSum: left+=1\n elif currSum>targetSum: right-=1\n return []\n pass\n
# O(n^2) time | O(1) space\ndef twoNumberSum(array, targetSum):\n n = len(array)\n for i in range(n-1):\n for j in range(i+1,n):\n if array[i]+array[j] == targetSum:\n return [array[i],array[j]]\n return []\n pass\n
Given two non-empty arrays of integers, write a function that determines whether the second array is a subsequence of the first one.
A subsequence of an array is a set of numbers that aren't necessarily adjacent in the array but that are in the same order as they appear in the array. For instance, the numbers [1, 3, 4] form a subsequence of the array [1, 2, 3, 4] , and so do the numbers [2, 4].
Note that a single number in an array and the array itself are both valid subsequences of the array.
# O(n) time | O(1) space - where n is the length of the array\ndef isValidSubsequence(array, sequence):\n pArray = pSequence = 0\n while pArray < len(array) and pSequence < len(sequence):\n if array[pArray] == sequence[pSequence]:\n pArray+=1\n pSequence+=1\n else: pArray+=1\n return pSequence == len(sequence)\n pass\n
Write a function that takes in a \"special\" array and returns its product sum. A \"special\" array is a non-empty array that contains either integers or other \"special\" arrays. The product sum of a \"special\" array is the sum of its elements, where \"special\" arrays inside it are summed themselves and then multiplied by their level of depth.
For example, the product sum of [x, y] is x + y ; the product sum of [x, [y, z]] is x + 2y + 2z
# O(n) time | O(d) space - where n is the total number of elements in the array,\n# including sub-elements, and d is the greatest depth of \"special\" arrays in the array\ndef productSum(array, depth = 1):\n sum = 0\n for i,v in enumerate(array):\n if type(v) is list:\n sum += productSum(v, depth + 1)\n else:\n sum += v\n return sum*depth\n pass\n
"},{"location":"Interview-Questions/data-structures-algorithms/#medium","title":"\ud83d\ude42 Medium","text":""},{"location":"Interview-Questions/data-structures-algorithms/#top-k-frequent-words","title":"Top K Frequent Words","text":"
Question
Given a non-empty list of words, return the\u00a0k\u00a0most frequent elements.
Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.
Example 1:
Input: [\"i\", \"love\", \"leetcode\", \"i\", \"love\", \"coding\"], k = 2\nOutput: [\"i\", \"love\"]\nExplanation: \"i\" and \"love\" are the two most frequent words.\n Note that \"i\" comes before \"love\" due to a lower alphabetical order.\n
Example 2:
Input: [\"the\", \"day\", \"is\", \"sunny\", \"the\", \"the\", \"the\", \"sunny\", \"is\", \"is\"], k = 4\nOutput: [\"the\", \"is\", \"sunny\", \"day\"]\nExplanation: \"the\", \"is\", \"sunny\" and \"day\" are the four most frequent words,\n with the number of occurrence being 4, 3, 2 and 1 respectively.\n
Note:
You may assume\u00a0k\u00a0is always valid, 1 \u2264\u00a0k\u00a0\u2264 number of unique elements.
Input words contain only lowercase letters.
Follow up:
Try to solve it in\u00a0O(n\u00a0log\u00a0k) time and\u00a0O(n) extra space.
# Count the frequency of each word, and \n# sort the words with a custom ordering relation \n# that uses these frequencies. Then take the best k of them.\n\n# Time Complexity: O(N \\log{N})O(NlogN), where NN is the length of words. \n# We count the frequency of each word in O(N)O(N) time, \n# then we sort the given words in O(N \\log{N})O(NlogN) time.\n# Space Complexity: O(N)O(N), the space used to store our uniqueWords.\ndef topKFrequentWords(words, k)-> List[str]:\n from collections import Counter\n wordsFreq = Counter(words)\n uniqueWords = list(wordsFreq.keys())\n uniqueWords.sort(key = lambda x: (-wordsFreq[x], x))\n return uniqueWords[:k]\n
# Time Complexity: O(N \\log{k})O(Nlogk), where NN is the length of words. \n# We count the frequency of each word in O(N)O(N) time, then we add NN words to the heap, \n# each in O(\\log {k})O(logk) time. Finally, we pop from the heap up to kk times. \n# As k \\leq Nk\u2264N, this is O(N \\log{k})O(Nlogk) in total.\n\n# In Python, we improve this to O(N + k \\log {N})O(N+klogN): our heapq.heapify operation and \n# counting operations are O(N)O(N), and \n# each of kk heapq.heappop operations are O(\\log {N})O(logN).\n\n# Space Complexity: O(N)O(N), the space used to store our wordsFreq.\n\n# Count the frequency of each word, then add it to heap that stores the best k candidates. \n# Here, \"best\" is defined with our custom ordering relation, \n# which puts the worst candidates at the top of the heap. \n# At the end, we pop off the heap up to k times and reverse the result \n# so that the best candidates are first.\n\n# In Python, we instead use heapq.heapify, which can turn a list into a heap in linear time, \n# simplifying our work.\n\ndef topKFrequentWords(words, k)-> List[str]:\n from heapq import heapify, heappop#, heappush\n from collections import Counter\n wordsFreq = Counter(words)\n heap = [(-freq, word) for word, freq in wordsFreq.items()]\n heapq.heapify(heap)\n return [heapq.heappop(heap)[1] for _ in range(k)]\n
"},{"location":"Interview-Questions/data-structures-algorithms/#hard","title":"\ud83e\udd28 Hard","text":""},{"location":"Interview-Questions/data-structures-algorithms/#very-hard","title":"\ud83d\ude32 Very Hard","text":""}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
Interview Questions
These are currently most commonly asked interview questions.
Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.
DSA (Data Structures & Algorithms)
System Design
Natural Language Processing (NLP)
Probability
Cheat Sheets
Distilled down important concepts for your quick reference
ML Algorithms
From scratch implementation and documentation of all ML algorithms
Online Resources
Most popular and commonly reffered online resources
Current Platform Status Done Under Development To Do
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
"},{"location":"Introduction/#contribute-to-the-platform","title":"Contribute to the platform","text":"
Contribution in any form will be deeply appreciated. \ud83d\ude4f
\u2753 Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.
\ud83e\udd1d Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.
\ud83d\ude0a If this platform helped you in any way, it would be great if you could share it with others.
Check out this \ud83d\udc47 platform \ud83d\udc47 for data science content:\n\ud83d\udc49 https://singhsidhukuldeep.github.io/data-science-interview-prep/ \ud83d\udc48\n
You can also star the repository on GitHub and watch-out for any updates
\ud83c\udfa8 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices \u2013 from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.
\ud83e\uddd0 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search \u2013 server-less \u2013 is fast and accurate in responses to any of the queries.
\ud83d\ude4c Accessible:
Easy to use: \ud83d\udc4c The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
Easy to contribute: \ud83e\udd1d The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
mkdocs serve - Start the live-reloading docs server.
mkdocs build - Build the documentation site.
mkdocs -h - Print help message and exit.
mkdocs gh-deploy - Use\u00a0mkdocs gh-deploy --help\u00a0to get a full list of options available for the\u00a0gh-deploy\u00a0command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using the\u00a0build\u00a0or\u00a0serve\u00a0commands and reviewing the built files locally.
mkdocs new [dir-name] - Create a new project. No need to create a new project
Can I filter questions based on companies? \ud83e\udd2a
As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. \ud83e\udd13
This doesn't mean that such feature won't be added in the future. \"Never say Never\"
But as of now there is neither plan nor data to do so. \ud83d\ude22
Why is this platform free? \ud83e\udd17
Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.
If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. \ud83d\ude07
\ud83d\ude0e The full list of all the contributors is available here
"},{"location":"Introduction/#current-status","title":"Current Status","text":""},{"location":"contact/","title":"Contact for https://singhsidhukuldeep.github.io","text":"
Welcome to https://singhsidhukuldeep.github.io/
For any information, request or official correspondence please email to: singhsidhukuldeep@gmail.com
Mailing Address:
Kuldeep Singh Sidhu
Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India
"},{"location":"contact/#follow-on-social-media","title":"Follow on Social Media","text":"Platform Link GitHub https://github.com/singhsidhukuldeep LinkedIn https://www.linkedin.com/in/singhsidhukuldeep/ Twitter (X) https://twitter.com/kuldeep_s_s HuggingFace https://huggingface.co/singhsidhukuldeep StackOverflow https://stackoverflow.com/users/7182350 Website http://kuldeepsinghsidhu.com/"},{"location":"privacy/","title":"Privacy Policy for https://singhsidhukuldeep.github.io","text":""},{"location":"privacy/#introduction","title":"Introduction","text":"
Welcome to https://singhsidhukuldeep.github.io/ (the \"Website\"). Your privacy is important to us, and we are committed to protecting the personal information you share with us. This Privacy Policy explains how we collect, use, and disclose your information, and our commitment to ensuring that your personal data is handled with care and security.
This policy complies with the General Data Protection Regulation (GDPR), ePrivacy Directive (EPD), California Privacy Rights Act (CPRA), Colorado Privacy Act (CPA), Virginia Consumer Data Protection Act (VCDPA), and Brazil's Lei Geral de Prote\u00e7\u00e3o de Dados (LGPD).
"},{"location":"privacy/#information-we-collect","title":"Information We Collect","text":""},{"location":"privacy/#personal-information","title":"Personal Information","text":"
We may collect personally identifiable information about you, such as:
Name
Email address
IP address
Other information you voluntarily provide through contact forms or interactions with the Website
We may also collect non-personal information such as:
Browser type
Language preference
Referring site
Date and time of each visitor request
Aggregated data on how visitors use the Website
"},{"location":"privacy/#cookies-and-web-beacons","title":"Cookies and Web Beacons","text":"
Our Website uses cookies to enhance your experience. A cookie is a small file that is placed on your device when you visit our Website. Cookies help us to:
Remember your preferences and settings
Understand how you interact with our Website
Track and analyze usage patterns
You can disable cookies through your browser settings; however, doing so may affect your ability to access certain features of the Website.
We use Google AdSense to display advertisements on our Website. Google AdSense may use cookies and web beacons to collect information about your interaction with the ads displayed on our Website. This information may include:
Your IP address
The type of browser you use
The pages you visit on our Website
Google may use this information to show you personalized ads based on your interests and browsing history. For more information on how Google uses your data, please visit the Google Privacy & Terms page.
"},{"location":"privacy/#legal-bases-for-processing-your-data-gdpr-compliance","title":"Legal Bases for Processing Your Data (GDPR Compliance)","text":"
We process your personal data under the following legal bases:
Consent: When you have given explicit consent for us to process your data for specific purposes.
Contract: When processing your data is necessary to fulfill a contract with you or to take steps at your request before entering into a contract.
Legitimate Interests: When the processing is necessary for our legitimate interests, such as improving our services, provided these are not overridden by your rights.
Compliance with Legal Obligations: When we need to process your data to comply with a legal obligation.
"},{"location":"privacy/#how-your-data-will-be-used-to-show-ads","title":"How Your Data Will Be Used to Show Ads","text":"
We work with third-party vendors, including Google, to serve ads on our Website. These vendors use cookies and similar technologies to collect and use data about your visits to this and other websites to show you ads that are more relevant to your interests.
"},{"location":"privacy/#types-of-data-used","title":"Types of Data Used","text":"
The data used to show you ads may include:
Demographic Information: Age, gender, and other demographic details
Location Data: Approximate geographical location based on your IP address
Behavioral Data: Your browsing behavior, such as pages visited, links clicked, and time spent on our Website
Interests and Preferences: Based on your browsing history, the types of ads you interact with, and your preferences across websites
"},{"location":"privacy/#purpose-of-data-usage","title":"Purpose of Data Usage","text":"
The primary purpose of collecting and using this data is to:
Serve ads that are relevant and tailored to your interests
Improve ad targeting and effectiveness
Analyze and optimize the performance of ads on our Website
"},{"location":"privacy/#opting-out-of-personalized-ads","title":"Opting Out of Personalized Ads","text":"
You can opt out of personalized ads by adjusting your ad settings with Google and other third-party vendors. For more information on how to opt out of personalized ads, please visit the Google Ads Settings page and review the options available to manage your preferences.
"},{"location":"privacy/#data-subject-rights-gdpr-cpra-cpa-vcdpa-lgpd-compliance","title":"Data Subject Rights (GDPR, CPRA, CPA, VCDPA, LGPD Compliance)","text":"
Depending on your jurisdiction, you have the following rights regarding your personal data:
"},{"location":"privacy/#right-to-access","title":"Right to Access","text":"
You have the right to request access to the personal data we hold about you and to receive a copy of this data.
"},{"location":"privacy/#right-to-rectification","title":"Right to Rectification","text":"
You have the right to request that we correct any inaccuracies in the personal data we hold about you.
"},{"location":"privacy/#right-to-erasure-right-to-be-forgotten","title":"Right to Erasure (Right to Be Forgotten)","text":"
You have the right to request that we delete your personal data, subject to certain conditions and legal obligations.
"},{"location":"privacy/#right-to-restriction-of-processing","title":"Right to Restriction of Processing","text":"
You have the right to request that we restrict the processing of your personal data in certain circumstances, such as when you contest the accuracy of the data.
"},{"location":"privacy/#right-to-data-portability","title":"Right to Data Portability","text":"
You have the right to receive your personal data in a structured, commonly used, and machine-readable format and to transmit this data to another controller.
"},{"location":"privacy/#right-to-object","title":"Right to Object","text":"
You have the right to object to the processing of your personal data based on legitimate interests or for direct marketing purposes.
"},{"location":"privacy/#right-to-withdraw-consent","title":"Right to Withdraw Consent","text":"
Where we rely on your consent to process your personal data, you have the right to withdraw your consent at any time.
"},{"location":"privacy/#right-to-non-discrimination-cpra-compliance","title":"Right to Non-Discrimination (CPRA Compliance)","text":"
We will not discriminate against you for exercising any of your privacy rights under CPRA or any other applicable laws.
"},{"location":"privacy/#exercising-your-rights","title":"Exercising Your Rights","text":"
To exercise any of these rights, please contact us at:
Email: singhsidhukuldeep@gmail.com
We will respond to your request within the timeframes required by applicable law.
"},{"location":"privacy/#how-we-use-your-information","title":"How We Use Your Information","text":"
We use the information collected from you to:
Improve the content and functionality of our Website
Display relevant advertisements through Google AdSense and other ad networks
Respond to your inquiries and provide customer support
Analyze usage patterns and improve our services
"},{"location":"privacy/#data-sharing-and-disclosure","title":"Data Sharing and Disclosure","text":""},{"location":"privacy/#third-party-service-providers","title":"Third-Party Service Providers","text":"
We may share your personal data with third-party service providers who assist us in operating our Website, conducting our business, or servicing you, as long as these parties agree to keep this information confidential.
We will retain your personal data only for as long as necessary to fulfill the purposes outlined in this Privacy Policy unless a longer retention period is required or permitted by law.
We take reasonable measures to protect your information from unauthorized access, alteration, disclosure, or destruction. However, no method of transmission over the internet or electronic storage is 100% secure, and we cannot guarantee absolute security.
"},{"location":"privacy/#cross-border-data-transfers","title":"Cross-Border Data Transfers","text":"
Your personal data may be transferred to, and processed in, countries other than the country in which you are resident. These countries may have data protection laws that are different from the laws of your country.
Where we transfer your personal data to other countries, we will take appropriate measures to ensure that your personal data remains protected in accordance with this Privacy Policy and applicable data protection laws.
By using our Website, you consent to our Privacy Policy and agree to its terms.
"},{"location":"privacy/#changes-to-this-privacy-policy","title":"Changes to This Privacy Policy","text":"
We may update this Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page. You are advised to review this Privacy Policy periodically for any changes.
If you have any questions about this Privacy Policy, or if you would like to exercise your rights under GDPR, CPRA, CPA, VCDPA, or LGPD, please contact us at:
Email: singhsidhukuldeep@gmail.com
Mailing Address:
Kuldeep Singh Sidhu
Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India
These are the projects that you can take inspiration from and try to improve on them. \u270d\ufe0f
"},{"location":"projects/#popular-sources","title":"Popular Sources","text":""},{"location":"projects/#list-of-projects","title":"List of projects","text":""},{"location":"projects/#natural-language-processing-nlp","title":"Natural Language processing (NLP)","text":"Title Description Source Author Text Classification with Facebook fasttext Building the User Review Model with fastText (Text Classification) with response time of less than one second Kuldeep Singh Sidhu Chat-bot using ChatterBot ChatterBot is a Python library that makes it easy to generate automated responses to a user\u2019s input. Kuldeep Singh Sidhu Text Summarizer Comparing state of the art models for text summary generation Kuldeep Singh Sidhu NLP with Spacy Building NLP pipeline using Spacy Kuldeep Singh Sidhu"},{"location":"projects/#recommendation-engine","title":"Recommendation Engine","text":"Title Description Source Author Recommendation Engine with Surprise Comparing different recommendation systems algorithms like SVD, SVDpp (Matrix Factorization), KNN Baseline, KNN Basic, KNN Means, KNN ZScore), Baseline, Co Clustering Kuldeep Singh Sidhu"},{"location":"projects/#image-processing","title":"Image Processing","text":"Title Description Source Author Facial Landmarks Using Dlib, a library capable of giving you 68 points (land marks) of the face. Kuldeep Singh Sidhu"},{"location":"projects/#reinforcement-learning","title":"Reinforcement Learning","text":"Title Description Source Author Google Dopamine Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. Kuldeep Singh Sidhu Tic Tac Toe Training a computer to play Tic Tac Toe using reinforcement learning algorithms. Kuldeep Singh Sidhu"},{"location":"projects/#others","title":"Others","text":"Title Description Source Author TensorFlow Eager Execution Eager Execution (EE) enables you to run operations immediately. Kuldeep Singh Sidhu"},{"location":"Cheat-Sheets/Hypothesis-Tests/","title":"Hypothesis Tests in Python","text":"
A\u00a0statistical hypothesis test\u00a0is a method of\u00a0statistical inference\u00a0used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters.
Few Notes:
When it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.
Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.
In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance, to name two examples.
Tests whether a data sample has a Gaussian distribution/Normal distribution.
Assumptions
Observations in each sample are independent and identically distributed (iid).
Interpretation
H0: the sample has a Gaussian distribution.
H1: the sample does not have a Gaussian distribution.
Python Code
# Example of the Anderson-Darling Normality Test\nfrom scipy.stats import anderson\ndata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\nresult = anderson(data)\nprint('stat=%.3f' % (result.statistic))\nfor i in range(len(result.critical_values)):\n sl, cv = result.significance_level[i], result.critical_values[i]\n if result.statistic < cv:\n print('Probably Gaussian at the %.1f%% level' % (sl))\n else:\n print('Probably not Gaussian at the %.1f%% level' % (sl))\n
In Non-Parametric tests, we don't make any assumption about the parameters for the given population or the population we are studying. In fact, these tests don't depend on the population. Hence, there is no fixed set of parameters is available, and also there is no distribution (normal distribution, etc.)
"},{"location":"Cheat-Sheets/Hypothesis-Tests/#mann-whitney-u-test","title":"Mann-Whitney U Test","text":"
Tests whether the distributions of two independent samples are equal or not.
Assumptions
Observations in each sample are independent and identically distributed (iid).
Observations in each sample can be ranked.
Interpretation
H0: the distributions of both samples are equal.
H1: the distributions of both samples are not equal.
Python Code
# Example of the Mann-Whitney U Test\nfrom scipy.stats import mannwhitneyu\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = mannwhitneyu(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n print('Probably the same distribution')\nelse:\n print('Probably different distributions')\n
Levene\u2019s test is used to assess the equality of variance between two or more different samples.
Assumptions
The samples from the populations under consideration are independent.
The populations under consideration are approximately normally distributed.
Interpretation
H0: All the samples variances are equal
H1: At least one variance is different from the rest
Python Code
# Example of the Levene's test\nfrom scipy.stats import levene\na = [8.88, 9.12, 9.04, 8.98, 9.00, 9.08, 9.01, 8.85, 9.06, 8.99]\nb = [8.88, 8.95, 9.29, 9.44, 9.15, 9.58, 8.36, 9.18, 8.67, 9.05]\nc = [8.95, 9.12, 8.95, 8.85, 9.03, 8.84, 9.07, 8.98, 8.86, 8.98]\nstat, p = levene(a, b, c)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n print('Probably the same variances')\nelse:\n print('Probably at least one variance is different from the rest')\n
This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.
Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.
This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.
"},{"location":"Deploying-ML-models/deploying-ml-models/#contribute-to-the-platform","title":"Contribute to the platform","text":"
Contribution in any form will be deeply appreciated. \ud83d\ude4f
\u2753 Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.
\ud83e\udd1d Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.
\ud83d\ude0a If this platform helped you in any way, it would be great if you could share it with others.
Check out this \ud83d\udc47 platform \ud83d\udc47 for data science content:\n\ud83d\udc49 https://singhsidhukuldeep.github.io/data-science-interview-prep/ \ud83d\udc48\n\n#data-science #machine-learning #interview-preparation \n
You can also star the repository on GitHub and watch-out for any updates
\ud83c\udfa8 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices \u2013 from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.
\ud83e\uddd0 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search \u2013 server-less \u2013 is fast and accurate in responses to any of the queries.
\ud83d\ude4c Accessible:
Easy to use: \ud83d\udc4c The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
Easy to contribute: \ud83e\udd1d The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
mkdocs serve - Start the live-reloading docs server.
mkdocs build - Build the documentation site.
mkdocs -h - Print help message and exit.
mkdocs gh-deploy - Use\u00a0mkdocs gh-deploy --help\u00a0to get a full list of options available for the\u00a0gh-deploy\u00a0command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using the\u00a0build\u00a0or\u00a0serve\u00a0commands and reviewing the built files locally.
mkdocs new [dir-name] - Create a new project. No need to create a new project
Can I filter questions based on companies? \ud83e\udd2a
As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. \ud83e\udd13
This doesn't mean that such feature won't be added in the future. \"Never say Never\"
But as of now there is neither plan nor data to do so. \ud83d\ude22
Why is this platform free? \ud83e\udd17
Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.
If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. \ud83d\ude07
These are currently most commonly asked questions. Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.
"},{"location":"Interview-Questions/Probability/#average-score-on-a-dice-role-of-at-most-3-times","title":"Average score on a dice role of at most 3 times","text":"
Question
Consider a fair 6-sided dice. Your aim is to get the highest score you can, in at-most 3 roles.
A score is defined as the number that appears on the face of the dice facing up after the role. You can role at most 3 times but every time you role it is up to you to decide whether you want to role again.
The last score will be counted as your final score.
Find the average score if you rolled the dice only once?
Find the average score that you can get with at most 3 roles?
If the dice is fair, why is the average score for at most 3 roles and 1 role not the same?
Hint 1
Find what is the expected score on single role
And for cases when scores of single role < expected score on single role is when you will go for next role
Eg: if expected score of single role comes out to be 4.5, you will only role next turn for 1,2,3,4 and not for 5,6
The average score if you rolled the dice only once is 3.5
For at most 3 roles, let's try back-tracking. Let's say just did your second role and you have to decide whether to do your 3rd role!
We just found out if we role dice once on average we can expect score of 3.5. So we will only role the 3rd time if score on 2nd role is less than 3.5 i.e (1,2 or 3)
Possibilities
2nd role score Probability 3rd role score Probability 1 \u2159 3.5 \u2159 2 \u2159 3.5 \u2159 3 \u2159 3.5 \u2159 4 \u2159 NA We won't role 5 \u2159 NA 3rd time if we 6 \u2159 NA get score >3 on 2nd
So if we had 2 roles, average score would be:
[We role again if current score is less than 3.4]\n(3.5)*(1/6) + (3.5)*(1/6) + (3.5)*(1/6) \n+\n(4)*(1/6) + (5)*(1/6) + (6)*(1/6) [Decide not to role again]\n=\n1.75 + 2.5 = 4.25\n
The average score if you rolled the dice twice is 4.25
So now if we look from the perspective of first role. We will only role again if our score is less than 4.25 i.e 1,2,3 or 4
Possibilities
1st role score Probability 2nd and 3rd role score Probability 1 \u2159 4.25 \u2159 2 \u2159 4.25 \u2159 3 \u2159 4.25 \u2159 4 \u2159 4.25 \u2159 5 \u2159 NA We won't role again if we 6 \u2159 NA get score >4.25 on 1st
So if we had 3 roles, average score would be:
[We role again if current score is less than 4.25]\n(4.25)*(1/6) + (4.25)*(1/6) + (4.25)*(1/6) + (4.25)*(1/6) \n+\n(5)*(1/6) + (6)*(1/6) [[Decide not to role again]\n=\n17/6 + 11/6 = 4.66\n
The average score if you rolled the dice only once is 4.66
The average score for at most 3 roles and 1 role is not the same because although the dice is fair the event of rolling the dice is no longer independent. The scores would have been the same if we rolled the dice 2nd and 3rd time without considering what we got in the last roll i.e. if the event of rolling the dice was independent.
"},{"location":"Interview-Questions/System-design/","title":"System Design","text":""},{"location":"Interview-Questions/data-structures-algorithms/","title":"Data Structure and Algorithms (DSA)","text":"
"},{"location":"Interview-Questions/data-structures-algorithms/#easy","title":"\ud83d\ude01 Easy","text":""},{"location":"Interview-Questions/data-structures-algorithms/#two-number-sum","title":"Two Number Sum","text":"
Question
Write a function that takes in a non-empty array of distinct integers and an integer representing a target sum.
If any two numbers in the input array sum up to the target sum, the function should return them in an array, in any order.
If no two numbers sum up to the target sum, the function should return an empty array.
Try it!
LeetCode: https://leetcode.com/problems/two-sum/
Hint 1
No Hint
Answer
# O(n) time | O(n) space\ndef twoNumberSum(array, targetSum):\n avail = set()\n for i,v in enumerate(array):\n if targetSum-v in avail:\n return [targetSum-v,v]\n else:\n avail.add(v)\n return []\n pass\n
# O(nlog(n)) time | O(1) space\ndef twoNumberSum(array, targetSum):\n array.sort()\n n = len(array)\n left = 0\n right = n-1\n while left<right:\n currSum = array[left]+array[right]\n if currSum==targetSum: return [array[left], array[right]]\n elif currSum<targetSum: left+=1\n elif currSum>targetSum: right-=1\n return []\n pass\n
# O(n^2) time | O(1) space\ndef twoNumberSum(array, targetSum):\n n = len(array)\n for i in range(n-1):\n for j in range(i+1,n):\n if array[i]+array[j] == targetSum:\n return [array[i],array[j]]\n return []\n pass\n
Given two non-empty arrays of integers, write a function that determines whether the second array is a subsequence of the first one.
A subsequence of an array is a set of numbers that aren't necessarily adjacent in the array but that are in the same order as they appear in the array. For instance, the numbers [1, 3, 4] form a subsequence of the array [1, 2, 3, 4] , and so do the numbers [2, 4].
Note that a single number in an array and the array itself are both valid subsequences of the array.
# O(n) time | O(1) space - where n is the length of the array\ndef isValidSubsequence(array, sequence):\n pArray = pSequence = 0\n while pArray < len(array) and pSequence < len(sequence):\n if array[pArray] == sequence[pSequence]:\n pArray+=1\n pSequence+=1\n else: pArray+=1\n return pSequence == len(sequence)\n pass\n
Write a function that takes in a \"special\" array and returns its product sum. A \"special\" array is a non-empty array that contains either integers or other \"special\" arrays. The product sum of a \"special\" array is the sum of its elements, where \"special\" arrays inside it are summed themselves and then multiplied by their level of depth.
For example, the product sum of [x, y] is x + y ; the product sum of [x, [y, z]] is x + 2y + 2z
# O(n) time | O(d) space - where n is the total number of elements in the array,\n# including sub-elements, and d is the greatest depth of \"special\" arrays in the array\ndef productSum(array, depth = 1):\n sum = 0\n for i,v in enumerate(array):\n if type(v) is list:\n sum += productSum(v, depth + 1)\n else:\n sum += v\n return sum*depth\n pass\n
"},{"location":"Interview-Questions/data-structures-algorithms/#medium","title":"\ud83d\ude42 Medium","text":""},{"location":"Interview-Questions/data-structures-algorithms/#top-k-frequent-words","title":"Top K Frequent Words","text":"
Question
Given a non-empty list of words, return the\u00a0k\u00a0most frequent elements.
Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.
Example 1:
Input: [\"i\", \"love\", \"leetcode\", \"i\", \"love\", \"coding\"], k = 2\nOutput: [\"i\", \"love\"]\nExplanation: \"i\" and \"love\" are the two most frequent words.\n Note that \"i\" comes before \"love\" due to a lower alphabetical order.\n
Example 2:
Input: [\"the\", \"day\", \"is\", \"sunny\", \"the\", \"the\", \"the\", \"sunny\", \"is\", \"is\"], k = 4\nOutput: [\"the\", \"is\", \"sunny\", \"day\"]\nExplanation: \"the\", \"is\", \"sunny\" and \"day\" are the four most frequent words,\n with the number of occurrence being 4, 3, 2 and 1 respectively.\n
Note:
You may assume\u00a0k\u00a0is always valid, 1 \u2264\u00a0k\u00a0\u2264 number of unique elements.
Input words contain only lowercase letters.
Follow up:
Try to solve it in\u00a0O(n\u00a0log\u00a0k) time and\u00a0O(n) extra space.
# Count the frequency of each word, and \n# sort the words with a custom ordering relation \n# that uses these frequencies. Then take the best k of them.\n\n# Time Complexity: O(N \\log{N})O(NlogN), where NN is the length of words. \n# We count the frequency of each word in O(N)O(N) time, \n# then we sort the given words in O(N \\log{N})O(NlogN) time.\n# Space Complexity: O(N)O(N), the space used to store our uniqueWords.\ndef topKFrequentWords(words, k)-> List[str]:\n from collections import Counter\n wordsFreq = Counter(words)\n uniqueWords = list(wordsFreq.keys())\n uniqueWords.sort(key = lambda x: (-wordsFreq[x], x))\n return uniqueWords[:k]\n
# Time Complexity: O(N \\log{k})O(Nlogk), where NN is the length of words. \n# We count the frequency of each word in O(N)O(N) time, then we add NN words to the heap, \n# each in O(\\log {k})O(logk) time. Finally, we pop from the heap up to kk times. \n# As k \\leq Nk\u2264N, this is O(N \\log{k})O(Nlogk) in total.\n\n# In Python, we improve this to O(N + k \\log {N})O(N+klogN): our heapq.heapify operation and \n# counting operations are O(N)O(N), and \n# each of kk heapq.heappop operations are O(\\log {N})O(logN).\n\n# Space Complexity: O(N)O(N), the space used to store our wordsFreq.\n\n# Count the frequency of each word, then add it to heap that stores the best k candidates. \n# Here, \"best\" is defined with our custom ordering relation, \n# which puts the worst candidates at the top of the heap. \n# At the end, we pop off the heap up to k times and reverse the result \n# so that the best candidates are first.\n\n# In Python, we instead use heapq.heapify, which can turn a list into a heap in linear time, \n# simplifying our work.\n\ndef topKFrequentWords(words, k)-> List[str]:\n from heapq import heapify, heappop#, heappush\n from collections import Counter\n wordsFreq = Counter(words)\n heap = [(-freq, word) for word, freq in wordsFreq.items()]\n heapq.heapify(heap)\n return [heapq.heappop(heap)[1] for _ in range(k)]\n