diff --git a/404.html b/404.html index c3afc82..176c86d 100644 --- a/404.html +++ b/404.html @@ -1 +1 @@ - Data Science Interview preparation

Alright, now that's a 404! 😱

What's 404 you might ask? πŸ€”

There are 2 possibilities: 🧐

  1. You are not supposed to be here🀨, which is awkward. Maybe you should head back!
  2. This is still in development.πŸ§‘β€πŸ’»
    And if you are here that probably means you are intersted in this.😍
    So please please, please, contribute. 🀝
    🫡 You can SUBMIT simple text/markdown content, I will format it! πŸ™Œ
\ No newline at end of file + Data Science Interview preparation

Alright, now that's a 404! 😱

What's 404 you might ask? πŸ€”

There are 2 possibilities: 🧐

  1. You are not supposed to be here🀨, which is awkward. Maybe you should head back!
  2. This is still in development.πŸ§‘β€πŸ’»
    And if you are here that probably means you are intersted in this.😍
    πŸ‘€ This project is in early stages of development.
    πŸ€— Please contibute content if possible! 🀝
    🫡 You can SUBMIT simple text/markdown content, I will format it! πŸ™Œ
\ No newline at end of file diff --git a/Cheat-Sheets/Django/index.html b/Cheat-Sheets/Django/index.html index 176077a..158c785 100644 --- a/Cheat-Sheets/Django/index.html +++ b/Cheat-Sheets/Django/index.html @@ -1 +1 @@ - Django - Data Science Interview preparation

Django

\ No newline at end of file + Django - Data Science Interview preparation
\ No newline at end of file diff --git a/Cheat-Sheets/Flask/index.html b/Cheat-Sheets/Flask/index.html index 864896b..753f79c 100644 --- a/Cheat-Sheets/Flask/index.html +++ b/Cheat-Sheets/Flask/index.html @@ -1 +1 @@ - Flask - Data Science Interview preparation

Flask

\ No newline at end of file + Flask - Data Science Interview preparation
\ No newline at end of file diff --git a/Cheat-Sheets/Hypothesis-Tests/index.html b/Cheat-Sheets/Hypothesis-Tests/index.html index 285ad15..164cbf3 100644 --- a/Cheat-Sheets/Hypothesis-Tests/index.html +++ b/Cheat-Sheets/Hypothesis-Tests/index.html @@ -1,4 +1,4 @@ - Hypothesis Tests in Python (Cheat Sheet) - Data Science Interview preparation
Skip to content

Hypothesis Tests in Python

AΒ statistical hypothesis testΒ is a method ofΒ statistical inferenceΒ used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters.

Few Notes:

  • When it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.
  • Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.
  • In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance, to name two examples.

Normality Tests

This section lists statistical tests that you can use to check if your data has a Gaussian distribution.

Gaussian distribution (also known as normal distribution) is a bell-shaped curve.

Shapiro-Wilk Test

Tests whether a data sample has a Gaussian distribution/Normal distribution.

  • Assumptions

    • Observations in each sample are independent and identically distributed (iid).
  • Interpretation

    • H0: the sample has a Gaussian distribution.
    • H1: the sample does not have a Gaussian distribution.
  • Python Code

    # Example of the Shapiro-Wilk Normality Test
    + Hypothesis Tests in Python (Cheat Sheet) - Data Science Interview preparation      

    Hypothesis Tests in Python

    AΒ statistical hypothesis testΒ is a method ofΒ statistical inferenceΒ used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters.

    Few Notes:

    • When it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.
    • Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.
    • In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance, to name two examples.

    Normality Tests

    This section lists statistical tests that you can use to check if your data has a Gaussian distribution.

    Gaussian distribution (also known as normal distribution) is a bell-shaped curve.

    Shapiro-Wilk Test

    Tests whether a data sample has a Gaussian distribution/Normal distribution.

    • Assumptions

      • Observations in each sample are independent and identically distributed (iid).
    • Interpretation

      • H0: the sample has a Gaussian distribution.
      • H1: the sample does not have a Gaussian distribution.
    • Python Code

      # Example of the Shapiro-Wilk Normality Test
       from scipy.stats import shapiro
       data = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
       stat, p = shapiro(data)
      diff --git a/Cheat-Sheets/Keras/index.html b/Cheat-Sheets/Keras/index.html
      index b86e122..a8644ec 100644
      --- a/Cheat-Sheets/Keras/index.html
      +++ b/Cheat-Sheets/Keras/index.html
      @@ -1 +1 @@
      - Keras - Data Science Interview preparation     

      Keras

      \ No newline at end of file + Keras - Data Science Interview preparation
      \ No newline at end of file diff --git a/Cheat-Sheets/NumPy/index.html b/Cheat-Sheets/NumPy/index.html index cef5099..d99b41f 100644 --- a/Cheat-Sheets/NumPy/index.html +++ b/Cheat-Sheets/NumPy/index.html @@ -1 +1 @@ - NumPy - Data Science Interview preparation

      NumPy

      \ No newline at end of file + NumPy - Data Science Interview preparation
      \ No newline at end of file diff --git a/Cheat-Sheets/Pandas/index.html b/Cheat-Sheets/Pandas/index.html index ffad173..274789b 100644 --- a/Cheat-Sheets/Pandas/index.html +++ b/Cheat-Sheets/Pandas/index.html @@ -1 +1 @@ - Pandas - Data Science Interview preparation

      Pandas

      \ No newline at end of file + Pandas - Data Science Interview preparation
      \ No newline at end of file diff --git a/Cheat-Sheets/PySpark/index.html b/Cheat-Sheets/PySpark/index.html index c5d88f3..eb29b82 100644 --- a/Cheat-Sheets/PySpark/index.html +++ b/Cheat-Sheets/PySpark/index.html @@ -1 +1 @@ - PySpark - Data Science Interview preparation

      PySpark

      \ No newline at end of file + PySpark - Data Science Interview preparation
      \ No newline at end of file diff --git a/Cheat-Sheets/PyTorch/index.html b/Cheat-Sheets/PyTorch/index.html index 9bce0db..9373aa1 100644 --- a/Cheat-Sheets/PyTorch/index.html +++ b/Cheat-Sheets/PyTorch/index.html @@ -1 +1 @@ - PyTorch - Data Science Interview preparation

      PyTorch

      \ No newline at end of file + PyTorch - Data Science Interview preparation
      \ No newline at end of file diff --git a/Cheat-Sheets/Python/index.html b/Cheat-Sheets/Python/index.html index 492aabc..c0012ec 100644 --- a/Cheat-Sheets/Python/index.html +++ b/Cheat-Sheets/Python/index.html @@ -1 +1 @@ - Python - Data Science Interview preparation

      Python

      \ No newline at end of file + Python - Data Science Interview preparation
      \ No newline at end of file diff --git a/Cheat-Sheets/RegEx/index.html b/Cheat-Sheets/RegEx/index.html index b0aa493..e5f5b71 100644 --- a/Cheat-Sheets/RegEx/index.html +++ b/Cheat-Sheets/RegEx/index.html @@ -1 +1 @@ - Regular Expressions (RegEx) - Data Science Interview preparation

      Regular Expressions (RegEx)

      \ No newline at end of file + Regular Expressions (RegEx) - Data Science Interview preparation
      \ No newline at end of file diff --git a/Cheat-Sheets/SQL/index.html b/Cheat-Sheets/SQL/index.html index 99a95df..54c62c9 100644 --- a/Cheat-Sheets/SQL/index.html +++ b/Cheat-Sheets/SQL/index.html @@ -1 +1 @@ - SQL - Data Science Interview preparation

      SQL

      \ No newline at end of file + SQL - Data Science Interview preparation
      \ No newline at end of file diff --git a/Cheat-Sheets/Sk-learn/index.html b/Cheat-Sheets/Sk-learn/index.html index f3095cf..684190e 100644 --- a/Cheat-Sheets/Sk-learn/index.html +++ b/Cheat-Sheets/Sk-learn/index.html @@ -1 +1 @@ - Scikit Learn - Data Science Interview preparation

      Scikit Learn

      \ No newline at end of file + Scikit Learn - Data Science Interview preparation
      \ No newline at end of file diff --git a/Cheat-Sheets/tensorflow/index.html b/Cheat-Sheets/tensorflow/index.html index 548eb8a..1e8dacc 100644 --- a/Cheat-Sheets/tensorflow/index.html +++ b/Cheat-Sheets/tensorflow/index.html @@ -1 +1 @@ - TensorFlow - Data Science Interview preparation

      TensorFlow

      \ No newline at end of file + TensorFlow - Data Science Interview preparation
      \ No newline at end of file diff --git a/Deploying-ML-models/deploying-ml-models/index.html b/Deploying-ML-models/deploying-ml-models/index.html index c398aa1..81654e4 100644 --- a/Deploying-ML-models/deploying-ml-models/index.html +++ b/Deploying-ML-models/deploying-ml-models/index.html @@ -1,4 +1,4 @@ - Home - Data Science Interview preparation

      Home

      Go to website

      Introduction

      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! πŸ€— You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      Contribute to the platform

      Contribution in any form will be deeply appreciated. πŸ™

      Add questions

      ❓ Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.

      Add New question

      🀝 Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.

      Add answers/topics

      πŸ“ These are the answers/topics that need your help at the moment

      • Add documentation for the project
      • Online Material for Learning
      • Suggested Learning Paths
      • Cheat Sheets
        • Django
        • Flask
        • Numpy
        • Pandas
        • PySpark
        • Python
        • RegEx
        • SQL
      • NLP Interview Questions
      • Add python common DSA interview questions
      • Add Major ML topics
        • Linear Regression
        • Logistic Regression
        • SVM
        • Random Forest
        • Gradient boosting
        • PCA
        • Collaborative Filtering
        • K-means clustering
        • kNN
        • ARIMA
        • Neural Networks
        • Decision Trees
        • Overfitting, Underfitting
        • Unbalanced, Skewed data
        • Activation functions relu/ leaky relu
        • Normalization
        • DBSCAN
        • Normal Distribution
        • Precision, Recall
        • Loss Function MAE, RMSE
      • Add Pandas questions
      • Add NumPy questions
      • Add TensorFlow questions
      • Add PyTorch questions
      • Add list of learning resources

      Report/Solve Issues

      Issues

      πŸ”§ To report any issues find me on LinkedIn or raise an issue on GitHub.

      πŸ›  You can also solve existing issues on GitHub and create a pull request.

      Say Thanks

      😊 If this platform helped you in any way, it would be great if you could share it with others.

      Check out this πŸ‘‡ platform πŸ‘‡ for data science content:
      + Home - Data Science Interview preparation      

      Home

      Go to website

      Introduction

      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! πŸ€— You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      Contribute to the platform

      Contribution in any form will be deeply appreciated. πŸ™

      Add questions

      ❓ Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.

      Add New question

      🀝 Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.

      Add answers/topics

      πŸ“ These are the answers/topics that need your help at the moment

      • Add documentation for the project
      • Online Material for Learning
      • Suggested Learning Paths
      • Cheat Sheets
        • Django
        • Flask
        • Numpy
        • Pandas
        • PySpark
        • Python
        • RegEx
        • SQL
      • NLP Interview Questions
      • Add python common DSA interview questions
      • Add Major ML topics
        • Linear Regression
        • Logistic Regression
        • SVM
        • Random Forest
        • Gradient boosting
        • PCA
        • Collaborative Filtering
        • K-means clustering
        • kNN
        • ARIMA
        • Neural Networks
        • Decision Trees
        • Overfitting, Underfitting
        • Unbalanced, Skewed data
        • Activation functions relu/ leaky relu
        • Normalization
        • DBSCAN
        • Normal Distribution
        • Precision, Recall
        • Loss Function MAE, RMSE
      • Add Pandas questions
      • Add NumPy questions
      • Add TensorFlow questions
      • Add PyTorch questions
      • Add list of learning resources

      Report/Solve Issues

      Issues

      πŸ”§ To report any issues find me on LinkedIn or raise an issue on GitHub.

      πŸ›  You can also solve existing issues on GitHub and create a pull request.

      Say Thanks

      😊 If this platform helped you in any way, it would be great if you could share it with others.

      Check out this πŸ‘‡ platform πŸ‘‡ for data science content:
       πŸ‘‰ https://singhsidhukuldeep.github.io/data-science-interview-prep/ πŸ‘ˆ
       
       #data-science #machine-learning #interview-preparation 
      diff --git a/Interview-Questions/Interview-Questions/index.html b/Interview-Questions/Interview-Questions/index.html
      index 801c667..33af3c1 100644
      --- a/Interview-Questions/Interview-Questions/index.html
      +++ b/Interview-Questions/Interview-Questions/index.html
      @@ -1 +1 @@
      - Interview Questions - Data Science Interview preparation      

      Interview Questions (Intro)

      These are currently most commonly asked questions. Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.

      \ No newline at end of file + Interview Questions - Data Science Interview preparation

      Interview Questions (Intro)

      These are currently most commonly asked questions. Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.

      \ No newline at end of file diff --git a/Interview-Questions/Natural-Language-Processing/index.html b/Interview-Questions/Natural-Language-Processing/index.html index 4875606..c3bb0b1 100644 --- a/Interview-Questions/Natural-Language-Processing/index.html +++ b/Interview-Questions/Natural-Language-Processing/index.html @@ -1 +1 @@ - NLP Questions - Data Science Interview preparation

      1. NLP Interview Questions

      Total Questions Unanswered Questions Answered Questions

      \ No newline at end of file + NLP Questions - Data Science Interview preparation
      \ No newline at end of file diff --git a/Interview-Questions/Probability/index.html b/Interview-Questions/Probability/index.html index 1bc224f..e9e4915 100644 --- a/Interview-Questions/Probability/index.html +++ b/Interview-Questions/Probability/index.html @@ -1,4 +1,4 @@ - Probability Questions - Data Science Interview preparation

      1. Probability Interview Questions

      Total Questions Unanswered Questions Answered Questions


      1.1 Average score on a dice role of at most 3 times

      Question

      Consider a fair 6-sided dice. Your aim is to get the highest score you can, in at-most 3 roles.

      A score is defined as the number that appears on the face of the dice facing up after the role. You can role at most 3 times but every time you role it is up to you to decide whether you want to role again.

      The last score will be counted as your final score.

      • Find the average score if you rolled the dice only once?
      • Find the average score that you can get with at most 3 roles?
      • If the dice is fair, why is the average score for at most 3 roles and 1 role not the same?
      Hint 1

      Find what is the expected score on single role

      And for cases when scores of single role < expected score on single role is when you will go for next role

      Eg: if expected score of single role comes out to be 4.5, you will only role next turn for 1,2,3,4 and not for 5,6

      Answer

      If you role a fair dice once you can get:

      Score Probability
      1 β…™
      2 β…™
      3 β…™
      4 β…™
      5 β…™
      6 β…™

      So your average score with one role is:

      sum of(score * scores's probability) = (1+2+3+4+5+6)*(β…™) = (21/6) = 3.5

      The average score if you rolled the dice only once is 3.5

      For at most 3 roles, let's try back-tracking. Let's say just did your second role and you have to decide whether to do your 3rd role!

      We just found out if we role dice once on average we can expect score of 3.5. So we will only role the 3rd time if score on 2nd role is less than 3.5 i.e (1,2 or 3)

      Possibilities

      2nd role score Probability 3rd role score Probability
      1 β…™ 3.5 β…™
      2 β…™ 3.5 β…™
      3 β…™ 3.5 β…™
      4 β…™ NA We won't role
      5 β…™ NA 3rd time if we
      6 β…™ NA get score >3 on 2nd

      So if we had 2 roles, average score would be:

      [We role again if current score is less than 3.4]
      + Probability Questions - Data Science Interview preparation      

      1. Probability Interview Questions

      Total Questions Unanswered Questions Answered Questions


      1.1 Average score on a dice role of at most 3 times

      Question

      Consider a fair 6-sided dice. Your aim is to get the highest score you can, in at-most 3 roles.

      A score is defined as the number that appears on the face of the dice facing up after the role. You can role at most 3 times but every time you role it is up to you to decide whether you want to role again.

      The last score will be counted as your final score.

      • Find the average score if you rolled the dice only once?
      • Find the average score that you can get with at most 3 roles?
      • If the dice is fair, why is the average score for at most 3 roles and 1 role not the same?
      Hint 1

      Find what is the expected score on single role

      And for cases when scores of single role < expected score on single role is when you will go for next role

      Eg: if expected score of single role comes out to be 4.5, you will only role next turn for 1,2,3,4 and not for 5,6

      Answer

      If you role a fair dice once you can get:

      Score Probability
      1 β…™
      2 β…™
      3 β…™
      4 β…™
      5 β…™
      6 β…™

      So your average score with one role is:

      sum of(score * scores's probability) = (1+2+3+4+5+6)*(β…™) = (21/6) = 3.5

      The average score if you rolled the dice only once is 3.5

      For at most 3 roles, let's try back-tracking. Let's say just did your second role and you have to decide whether to do your 3rd role!

      We just found out if we role dice once on average we can expect score of 3.5. So we will only role the 3rd time if score on 2nd role is less than 3.5 i.e (1,2 or 3)

      Possibilities

      2nd role score Probability 3rd role score Probability
      1 β…™ 3.5 β…™
      2 β…™ 3.5 β…™
      3 β…™ 3.5 β…™
      4 β…™ NA We won't role
      5 β…™ NA 3rd time if we
      6 β…™ NA get score >3 on 2nd

      So if we had 2 roles, average score would be:

      [We role again if current score is less than 3.4]
       (3.5)*(1/6) + (3.5)*(1/6) + (3.5)*(1/6) 
       +
       (4)*(1/6) + (5)*(1/6) + (6)*(1/6) [Decide not to role again]
      diff --git a/Interview-Questions/System-design/index.html b/Interview-Questions/System-design/index.html
      index d224a0d..70e9459 100644
      --- a/Interview-Questions/System-design/index.html
      +++ b/Interview-Questions/System-design/index.html
      @@ -1 +1 @@
      - System Design - Data Science Interview preparation      

      1. System Design

      Total Questions Unanswered Questions Answered Questions

      \ No newline at end of file + System Design - Data Science Interview preparation
      \ No newline at end of file diff --git a/Interview-Questions/data-structures-algorithms/index.html b/Interview-Questions/data-structures-algorithms/index.html index 769285b..f28fae2 100644 --- a/Interview-Questions/data-structures-algorithms/index.html +++ b/Interview-Questions/data-structures-algorithms/index.html @@ -1,4 +1,4 @@ - Data Structure and Algorithms - Data Science Interview preparation

      1. Data Structure and Algorithms (DSA)

      Total Questions Unanswered Questions Answered Questions

      1.1 To-do

      πŸ‘€ This project is in early stages of development. πŸ€— Please contibute content if possible! 🀝
      🫡 You can SUBMIT simple text/markdown content, I will format it! πŸ™Œ

      1. Data Structure and Algorithms (DSA)

      Total Questions Unanswered Questions Answered Questions

      1.1 To-do

      πŸ‘€ This project is in early stages of development. πŸ€— Please contibute content if possible! 🀝
      🫡 You can SUBMIT simple text/markdown content, I will format it! πŸ™Œ

      Home

      Go to website

      Introduction

      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! πŸ€— You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      Contribute to the platform

      Contribution in any form will be deeply appreciated. πŸ™

      Add questions

      ❓ Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.

      Add New question

      🀝 Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.

      Add answers/topics

      πŸ“ These are the answers/topics that need your help at the moment

      • Add documentation for the project
      • Online Material for Learning
      • Suggested Learning Paths
      • Cheat Sheets
        • Django
        • Flask
        • Numpy
        • Pandas
        • PySpark
        • Python
        • RegEx
        • SQL
      • NLP Interview Questions
      • Add python common DSA interview questions
      • Add Major ML topics
        • Linear Regression
        • Logistic Regression
        • SVM
        • Random Forest
        • Gradient boosting
        • PCA
        • Collaborative Filtering
        • K-means clustering
        • kNN
        • ARIMA
        • Neural Networks
        • Decision Trees
        • Overfitting, Underfitting
        • Unbalanced, Skewed data
        • Activation functions relu/ leaky relu
        • Normalization
        • DBSCAN
        • Normal Distribution
        • Precision, Recall
        • Loss Function MAE, RMSE
      • Add Pandas questions
      • Add NumPy questions
      • Add TensorFlow questions
      • Add PyTorch questions
      • Add list of learning resources

      Report/Solve Issues

      Issues

      πŸ”§ To report any issues find me on LinkedIn or raise an issue on GitHub.

      πŸ›  You can also solve existing issues on GitHub and create a pull request.

      Say Thanks

      😊 If this platform helped you in any way, it would be great if you could share it with others.

      Check out this πŸ‘‡ platform πŸ‘‡ for data science content:
      +πŸ‘‰ https://singhsidhukuldeep.github.io/data-science-interview-prep/ πŸ‘ˆ
      +

      You can also star the repository on GitHub Stars and watch-out for any updates Watchers

      Features

      • 🎨 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices – from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.

      • 🧐 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search – server-less – is fast and accurate in responses to any of the queries.

      • πŸ™Œ Accessible:

        • Easy to use: πŸ‘Œ The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
        • Easy to contribute: 🀝 The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.

      Setup

      No setup is required for usage of the platform

      Important: It is strongly advised to use virtual environment and not change anything in gh-pages

      Linux Systems Linux

      python3 -m venv ./venv
      +
      +source venv/bin/activate
      +
      +pip3 install -r requirements.txt
      +
      deactivate
      +

      Windows Systems Windows

      python3 -m venv ./venv
      +
      +venv\Scripts\activate
      +
      +pip3 install -r requirements.txt
      +
      venv\Scripts\deactivate
      +

      To install the latest

      pip3 install mkdocs
      +pip3 install mkdocs-material
      +pip3 install mkdocs-minify-plugin
      +pip3 install mkdocs-git-revision-date-localized-plugin
      +

      Useful Commands

      • mkdocs serve - Start the live-reloading docs server.
      • mkdocs build - Build the documentation site.
      • mkdocs -h - Print help message and exit.
      • mkdocs gh-deploy - UseΒ mkdocs gh-deploy --helpΒ to get a full list of options available for theΒ gh-deployΒ command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using theΒ buildΒ orΒ serveΒ commands and reviewing the built files locally.
      • mkdocs new [dir-name] - Create a new project. No need to create a new project

      Useful Documents

      FAQ

      • Can I filter questions based on companies? πŸ€ͺ

        As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. πŸ€“

        This doesn't mean that such feature won't be added in the future. "Never say Never"

        But as of now there is neither plan nor data to do so. 😒

      • Why is this platform free? πŸ€—

        Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.

        If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. πŸ˜‡

      Credits

      Maintained by

      πŸ‘¨β€πŸŽ“ Kuldeep Singh Sidhu

      Github: github/singhsidhukuldeep https://github.com/singhsidhukuldeep

      Website: Kuldeep Singh Sidhu (Website) http://kuldeepsinghsidhu.com

      LinkedIn: Kuldeep Singh Sidhu (LinkedIn) https://www.linkedin.com/in/singhsidhukuldeep/

      Contributors

      😎 The full list of all the contributors is available here

      Current Status

      Maintenance Website shields.io GitHub pages status GitHub up-time BOT Commits Pull Requests

      Issues Total Commits Contributors Forks Stars Watchers Branches

      License: AGPL v3 made-with-python made-with-Markdown repo- size Followers

      \ No newline at end of file diff --git a/Machine-Learning/ARIMA/index.html b/Machine-Learning/ARIMA/index.html index aec2957..865dfab 100644 --- a/Machine-Learning/ARIMA/index.html +++ b/Machine-Learning/ARIMA/index.html @@ -1 +1 @@ - ARIMA - Data Science Interview preparation

      ARIMA

      \ No newline at end of file + ARIMA - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Activation functions/index.html b/Machine-Learning/Activation functions/index.html index cb91914..9e3e9a3 100644 --- a/Machine-Learning/Activation functions/index.html +++ b/Machine-Learning/Activation functions/index.html @@ -1 +1 @@ - Activation functions - Data Science Interview preparation

      Activation functions

      \ No newline at end of file + Activation functions - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Collaborative Filtering/index.html b/Machine-Learning/Collaborative Filtering/index.html index c32d03c..40ef74a 100644 --- a/Machine-Learning/Collaborative Filtering/index.html +++ b/Machine-Learning/Collaborative Filtering/index.html @@ -1 +1 @@ - Collaborative Filtering - Data Science Interview preparation

      Collaborative Filtering

      \ No newline at end of file + Collaborative Filtering - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Confusion Matrix/index.html b/Machine-Learning/Confusion Matrix/index.html index ebe94b7..30baed4 100644 --- a/Machine-Learning/Confusion Matrix/index.html +++ b/Machine-Learning/Confusion Matrix/index.html @@ -1 +1 @@ - Confusion Matrix - Data Science Interview preparation

      Confusion Matrix

      \ No newline at end of file + Confusion Matrix - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/DBSCAN/index.html b/Machine-Learning/DBSCAN/index.html index 50c973c..3c2ba11 100644 --- a/Machine-Learning/DBSCAN/index.html +++ b/Machine-Learning/DBSCAN/index.html @@ -1 +1 @@ - DBSCAN - Data Science Interview preparation

      DBSCAN

      \ No newline at end of file + DBSCAN - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Decision Trees/index.html b/Machine-Learning/Decision Trees/index.html index aee62a9..c667a71 100644 --- a/Machine-Learning/Decision Trees/index.html +++ b/Machine-Learning/Decision Trees/index.html @@ -1 +1 @@ - Decision Trees - Data Science Interview preparation

      Decision Trees

      \ No newline at end of file + Decision Trees - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Gradient Boosting/index.html b/Machine-Learning/Gradient Boosting/index.html index 7a1b629..f081559 100644 --- a/Machine-Learning/Gradient Boosting/index.html +++ b/Machine-Learning/Gradient Boosting/index.html @@ -1 +1 @@ - Gradient Boosting - Data Science Interview preparation

      Gradient Boosting

      \ No newline at end of file + Gradient Boosting - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/K-means clustering/index.html b/Machine-Learning/K-means clustering/index.html index 8e20866..5307ec9 100644 --- a/Machine-Learning/K-means clustering/index.html +++ b/Machine-Learning/K-means clustering/index.html @@ -1 +1 @@ - K means clustering - Data Science Interview preparation

      K means clustering

      \ No newline at end of file + K means clustering - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Linear Regression/index.html b/Machine-Learning/Linear Regression/index.html index a62eabc..99e0b82 100644 --- a/Machine-Learning/Linear Regression/index.html +++ b/Machine-Learning/Linear Regression/index.html @@ -1 +1 @@ - Linear Regression - Data Science Interview preparation

      Linear Regression

      \ No newline at end of file + Linear Regression - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Logistic Regression/index.html b/Machine-Learning/Logistic Regression/index.html index ddd175d..f9e9c15 100644 --- a/Machine-Learning/Logistic Regression/index.html +++ b/Machine-Learning/Logistic Regression/index.html @@ -1 +1 @@ - Logistic Regression - Data Science Interview preparation

      Logistic Regression

      \ No newline at end of file + Logistic Regression - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Loss Function MAE, RMSE/index.html b/Machine-Learning/Loss Function MAE, RMSE/index.html index 49a993a..ec45b3a 100644 --- a/Machine-Learning/Loss Function MAE, RMSE/index.html +++ b/Machine-Learning/Loss Function MAE, RMSE/index.html @@ -1 +1 @@ - Loss Function MAE, RMSE - Data Science Interview preparation

      Loss Function MAE, RMSE

      \ No newline at end of file + Loss Function MAE, RMSE - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Neural Networks/index.html b/Machine-Learning/Neural Networks/index.html index c3771ac..af83a67 100644 --- a/Machine-Learning/Neural Networks/index.html +++ b/Machine-Learning/Neural Networks/index.html @@ -1 +1 @@ - Neural Networks - Data Science Interview preparation

      Neural Networks

      \ No newline at end of file + Neural Networks - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Normal Distribution/index.html b/Machine-Learning/Normal Distribution/index.html index 27ce658..45c80c2 100644 --- a/Machine-Learning/Normal Distribution/index.html +++ b/Machine-Learning/Normal Distribution/index.html @@ -1 +1 @@ - Normal Distribution - Data Science Interview preparation

      Normal Distribution

      \ No newline at end of file + Normal Distribution - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Normalization Regularisation/index.html b/Machine-Learning/Normalization Regularisation/index.html index cdd7169..2080834 100644 --- a/Machine-Learning/Normalization Regularisation/index.html +++ b/Machine-Learning/Normalization Regularisation/index.html @@ -1 +1 @@ - Normalization Regularisation - Data Science Interview preparation

      Normalization Regularisation

      \ No newline at end of file + Normalization Regularisation - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Overfitting, Underfitting/index.html b/Machine-Learning/Overfitting, Underfitting/index.html index 49e29f8..7cce01e 100644 --- a/Machine-Learning/Overfitting, Underfitting/index.html +++ b/Machine-Learning/Overfitting, Underfitting/index.html @@ -1 +1 @@ - Overfitting, Underfitting - Data Science Interview preparation

      Overfitting, Underfitting

      \ No newline at end of file + Overfitting, Underfitting - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/PCA/index.html b/Machine-Learning/PCA/index.html index 7caf80d..c4d0097 100644 --- a/Machine-Learning/PCA/index.html +++ b/Machine-Learning/PCA/index.html @@ -1 +1 @@ - PCA - Data Science Interview preparation

      PCA

      \ No newline at end of file + PCA - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Random Forest/index.html b/Machine-Learning/Random Forest/index.html index 68d5356..d2e081e 100644 --- a/Machine-Learning/Random Forest/index.html +++ b/Machine-Learning/Random Forest/index.html @@ -1 +1 @@ - Random Forest - Data Science Interview preparation

      Random Forest

      \ No newline at end of file + Random Forest - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Support Vector Machines/index.html b/Machine-Learning/Support Vector Machines/index.html index 8b15cf4..81cbb81 100644 --- a/Machine-Learning/Support Vector Machines/index.html +++ b/Machine-Learning/Support Vector Machines/index.html @@ -1 +1 @@ - Support Vector Machines - Data Science Interview preparation

      Support Vector Machines

      \ No newline at end of file + Support Vector Machines - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/Unbalanced, Skewed data/index.html b/Machine-Learning/Unbalanced, Skewed data/index.html index 5de24e5..8e40f42 100644 --- a/Machine-Learning/Unbalanced, Skewed data/index.html +++ b/Machine-Learning/Unbalanced, Skewed data/index.html @@ -1 +1 @@ - Unbalanced, Skewed data - Data Science Interview preparation

      Unbalanced, Skewed data

      \ No newline at end of file + Unbalanced, Skewed data - Data Science Interview preparation
      \ No newline at end of file diff --git a/Machine-Learning/kNN/index.html b/Machine-Learning/kNN/index.html index bfdfd48..dcdba22 100644 --- a/Machine-Learning/kNN/index.html +++ b/Machine-Learning/kNN/index.html @@ -1 +1 @@ - kNN - Data Science Interview preparation

      kNN

      \ No newline at end of file + kNN - Data Science Interview preparation
      \ No newline at end of file diff --git a/Online-Material/Online-Material-for-Learning/index.html b/Online-Material/Online-Material-for-Learning/index.html index fac3230..8ee51bc 100644 --- a/Online-Material/Online-Material-for-Learning/index.html +++ b/Online-Material/Online-Material-for-Learning/index.html @@ -1 +1 @@ - Online Study Material - Data Science Interview preparation

      Online Study Material

      \ No newline at end of file + Online Study Material - Data Science Interview preparation
      \ No newline at end of file diff --git a/Online-Material/popular-resouces/index.html b/Online-Material/popular-resouces/index.html index 2e25cfe..422df4d 100644 --- a/Online-Material/popular-resouces/index.html +++ b/Online-Material/popular-resouces/index.html @@ -1 +1 @@ - Popular Blogs - Data Science Interview preparation

      Popular Blogs

      \ No newline at end of file + Popular Blogs - Data Science Interview preparation
      \ No newline at end of file diff --git a/Suggested-Learning-Paths/index.html b/Suggested-Learning-Paths/index.html index 2dfd262..d4c0129 100644 --- a/Suggested-Learning-Paths/index.html +++ b/Suggested-Learning-Paths/index.html @@ -1 +1 @@ - πŸ“… Suggested Learning Paths - Data Science Interview preparation

      πŸ“… Suggested Learning Paths

      \ No newline at end of file + Suggested Learning Paths - Data Science Interview preparation
      \ No newline at end of file diff --git a/as-fast-as-possible/Deep-CV/index.html b/as-fast-as-possible/Deep-CV/index.html index 36431d1..c6b45bc 100644 --- a/as-fast-as-possible/Deep-CV/index.html +++ b/as-fast-as-possible/Deep-CV/index.html @@ -1 +1 @@ - Deep Computer Vision - Data Science Interview preparation

      Deep Computer Vision

      \ No newline at end of file + Deep CV - Data Science Interview preparation
      \ No newline at end of file diff --git a/as-fast-as-possible/Deep-NLP/index.html b/as-fast-as-possible/Deep-NLP/index.html index eb2bfb9..57050f3 100644 --- a/as-fast-as-possible/Deep-NLP/index.html +++ b/as-fast-as-possible/Deep-NLP/index.html @@ -1 +1 @@ - Deep Natural Language Processing - Data Science Interview preparation

      Deep Natural Language Processing

      \ No newline at end of file + Deep NLP - Data Science Interview preparation
      \ No newline at end of file diff --git a/as-fast-as-possible/Neural-Networks/index.html b/as-fast-as-possible/Neural-Networks/index.html index 2369ce2..f2a8b83 100644 --- a/as-fast-as-possible/Neural-Networks/index.html +++ b/as-fast-as-possible/Neural-Networks/index.html @@ -1 +1 @@ - Neural Networks - Data Science Interview preparation

      Neural Networks

      \ No newline at end of file + Neural Networks - Data Science Interview preparation
      \ No newline at end of file diff --git a/as-fast-as-possible/TF2-Keras/index.html b/as-fast-as-possible/TF2-Keras/index.html index 49ef79e..2913ff0 100644 --- a/as-fast-as-possible/TF2-Keras/index.html +++ b/as-fast-as-possible/TF2-Keras/index.html @@ -1 +1 @@ - Tensorflow 2 with Keras - Data Science Interview preparation

      Tensorflow 2 with Keras

      \ No newline at end of file + TF2 Keras - Data Science Interview preparation
      \ No newline at end of file diff --git a/as-fast-as-possible/index.html b/as-fast-as-possible/index.html index cceacc8..27389e2 100644 --- a/as-fast-as-possible/index.html +++ b/as-fast-as-possible/index.html @@ -1 +1 @@ - Introduction - Data Science Interview preparation

      Introduction

      \ No newline at end of file + Index - Data Science Interview preparation
      \ No newline at end of file diff --git a/contact/index.html b/contact/index.html index 58571dc..e3109e8 100644 --- a/contact/index.html +++ b/contact/index.html @@ -1 +1 @@ - Contact - Data Science Interview preparation

      Contact for https://singhsidhukuldeep.github.io

      Welcome to https://singhsidhukuldeep.github.io/

      For any information, request or official correspondence please email to: singhsidhukuldeep@gmail.com

      Mailing Address:

      Kuldeep Singh Sidhu

      Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India


      Follow on Social Media

      Platform Link
      GitHub https://github.com/singhsidhukuldeep
      LinkedIn https://www.linkedin.com/in/singhsidhukuldeep/
      Twitter (X) https://twitter.com/kuldeep_s_s
      HuggingFace https://huggingface.co/singhsidhukuldeep
      StackOverflow https://stackoverflow.com/users/7182350
      Website http://kuldeepsinghsidhu.com/

      \ No newline at end of file + Contact - Data Science Interview preparation

      Contact for https://singhsidhukuldeep.github.io

      Welcome to https://singhsidhukuldeep.github.io/

      For any information, request or official correspondence please email to: singhsidhukuldeep@gmail.com

      Mailing Address:

      Kuldeep Singh Sidhu

      Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India


      Follow on Social Media

      Platform Link
      GitHub https://github.com/singhsidhukuldeep
      LinkedIn https://www.linkedin.com/in/singhsidhukuldeep/
      Twitter (X) https://twitter.com/kuldeep_s_s
      HuggingFace https://huggingface.co/singhsidhukuldeep
      StackOverflow https://stackoverflow.com/users/7182350
      Website http://kuldeepsinghsidhu.com/

      \ No newline at end of file diff --git a/index.html b/index.html index 67950a8..6d0ec88 100644 --- a/index.html +++ b/index.html @@ -1,33 +1 @@ - Data Science - Data Science Interview preparation

      Home

      Go to website

      Introduction

      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! πŸ€— You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      Contribute to the platform

      Contribution in any form will be deeply appreciated. πŸ™

      Add questions

      ❓ Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.

      Add New question

      🀝 Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.

      Add answers/topics

      πŸ“ These are the answers/topics that need your help at the moment

      • Add documentation for the project
      • Online Material for Learning
      • Suggested Learning Paths
      • Cheat Sheets
        • Django
        • Flask
        • Numpy
        • Pandas
        • PySpark
        • Python
        • RegEx
        • SQL
      • NLP Interview Questions
      • Add python common DSA interview questions
      • Add Major ML topics
        • Linear Regression
        • Logistic Regression
        • SVM
        • Random Forest
        • Gradient boosting
        • PCA
        • Collaborative Filtering
        • K-means clustering
        • kNN
        • ARIMA
        • Neural Networks
        • Decision Trees
        • Overfitting, Underfitting
        • Unbalanced, Skewed data
        • Activation functions relu/ leaky relu
        • Normalization
        • DBSCAN
        • Normal Distribution
        • Precision, Recall
        • Loss Function MAE, RMSE
      • Add Pandas questions
      • Add NumPy questions
      • Add TensorFlow questions
      • Add PyTorch questions
      • Add list of learning resources

      Report/Solve Issues

      Issues

      πŸ”§ To report any issues find me on LinkedIn or raise an issue on GitHub.

      πŸ›  You can also solve existing issues on GitHub and create a pull request.

      Say Thanks

      😊 If this platform helped you in any way, it would be great if you could share it with others.

      Check out this πŸ‘‡ platform πŸ‘‡ for data science content:
      -πŸ‘‰ https://singhsidhukuldeep.github.io/data-science-interview-prep/ πŸ‘ˆ
      -

      You can also star the repository on GitHub Stars and watch-out for any updates Watchers

      Features

      • 🎨 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices – from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.

      • 🧐 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search – server-less – is fast and accurate in responses to any of the queries.

      • πŸ™Œ Accessible:

        • Easy to use: πŸ‘Œ The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
        • Easy to contribute: 🀝 The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.

      Setup

      No setup is required for usage of the platform

      Important: It is strongly advised to use virtual environment and not change anything in gh-pages

      Linux Systems Linux

      python3 -m venv ./venv
      -
      -source venv/bin/activate
      -
      -pip3 install -r requirements.txt
      -
      deactivate
      -

      Windows Systems Windows

      python3 -m venv ./venv
      -
      -venv\Scripts\activate
      -
      -pip3 install -r requirements.txt
      -
      venv\Scripts\deactivate
      -

      To install the latest

      pip3 install mkdocs
      -pip3 install mkdocs-material
      -pip3 install mkdocs-minify-plugin
      -pip3 install mkdocs-git-revision-date-localized-plugin
      -

      Useful Commands

      • mkdocs serve - Start the live-reloading docs server.
      • mkdocs build - Build the documentation site.
      • mkdocs -h - Print help message and exit.
      • mkdocs gh-deploy - UseΒ mkdocs gh-deploy --helpΒ to get a full list of options available for theΒ gh-deployΒ command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using theΒ buildΒ orΒ serveΒ commands and reviewing the built files locally.
      • mkdocs new [dir-name] - Create a new project. No need to create a new project

      Useful Documents

      FAQ

      • Can I filter questions based on companies? πŸ€ͺ

        As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. πŸ€“

        This doesn't mean that such feature won't be added in the future. "Never say Never"

        But as of now there is neither plan nor data to do so. 😒

      • Why is this platform free? πŸ€—

        Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.

        If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. πŸ˜‡

      Credits

      Maintained by

      πŸ‘¨β€πŸŽ“ Kuldeep Singh Sidhu

      Github: github/singhsidhukuldeep https://github.com/singhsidhukuldeep

      Website: Kuldeep Singh Sidhu (Website) http://kuldeepsinghsidhu.com

      LinkedIn: Kuldeep Singh Sidhu (LinkedIn) https://www.linkedin.com/in/singhsidhukuldeep/

      Contributors

      😎 The full list of all the contributors is available here

      Current Status

      Maintenance Website shields.io GitHub pages status GitHub up-time BOT Commits Pull Requests

      Issues Total Commits Contributors Forks Stars Watchers Branches

      License: AGPL v3 made-with-python made-with-Markdown repo- size Followers

      \ No newline at end of file + Data Science - Data Science Interview preparation

      Home

      Go to website


      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! πŸ€— You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      • Interview Questions

        These are currently most commonly asked interview questions.

        Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.

      • Cheat Sheets

        Distilled down important concepts for your quick reference

      • ML Algorithms

        From scratch implementation and documentation of all ML algorithms

      • Online Resources

        Most popular and commonly reffered online resources


      Current Platform Status
      1. I
      2. Will
      3. Update
      4. Soon
      1. I
      2. Will
      3. Update
      4. Soon

      \ No newline at end of file diff --git a/privacy/index.html b/privacy/index.html index cdba481..293a075 100644 --- a/privacy/index.html +++ b/privacy/index.html @@ -1 +1 @@ - Privacy Policy - Data Science Interview preparation

      Privacy Policy for https://singhsidhukuldeep.github.io

      Introduction

      Welcome to https://singhsidhukuldeep.github.io/ (the "Website"). Your privacy is important to us, and we are committed to protecting the personal information you share with us. This Privacy Policy explains how we collect, use, and disclose your information, and our commitment to ensuring that your personal data is handled with care and security.

      This policy complies with the General Data Protection Regulation (GDPR), ePrivacy Directive (EPD), California Privacy Rights Act (CPRA), Colorado Privacy Act (CPA), Virginia Consumer Data Protection Act (VCDPA), and Brazil's Lei Geral de Proteção de Dados (LGPD).

      Information We Collect

      Personal Information

      We may collect personally identifiable information about you, such as:

      • Name
      • Email address
      • IP address
      • Other information you voluntarily provide through contact forms or interactions with the Website

      Non-Personal Information

      We may also collect non-personal information such as:

      • Browser type
      • Language preference
      • Referring site
      • Date and time of each visitor request
      • Aggregated data on how visitors use the Website

      Cookies and Web Beacons

      Our Website uses cookies to enhance your experience. A cookie is a small file that is placed on your device when you visit our Website. Cookies help us to:

      • Remember your preferences and settings
      • Understand how you interact with our Website
      • Track and analyze usage patterns

      You can disable cookies through your browser settings; however, doing so may affect your ability to access certain features of the Website.

      Google AdSense

      We use Google AdSense to display advertisements on our Website. Google AdSense may use cookies and web beacons to collect information about your interaction with the ads displayed on our Website. This information may include:

      • Your IP address
      • The type of browser you use
      • The pages you visit on our Website

      Google may use this information to show you personalized ads based on your interests and browsing history. For more information on how Google uses your data, please visit the Google Privacy & Terms page.

      We process your personal data under the following legal bases:

      • Consent: When you have given explicit consent for us to process your data for specific purposes.
      • Contract: When processing your data is necessary to fulfill a contract with you or to take steps at your request before entering into a contract.
      • Legitimate Interests: When the processing is necessary for our legitimate interests, such as improving our services, provided these are not overridden by your rights.
      • Compliance with Legal Obligations: When we need to process your data to comply with a legal obligation.

      How Your Data Will Be Used to Show Ads

      We work with third-party vendors, including Google, to serve ads on our Website. These vendors use cookies and similar technologies to collect and use data about your visits to this and other websites to show you ads that are more relevant to your interests.

      Types of Data Used

      The data used to show you ads may include:

      • Demographic Information: Age, gender, and other demographic details
      • Location Data: Approximate geographical location based on your IP address
      • Behavioral Data: Your browsing behavior, such as pages visited, links clicked, and time spent on our Website
      • Interests and Preferences: Based on your browsing history, the types of ads you interact with, and your preferences across websites

      Purpose of Data Usage

      The primary purpose of collecting and using this data is to:

      • Serve ads that are relevant and tailored to your interests
      • Improve ad targeting and effectiveness
      • Analyze and optimize the performance of ads on our Website

      Opting Out of Personalized Ads

      You can opt out of personalized ads by adjusting your ad settings with Google and other third-party vendors. For more information on how to opt out of personalized ads, please visit the Google Ads Settings page and review the options available to manage your preferences.

      Data Subject Rights (GDPR, CPRA, CPA, VCDPA, LGPD Compliance)

      Depending on your jurisdiction, you have the following rights regarding your personal data:

      Right to Access

      You have the right to request access to the personal data we hold about you and to receive a copy of this data.

      Right to Rectification

      You have the right to request that we correct any inaccuracies in the personal data we hold about you.

      Right to Erasure (Right to Be Forgotten)

      You have the right to request that we delete your personal data, subject to certain conditions and legal obligations.

      Right to Restriction of Processing

      You have the right to request that we restrict the processing of your personal data in certain circumstances, such as when you contest the accuracy of the data.

      Right to Data Portability

      You have the right to receive your personal data in a structured, commonly used, and machine-readable format and to transmit this data to another controller.

      Right to Object

      You have the right to object to the processing of your personal data based on legitimate interests or for direct marketing purposes.

      Where we rely on your consent to process your personal data, you have the right to withdraw your consent at any time.

      Right to Non-Discrimination (CPRA Compliance)

      We will not discriminate against you for exercising any of your privacy rights under CPRA or any other applicable laws.

      Exercising Your Rights

      To exercise any of these rights, please contact us at:

      Email: singhsidhukuldeep@gmail.com

      We will respond to your request within the timeframes required by applicable law.

      How We Use Your Information

      We use the information collected from you to:

      • Improve the content and functionality of our Website
      • Display relevant advertisements through Google AdSense and other ad networks
      • Respond to your inquiries and provide customer support
      • Analyze usage patterns and improve our services

      Data Sharing and Disclosure

      Third-Party Service Providers

      We may share your personal data with third-party service providers who assist us in operating our Website, conducting our business, or servicing you, as long as these parties agree to keep this information confidential.

      We may disclose your personal data when required by law or to comply with legal processes, such as a court order or subpoena.

      Business Transfers

      In the event of a merger, acquisition, or sale of all or a portion of our assets, your personal data may be transferred to the acquiring entity.

      Data Retention

      We will retain your personal data only for as long as necessary to fulfill the purposes outlined in this Privacy Policy unless a longer retention period is required or permitted by law.

      Data Security

      We take reasonable measures to protect your information from unauthorized access, alteration, disclosure, or destruction. However, no method of transmission over the internet or electronic storage is 100% secure, and we cannot guarantee absolute security.

      Cross-Border Data Transfers

      Your personal data may be transferred to, and processed in, countries other than the country in which you are resident. These countries may have data protection laws that are different from the laws of your country.

      Where we transfer your personal data to other countries, we will take appropriate measures to ensure that your personal data remains protected in accordance with this Privacy Policy and applicable data protection laws.

      By using our Website, you consent to our Privacy Policy and agree to its terms.

      Changes to This Privacy Policy

      We may update this Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page. You are advised to review this Privacy Policy periodically for any changes.

      Contact Us

      If you have any questions about this Privacy Policy, or if you would like to exercise your rights under GDPR, CPRA, CPA, VCDPA, or LGPD, please contact us at:

      Email: singhsidhukuldeep@gmail.com

      Mailing Address:

      Kuldeep Singh Sidhu

      Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India


      \ No newline at end of file + Privacy Policy - Data Science Interview preparation

      Privacy Policy for https://singhsidhukuldeep.github.io

      Introduction

      Welcome to https://singhsidhukuldeep.github.io/ (the "Website"). Your privacy is important to us, and we are committed to protecting the personal information you share with us. This Privacy Policy explains how we collect, use, and disclose your information, and our commitment to ensuring that your personal data is handled with care and security.

      This policy complies with the General Data Protection Regulation (GDPR), ePrivacy Directive (EPD), California Privacy Rights Act (CPRA), Colorado Privacy Act (CPA), Virginia Consumer Data Protection Act (VCDPA), and Brazil's Lei Geral de Proteção de Dados (LGPD).

      Information We Collect

      Personal Information

      We may collect personally identifiable information about you, such as:

      • Name
      • Email address
      • IP address
      • Other information you voluntarily provide through contact forms or interactions with the Website

      Non-Personal Information

      We may also collect non-personal information such as:

      • Browser type
      • Language preference
      • Referring site
      • Date and time of each visitor request
      • Aggregated data on how visitors use the Website

      Cookies and Web Beacons

      Our Website uses cookies to enhance your experience. A cookie is a small file that is placed on your device when you visit our Website. Cookies help us to:

      • Remember your preferences and settings
      • Understand how you interact with our Website
      • Track and analyze usage patterns

      You can disable cookies through your browser settings; however, doing so may affect your ability to access certain features of the Website.

      Google AdSense

      We use Google AdSense to display advertisements on our Website. Google AdSense may use cookies and web beacons to collect information about your interaction with the ads displayed on our Website. This information may include:

      • Your IP address
      • The type of browser you use
      • The pages you visit on our Website

      Google may use this information to show you personalized ads based on your interests and browsing history. For more information on how Google uses your data, please visit the Google Privacy & Terms page.

      We process your personal data under the following legal bases:

      • Consent: When you have given explicit consent for us to process your data for specific purposes.
      • Contract: When processing your data is necessary to fulfill a contract with you or to take steps at your request before entering into a contract.
      • Legitimate Interests: When the processing is necessary for our legitimate interests, such as improving our services, provided these are not overridden by your rights.
      • Compliance with Legal Obligations: When we need to process your data to comply with a legal obligation.

      How Your Data Will Be Used to Show Ads

      We work with third-party vendors, including Google, to serve ads on our Website. These vendors use cookies and similar technologies to collect and use data about your visits to this and other websites to show you ads that are more relevant to your interests.

      Types of Data Used

      The data used to show you ads may include:

      • Demographic Information: Age, gender, and other demographic details
      • Location Data: Approximate geographical location based on your IP address
      • Behavioral Data: Your browsing behavior, such as pages visited, links clicked, and time spent on our Website
      • Interests and Preferences: Based on your browsing history, the types of ads you interact with, and your preferences across websites

      Purpose of Data Usage

      The primary purpose of collecting and using this data is to:

      • Serve ads that are relevant and tailored to your interests
      • Improve ad targeting and effectiveness
      • Analyze and optimize the performance of ads on our Website

      Opting Out of Personalized Ads

      You can opt out of personalized ads by adjusting your ad settings with Google and other third-party vendors. For more information on how to opt out of personalized ads, please visit the Google Ads Settings page and review the options available to manage your preferences.

      Data Subject Rights (GDPR, CPRA, CPA, VCDPA, LGPD Compliance)

      Depending on your jurisdiction, you have the following rights regarding your personal data:

      Right to Access

      You have the right to request access to the personal data we hold about you and to receive a copy of this data.

      Right to Rectification

      You have the right to request that we correct any inaccuracies in the personal data we hold about you.

      Right to Erasure (Right to Be Forgotten)

      You have the right to request that we delete your personal data, subject to certain conditions and legal obligations.

      Right to Restriction of Processing

      You have the right to request that we restrict the processing of your personal data in certain circumstances, such as when you contest the accuracy of the data.

      Right to Data Portability

      You have the right to receive your personal data in a structured, commonly used, and machine-readable format and to transmit this data to another controller.

      Right to Object

      You have the right to object to the processing of your personal data based on legitimate interests or for direct marketing purposes.

      Where we rely on your consent to process your personal data, you have the right to withdraw your consent at any time.

      Right to Non-Discrimination (CPRA Compliance)

      We will not discriminate against you for exercising any of your privacy rights under CPRA or any other applicable laws.

      Exercising Your Rights

      To exercise any of these rights, please contact us at:

      Email: singhsidhukuldeep@gmail.com

      We will respond to your request within the timeframes required by applicable law.

      How We Use Your Information

      We use the information collected from you to:

      • Improve the content and functionality of our Website
      • Display relevant advertisements through Google AdSense and other ad networks
      • Respond to your inquiries and provide customer support
      • Analyze usage patterns and improve our services

      Data Sharing and Disclosure

      Third-Party Service Providers

      We may share your personal data with third-party service providers who assist us in operating our Website, conducting our business, or servicing you, as long as these parties agree to keep this information confidential.

      We may disclose your personal data when required by law or to comply with legal processes, such as a court order or subpoena.

      Business Transfers

      In the event of a merger, acquisition, or sale of all or a portion of our assets, your personal data may be transferred to the acquiring entity.

      Data Retention

      We will retain your personal data only for as long as necessary to fulfill the purposes outlined in this Privacy Policy unless a longer retention period is required or permitted by law.

      Data Security

      We take reasonable measures to protect your information from unauthorized access, alteration, disclosure, or destruction. However, no method of transmission over the internet or electronic storage is 100% secure, and we cannot guarantee absolute security.

      Cross-Border Data Transfers

      Your personal data may be transferred to, and processed in, countries other than the country in which you are resident. These countries may have data protection laws that are different from the laws of your country.

      Where we transfer your personal data to other countries, we will take appropriate measures to ensure that your personal data remains protected in accordance with this Privacy Policy and applicable data protection laws.

      By using our Website, you consent to our Privacy Policy and agree to its terms.

      Changes to This Privacy Policy

      We may update this Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page. You are advised to review this Privacy Policy periodically for any changes.

      Contact Us

      If you have any questions about this Privacy Policy, or if you would like to exercise your rights under GDPR, CPRA, CPA, VCDPA, or LGPD, please contact us at:

      Email: singhsidhukuldeep@gmail.com

      Mailing Address:

      Kuldeep Singh Sidhu

      Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India


      \ No newline at end of file diff --git a/projects/index.html b/projects/index.html index d188539..25211f5 100644 --- a/projects/index.html +++ b/projects/index.html @@ -1 +1 @@ - Projects - Data Science Interview preparation

      Projects

      Introduction

      These are the projects that you can take inspiration from and try to improve on them. ✍️

      Number of Projects

      Github Google Collab

      List of projects

      Natural Language processing (NLP)

      Title Description Source Author
      Text Classification with Facebook fasttext Building the User Review Model with fastText (Text Classification) with response time of less than one second Github Kuldeep Singh Sidhu
      Chat-bot using ChatterBot ChatterBot is a Python library that makes it easy to generate automated responses to a user’s input. Github Kuldeep Singh Sidhu
      Text Summarizer Comparing state of the art models for text summary generation Github Google Collab Kuldeep Singh Sidhu
      NLP with Spacy Building NLP pipeline using Spacy Github Kuldeep Singh Sidhu

      Recommendation Engine

      Title Description Source Author
      Recommendation Engine with Surprise Comparing different recommendation systems algorithms like SVD, SVDpp (Matrix Factorization), KNN Baseline, KNN Basic, KNN Means, KNN ZScore), Baseline, Co Clustering Github Google Collab Kuldeep Singh Sidhu

      Image Processing

      Title Description Source Author
      Facial Landmarks Using Dlib, a library capable of giving you 68 points (land marks) of the face. Github Kuldeep Singh Sidhu

      Reinforcement Learning

      Title Description Source Author
      Google Dopamine Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. Github Google Collab Kuldeep Singh Sidhu
      Tic Tac Toe Training a computer to play Tic Tac Toe using reinforcement learning algorithms. Github Google Collab Kuldeep Singh Sidhu

      Others

      Title Description Source Author
      TensorFlow Eager Execution Eager Execution (EE) enables you to run operations immediately. Github Google Collab Kuldeep Singh Sidhu
      \ No newline at end of file + Projects - Data Science Interview preparation

      Projects

      Introduction

      These are the projects that you can take inspiration from and try to improve on them. ✍️

      Number of Projects

      Github Google Collab

      List of projects

      Natural Language processing (NLP)

      Title Description Source Author
      Text Classification with Facebook fasttext Building the User Review Model with fastText (Text Classification) with response time of less than one second Github Kuldeep Singh Sidhu
      Chat-bot using ChatterBot ChatterBot is a Python library that makes it easy to generate automated responses to a user’s input. Github Kuldeep Singh Sidhu
      Text Summarizer Comparing state of the art models for text summary generation Github Google Collab Kuldeep Singh Sidhu
      NLP with Spacy Building NLP pipeline using Spacy Github Kuldeep Singh Sidhu

      Recommendation Engine

      Title Description Source Author
      Recommendation Engine with Surprise Comparing different recommendation systems algorithms like SVD, SVDpp (Matrix Factorization), KNN Baseline, KNN Basic, KNN Means, KNN ZScore), Baseline, Co Clustering Github Google Collab Kuldeep Singh Sidhu

      Image Processing

      Title Description Source Author
      Facial Landmarks Using Dlib, a library capable of giving you 68 points (land marks) of the face. Github Kuldeep Singh Sidhu

      Reinforcement Learning

      Title Description Source Author
      Google Dopamine Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. Github Google Collab Kuldeep Singh Sidhu
      Tic Tac Toe Training a computer to play Tic Tac Toe using reinforcement learning algorithms. Github Google Collab Kuldeep Singh Sidhu

      Others

      Title Description Source Author
      TensorFlow Eager Execution Eager Execution (EE) enables you to run operations immediately. Github Google Collab Kuldeep Singh Sidhu
      \ No newline at end of file diff --git a/search/search_index.json b/search/search_index.json index 87c9b28..747a927 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":""},{"location":"#introduction","title":"Introduction","text":"

      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      "},{"location":"#contribute-to-the-platform","title":"Contribute to the platform","text":"

      Contribution in any form will be deeply appreciated. \ud83d\ude4f

      "},{"location":"#add-questions","title":"Add questions","text":"

      \u2753 Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.

      \ud83e\udd1d Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.

      "},{"location":"#add-answerstopics","title":"Add answers/topics","text":"

      \ud83d\udcdd These are the answers/topics that need your help at the moment

      • Add documentation for the project
      • Online Material for Learning
      • Suggested Learning Paths
      • Cheat Sheets
        • Django
        • Flask
        • Numpy
        • Pandas
        • PySpark
        • Python
        • RegEx
        • SQL
      • NLP Interview Questions
      • Add python common DSA interview questions
      • Add Major ML topics
        • Linear Regression
        • Logistic Regression
        • SVM
        • Random Forest
        • Gradient boosting
        • PCA
        • Collaborative Filtering
        • K-means clustering
        • kNN
        • ARIMA
        • Neural Networks
        • Decision Trees
        • Overfitting, Underfitting
        • Unbalanced, Skewed data
        • Activation functions relu/ leaky relu
        • Normalization
        • DBSCAN
        • Normal Distribution
        • Precision, Recall
        • Loss Function MAE, RMSE
      • Add Pandas questions
      • Add NumPy questions
      • Add TensorFlow questions
      • Add PyTorch questions
      • Add list of learning resources
      "},{"location":"#reportsolve-issues","title":"Report/Solve Issues","text":"

      \ud83d\udd27 To report any issues find me on LinkedIn or raise an issue on GitHub.

      \ud83d\udee0 You can also solve existing issues on GitHub and create a pull request.

      "},{"location":"#say-thanks","title":"Say Thanks","text":"

      \ud83d\ude0a If this platform helped you in any way, it would be great if you could share it with others.

      Check out this \ud83d\udc47 platform \ud83d\udc47 for data science content:\n\ud83d\udc49 https://singhsidhukuldeep.github.io/data-science-interview-prep/ \ud83d\udc48\n

      You can also star the repository on GitHub and watch-out for any updates

      "},{"location":"#features","title":"Features","text":"
      • \ud83c\udfa8 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices \u2013 from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.

      • \ud83e\uddd0 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search \u2013 server-less \u2013 is fast and accurate in responses to any of the queries.

      • \ud83d\ude4c Accessible:

        • Easy to use: \ud83d\udc4c The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
        • Easy to contribute: \ud83e\udd1d The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
      "},{"location":"#setup","title":"Setup","text":"

      No setup is required for usage of the platform

      Important: It is strongly advised to use virtual environment and not change anything in gh-pages

      "},{"location":"#linux-systems","title":"Linux Systems","text":"
      python3 -m venv ./venv\n\nsource venv/bin/activate\n\npip3 install -r requirements.txt\n
      deactivate\n
      "},{"location":"#windows-systems","title":"Windows Systems","text":"
      python3 -m venv ./venv\n\nvenv\\Scripts\\activate\n\npip3 install -r requirements.txt\n
      venv\\Scripts\\deactivate\n
      "},{"location":"#to-install-the-latest","title":"To install the latest","text":"
      pip3 install mkdocs\npip3 install mkdocs-material\npip3 install mkdocs-minify-plugin\npip3 install mkdocs-git-revision-date-localized-plugin\n
      "},{"location":"#useful-commands","title":"Useful Commands","text":"
      • mkdocs serve - Start the live-reloading docs server.
      • mkdocs build - Build the documentation site.
      • mkdocs -h - Print help message and exit.
      • mkdocs gh-deploy - Use\u00a0mkdocs gh-deploy --help\u00a0to get a full list of options available for the\u00a0gh-deploy\u00a0command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using the\u00a0build\u00a0or\u00a0serve\u00a0commands and reviewing the built files locally.
      • mkdocs new [dir-name] - Create a new project. No need to create a new project
      "},{"location":"#useful-documents","title":"Useful Documents","text":"
      • \ud83d\udcd1 MkDocs:

        • GitHub: https://github.com/mkdocs/mkdocs
        • Documentation: https://www.mkdocs.org/
      • \ud83c\udfa8 Theme:

        • GitHub: https://github.com/squidfunk/mkdocs-material
        • Documentation: https://squidfunk.github.io/mkdocs-material/getting-started/
      "},{"location":"#faq","title":"FAQ","text":"
      • Can I filter questions based on companies? \ud83e\udd2a

        As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. \ud83e\udd13

        This doesn't mean that such feature won't be added in the future. \"Never say Never\"

        But as of now there is neither plan nor data to do so. \ud83d\ude22

      • Why is this platform free? \ud83e\udd17

        Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.

        If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. \ud83d\ude07

      "},{"location":"#credits","title":"Credits","text":""},{"location":"#maintained-by","title":"Maintained by","text":"

      \ud83d\udc68\u200d\ud83c\udf93 Kuldeep Singh Sidhu

      Github: github/singhsidhukuldeep https://github.com/singhsidhukuldeep

      Website: Kuldeep Singh Sidhu (Website) http://kuldeepsinghsidhu.com

      LinkedIn: Kuldeep Singh Sidhu (LinkedIn) https://www.linkedin.com/in/singhsidhukuldeep/

      "},{"location":"#contributors","title":"Contributors","text":"

      \ud83d\ude0e The full list of all the contributors is available here

      "},{"location":"#current-status","title":"Current Status","text":""},{"location":"contact/","title":"Contact for https://singhsidhukuldeep.github.io","text":"

      Welcome to https://singhsidhukuldeep.github.io/

      For any information, request or official correspondence please email to: singhsidhukuldeep@gmail.com

      Mailing Address:

      Kuldeep Singh Sidhu

      Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India

      "},{"location":"contact/#follow-on-social-media","title":"Follow on Social Media","text":"Platform Link GitHub https://github.com/singhsidhukuldeep LinkedIn https://www.linkedin.com/in/singhsidhukuldeep/ Twitter (X) https://twitter.com/kuldeep_s_s HuggingFace https://huggingface.co/singhsidhukuldeep StackOverflow https://stackoverflow.com/users/7182350 Website http://kuldeepsinghsidhu.com/"},{"location":"privacy/","title":"Privacy Policy for https://singhsidhukuldeep.github.io","text":""},{"location":"privacy/#introduction","title":"Introduction","text":"

      Welcome to https://singhsidhukuldeep.github.io/ (the \"Website\"). Your privacy is important to us, and we are committed to protecting the personal information you share with us. This Privacy Policy explains how we collect, use, and disclose your information, and our commitment to ensuring that your personal data is handled with care and security.

      This policy complies with the General Data Protection Regulation (GDPR), ePrivacy Directive (EPD), California Privacy Rights Act (CPRA), Colorado Privacy Act (CPA), Virginia Consumer Data Protection Act (VCDPA), and Brazil's Lei Geral de Prote\u00e7\u00e3o de Dados (LGPD).

      "},{"location":"privacy/#information-we-collect","title":"Information We Collect","text":""},{"location":"privacy/#personal-information","title":"Personal Information","text":"

      We may collect personally identifiable information about you, such as:

      • Name
      • Email address
      • IP address
      • Other information you voluntarily provide through contact forms or interactions with the Website
      "},{"location":"privacy/#non-personal-information","title":"Non-Personal Information","text":"

      We may also collect non-personal information such as:

      • Browser type
      • Language preference
      • Referring site
      • Date and time of each visitor request
      • Aggregated data on how visitors use the Website
      "},{"location":"privacy/#cookies-and-web-beacons","title":"Cookies and Web Beacons","text":"

      Our Website uses cookies to enhance your experience. A cookie is a small file that is placed on your device when you visit our Website. Cookies help us to:

      • Remember your preferences and settings
      • Understand how you interact with our Website
      • Track and analyze usage patterns

      You can disable cookies through your browser settings; however, doing so may affect your ability to access certain features of the Website.

      "},{"location":"privacy/#google-adsense","title":"Google AdSense","text":"

      We use Google AdSense to display advertisements on our Website. Google AdSense may use cookies and web beacons to collect information about your interaction with the ads displayed on our Website. This information may include:

      • Your IP address
      • The type of browser you use
      • The pages you visit on our Website

      Google may use this information to show you personalized ads based on your interests and browsing history. For more information on how Google uses your data, please visit the Google Privacy & Terms page.

      "},{"location":"privacy/#legal-bases-for-processing-your-data-gdpr-compliance","title":"Legal Bases for Processing Your Data (GDPR Compliance)","text":"

      We process your personal data under the following legal bases:

      • Consent: When you have given explicit consent for us to process your data for specific purposes.
      • Contract: When processing your data is necessary to fulfill a contract with you or to take steps at your request before entering into a contract.
      • Legitimate Interests: When the processing is necessary for our legitimate interests, such as improving our services, provided these are not overridden by your rights.
      • Compliance with Legal Obligations: When we need to process your data to comply with a legal obligation.
      "},{"location":"privacy/#how-your-data-will-be-used-to-show-ads","title":"How Your Data Will Be Used to Show Ads","text":"

      We work with third-party vendors, including Google, to serve ads on our Website. These vendors use cookies and similar technologies to collect and use data about your visits to this and other websites to show you ads that are more relevant to your interests.

      "},{"location":"privacy/#types-of-data-used","title":"Types of Data Used","text":"

      The data used to show you ads may include:

      • Demographic Information: Age, gender, and other demographic details
      • Location Data: Approximate geographical location based on your IP address
      • Behavioral Data: Your browsing behavior, such as pages visited, links clicked, and time spent on our Website
      • Interests and Preferences: Based on your browsing history, the types of ads you interact with, and your preferences across websites
      "},{"location":"privacy/#purpose-of-data-usage","title":"Purpose of Data Usage","text":"

      The primary purpose of collecting and using this data is to:

      • Serve ads that are relevant and tailored to your interests
      • Improve ad targeting and effectiveness
      • Analyze and optimize the performance of ads on our Website
      "},{"location":"privacy/#opting-out-of-personalized-ads","title":"Opting Out of Personalized Ads","text":"

      You can opt out of personalized ads by adjusting your ad settings with Google and other third-party vendors. For more information on how to opt out of personalized ads, please visit the Google Ads Settings page and review the options available to manage your preferences.

      "},{"location":"privacy/#data-subject-rights-gdpr-cpra-cpa-vcdpa-lgpd-compliance","title":"Data Subject Rights (GDPR, CPRA, CPA, VCDPA, LGPD Compliance)","text":"

      Depending on your jurisdiction, you have the following rights regarding your personal data:

      "},{"location":"privacy/#right-to-access","title":"Right to Access","text":"

      You have the right to request access to the personal data we hold about you and to receive a copy of this data.

      "},{"location":"privacy/#right-to-rectification","title":"Right to Rectification","text":"

      You have the right to request that we correct any inaccuracies in the personal data we hold about you.

      "},{"location":"privacy/#right-to-erasure-right-to-be-forgotten","title":"Right to Erasure (Right to Be Forgotten)","text":"

      You have the right to request that we delete your personal data, subject to certain conditions and legal obligations.

      "},{"location":"privacy/#right-to-restriction-of-processing","title":"Right to Restriction of Processing","text":"

      You have the right to request that we restrict the processing of your personal data in certain circumstances, such as when you contest the accuracy of the data.

      "},{"location":"privacy/#right-to-data-portability","title":"Right to Data Portability","text":"

      You have the right to receive your personal data in a structured, commonly used, and machine-readable format and to transmit this data to another controller.

      "},{"location":"privacy/#right-to-object","title":"Right to Object","text":"

      You have the right to object to the processing of your personal data based on legitimate interests or for direct marketing purposes.

      "},{"location":"privacy/#right-to-withdraw-consent","title":"Right to Withdraw Consent","text":"

      Where we rely on your consent to process your personal data, you have the right to withdraw your consent at any time.

      "},{"location":"privacy/#right-to-non-discrimination-cpra-compliance","title":"Right to Non-Discrimination (CPRA Compliance)","text":"

      We will not discriminate against you for exercising any of your privacy rights under CPRA or any other applicable laws.

      "},{"location":"privacy/#exercising-your-rights","title":"Exercising Your Rights","text":"

      To exercise any of these rights, please contact us at:

      Email: singhsidhukuldeep@gmail.com

      We will respond to your request within the timeframes required by applicable law.

      "},{"location":"privacy/#how-we-use-your-information","title":"How We Use Your Information","text":"

      We use the information collected from you to:

      • Improve the content and functionality of our Website
      • Display relevant advertisements through Google AdSense and other ad networks
      • Respond to your inquiries and provide customer support
      • Analyze usage patterns and improve our services
      "},{"location":"privacy/#data-sharing-and-disclosure","title":"Data Sharing and Disclosure","text":""},{"location":"privacy/#third-party-service-providers","title":"Third-Party Service Providers","text":"

      We may share your personal data with third-party service providers who assist us in operating our Website, conducting our business, or servicing you, as long as these parties agree to keep this information confidential.

      "},{"location":"privacy/#legal-obligations","title":"Legal Obligations","text":"

      We may disclose your personal data when required by law or to comply with legal processes, such as a court order or subpoena.

      "},{"location":"privacy/#business-transfers","title":"Business Transfers","text":"

      In the event of a merger, acquisition, or sale of all or a portion of our assets, your personal data may be transferred to the acquiring entity.

      "},{"location":"privacy/#data-retention","title":"Data Retention","text":"

      We will retain your personal data only for as long as necessary to fulfill the purposes outlined in this Privacy Policy unless a longer retention period is required or permitted by law.

      "},{"location":"privacy/#data-security","title":"Data Security","text":"

      We take reasonable measures to protect your information from unauthorized access, alteration, disclosure, or destruction. However, no method of transmission over the internet or electronic storage is 100% secure, and we cannot guarantee absolute security.

      "},{"location":"privacy/#cross-border-data-transfers","title":"Cross-Border Data Transfers","text":"

      Your personal data may be transferred to, and processed in, countries other than the country in which you are resident. These countries may have data protection laws that are different from the laws of your country.

      Where we transfer your personal data to other countries, we will take appropriate measures to ensure that your personal data remains protected in accordance with this Privacy Policy and applicable data protection laws.

      "},{"location":"privacy/#your-consent","title":"Your Consent","text":"

      By using our Website, you consent to our Privacy Policy and agree to its terms.

      "},{"location":"privacy/#changes-to-this-privacy-policy","title":"Changes to This Privacy Policy","text":"

      We may update this Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page. You are advised to review this Privacy Policy periodically for any changes.

      "},{"location":"privacy/#contact-us","title":"Contact Us","text":"

      If you have any questions about this Privacy Policy, or if you would like to exercise your rights under GDPR, CPRA, CPA, VCDPA, or LGPD, please contact us at:

      Email: singhsidhukuldeep@gmail.com

      Mailing Address:

      Kuldeep Singh Sidhu

      Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India

      "},{"location":"projects/","title":"Projects","text":""},{"location":"projects/#introduction","title":"Introduction","text":"

      These are the projects that you can take inspiration from and try to improve on them. \u270d\ufe0f

      "},{"location":"projects/#popular-sources","title":"Popular Sources","text":""},{"location":"projects/#list-of-projects","title":"List of projects","text":""},{"location":"projects/#natural-language-processing-nlp","title":"Natural Language processing (NLP)","text":"Title Description Source Author Text Classification with Facebook fasttext Building the User Review Model with fastText (Text Classification) with response time of less than one second Kuldeep Singh Sidhu Chat-bot using ChatterBot ChatterBot is a Python library that makes it easy to generate automated responses to a user\u2019s input. Kuldeep Singh Sidhu Text Summarizer Comparing state of the art models for text summary generation Kuldeep Singh Sidhu NLP with Spacy Building NLP pipeline using Spacy Kuldeep Singh Sidhu"},{"location":"projects/#recommendation-engine","title":"Recommendation Engine","text":"Title Description Source Author Recommendation Engine with Surprise Comparing different recommendation systems algorithms like SVD, SVDpp (Matrix Factorization), KNN Baseline, KNN Basic, KNN Means, KNN ZScore), Baseline, Co Clustering Kuldeep Singh Sidhu"},{"location":"projects/#image-processing","title":"Image Processing","text":"Title Description Source Author Facial Landmarks Using Dlib, a library capable of giving you 68 points (land marks) of the face. Kuldeep Singh Sidhu"},{"location":"projects/#reinforcement-learning","title":"Reinforcement Learning","text":"Title Description Source Author Google Dopamine Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. Kuldeep Singh Sidhu Tic Tac Toe Training a computer to play Tic Tac Toe using reinforcement learning algorithms. Kuldeep Singh Sidhu"},{"location":"projects/#others","title":"Others","text":"Title Description Source Author TensorFlow Eager Execution Eager Execution (EE) enables you to run operations immediately. Kuldeep Singh Sidhu"},{"location":"Cheat-Sheets/Hypothesis-Tests/","title":"Hypothesis Tests in Python","text":"

      A\u00a0statistical hypothesis test\u00a0is a method of\u00a0statistical inference\u00a0used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters.

      Few Notes:

      • When it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.
      • Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.
      • In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance, to name two examples.
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#normality-tests","title":"Normality Tests","text":"

      This section lists statistical tests that you can use to check if your data has a Gaussian distribution.

      Gaussian distribution (also known as normal distribution) is a bell-shaped curve.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#shapiro-wilk-test","title":"Shapiro-Wilk Test","text":"

      Tests whether a data sample has a Gaussian distribution/Normal distribution.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
      • Interpretation

        • H0: the sample has a Gaussian distribution.
        • H1: the sample does not have a Gaussian distribution.
      • Python Code

        # Example of the Shapiro-Wilk Normality Test\nfrom scipy.stats import shapiro\ndata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\nstat, p = shapiro(data)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably Gaussian')\nelse:\n    print('Probably not Gaussian')\n
      • Sources

        • scipy.stats.shapiro
        • Shapiro-Wilk test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#dagostinos-k2-test","title":"D\u2019Agostino\u2019s K^2 Test","text":"

      Tests whether a data sample has a Gaussian distribution/Normal distribution.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
      • Interpretation

        • H0: the sample has a Gaussian distribution.
        • H1: the sample does not have a Gaussian distribution.
      • Python Code

        # Example of the D'Agostino's K^2 Normality Test\nfrom scipy.stats import normaltest\ndata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\nstat, p = normaltest(data)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably Gaussian')\nelse:\n    print('Probably not Gaussian')\n
      • Sources

        • scipy.stats.normaltest
        • D'Agostino's K-squared test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#anderson-darling-test","title":"Anderson-Darling Test","text":"

      Tests whether a data sample has a Gaussian distribution/Normal distribution.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
      • Interpretation

        • H0: the sample has a Gaussian distribution.
        • H1: the sample does not have a Gaussian distribution.
      • Python Code

        # Example of the Anderson-Darling Normality Test\nfrom scipy.stats import anderson\ndata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\nresult = anderson(data)\nprint('stat=%.3f' % (result.statistic))\nfor i in range(len(result.critical_values)):\n    sl, cv = result.significance_level[i], result.critical_values[i]\n    if result.statistic < cv:\n        print('Probably Gaussian at the %.1f%% level' % (sl))\n    else:\n        print('Probably not Gaussian at the %.1f%% level' % (sl))\n
      • Sources

        • scipy.stats.anderson
        • Anderson-Darling test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#correlation-tests","title":"Correlation Tests","text":"

      This section lists statistical tests that you can use to check if two samples are related.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#pearsons-correlation-coefficient","title":"Pearson\u2019s Correlation Coefficient","text":"

      Tests whether two samples have a linear relationship.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
      • Interpretation

        • H0: the two samples are independent.
        • H1: there is a dependency between the samples.
      • Python Code

        # Example of the Pearson's Correlation test\nfrom scipy.stats import pearsonr\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]\nstat, p = pearsonr(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably independent')\nelse:\n    print('Probably dependent')\n
      • Sources

        • scipy.stats.pearsonr
        • Pearson's correlation coefficient on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#spearmans-rank-correlation","title":"Spearman\u2019s Rank Correlation","text":"

      Tests whether two samples have a monotonic relationship.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
      • Interpretation

        • H0: the two samples are independent.
        • H1: there is a dependency between the samples.
      • Python Code

        # Example of the Spearman's Rank Correlation Test\nfrom scipy.stats import spearmanr\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]\nstat, p = spearmanr(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably independent')\nelse:\n    print('Probably dependent')\n
      • Sources

        • scipy.stats.spearmanr
        • Spearman's rank correlation coefficient on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#kendalls-rank-correlation","title":"Kendall\u2019s Rank Correlation","text":"

      Tests whether two samples have a monotonic relationship.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
      • Interpretation

        • H0: the two samples are independent.
        • H1: there is a dependency between the samples.
      • Python Code

        # Example of the Kendall's Rank Correlation Test\nfrom scipy.stats import kendalltau\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]\nstat, p = kendalltau(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably independent')\nelse:\n    print('Probably dependent')\n
      • Sources

        • scipy.stats.kendalltau
        • Kendall rank correlation coefficient on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#chi-squared-test","title":"Chi-Squared Test","text":"

      Tests whether two categorical variables are related or independent.

      • Assumptions

        • Observations used in the calculation of the contingency table are independent.
        • 25 or more examples in each cell of the contingency table.
      • Interpretation

        • H0: the two samples are independent.
        • H1: there is a dependency between the samples.
      • Python Code

        # Example of the Chi-Squared Test\nfrom scipy.stats import chi2_contingency\ntable = [[10, 20, 30],[6,  9,  17]]\nstat, p, dof, expected = chi2_contingency(table)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably independent')\nelse:\n    print('Probably dependent')\n
      • Sources

        • scipy.stats.chi2_contingency
        • Chi-Squared test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#stationary-tests","title":"Stationary Tests","text":"

      This section lists statistical tests that you can use to check if a time series is stationary or not.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#augmented-dickey-fuller-unit-root-test","title":"Augmented Dickey-Fuller Unit Root Test","text":"

      Tests whether a time series has a unit root, e.g. has a trend or more generally is autoregressive.

      • Assumptions

        • Observations in are temporally ordered.
      • Interpretation

        • H0: a unit root is present (series is non-stationary).
        • H1: a unit root is not present (series is stationary).
      • Python Code

        # Example of the Augmented Dickey-Fuller unit root test\nfrom statsmodels.tsa.stattools import adfuller\ndata = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\nstat, p, lags, obs, crit, t = adfuller(data)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably not Stationary')\nelse:\n    print('Probably Stationary')\n
      • Sources

        • statsmodels.tsa.stattools.adfuller API.
        • Augmented Dickey--Fuller test, Wikipedia.
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#kwiatkowski-phillips-schmidt-shin","title":"Kwiatkowski-Phillips-Schmidt-Shin","text":"

      Tests whether a time series is trend stationary or not.

      • Assumptions

        • Observations in are temporally ordered.
      • Interpretation

        • H0: the time series is trend-stationary.
        • H1: the time series is not trend-stationary.
      • Python Code

        # Example of the Kwiatkowski-Phillips-Schmidt-Shin test\nfrom statsmodels.tsa.stattools import kpss\ndata = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\nstat, p, lags, crit = kpss(data)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably Stationary')\nelse:\n    print('Probably not Stationary')\n
      • Sources

        • statsmodels.tsa.stattools.kpss API.
        • KPSS test, Wikipedia.
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#parametric-statistical-hypothesis-tests","title":"Parametric Statistical Hypothesis Tests","text":"

      This section lists statistical tests that you can use to compare data samples.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#students-t-test","title":"Student\u2019s t-test","text":"

      Tests whether the means of two independent samples are significantly different.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
      • Interpretation

        • H0: the means of the samples are equal.
        • H1: the means of the samples are unequal.
      • Python Code

        # Example of the Student's t-test\nfrom scipy.stats import ttest_ind\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = ttest_ind(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.ttest_ind
        • Student's t-test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#paired-students-t-test","title":"Paired Student\u2019s t-test","text":"

      Tests whether the means of two independent samples are significantly different.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
        • Observations across each sample are paired.
      • Interpretation

        • H0: the means of the samples are equal.
        • H1: the means of the samples are unequal.
      • Python Code

        # Example of the Paired Student's t-test\nfrom scipy.stats import ttest_rel\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = ttest_rel(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.ttest_rel
        • Student's t-test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#analysis-of-variance-test-anova","title":"Analysis of Variance Test (ANOVA)","text":"

      Tests whether the means of two or more independent samples are significantly different.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
      • Interpretation

        • H0: the means of the samples are equal.
        • H1: the means of the samples are unequal.
      • Python Code

        # Example of the Analysis of Variance Test\nfrom scipy.stats import f_oneway\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\ndata3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]\nstat, p = f_oneway(data1, data2, data3)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.f_oneway
        • Analysis of variance on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#repeated-measures-anova-test","title":"Repeated Measures ANOVA Test","text":"

      Tests whether the means of two or more paired samples are significantly different.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
        • Observations across each sample are paired.
      • Interpretation

        • H0: the means of the samples are equal.
        • H1: one or more of the means of the samples are unequal.
      • Python Code

        # Currently not supported in Python. :(\n
      • Sources

        • Analysis of variance on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#nonparametric-statistical-hypothesis-tests","title":"Nonparametric Statistical Hypothesis Tests","text":"

      In Non-Parametric tests, we don't make any assumption about the parameters for the given population or the population we are studying. In fact, these tests don't depend on the population. Hence, there is no fixed set of parameters is available, and also there is no distribution (normal distribution, etc.)

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#mann-whitney-u-test","title":"Mann-Whitney U Test","text":"

      Tests whether the distributions of two independent samples are equal or not.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
      • Interpretation

        • H0: the distributions of both samples are equal.
        • H1: the distributions of both samples are not equal.
      • Python Code

        # Example of the Mann-Whitney U Test\nfrom scipy.stats import mannwhitneyu\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = mannwhitneyu(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.mannwhitneyu
        • Mann-Whitney U test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#wilcoxon-signed-rank-test","title":"Wilcoxon Signed-Rank Test","text":"

      Tests whether the distributions of two paired samples are equal or not.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
        • Observations across each sample are paired.
      • Interpretation

        • H0: the distributions of both samples are equal.
        • H1: the distributions of both samples are not equal.
      • Python Code

        # Example of the Wilcoxon Signed-Rank Test\nfrom scipy.stats import wilcoxon\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = wilcoxon(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.wilcoxon
        • Wilcoxon signed-rank test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#kruskal-wallis-h-test","title":"Kruskal-Wallis H Test","text":"

      Tests whether the distributions of two or more independent samples are equal or not.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
      • Interpretation

        • H0: the distributions of all samples are equal.
        • H1: the distributions of one or more samples are not equal.
      • Python Code

        # Example of the Kruskal-Wallis H Test\nfrom scipy.stats import kruskal\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = kruskal(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.kruskal
        • Kruskal-Wallis one-way analysis of variance on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#friedman-test","title":"Friedman Test","text":"

      Tests whether the distributions of two or more paired samples are equal or not.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
        • Observations across each sample are paired.
      • Interpretation

        • H0: the distributions of all samples are equal.
        • H1: the distributions of one or more samples are not equal.
      • Python Code

        # Example of the Friedman Test\nfrom scipy.stats import friedmanchisquare\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\ndata3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]\nstat, p = friedmanchisquare(data1, data2, data3)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.friedmanchisquare
        • Friedman test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#equality-of-variance-test","title":"Equality of variance test","text":"

      Test is used to assess the equality of variance between two different samples.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#levenes-test","title":"Levene's test","text":"

      Levene\u2019s test is used to assess the equality of variance between two or more different samples.

      • Assumptions

        • The samples from the populations under consideration are independent.
        • The populations under consideration are approximately normally distributed.
      • Interpretation

        • H0: All the samples variances are equal
        • H1: At least one variance is different from the rest
      • Python Code

        # Example of the Levene's test\nfrom scipy.stats import levene\na = [8.88, 9.12, 9.04, 8.98, 9.00, 9.08, 9.01, 8.85, 9.06, 8.99]\nb = [8.88, 8.95, 9.29, 9.44, 9.15, 9.58, 8.36, 9.18, 8.67, 9.05]\nc = [8.95, 9.12, 8.95, 8.85, 9.03, 8.84, 9.07, 8.98, 8.86, 8.98]\nstat, p = levene(a, b, c)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same variances')\nelse:\n    print('Probably at least one variance is different from the rest')\n
      • Sources

        • scipy.stats.levene
        • Levene's test on Wikipedia

      Source: https://machinelearningmastery.com/statistical-hypothesis-tests-in-python-cheat-sheet/

      "},{"location":"Deploying-ML-models/deploying-ml-models/","title":"Home","text":""},{"location":"Deploying-ML-models/deploying-ml-models/#introduction","title":"Introduction","text":"

      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      "},{"location":"Deploying-ML-models/deploying-ml-models/#contribute-to-the-platform","title":"Contribute to the platform","text":"

      Contribution in any form will be deeply appreciated. \ud83d\ude4f

      "},{"location":"Deploying-ML-models/deploying-ml-models/#add-questions","title":"Add questions","text":"

      \u2753 Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.

      \ud83e\udd1d Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.

      "},{"location":"Deploying-ML-models/deploying-ml-models/#add-answerstopics","title":"Add answers/topics","text":"

      \ud83d\udcdd These are the answers/topics that need your help at the moment

      • Add documentation for the project
      • Online Material for Learning
      • Suggested Learning Paths
      • Cheat Sheets
        • Django
        • Flask
        • Numpy
        • Pandas
        • PySpark
        • Python
        • RegEx
        • SQL
      • NLP Interview Questions
      • Add python common DSA interview questions
      • Add Major ML topics
        • Linear Regression
        • Logistic Regression
        • SVM
        • Random Forest
        • Gradient boosting
        • PCA
        • Collaborative Filtering
        • K-means clustering
        • kNN
        • ARIMA
        • Neural Networks
        • Decision Trees
        • Overfitting, Underfitting
        • Unbalanced, Skewed data
        • Activation functions relu/ leaky relu
        • Normalization
        • DBSCAN
        • Normal Distribution
        • Precision, Recall
        • Loss Function MAE, RMSE
      • Add Pandas questions
      • Add NumPy questions
      • Add TensorFlow questions
      • Add PyTorch questions
      • Add list of learning resources
      "},{"location":"Deploying-ML-models/deploying-ml-models/#reportsolve-issues","title":"Report/Solve Issues","text":"

      \ud83d\udd27 To report any issues find me on LinkedIn or raise an issue on GitHub.

      \ud83d\udee0 You can also solve existing issues on GitHub and create a pull request.

      "},{"location":"Deploying-ML-models/deploying-ml-models/#say-thanks","title":"Say Thanks","text":"

      \ud83d\ude0a If this platform helped you in any way, it would be great if you could share it with others.

      Check out this \ud83d\udc47 platform \ud83d\udc47 for data science content:\n\ud83d\udc49 https://singhsidhukuldeep.github.io/data-science-interview-prep/ \ud83d\udc48\n\n#data-science #machine-learning #interview-preparation \n

      You can also star the repository on GitHub and watch-out for any updates

      "},{"location":"Deploying-ML-models/deploying-ml-models/#features","title":"Features","text":"
      • \ud83c\udfa8 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices \u2013 from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.

      • \ud83e\uddd0 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search \u2013 server-less \u2013 is fast and accurate in responses to any of the queries.

      • \ud83d\ude4c Accessible:

        • Easy to use: \ud83d\udc4c The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
        • Easy to contribute: \ud83e\udd1d The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
      "},{"location":"Deploying-ML-models/deploying-ml-models/#setup","title":"Setup","text":"

      No setup is required for usage of the platform

      Important: It is strongly advised to use virtual environment and not change anything in gh-pages

      "},{"location":"Deploying-ML-models/deploying-ml-models/#linux-systems","title":"Linux Systems","text":"
      python3 -m venv ./venv\n\nsource venv/bin/activate\n\npip3 install -r requirements.txt\n
      deactivate\n
      "},{"location":"Deploying-ML-models/deploying-ml-models/#windows-systems","title":"Windows Systems","text":"
      python3 -m venv ./venv\n\nvenv\\Scripts\\activate\n\npip3 install -r requirements.txt\n
      venv\\Scripts\\deactivate\n
      "},{"location":"Deploying-ML-models/deploying-ml-models/#to-install-the-latest","title":"To install the latest","text":"
      pip3 install mkdocs\npip3 install mkdocs-material\n
      "},{"location":"Deploying-ML-models/deploying-ml-models/#useful-commands","title":"Useful Commands","text":"
      • mkdocs serve - Start the live-reloading docs server.
      • mkdocs build - Build the documentation site.
      • mkdocs -h - Print help message and exit.
      • mkdocs gh-deploy - Use\u00a0mkdocs gh-deploy --help\u00a0to get a full list of options available for the\u00a0gh-deploy\u00a0command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using the\u00a0build\u00a0or\u00a0serve\u00a0commands and reviewing the built files locally.
      • mkdocs new [dir-name] - Create a new project. No need to create a new project
      "},{"location":"Deploying-ML-models/deploying-ml-models/#useful-documents","title":"Useful Documents","text":"
      • \ud83d\udcd1 MkDocs: https://github.com/mkdocs/mkdocs

      • \ud83c\udfa8 Theme: https://github.com/squidfunk/mkdocs-material

      "},{"location":"Deploying-ML-models/deploying-ml-models/#faq","title":"FAQ","text":"
      • Can I filter questions based on companies? \ud83e\udd2a

        As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. \ud83e\udd13

        This doesn't mean that such feature won't be added in the future. \"Never say Never\"

        But as of now there is neither plan nor data to do so. \ud83d\ude22

      • Why is this platform free? \ud83e\udd17

        Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.

        If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. \ud83d\ude07

      "},{"location":"Deploying-ML-models/deploying-ml-models/#credits","title":"Credits","text":""},{"location":"Deploying-ML-models/deploying-ml-models/#maintained-by","title":"Maintained by","text":"

      \ud83d\udc68\u200d\ud83c\udf93 Kuldeep Singh Sidhu

      Github: github/singhsidhukuldeep https://github.com/singhsidhukuldeep

      Website: Kuldeep Singh Sidhu (Website) http://kuldeepsinghsidhu.com

      LinkedIn: Kuldeep Singh Sidhu (LinkedIn) https://www.linkedin.com/in/singhsidhukuldeep/

      "},{"location":"Deploying-ML-models/deploying-ml-models/#contributors","title":"Contributors","text":"

      \ud83d\ude0e The full list of all the contributors is available here

      "},{"location":"Deploying-ML-models/deploying-ml-models/#current-status","title":"Current Status","text":""},{"location":"Interview-Questions/Interview-Questions/","title":"Interview Questions (Intro)","text":"

      These are currently most commonly asked questions. Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.

      "},{"location":"Interview-Questions/Natural-Language-Processing/","title":"NLP Interview Questions","text":""},{"location":"Interview-Questions/Probability/","title":"Probability Interview Questions","text":"
      • Probability Interview Questions
        • Average score on a dice role of at most 3 times
      "},{"location":"Interview-Questions/Probability/#average-score-on-a-dice-role-of-at-most-3-times","title":"Average score on a dice role of at most 3 times","text":"

      Question

      Consider a fair 6-sided dice. Your aim is to get the highest score you can, in at-most 3 roles.

      A score is defined as the number that appears on the face of the dice facing up after the role. You can role at most 3 times but every time you role it is up to you to decide whether you want to role again.

      The last score will be counted as your final score.

      • Find the average score if you rolled the dice only once?
      • Find the average score that you can get with at most 3 roles?
      • If the dice is fair, why is the average score for at most 3 roles and 1 role not the same?
      Hint 1

      Find what is the expected score on single role

      And for cases when scores of single role < expected score on single role is when you will go for next role

      Eg: if expected score of single role comes out to be 4.5, you will only role next turn for 1,2,3,4 and not for 5,6

      Answer

      If you role a fair dice once you can get:

      Score Probability 1 \u2159 2 \u2159 3 \u2159 4 \u2159 5 \u2159 6 \u2159

      So your average score with one role is:

      sum of(score * scores's probability) = (1+2+3+4+5+6)*(\u2159) = (21/6) = 3.5

      The average score if you rolled the dice only once is 3.5

      For at most 3 roles, let's try back-tracking. Let's say just did your second role and you have to decide whether to do your 3rd role!

      We just found out if we role dice once on average we can expect score of 3.5. So we will only role the 3rd time if score on 2nd role is less than 3.5 i.e (1,2 or 3)

      Possibilities

      2nd role score Probability 3rd role score Probability 1 \u2159 3.5 \u2159 2 \u2159 3.5 \u2159 3 \u2159 3.5 \u2159 4 \u2159 NA We won't role 5 \u2159 NA 3rd time if we 6 \u2159 NA get score >3 on 2nd

      So if we had 2 roles, average score would be:

      [We role again if current score is less than 3.4]\n(3.5)*(1/6) + (3.5)*(1/6) + (3.5)*(1/6) \n+\n(4)*(1/6) + (5)*(1/6) + (6)*(1/6) [Decide not to role again]\n=\n1.75 + 2.5 = 4.25\n

      The average score if you rolled the dice twice is 4.25

      So now if we look from the perspective of first role. We will only role again if our score is less than 4.25 i.e 1,2,3 or 4

      Possibilities

      1st role score Probability 2nd and 3rd role score Probability 1 \u2159 4.25 \u2159 2 \u2159 4.25 \u2159 3 \u2159 4.25 \u2159 4 \u2159 4.25 \u2159 5 \u2159 NA We won't role again if we 6 \u2159 NA get score >4.25 on 1st

      So if we had 3 roles, average score would be:

      [We role again if current score is less than 4.25]\n(4.25)*(1/6) + (4.25)*(1/6) + (4.25)*(1/6) + (4.25)*(1/6) \n+\n(5)*(1/6) + (6)*(1/6) [[Decide not to role again]\n=\n17/6 + 11/6 = 4.66\n
      The average score if you rolled the dice only once is 4.66

      The average score for at most 3 roles and 1 role is not the same because although the dice is fair the event of rolling the dice is no longer independent. The scores would have been the same if we rolled the dice 2nd and 3rd time without considering what we got in the last roll i.e. if the event of rolling the dice was independent.

      "},{"location":"Interview-Questions/System-design/","title":"System Design","text":""},{"location":"Interview-Questions/data-structures-algorithms/","title":"Data Structure and Algorithms (DSA)","text":"
      • Data Structure and Algorithms (DSA)
        • To-do
        • \ud83d\ude01 Easy
          • Two Number Sum
          • Validate Subsequence
          • Nth Fibonacci
          • Product Sum
        • \ud83d\ude42 Medium
          • Top K Frequent Words
        • \ud83e\udd28 Hard
        • \ud83d\ude32 Very Hard
      "},{"location":"Interview-Questions/data-structures-algorithms/#to-do","title":"To-do","text":"
      • Add https://leetcode.com/discuss/interview-question/344650/Amazon-Online-Assessment-Questions
      "},{"location":"Interview-Questions/data-structures-algorithms/#easy","title":"\ud83d\ude01 Easy","text":""},{"location":"Interview-Questions/data-structures-algorithms/#two-number-sum","title":"Two Number Sum","text":"

      Question

      Write a function that takes in a non-empty array of distinct integers and an integer representing a target sum.

      If any two numbers in the input array sum up to the target sum, the function should return them in an array, in any order.

      If no two numbers sum up to the target sum, the function should return an empty array.

      Try it!
      • LeetCode: https://leetcode.com/problems/two-sum/
      Hint 1

      No Hint

      Answer

      # O(n) time | O(n) space\ndef twoNumberSum(array, targetSum):\n    avail = set()\n    for i,v in enumerate(array):\n        if targetSum-v in avail:\n            return [targetSum-v,v]\n        else:\n            avail.add(v)\n    return []\n    pass\n
      # O(nlog(n)) time | O(1) space\ndef twoNumberSum(array, targetSum):\n    array.sort()\n    n = len(array)\n    left = 0\n    right = n-1\n    while left<right:\n        currSum = array[left]+array[right]\n        if currSum==targetSum: return [array[left], array[right]]\n        elif currSum<targetSum: left+=1\n        elif currSum>targetSum: right-=1\n    return []\n    pass\n
      # O(n^2) time | O(1) space\ndef twoNumberSum(array, targetSum):\n    n = len(array)\n    for i in range(n-1):\n        for j in range(i+1,n):\n            if array[i]+array[j] == targetSum:\n                return [array[i],array[j]]\n    return []\n    pass\n

      "},{"location":"Interview-Questions/data-structures-algorithms/#validate-subsequence","title":"Validate Subsequence","text":"

      Question

      Given two non-empty arrays of integers, write a function that determines whether the second array is a subsequence of the first one.

      A subsequence of an array is a set of numbers that aren't necessarily adjacent in the array but that are in the same order as they appear in the array. For instance, the numbers [1, 3, 4] form a subsequence of the array [1, 2, 3, 4] , and so do the numbers [2, 4].

      Note that a single number in an array and the array itself are both valid subsequences of the array.

      Try it!
      • GeeksforGeeks: https://www.geeksforgeeks.org/problems/array-subset-of-another-array2317/1
      Hint 1

      No Hint

      Answer
      # O(n) time | O(1) space - where n is the length of the array\ndef isValidSubsequence(array, sequence):\n    pArray = pSequence = 0\n    while pArray < len(array) and pSequence < len(sequence):\n        if array[pArray] == sequence[pSequence]:\n            pArray+=1\n            pSequence+=1\n        else: pArray+=1\n    return pSequence == len(sequence)\n    pass\n
      "},{"location":"Interview-Questions/data-structures-algorithms/#nth-fibonacci","title":"Nth Fibonacci","text":"

      Question

      The Fibonacci sequence is defined as follows: Any number in the sequence is the sum of the previous 2.

      for fib[n] = fib[n-1] + fib[n-2]

      The 1st and 2nd are fixed at 0,1

      Find the nth Nth Fibonacci sequence

      Try it!
      • LeetCode: https://leetcode.com/problems/fibonacci-number/
      • GeeksforGeeks: https://www.geeksforgeeks.org/problems/nth-fibonacci-number1335/1
      Hint 1

      No Hint

      Answer
      # O(n) time | O(n) space\ndef getNthFib(n):\n    dp = [0,1]\n    while len(dp)<n:\n        dp.append(dp[-1]+dp[-2])\n    return dp[n-1]\n    pass\n
      # O(n) time | O(1) space\ndef getNthFib(n):\n    last_two = [0,1]\n    count = 2\n    while count < n:\n        currFib = last_two[0] + last_two[1]\n        last_two[0] = last_two[1]\n        last_two[1] = currFib\n        count += 1\n    return last_two[1] if n>1 else last_two[0]\n    pass\n
      "},{"location":"Interview-Questions/data-structures-algorithms/#product-sum","title":"Product Sum","text":"

      Question

      Write a function that takes in a \"special\" array and returns its product sum. A \"special\" array is a non-empty array that contains either integers or other \"special\" arrays. The product sum of a \"special\" array is the sum of its elements, where \"special\" arrays inside it are summed themselves and then multiplied by their level of depth.

      For example, the product sum of [x, y] is x + y ; the product sum of [x, [y, z]] is x + 2y + 2z

      Eg: Input: [5, 2, [7, -1], 3, [6, [-13, 8], 4]] Output: 12 # calculated as: 5 + 2 + 2 * (7 - 1) + 3 + 2 * (6 + 3 * (-13 + 8) + 4)

      Try it!
      • LeetCode: https://leetcode.com/problems/fibonacci-number/
      • GeeksforGeeks: https://www.geeksforgeeks.org/problems/nth-fibonacci-number1335/1
      Hint 1

      No Hint

      Answer
      # O(n) time | O(d) space - where n is the total number of elements in the array,\n# including sub-elements, and d is the greatest depth of \"special\" arrays in the array\ndef productSum(array, depth = 1):\n    sum = 0\n    for i,v in enumerate(array):\n        if type(v) is list:\n            sum += productSum(v, depth + 1)\n        else:\n            sum += v\n    return sum*depth\n    pass\n
      "},{"location":"Interview-Questions/data-structures-algorithms/#medium","title":"\ud83d\ude42 Medium","text":""},{"location":"Interview-Questions/data-structures-algorithms/#top-k-frequent-words","title":"Top K Frequent Words","text":"

      Question

      Given a non-empty list of words, return the\u00a0k\u00a0most frequent elements.

      Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

      Example 1:

      Input: [\"i\", \"love\", \"leetcode\", \"i\", \"love\", \"coding\"], k = 2\nOutput: [\"i\", \"love\"]\nExplanation: \"i\" and \"love\" are the two most frequent words.\n    Note that \"i\" comes before \"love\" due to a lower alphabetical order.\n

      Example 2:

      Input: [\"the\", \"day\", \"is\", \"sunny\", \"the\", \"the\", \"the\", \"sunny\", \"is\", \"is\"], k = 4\nOutput: [\"the\", \"is\", \"sunny\", \"day\"]\nExplanation: \"the\", \"is\", \"sunny\" and \"day\" are the four most frequent words,\n    with the number of occurrence being 4, 3, 2 and 1 respectively.\n
      Note:

      1. You may assume\u00a0k\u00a0is always valid, 1 \u2264\u00a0k\u00a0\u2264 number of unique elements.
      2. Input words contain only lowercase letters.

      Follow up:

      1. Try to solve it in\u00a0O(n\u00a0log\u00a0k) time and\u00a0O(n) extra space.
      Try it!
      • LeetCode: https://leetcode.com/problems/fibonacci-number/
      • GeeksforGeeks: https://www.geeksforgeeks.org/problems/nth-fibonacci-number1335/1
      Hint 1

      No Hint

      Answer
      # Count the frequency of each word, and \n# sort the words with a custom ordering relation \n# that uses these frequencies. Then take the best k of them.\n\n# Time Complexity: O(N \\log{N})O(NlogN), where NN is the length of words. \n# We count the frequency of each word in O(N)O(N) time, \n# then we sort the given words in O(N \\log{N})O(NlogN) time.\n# Space Complexity: O(N)O(N), the space used to store our uniqueWords.\ndef topKFrequentWords(words, k)-> List[str]:\n    from collections import Counter\n    wordsFreq = Counter(words)\n    uniqueWords = list(wordsFreq.keys())\n    uniqueWords.sort(key = lambda x: (-wordsFreq[x], x))\n    return uniqueWords[:k]\n
      # Time Complexity: O(N \\log{k})O(Nlogk), where NN is the length of words. \n# We count the frequency of each word in O(N)O(N) time, then we add NN words to the heap, \n# each in O(\\log {k})O(logk) time. Finally, we pop from the heap up to kk times. \n# As k \\leq Nk\u2264N, this is O(N \\log{k})O(Nlogk) in total.\n\n# In Python, we improve this to O(N + k \\log {N})O(N+klogN): our heapq.heapify operation and \n# counting operations are O(N)O(N), and \n# each of kk heapq.heappop operations are O(\\log {N})O(logN).\n\n# Space Complexity: O(N)O(N), the space used to store our wordsFreq.\n\n# Count the frequency of each word, then add it to heap that stores the best k candidates. \n# Here, \"best\" is defined with our custom ordering relation, \n# which puts the worst candidates at the top of the heap. \n# At the end, we pop off the heap up to k times and reverse the result \n# so that the best candidates are first.\n\n# In Python, we instead use heapq.heapify, which can turn a list into a heap in linear time, \n# simplifying our work.\n\ndef topKFrequentWords(words, k)-> List[str]:\n    from heapq import heapify, heappop#, heappush\n    from collections import Counter\n    wordsFreq = Counter(words)\n    heap = [(-freq, word) for word, freq in wordsFreq.items()]\n    heapq.heapify(heap)\n    return [heapq.heappop(heap)[1] for _ in range(k)]\n
      "},{"location":"Interview-Questions/data-structures-algorithms/#hard","title":"\ud83e\udd28 Hard","text":""},{"location":"Interview-Questions/data-structures-algorithms/#very-hard","title":"\ud83d\ude32 Very Hard","text":""}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"

      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      • Interview Questions

        These are currently most commonly asked interview questions.

        Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.

        • DSA (Data Structures & Algorithms)
        • System Design
        • Natural Language Processing (NLP)
        • Probability
      • Cheat Sheets

        Distilled down important concepts for your quick reference

      • ML Algorithms

        From scratch implementation and documentation of all ML algorithms

      • Online Resources

        Most popular and commonly reffered online resources

      Current Platform Status Done Under Development To Do
      • Cheat-Sheets/Hypothesis-Tests/
      1. I
      2. Will
      3. Update
      4. Soon
      1. I
      2. Will
      3. Update
      4. Soon
      • :: Project Maintainer
      • All Contributors list
      • AGPL-3.0 license
      • Reach Out
      "},{"location":"Introduction/","title":"Home","text":""},{"location":"Introduction/#introduction","title":"Introduction","text":"

      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      "},{"location":"Introduction/#contribute-to-the-platform","title":"Contribute to the platform","text":"

      Contribution in any form will be deeply appreciated. \ud83d\ude4f

      "},{"location":"Introduction/#add-questions","title":"Add questions","text":"

      \u2753 Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.

      \ud83e\udd1d Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.

      "},{"location":"Introduction/#add-answerstopics","title":"Add answers/topics","text":"

      \ud83d\udcdd These are the answers/topics that need your help at the moment

      • Add documentation for the project
      • Online Material for Learning
      • Suggested Learning Paths
      • Cheat Sheets
        • Django
        • Flask
        • Numpy
        • Pandas
        • PySpark
        • Python
        • RegEx
        • SQL
      • NLP Interview Questions
      • Add python common DSA interview questions
      • Add Major ML topics
        • Linear Regression
        • Logistic Regression
        • SVM
        • Random Forest
        • Gradient boosting
        • PCA
        • Collaborative Filtering
        • K-means clustering
        • kNN
        • ARIMA
        • Neural Networks
        • Decision Trees
        • Overfitting, Underfitting
        • Unbalanced, Skewed data
        • Activation functions relu/ leaky relu
        • Normalization
        • DBSCAN
        • Normal Distribution
        • Precision, Recall
        • Loss Function MAE, RMSE
      • Add Pandas questions
      • Add NumPy questions
      • Add TensorFlow questions
      • Add PyTorch questions
      • Add list of learning resources
      "},{"location":"Introduction/#reportsolve-issues","title":"Report/Solve Issues","text":"

      \ud83d\udd27 To report any issues find me on LinkedIn or raise an issue on GitHub.

      \ud83d\udee0 You can also solve existing issues on GitHub and create a pull request.

      "},{"location":"Introduction/#say-thanks","title":"Say Thanks","text":"

      \ud83d\ude0a If this platform helped you in any way, it would be great if you could share it with others.

      Check out this \ud83d\udc47 platform \ud83d\udc47 for data science content:\n\ud83d\udc49 https://singhsidhukuldeep.github.io/data-science-interview-prep/ \ud83d\udc48\n

      You can also star the repository on GitHub and watch-out for any updates

      "},{"location":"Introduction/#features","title":"Features","text":"
      • \ud83c\udfa8 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices \u2013 from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.

      • \ud83e\uddd0 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search \u2013 server-less \u2013 is fast and accurate in responses to any of the queries.

      • \ud83d\ude4c Accessible:

        • Easy to use: \ud83d\udc4c The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
        • Easy to contribute: \ud83e\udd1d The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
      "},{"location":"Introduction/#setup","title":"Setup","text":"

      No setup is required for usage of the platform

      Important: It is strongly advised to use virtual environment and not change anything in gh-pages

      "},{"location":"Introduction/#linux-systems","title":"Linux Systems","text":"
      python3 -m venv ./venv\n\nsource venv/bin/activate\n\npip3 install -r requirements.txt\n
      deactivate\n
      "},{"location":"Introduction/#windows-systems","title":"Windows Systems","text":"
      python3 -m venv ./venv\n\nvenv\\Scripts\\activate\n\npip3 install -r requirements.txt\n
      venv\\Scripts\\deactivate\n
      "},{"location":"Introduction/#to-install-the-latest","title":"To install the latest","text":"
      pip3 install mkdocs\npip3 install mkdocs-material\npip3 install mkdocs-minify-plugin\npip3 install mkdocs-git-revision-date-localized-plugin\n
      "},{"location":"Introduction/#useful-commands","title":"Useful Commands","text":"
      • mkdocs serve - Start the live-reloading docs server.
      • mkdocs build - Build the documentation site.
      • mkdocs -h - Print help message and exit.
      • mkdocs gh-deploy - Use\u00a0mkdocs gh-deploy --help\u00a0to get a full list of options available for the\u00a0gh-deploy\u00a0command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using the\u00a0build\u00a0or\u00a0serve\u00a0commands and reviewing the built files locally.
      • mkdocs new [dir-name] - Create a new project. No need to create a new project
      "},{"location":"Introduction/#useful-documents","title":"Useful Documents","text":"
      • \ud83d\udcd1 MkDocs:

        • GitHub: https://github.com/mkdocs/mkdocs
        • Documentation: https://www.mkdocs.org/
      • \ud83c\udfa8 Theme:

        • GitHub: https://github.com/squidfunk/mkdocs-material
        • Documentation: https://squidfunk.github.io/mkdocs-material/getting-started/
      "},{"location":"Introduction/#faq","title":"FAQ","text":"
      • Can I filter questions based on companies? \ud83e\udd2a

        As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. \ud83e\udd13

        This doesn't mean that such feature won't be added in the future. \"Never say Never\"

        But as of now there is neither plan nor data to do so. \ud83d\ude22

      • Why is this platform free? \ud83e\udd17

        Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.

        If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. \ud83d\ude07

      "},{"location":"Introduction/#credits","title":"Credits","text":""},{"location":"Introduction/#maintained-by","title":"Maintained by","text":"

      \ud83d\udc68\u200d\ud83c\udf93 Kuldeep Singh Sidhu

      Github: github/singhsidhukuldeep https://github.com/singhsidhukuldeep

      Website: Kuldeep Singh Sidhu (Website) http://kuldeepsinghsidhu.com

      LinkedIn: Kuldeep Singh Sidhu (LinkedIn) https://www.linkedin.com/in/singhsidhukuldeep/

      "},{"location":"Introduction/#contributors","title":"Contributors","text":"

      \ud83d\ude0e The full list of all the contributors is available here

      "},{"location":"Introduction/#current-status","title":"Current Status","text":""},{"location":"contact/","title":"Contact for https://singhsidhukuldeep.github.io","text":"

      Welcome to https://singhsidhukuldeep.github.io/

      For any information, request or official correspondence please email to: singhsidhukuldeep@gmail.com

      Mailing Address:

      Kuldeep Singh Sidhu

      Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India

      "},{"location":"contact/#follow-on-social-media","title":"Follow on Social Media","text":"Platform Link GitHub https://github.com/singhsidhukuldeep LinkedIn https://www.linkedin.com/in/singhsidhukuldeep/ Twitter (X) https://twitter.com/kuldeep_s_s HuggingFace https://huggingface.co/singhsidhukuldeep StackOverflow https://stackoverflow.com/users/7182350 Website http://kuldeepsinghsidhu.com/"},{"location":"privacy/","title":"Privacy Policy for https://singhsidhukuldeep.github.io","text":""},{"location":"privacy/#introduction","title":"Introduction","text":"

      Welcome to https://singhsidhukuldeep.github.io/ (the \"Website\"). Your privacy is important to us, and we are committed to protecting the personal information you share with us. This Privacy Policy explains how we collect, use, and disclose your information, and our commitment to ensuring that your personal data is handled with care and security.

      This policy complies with the General Data Protection Regulation (GDPR), ePrivacy Directive (EPD), California Privacy Rights Act (CPRA), Colorado Privacy Act (CPA), Virginia Consumer Data Protection Act (VCDPA), and Brazil's Lei Geral de Prote\u00e7\u00e3o de Dados (LGPD).

      "},{"location":"privacy/#information-we-collect","title":"Information We Collect","text":""},{"location":"privacy/#personal-information","title":"Personal Information","text":"

      We may collect personally identifiable information about you, such as:

      • Name
      • Email address
      • IP address
      • Other information you voluntarily provide through contact forms or interactions with the Website
      "},{"location":"privacy/#non-personal-information","title":"Non-Personal Information","text":"

      We may also collect non-personal information such as:

      • Browser type
      • Language preference
      • Referring site
      • Date and time of each visitor request
      • Aggregated data on how visitors use the Website
      "},{"location":"privacy/#cookies-and-web-beacons","title":"Cookies and Web Beacons","text":"

      Our Website uses cookies to enhance your experience. A cookie is a small file that is placed on your device when you visit our Website. Cookies help us to:

      • Remember your preferences and settings
      • Understand how you interact with our Website
      • Track and analyze usage patterns

      You can disable cookies through your browser settings; however, doing so may affect your ability to access certain features of the Website.

      "},{"location":"privacy/#google-adsense","title":"Google AdSense","text":"

      We use Google AdSense to display advertisements on our Website. Google AdSense may use cookies and web beacons to collect information about your interaction with the ads displayed on our Website. This information may include:

      • Your IP address
      • The type of browser you use
      • The pages you visit on our Website

      Google may use this information to show you personalized ads based on your interests and browsing history. For more information on how Google uses your data, please visit the Google Privacy & Terms page.

      "},{"location":"privacy/#legal-bases-for-processing-your-data-gdpr-compliance","title":"Legal Bases for Processing Your Data (GDPR Compliance)","text":"

      We process your personal data under the following legal bases:

      • Consent: When you have given explicit consent for us to process your data for specific purposes.
      • Contract: When processing your data is necessary to fulfill a contract with you or to take steps at your request before entering into a contract.
      • Legitimate Interests: When the processing is necessary for our legitimate interests, such as improving our services, provided these are not overridden by your rights.
      • Compliance with Legal Obligations: When we need to process your data to comply with a legal obligation.
      "},{"location":"privacy/#how-your-data-will-be-used-to-show-ads","title":"How Your Data Will Be Used to Show Ads","text":"

      We work with third-party vendors, including Google, to serve ads on our Website. These vendors use cookies and similar technologies to collect and use data about your visits to this and other websites to show you ads that are more relevant to your interests.

      "},{"location":"privacy/#types-of-data-used","title":"Types of Data Used","text":"

      The data used to show you ads may include:

      • Demographic Information: Age, gender, and other demographic details
      • Location Data: Approximate geographical location based on your IP address
      • Behavioral Data: Your browsing behavior, such as pages visited, links clicked, and time spent on our Website
      • Interests and Preferences: Based on your browsing history, the types of ads you interact with, and your preferences across websites
      "},{"location":"privacy/#purpose-of-data-usage","title":"Purpose of Data Usage","text":"

      The primary purpose of collecting and using this data is to:

      • Serve ads that are relevant and tailored to your interests
      • Improve ad targeting and effectiveness
      • Analyze and optimize the performance of ads on our Website
      "},{"location":"privacy/#opting-out-of-personalized-ads","title":"Opting Out of Personalized Ads","text":"

      You can opt out of personalized ads by adjusting your ad settings with Google and other third-party vendors. For more information on how to opt out of personalized ads, please visit the Google Ads Settings page and review the options available to manage your preferences.

      "},{"location":"privacy/#data-subject-rights-gdpr-cpra-cpa-vcdpa-lgpd-compliance","title":"Data Subject Rights (GDPR, CPRA, CPA, VCDPA, LGPD Compliance)","text":"

      Depending on your jurisdiction, you have the following rights regarding your personal data:

      "},{"location":"privacy/#right-to-access","title":"Right to Access","text":"

      You have the right to request access to the personal data we hold about you and to receive a copy of this data.

      "},{"location":"privacy/#right-to-rectification","title":"Right to Rectification","text":"

      You have the right to request that we correct any inaccuracies in the personal data we hold about you.

      "},{"location":"privacy/#right-to-erasure-right-to-be-forgotten","title":"Right to Erasure (Right to Be Forgotten)","text":"

      You have the right to request that we delete your personal data, subject to certain conditions and legal obligations.

      "},{"location":"privacy/#right-to-restriction-of-processing","title":"Right to Restriction of Processing","text":"

      You have the right to request that we restrict the processing of your personal data in certain circumstances, such as when you contest the accuracy of the data.

      "},{"location":"privacy/#right-to-data-portability","title":"Right to Data Portability","text":"

      You have the right to receive your personal data in a structured, commonly used, and machine-readable format and to transmit this data to another controller.

      "},{"location":"privacy/#right-to-object","title":"Right to Object","text":"

      You have the right to object to the processing of your personal data based on legitimate interests or for direct marketing purposes.

      "},{"location":"privacy/#right-to-withdraw-consent","title":"Right to Withdraw Consent","text":"

      Where we rely on your consent to process your personal data, you have the right to withdraw your consent at any time.

      "},{"location":"privacy/#right-to-non-discrimination-cpra-compliance","title":"Right to Non-Discrimination (CPRA Compliance)","text":"

      We will not discriminate against you for exercising any of your privacy rights under CPRA or any other applicable laws.

      "},{"location":"privacy/#exercising-your-rights","title":"Exercising Your Rights","text":"

      To exercise any of these rights, please contact us at:

      Email: singhsidhukuldeep@gmail.com

      We will respond to your request within the timeframes required by applicable law.

      "},{"location":"privacy/#how-we-use-your-information","title":"How We Use Your Information","text":"

      We use the information collected from you to:

      • Improve the content and functionality of our Website
      • Display relevant advertisements through Google AdSense and other ad networks
      • Respond to your inquiries and provide customer support
      • Analyze usage patterns and improve our services
      "},{"location":"privacy/#data-sharing-and-disclosure","title":"Data Sharing and Disclosure","text":""},{"location":"privacy/#third-party-service-providers","title":"Third-Party Service Providers","text":"

      We may share your personal data with third-party service providers who assist us in operating our Website, conducting our business, or servicing you, as long as these parties agree to keep this information confidential.

      "},{"location":"privacy/#legal-obligations","title":"Legal Obligations","text":"

      We may disclose your personal data when required by law or to comply with legal processes, such as a court order or subpoena.

      "},{"location":"privacy/#business-transfers","title":"Business Transfers","text":"

      In the event of a merger, acquisition, or sale of all or a portion of our assets, your personal data may be transferred to the acquiring entity.

      "},{"location":"privacy/#data-retention","title":"Data Retention","text":"

      We will retain your personal data only for as long as necessary to fulfill the purposes outlined in this Privacy Policy unless a longer retention period is required or permitted by law.

      "},{"location":"privacy/#data-security","title":"Data Security","text":"

      We take reasonable measures to protect your information from unauthorized access, alteration, disclosure, or destruction. However, no method of transmission over the internet or electronic storage is 100% secure, and we cannot guarantee absolute security.

      "},{"location":"privacy/#cross-border-data-transfers","title":"Cross-Border Data Transfers","text":"

      Your personal data may be transferred to, and processed in, countries other than the country in which you are resident. These countries may have data protection laws that are different from the laws of your country.

      Where we transfer your personal data to other countries, we will take appropriate measures to ensure that your personal data remains protected in accordance with this Privacy Policy and applicable data protection laws.

      "},{"location":"privacy/#your-consent","title":"Your Consent","text":"

      By using our Website, you consent to our Privacy Policy and agree to its terms.

      "},{"location":"privacy/#changes-to-this-privacy-policy","title":"Changes to This Privacy Policy","text":"

      We may update this Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page. You are advised to review this Privacy Policy periodically for any changes.

      "},{"location":"privacy/#contact-us","title":"Contact Us","text":"

      If you have any questions about this Privacy Policy, or if you would like to exercise your rights under GDPR, CPRA, CPA, VCDPA, or LGPD, please contact us at:

      Email: singhsidhukuldeep@gmail.com

      Mailing Address:

      Kuldeep Singh Sidhu

      Street No 4, Malviya Nagar Bathinda, Punjab, 151001 India

      "},{"location":"projects/","title":"Projects","text":""},{"location":"projects/#introduction","title":"Introduction","text":"

      These are the projects that you can take inspiration from and try to improve on them. \u270d\ufe0f

      "},{"location":"projects/#popular-sources","title":"Popular Sources","text":""},{"location":"projects/#list-of-projects","title":"List of projects","text":""},{"location":"projects/#natural-language-processing-nlp","title":"Natural Language processing (NLP)","text":"Title Description Source Author Text Classification with Facebook fasttext Building the User Review Model with fastText (Text Classification) with response time of less than one second Kuldeep Singh Sidhu Chat-bot using ChatterBot ChatterBot is a Python library that makes it easy to generate automated responses to a user\u2019s input. Kuldeep Singh Sidhu Text Summarizer Comparing state of the art models for text summary generation Kuldeep Singh Sidhu NLP with Spacy Building NLP pipeline using Spacy Kuldeep Singh Sidhu"},{"location":"projects/#recommendation-engine","title":"Recommendation Engine","text":"Title Description Source Author Recommendation Engine with Surprise Comparing different recommendation systems algorithms like SVD, SVDpp (Matrix Factorization), KNN Baseline, KNN Basic, KNN Means, KNN ZScore), Baseline, Co Clustering Kuldeep Singh Sidhu"},{"location":"projects/#image-processing","title":"Image Processing","text":"Title Description Source Author Facial Landmarks Using Dlib, a library capable of giving you 68 points (land marks) of the face. Kuldeep Singh Sidhu"},{"location":"projects/#reinforcement-learning","title":"Reinforcement Learning","text":"Title Description Source Author Google Dopamine Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. Kuldeep Singh Sidhu Tic Tac Toe Training a computer to play Tic Tac Toe using reinforcement learning algorithms. Kuldeep Singh Sidhu"},{"location":"projects/#others","title":"Others","text":"Title Description Source Author TensorFlow Eager Execution Eager Execution (EE) enables you to run operations immediately. Kuldeep Singh Sidhu"},{"location":"Cheat-Sheets/Hypothesis-Tests/","title":"Hypothesis Tests in Python","text":"

      A\u00a0statistical hypothesis test\u00a0is a method of\u00a0statistical inference\u00a0used to decide whether the data at hand sufficiently support a particular hypothesis. Hypothesis testing allows us to make probabilistic statements about population parameters.

      Few Notes:

      • When it comes to assumptions such as the expected distribution of data or sample size, the results of a given test are likely to degrade gracefully rather than become immediately unusable if an assumption is violated.
      • Generally, data samples need to be representative of the domain and large enough to expose their distribution to analysis.
      • In some cases, the data can be corrected to meet the assumptions, such as correcting a nearly normal distribution to be normal by removing outliers, or using a correction to the degrees of freedom in a statistical test when samples have differing variance, to name two examples.
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#normality-tests","title":"Normality Tests","text":"

      This section lists statistical tests that you can use to check if your data has a Gaussian distribution.

      Gaussian distribution (also known as normal distribution) is a bell-shaped curve.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#shapiro-wilk-test","title":"Shapiro-Wilk Test","text":"

      Tests whether a data sample has a Gaussian distribution/Normal distribution.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
      • Interpretation

        • H0: the sample has a Gaussian distribution.
        • H1: the sample does not have a Gaussian distribution.
      • Python Code

        # Example of the Shapiro-Wilk Normality Test\nfrom scipy.stats import shapiro\ndata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\nstat, p = shapiro(data)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably Gaussian')\nelse:\n    print('Probably not Gaussian')\n
      • Sources

        • scipy.stats.shapiro
        • Shapiro-Wilk test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#dagostinos-k2-test","title":"D\u2019Agostino\u2019s K^2 Test","text":"

      Tests whether a data sample has a Gaussian distribution/Normal distribution.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
      • Interpretation

        • H0: the sample has a Gaussian distribution.
        • H1: the sample does not have a Gaussian distribution.
      • Python Code

        # Example of the D'Agostino's K^2 Normality Test\nfrom scipy.stats import normaltest\ndata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\nstat, p = normaltest(data)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably Gaussian')\nelse:\n    print('Probably not Gaussian')\n
      • Sources

        • scipy.stats.normaltest
        • D'Agostino's K-squared test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#anderson-darling-test","title":"Anderson-Darling Test","text":"

      Tests whether a data sample has a Gaussian distribution/Normal distribution.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
      • Interpretation

        • H0: the sample has a Gaussian distribution.
        • H1: the sample does not have a Gaussian distribution.
      • Python Code

        # Example of the Anderson-Darling Normality Test\nfrom scipy.stats import anderson\ndata = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\nresult = anderson(data)\nprint('stat=%.3f' % (result.statistic))\nfor i in range(len(result.critical_values)):\n    sl, cv = result.significance_level[i], result.critical_values[i]\n    if result.statistic < cv:\n        print('Probably Gaussian at the %.1f%% level' % (sl))\n    else:\n        print('Probably not Gaussian at the %.1f%% level' % (sl))\n
      • Sources

        • scipy.stats.anderson
        • Anderson-Darling test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#correlation-tests","title":"Correlation Tests","text":"

      This section lists statistical tests that you can use to check if two samples are related.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#pearsons-correlation-coefficient","title":"Pearson\u2019s Correlation Coefficient","text":"

      Tests whether two samples have a linear relationship.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
      • Interpretation

        • H0: the two samples are independent.
        • H1: there is a dependency between the samples.
      • Python Code

        # Example of the Pearson's Correlation test\nfrom scipy.stats import pearsonr\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]\nstat, p = pearsonr(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably independent')\nelse:\n    print('Probably dependent')\n
      • Sources

        • scipy.stats.pearsonr
        • Pearson's correlation coefficient on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#spearmans-rank-correlation","title":"Spearman\u2019s Rank Correlation","text":"

      Tests whether two samples have a monotonic relationship.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
      • Interpretation

        • H0: the two samples are independent.
        • H1: there is a dependency between the samples.
      • Python Code

        # Example of the Spearman's Rank Correlation Test\nfrom scipy.stats import spearmanr\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]\nstat, p = spearmanr(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably independent')\nelse:\n    print('Probably dependent')\n
      • Sources

        • scipy.stats.spearmanr
        • Spearman's rank correlation coefficient on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#kendalls-rank-correlation","title":"Kendall\u2019s Rank Correlation","text":"

      Tests whether two samples have a monotonic relationship.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
      • Interpretation

        • H0: the two samples are independent.
        • H1: there is a dependency between the samples.
      • Python Code

        # Example of the Kendall's Rank Correlation Test\nfrom scipy.stats import kendalltau\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [0.353, 3.517, 0.125, -7.545, -0.555, -1.536, 3.350, -1.578, -3.537, -1.579]\nstat, p = kendalltau(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably independent')\nelse:\n    print('Probably dependent')\n
      • Sources

        • scipy.stats.kendalltau
        • Kendall rank correlation coefficient on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#chi-squared-test","title":"Chi-Squared Test","text":"

      Tests whether two categorical variables are related or independent.

      • Assumptions

        • Observations used in the calculation of the contingency table are independent.
        • 25 or more examples in each cell of the contingency table.
      • Interpretation

        • H0: the two samples are independent.
        • H1: there is a dependency between the samples.
      • Python Code

        # Example of the Chi-Squared Test\nfrom scipy.stats import chi2_contingency\ntable = [[10, 20, 30],[6,  9,  17]]\nstat, p, dof, expected = chi2_contingency(table)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably independent')\nelse:\n    print('Probably dependent')\n
      • Sources

        • scipy.stats.chi2_contingency
        • Chi-Squared test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#stationary-tests","title":"Stationary Tests","text":"

      This section lists statistical tests that you can use to check if a time series is stationary or not.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#augmented-dickey-fuller-unit-root-test","title":"Augmented Dickey-Fuller Unit Root Test","text":"

      Tests whether a time series has a unit root, e.g. has a trend or more generally is autoregressive.

      • Assumptions

        • Observations in are temporally ordered.
      • Interpretation

        • H0: a unit root is present (series is non-stationary).
        • H1: a unit root is not present (series is stationary).
      • Python Code

        # Example of the Augmented Dickey-Fuller unit root test\nfrom statsmodels.tsa.stattools import adfuller\ndata = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\nstat, p, lags, obs, crit, t = adfuller(data)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably not Stationary')\nelse:\n    print('Probably Stationary')\n
      • Sources

        • statsmodels.tsa.stattools.adfuller API.
        • Augmented Dickey--Fuller test, Wikipedia.
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#kwiatkowski-phillips-schmidt-shin","title":"Kwiatkowski-Phillips-Schmidt-Shin","text":"

      Tests whether a time series is trend stationary or not.

      • Assumptions

        • Observations in are temporally ordered.
      • Interpretation

        • H0: the time series is trend-stationary.
        • H1: the time series is not trend-stationary.
      • Python Code

        # Example of the Kwiatkowski-Phillips-Schmidt-Shin test\nfrom statsmodels.tsa.stattools import kpss\ndata = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\nstat, p, lags, crit = kpss(data)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably Stationary')\nelse:\n    print('Probably not Stationary')\n
      • Sources

        • statsmodels.tsa.stattools.kpss API.
        • KPSS test, Wikipedia.
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#parametric-statistical-hypothesis-tests","title":"Parametric Statistical Hypothesis Tests","text":"

      This section lists statistical tests that you can use to compare data samples.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#students-t-test","title":"Student\u2019s t-test","text":"

      Tests whether the means of two independent samples are significantly different.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
      • Interpretation

        • H0: the means of the samples are equal.
        • H1: the means of the samples are unequal.
      • Python Code

        # Example of the Student's t-test\nfrom scipy.stats import ttest_ind\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = ttest_ind(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.ttest_ind
        • Student's t-test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#paired-students-t-test","title":"Paired Student\u2019s t-test","text":"

      Tests whether the means of two independent samples are significantly different.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
        • Observations across each sample are paired.
      • Interpretation

        • H0: the means of the samples are equal.
        • H1: the means of the samples are unequal.
      • Python Code

        # Example of the Paired Student's t-test\nfrom scipy.stats import ttest_rel\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = ttest_rel(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.ttest_rel
        • Student's t-test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#analysis-of-variance-test-anova","title":"Analysis of Variance Test (ANOVA)","text":"

      Tests whether the means of two or more independent samples are significantly different.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
      • Interpretation

        • H0: the means of the samples are equal.
        • H1: the means of the samples are unequal.
      • Python Code

        # Example of the Analysis of Variance Test\nfrom scipy.stats import f_oneway\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\ndata3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]\nstat, p = f_oneway(data1, data2, data3)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.f_oneway
        • Analysis of variance on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#repeated-measures-anova-test","title":"Repeated Measures ANOVA Test","text":"

      Tests whether the means of two or more paired samples are significantly different.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample are normally distributed.
        • Observations in each sample have the same variance.
        • Observations across each sample are paired.
      • Interpretation

        • H0: the means of the samples are equal.
        • H1: one or more of the means of the samples are unequal.
      • Python Code

        # Currently not supported in Python. :(\n
      • Sources

        • Analysis of variance on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#nonparametric-statistical-hypothesis-tests","title":"Nonparametric Statistical Hypothesis Tests","text":"

      In Non-Parametric tests, we don't make any assumption about the parameters for the given population or the population we are studying. In fact, these tests don't depend on the population. Hence, there is no fixed set of parameters is available, and also there is no distribution (normal distribution, etc.)

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#mann-whitney-u-test","title":"Mann-Whitney U Test","text":"

      Tests whether the distributions of two independent samples are equal or not.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
      • Interpretation

        • H0: the distributions of both samples are equal.
        • H1: the distributions of both samples are not equal.
      • Python Code

        # Example of the Mann-Whitney U Test\nfrom scipy.stats import mannwhitneyu\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = mannwhitneyu(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.mannwhitneyu
        • Mann-Whitney U test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#wilcoxon-signed-rank-test","title":"Wilcoxon Signed-Rank Test","text":"

      Tests whether the distributions of two paired samples are equal or not.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
        • Observations across each sample are paired.
      • Interpretation

        • H0: the distributions of both samples are equal.
        • H1: the distributions of both samples are not equal.
      • Python Code

        # Example of the Wilcoxon Signed-Rank Test\nfrom scipy.stats import wilcoxon\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = wilcoxon(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.wilcoxon
        • Wilcoxon signed-rank test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#kruskal-wallis-h-test","title":"Kruskal-Wallis H Test","text":"

      Tests whether the distributions of two or more independent samples are equal or not.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
      • Interpretation

        • H0: the distributions of all samples are equal.
        • H1: the distributions of one or more samples are not equal.
      • Python Code

        # Example of the Kruskal-Wallis H Test\nfrom scipy.stats import kruskal\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\nstat, p = kruskal(data1, data2)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.kruskal
        • Kruskal-Wallis one-way analysis of variance on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#friedman-test","title":"Friedman Test","text":"

      Tests whether the distributions of two or more paired samples are equal or not.

      • Assumptions

        • Observations in each sample are independent and identically distributed (iid).
        • Observations in each sample can be ranked.
        • Observations across each sample are paired.
      • Interpretation

        • H0: the distributions of all samples are equal.
        • H1: the distributions of one or more samples are not equal.
      • Python Code

        # Example of the Friedman Test\nfrom scipy.stats import friedmanchisquare\ndata1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]\ndata2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]\ndata3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]\nstat, p = friedmanchisquare(data1, data2, data3)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same distribution')\nelse:\n    print('Probably different distributions')\n
      • Sources

        • scipy.stats.friedmanchisquare
        • Friedman test on Wikipedia
      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#equality-of-variance-test","title":"Equality of variance test","text":"

      Test is used to assess the equality of variance between two different samples.

      "},{"location":"Cheat-Sheets/Hypothesis-Tests/#levenes-test","title":"Levene's test","text":"

      Levene\u2019s test is used to assess the equality of variance between two or more different samples.

      • Assumptions

        • The samples from the populations under consideration are independent.
        • The populations under consideration are approximately normally distributed.
      • Interpretation

        • H0: All the samples variances are equal
        • H1: At least one variance is different from the rest
      • Python Code

        # Example of the Levene's test\nfrom scipy.stats import levene\na = [8.88, 9.12, 9.04, 8.98, 9.00, 9.08, 9.01, 8.85, 9.06, 8.99]\nb = [8.88, 8.95, 9.29, 9.44, 9.15, 9.58, 8.36, 9.18, 8.67, 9.05]\nc = [8.95, 9.12, 8.95, 8.85, 9.03, 8.84, 9.07, 8.98, 8.86, 8.98]\nstat, p = levene(a, b, c)\nprint('stat=%.3f, p=%.3f' % (stat, p))\nif p > 0.05:\n    print('Probably the same variances')\nelse:\n    print('Probably at least one variance is different from the rest')\n
      • Sources

        • scipy.stats.levene
        • Levene's test on Wikipedia

      Source: https://machinelearningmastery.com/statistical-hypothesis-tests-in-python-cheat-sheet/

      "},{"location":"Deploying-ML-models/deploying-ml-models/","title":"Home","text":""},{"location":"Deploying-ML-models/deploying-ml-models/#introduction","title":"Introduction","text":"

      This is a completely open-source platform for maintaining curated list of interview questions and answers for people looking and preparing for data science opportunities.

      Not only this, the platform will also serve as one point destination for all your needs like tutorials, online materials, etc.

      This platform is maintained by you! \ud83e\udd17 You can help us by answering/ improving existing questions as well as by sharing any new questions that you faced during your interviews.

      "},{"location":"Deploying-ML-models/deploying-ml-models/#contribute-to-the-platform","title":"Contribute to the platform","text":"

      Contribution in any form will be deeply appreciated. \ud83d\ude4f

      "},{"location":"Deploying-ML-models/deploying-ml-models/#add-questions","title":"Add questions","text":"

      \u2753 Add your questions here. Please ensure to provide a detailed description to allow your fellow contributors to understand your questions and answer them to your satisfaction.

      \ud83e\udd1d Please note that as of now, you cannot directly add a question via a pull request. This will help us to maintain the quality of the content for you.

      "},{"location":"Deploying-ML-models/deploying-ml-models/#add-answerstopics","title":"Add answers/topics","text":"

      \ud83d\udcdd These are the answers/topics that need your help at the moment

      • Add documentation for the project
      • Online Material for Learning
      • Suggested Learning Paths
      • Cheat Sheets
        • Django
        • Flask
        • Numpy
        • Pandas
        • PySpark
        • Python
        • RegEx
        • SQL
      • NLP Interview Questions
      • Add python common DSA interview questions
      • Add Major ML topics
        • Linear Regression
        • Logistic Regression
        • SVM
        • Random Forest
        • Gradient boosting
        • PCA
        • Collaborative Filtering
        • K-means clustering
        • kNN
        • ARIMA
        • Neural Networks
        • Decision Trees
        • Overfitting, Underfitting
        • Unbalanced, Skewed data
        • Activation functions relu/ leaky relu
        • Normalization
        • DBSCAN
        • Normal Distribution
        • Precision, Recall
        • Loss Function MAE, RMSE
      • Add Pandas questions
      • Add NumPy questions
      • Add TensorFlow questions
      • Add PyTorch questions
      • Add list of learning resources
      "},{"location":"Deploying-ML-models/deploying-ml-models/#reportsolve-issues","title":"Report/Solve Issues","text":"

      \ud83d\udd27 To report any issues find me on LinkedIn or raise an issue on GitHub.

      \ud83d\udee0 You can also solve existing issues on GitHub and create a pull request.

      "},{"location":"Deploying-ML-models/deploying-ml-models/#say-thanks","title":"Say Thanks","text":"

      \ud83d\ude0a If this platform helped you in any way, it would be great if you could share it with others.

      Check out this \ud83d\udc47 platform \ud83d\udc47 for data science content:\n\ud83d\udc49 https://singhsidhukuldeep.github.io/data-science-interview-prep/ \ud83d\udc48\n\n#data-science #machine-learning #interview-preparation \n

      You can also star the repository on GitHub and watch-out for any updates

      "},{"location":"Deploying-ML-models/deploying-ml-models/#features","title":"Features","text":"
      • \ud83c\udfa8 Beautiful: The design is built on top of most popular libraries like MkDocs and material which allows the platform to be responsive and to work on all sorts of devices \u2013 from mobile phones to wide-screens. The underlying fluid layout will always adapt perfectly to the available screen space.

      • \ud83e\uddd0 Searchable: almost magically, all the content on the website is searchable without any further ado. The built-in search \u2013 server-less \u2013 is fast and accurate in responses to any of the queries.

      • \ud83d\ude4c Accessible:

        • Easy to use: \ud83d\udc4c The website is hosted on github-pages and is free and open to use to over 40 million users of GitHub in 100+ countries.
        • Easy to contribute: \ud83e\udd1d The website embodies the concept of collaboration to the latter. Allowing anyone to add/improve the content. To make contributing easy, everything is written in MarkDown and then compiled to beautiful html.
      "},{"location":"Deploying-ML-models/deploying-ml-models/#setup","title":"Setup","text":"

      No setup is required for usage of the platform

      Important: It is strongly advised to use virtual environment and not change anything in gh-pages

      "},{"location":"Deploying-ML-models/deploying-ml-models/#linux-systems","title":"Linux Systems","text":"
      python3 -m venv ./venv\n\nsource venv/bin/activate\n\npip3 install -r requirements.txt\n
      deactivate\n
      "},{"location":"Deploying-ML-models/deploying-ml-models/#windows-systems","title":"Windows Systems","text":"
      python3 -m venv ./venv\n\nvenv\\Scripts\\activate\n\npip3 install -r requirements.txt\n
      venv\\Scripts\\deactivate\n
      "},{"location":"Deploying-ML-models/deploying-ml-models/#to-install-the-latest","title":"To install the latest","text":"
      pip3 install mkdocs\npip3 install mkdocs-material\n
      "},{"location":"Deploying-ML-models/deploying-ml-models/#useful-commands","title":"Useful Commands","text":"
      • mkdocs serve - Start the live-reloading docs server.
      • mkdocs build - Build the documentation site.
      • mkdocs -h - Print help message and exit.
      • mkdocs gh-deploy - Use\u00a0mkdocs gh-deploy --help\u00a0to get a full list of options available for the\u00a0gh-deploy\u00a0command. Be aware that you will not be able to review the built site before it is pushed to GitHub. Therefore, you may want to verify any changes you make to the docs beforehand by using the\u00a0build\u00a0or\u00a0serve\u00a0commands and reviewing the built files locally.
      • mkdocs new [dir-name] - Create a new project. No need to create a new project
      "},{"location":"Deploying-ML-models/deploying-ml-models/#useful-documents","title":"Useful Documents","text":"
      • \ud83d\udcd1 MkDocs: https://github.com/mkdocs/mkdocs

      • \ud83c\udfa8 Theme: https://github.com/squidfunk/mkdocs-material

      "},{"location":"Deploying-ML-models/deploying-ml-models/#faq","title":"FAQ","text":"
      • Can I filter questions based on companies? \ud83e\udd2a

        As much as this platform aims to help you with your interview preparation, it is not a short-cut to crack one. Think of this platform as a practicing field to help you sharpen your skills for your interview processes. However, for your convenience we have sorted all the questions by topics for you. \ud83e\udd13

        This doesn't mean that such feature won't be added in the future. \"Never say Never\"

        But as of now there is neither plan nor data to do so. \ud83d\ude22

      • Why is this platform free? \ud83e\udd17

        Currently there is no major cost involved in maintaining this platform other than time and effort that is put in by every contributor. If you want to help you can contribute here.

        If you still want to pay for something that is free, we would request you to donate it to a charity of your choice instead. \ud83d\ude07

      "},{"location":"Deploying-ML-models/deploying-ml-models/#credits","title":"Credits","text":""},{"location":"Deploying-ML-models/deploying-ml-models/#maintained-by","title":"Maintained by","text":"

      \ud83d\udc68\u200d\ud83c\udf93 Kuldeep Singh Sidhu

      Github: github/singhsidhukuldeep https://github.com/singhsidhukuldeep

      Website: Kuldeep Singh Sidhu (Website) http://kuldeepsinghsidhu.com

      LinkedIn: Kuldeep Singh Sidhu (LinkedIn) https://www.linkedin.com/in/singhsidhukuldeep/

      "},{"location":"Deploying-ML-models/deploying-ml-models/#contributors","title":"Contributors","text":"

      \ud83d\ude0e The full list of all the contributors is available here

      "},{"location":"Deploying-ML-models/deploying-ml-models/#current-status","title":"Current Status","text":""},{"location":"Interview-Questions/Interview-Questions/","title":"Interview Questions (Intro)","text":"

      These are currently most commonly asked questions. Questions can be removed if they are no longer popular in interview circles and added as new question banks are released.

      "},{"location":"Interview-Questions/Natural-Language-Processing/","title":"NLP Interview Questions","text":""},{"location":"Interview-Questions/Probability/","title":"Probability Interview Questions","text":"
      • Probability Interview Questions
        • Average score on a dice role of at most 3 times
      "},{"location":"Interview-Questions/Probability/#average-score-on-a-dice-role-of-at-most-3-times","title":"Average score on a dice role of at most 3 times","text":"

      Question

      Consider a fair 6-sided dice. Your aim is to get the highest score you can, in at-most 3 roles.

      A score is defined as the number that appears on the face of the dice facing up after the role. You can role at most 3 times but every time you role it is up to you to decide whether you want to role again.

      The last score will be counted as your final score.

      • Find the average score if you rolled the dice only once?
      • Find the average score that you can get with at most 3 roles?
      • If the dice is fair, why is the average score for at most 3 roles and 1 role not the same?
      Hint 1

      Find what is the expected score on single role

      And for cases when scores of single role < expected score on single role is when you will go for next role

      Eg: if expected score of single role comes out to be 4.5, you will only role next turn for 1,2,3,4 and not for 5,6

      Answer

      If you role a fair dice once you can get:

      Score Probability 1 \u2159 2 \u2159 3 \u2159 4 \u2159 5 \u2159 6 \u2159

      So your average score with one role is:

      sum of(score * scores's probability) = (1+2+3+4+5+6)*(\u2159) = (21/6) = 3.5

      The average score if you rolled the dice only once is 3.5

      For at most 3 roles, let's try back-tracking. Let's say just did your second role and you have to decide whether to do your 3rd role!

      We just found out if we role dice once on average we can expect score of 3.5. So we will only role the 3rd time if score on 2nd role is less than 3.5 i.e (1,2 or 3)

      Possibilities

      2nd role score Probability 3rd role score Probability 1 \u2159 3.5 \u2159 2 \u2159 3.5 \u2159 3 \u2159 3.5 \u2159 4 \u2159 NA We won't role 5 \u2159 NA 3rd time if we 6 \u2159 NA get score >3 on 2nd

      So if we had 2 roles, average score would be:

      [We role again if current score is less than 3.4]\n(3.5)*(1/6) + (3.5)*(1/6) + (3.5)*(1/6) \n+\n(4)*(1/6) + (5)*(1/6) + (6)*(1/6) [Decide not to role again]\n=\n1.75 + 2.5 = 4.25\n

      The average score if you rolled the dice twice is 4.25

      So now if we look from the perspective of first role. We will only role again if our score is less than 4.25 i.e 1,2,3 or 4

      Possibilities

      1st role score Probability 2nd and 3rd role score Probability 1 \u2159 4.25 \u2159 2 \u2159 4.25 \u2159 3 \u2159 4.25 \u2159 4 \u2159 4.25 \u2159 5 \u2159 NA We won't role again if we 6 \u2159 NA get score >4.25 on 1st

      So if we had 3 roles, average score would be:

      [We role again if current score is less than 4.25]\n(4.25)*(1/6) + (4.25)*(1/6) + (4.25)*(1/6) + (4.25)*(1/6) \n+\n(5)*(1/6) + (6)*(1/6) [[Decide not to role again]\n=\n17/6 + 11/6 = 4.66\n
      The average score if you rolled the dice only once is 4.66

      The average score for at most 3 roles and 1 role is not the same because although the dice is fair the event of rolling the dice is no longer independent. The scores would have been the same if we rolled the dice 2nd and 3rd time without considering what we got in the last roll i.e. if the event of rolling the dice was independent.

      "},{"location":"Interview-Questions/System-design/","title":"System Design","text":""},{"location":"Interview-Questions/data-structures-algorithms/","title":"Data Structure and Algorithms (DSA)","text":"
      • Data Structure and Algorithms (DSA)
        • To-do
        • \ud83d\ude01 Easy
          • Two Number Sum
          • Validate Subsequence
          • Nth Fibonacci
          • Product Sum
        • \ud83d\ude42 Medium
          • Top K Frequent Words
        • \ud83e\udd28 Hard
        • \ud83d\ude32 Very Hard
      "},{"location":"Interview-Questions/data-structures-algorithms/#to-do","title":"To-do","text":"
      • Add https://leetcode.com/discuss/interview-question/344650/Amazon-Online-Assessment-Questions
      "},{"location":"Interview-Questions/data-structures-algorithms/#easy","title":"\ud83d\ude01 Easy","text":""},{"location":"Interview-Questions/data-structures-algorithms/#two-number-sum","title":"Two Number Sum","text":"

      Question

      Write a function that takes in a non-empty array of distinct integers and an integer representing a target sum.

      If any two numbers in the input array sum up to the target sum, the function should return them in an array, in any order.

      If no two numbers sum up to the target sum, the function should return an empty array.

      Try it!
      • LeetCode: https://leetcode.com/problems/two-sum/
      Hint 1

      No Hint

      Answer

      # O(n) time | O(n) space\ndef twoNumberSum(array, targetSum):\n    avail = set()\n    for i,v in enumerate(array):\n        if targetSum-v in avail:\n            return [targetSum-v,v]\n        else:\n            avail.add(v)\n    return []\n    pass\n
      # O(nlog(n)) time | O(1) space\ndef twoNumberSum(array, targetSum):\n    array.sort()\n    n = len(array)\n    left = 0\n    right = n-1\n    while left<right:\n        currSum = array[left]+array[right]\n        if currSum==targetSum: return [array[left], array[right]]\n        elif currSum<targetSum: left+=1\n        elif currSum>targetSum: right-=1\n    return []\n    pass\n
      # O(n^2) time | O(1) space\ndef twoNumberSum(array, targetSum):\n    n = len(array)\n    for i in range(n-1):\n        for j in range(i+1,n):\n            if array[i]+array[j] == targetSum:\n                return [array[i],array[j]]\n    return []\n    pass\n

      "},{"location":"Interview-Questions/data-structures-algorithms/#validate-subsequence","title":"Validate Subsequence","text":"

      Question

      Given two non-empty arrays of integers, write a function that determines whether the second array is a subsequence of the first one.

      A subsequence of an array is a set of numbers that aren't necessarily adjacent in the array but that are in the same order as they appear in the array. For instance, the numbers [1, 3, 4] form a subsequence of the array [1, 2, 3, 4] , and so do the numbers [2, 4].

      Note that a single number in an array and the array itself are both valid subsequences of the array.

      Try it!
      • GeeksforGeeks: https://www.geeksforgeeks.org/problems/array-subset-of-another-array2317/1
      Hint 1

      No Hint

      Answer
      # O(n) time | O(1) space - where n is the length of the array\ndef isValidSubsequence(array, sequence):\n    pArray = pSequence = 0\n    while pArray < len(array) and pSequence < len(sequence):\n        if array[pArray] == sequence[pSequence]:\n            pArray+=1\n            pSequence+=1\n        else: pArray+=1\n    return pSequence == len(sequence)\n    pass\n
      "},{"location":"Interview-Questions/data-structures-algorithms/#nth-fibonacci","title":"Nth Fibonacci","text":"

      Question

      The Fibonacci sequence is defined as follows: Any number in the sequence is the sum of the previous 2.

      for fib[n] = fib[n-1] + fib[n-2]

      The 1st and 2nd are fixed at 0,1

      Find the nth Nth Fibonacci sequence

      Try it!
      • LeetCode: https://leetcode.com/problems/fibonacci-number/
      • GeeksforGeeks: https://www.geeksforgeeks.org/problems/nth-fibonacci-number1335/1
      Hint 1

      No Hint

      Answer
      # O(n) time | O(n) space\ndef getNthFib(n):\n    dp = [0,1]\n    while len(dp)<n:\n        dp.append(dp[-1]+dp[-2])\n    return dp[n-1]\n    pass\n
      # O(n) time | O(1) space\ndef getNthFib(n):\n    last_two = [0,1]\n    count = 2\n    while count < n:\n        currFib = last_two[0] + last_two[1]\n        last_two[0] = last_two[1]\n        last_two[1] = currFib\n        count += 1\n    return last_two[1] if n>1 else last_two[0]\n    pass\n
      "},{"location":"Interview-Questions/data-structures-algorithms/#product-sum","title":"Product Sum","text":"

      Question

      Write a function that takes in a \"special\" array and returns its product sum. A \"special\" array is a non-empty array that contains either integers or other \"special\" arrays. The product sum of a \"special\" array is the sum of its elements, where \"special\" arrays inside it are summed themselves and then multiplied by their level of depth.

      For example, the product sum of [x, y] is x + y ; the product sum of [x, [y, z]] is x + 2y + 2z

      Eg: Input: [5, 2, [7, -1], 3, [6, [-13, 8], 4]] Output: 12 # calculated as: 5 + 2 + 2 * (7 - 1) + 3 + 2 * (6 + 3 * (-13 + 8) + 4)

      Try it!
      • LeetCode: https://leetcode.com/problems/fibonacci-number/
      • GeeksforGeeks: https://www.geeksforgeeks.org/problems/nth-fibonacci-number1335/1
      Hint 1

      No Hint

      Answer
      # O(n) time | O(d) space - where n is the total number of elements in the array,\n# including sub-elements, and d is the greatest depth of \"special\" arrays in the array\ndef productSum(array, depth = 1):\n    sum = 0\n    for i,v in enumerate(array):\n        if type(v) is list:\n            sum += productSum(v, depth + 1)\n        else:\n            sum += v\n    return sum*depth\n    pass\n
      "},{"location":"Interview-Questions/data-structures-algorithms/#medium","title":"\ud83d\ude42 Medium","text":""},{"location":"Interview-Questions/data-structures-algorithms/#top-k-frequent-words","title":"Top K Frequent Words","text":"

      Question

      Given a non-empty list of words, return the\u00a0k\u00a0most frequent elements.

      Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

      Example 1:

      Input: [\"i\", \"love\", \"leetcode\", \"i\", \"love\", \"coding\"], k = 2\nOutput: [\"i\", \"love\"]\nExplanation: \"i\" and \"love\" are the two most frequent words.\n    Note that \"i\" comes before \"love\" due to a lower alphabetical order.\n

      Example 2:

      Input: [\"the\", \"day\", \"is\", \"sunny\", \"the\", \"the\", \"the\", \"sunny\", \"is\", \"is\"], k = 4\nOutput: [\"the\", \"is\", \"sunny\", \"day\"]\nExplanation: \"the\", \"is\", \"sunny\" and \"day\" are the four most frequent words,\n    with the number of occurrence being 4, 3, 2 and 1 respectively.\n
      Note:

      1. You may assume\u00a0k\u00a0is always valid, 1 \u2264\u00a0k\u00a0\u2264 number of unique elements.
      2. Input words contain only lowercase letters.

      Follow up:

      1. Try to solve it in\u00a0O(n\u00a0log\u00a0k) time and\u00a0O(n) extra space.
      Try it!
      • LeetCode: https://leetcode.com/problems/fibonacci-number/
      • GeeksforGeeks: https://www.geeksforgeeks.org/problems/nth-fibonacci-number1335/1
      Hint 1

      No Hint

      Answer
      # Count the frequency of each word, and \n# sort the words with a custom ordering relation \n# that uses these frequencies. Then take the best k of them.\n\n# Time Complexity: O(N \\log{N})O(NlogN), where NN is the length of words. \n# We count the frequency of each word in O(N)O(N) time, \n# then we sort the given words in O(N \\log{N})O(NlogN) time.\n# Space Complexity: O(N)O(N), the space used to store our uniqueWords.\ndef topKFrequentWords(words, k)-> List[str]:\n    from collections import Counter\n    wordsFreq = Counter(words)\n    uniqueWords = list(wordsFreq.keys())\n    uniqueWords.sort(key = lambda x: (-wordsFreq[x], x))\n    return uniqueWords[:k]\n
      # Time Complexity: O(N \\log{k})O(Nlogk), where NN is the length of words. \n# We count the frequency of each word in O(N)O(N) time, then we add NN words to the heap, \n# each in O(\\log {k})O(logk) time. Finally, we pop from the heap up to kk times. \n# As k \\leq Nk\u2264N, this is O(N \\log{k})O(Nlogk) in total.\n\n# In Python, we improve this to O(N + k \\log {N})O(N+klogN): our heapq.heapify operation and \n# counting operations are O(N)O(N), and \n# each of kk heapq.heappop operations are O(\\log {N})O(logN).\n\n# Space Complexity: O(N)O(N), the space used to store our wordsFreq.\n\n# Count the frequency of each word, then add it to heap that stores the best k candidates. \n# Here, \"best\" is defined with our custom ordering relation, \n# which puts the worst candidates at the top of the heap. \n# At the end, we pop off the heap up to k times and reverse the result \n# so that the best candidates are first.\n\n# In Python, we instead use heapq.heapify, which can turn a list into a heap in linear time, \n# simplifying our work.\n\ndef topKFrequentWords(words, k)-> List[str]:\n    from heapq import heapify, heappop#, heappush\n    from collections import Counter\n    wordsFreq = Counter(words)\n    heap = [(-freq, word) for word, freq in wordsFreq.items()]\n    heapq.heapify(heap)\n    return [heapq.heappop(heap)[1] for _ in range(k)]\n
      "},{"location":"Interview-Questions/data-structures-algorithms/#hard","title":"\ud83e\udd28 Hard","text":""},{"location":"Interview-Questions/data-structures-algorithms/#very-hard","title":"\ud83d\ude32 Very Hard","text":""}]} \ No newline at end of file diff --git a/stylesheets/extra.css b/stylesheets/extra.css new file mode 100644 index 0000000..9711f7b --- /dev/null +++ b/stylesheets/extra.css @@ -0,0 +1,72 @@ +:root { + --md-admonition-icon--interview-questions: url('data:image/svg+xml;charset=utf-8,') + } + .md-typeset .admonition.interview-questions, + .md-typeset details.interview-questions { + border-color: rgb(43, 155, 70); + } + .md-typeset .interview-questions > .admonition-title, + .md-typeset .interview-questions > summary { + background-color: rgba(43, 155, 70, 0.1); + } + .md-typeset .interview-questions > .admonition-title::before, + .md-typeset .interview-questions > summary::before { + background-color: rgb(43, 155, 70); + -webkit-mask-image: var(--md-admonition-icon--interview-questions); + mask-image: var(--md-admonition-icon--interview-questions); + } + +:root { + --md-admonition-icon--cheat-sheet: url('data:image/svg+xml;charset=utf-8,') + } + .md-typeset .admonition.cheat-sheet, + .md-typeset details.cheat-sheet { + border-color: rgb(43, 155, 70); + } + .md-typeset .cheat-sheet > .admonition-title, + .md-typeset .cheat-sheet > summary { + background-color: rgba(43, 155, 70, 0.1); + } + .md-typeset .cheat-sheet > .admonition-title::before, + .md-typeset .cheat-sheet > summary::before { + background-color: rgb(43, 155, 70); + -webkit-mask-image: var(--md-admonition-icon--cheat-sheet); + mask-image: var(--md-admonition-icon--cheat-sheet); + } + + :root { + --md-admonition-icon--ml-algo: url('data:image/svg+xml;charset=utf-8,') + } + .md-typeset .admonition.ml-algo, + .md-typeset details.ml-algo { + border-color: rgb(43, 155, 70); + } + .md-typeset .ml-algo > .admonition-title, + .md-typeset .ml-algo > summary { + background-color: rgba(43, 155, 70, 0.1); + } + .md-typeset .ml-algo > .admonition-title::before, + .md-typeset .ml-algo > summary::before { + background-color: rgb(43, 155, 70); + -webkit-mask-image: var(--md-admonition-icon--ml-algo); + mask-image: var(--md-admonition-icon--ml-algo); + } + +:root { + --md-admonition-icon--online-resources: url('data:image/svg+xml;charset=utf-8,') + } + .md-typeset .admonition.online-resources, + .md-typeset details.online-resources { + border-color: rgb(43, 155, 70); + } + .md-typeset .online-resources > .admonition-title, + .md-typeset .online-resources > summary { + background-color: rgba(43, 155, 70, 0.1); + } + .md-typeset .online-resources > .admonition-title::before, + .md-typeset .online-resources > summary::before { + background-color: rgb(43, 155, 70); + -webkit-mask-image: var(--md-admonition-icon--online-resources); + mask-image: var(--md-admonition-icon--online-resources); + } +