Skip to content

Commit

Permalink
ddd
Browse files Browse the repository at this point in the history
  • Loading branch information
jlgmp committed Apr 24, 2024
1 parent c653f31 commit d1f9ebd
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 36 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -44,42 +44,7 @@
"id": "f6ca5fc4",
"metadata": {},
"source": [
"# Bagging\n",
"\n",
"In previous sections, we explored different classification algorithms as well as techniques that can be used to properly validate and evaluate the quality of your models.\n",
"\n",
"Now, suppose that we have chosen the best possible model for a particular problem and are struggling to further improve its accuracy. In this case, we would need to apply some more advanced machine learning techniques that are collectively referred to as *ensembles*.\n",
"\n",
"An *ensemble* is a set of elements that collectively contribute to a whole. A familiar example is a musical ensemble, which blends the sounds of several musical instruments to create harmony, or architectural ensembles, which are a set of buildings designed as a unit. In ensembles, the (whole) harmonious outcome is more important than the performance of any individual part."
]
},
{
"cell_type": "markdown",
"id": "4ea30c2c",
"metadata": {},
"source": [
"## Ensembles\n",
"\n",
"[Condorcet's jury theorem](https://en.wikipedia.org/wiki/Condorcet%27s_jury_theorem) (1784) is about an ensemble in some sense. It states that, if each member of the jury makes an independent judgment and the probability of the correct decision by each juror is more than 0.5, then the probability of the correct decision by the whole jury increases with the total number of jurors and tends to one. On the other hand, if the probability of being right is less than 0.5 for each juror, then the probability of the correct decision by the whole jury decreases with the number of jurors and tends to zero. \n",
"\n",
"Let's write an analytic expression for this theorem:\n",
"\n",
"- $\\large N$ is the total number of jurors;\n",
"- $\\large m$ is a minimal number of jurors that would make a majority, that is $\\large m = floor(N/2) + 1$;\n",
"- $\\large {N \\choose i}$ is the number of $\\large i$-combinations from a set with $\\large N$ elements.\n",
"- $\\large p$ is the probability of the correct decision by a juror;\n",
"- $\\large \\mu$ is the probability of the correct decision by the whole jury.\n",
"\n",
"Then:\n",
"\n",
"$$ \\large \\mu = \\sum_{i=m}^{N}{N\\choose i}p^i(1-p)^{N-i} $$\n",
"\n",
"It can be seen that if $\\large p > 0.5$, then $\\large \\mu > p$. In addition, if $\\large N \\rightarrow \\infty $, then $\\large \\mu \\rightarrow 1$.\n",
"\n",
"Let's look at another example of ensembles: an observation known as [Wisdom of the crowd](https://en.wikipedia.org/wiki/Wisdom_of_the_crowd). <img src=\"https://habrastorage.org/webt/zg/hw/b7/zghwb7oztkmv840odqkjpink1vw.png\" align=\"right\" width=15% height=15%> In 1906, [Francis Galton](https://en.wikipedia.org/wiki/Francis_Galton) visited a country fair in Plymouth where he saw a contest being held for farmers. 800 participants tried to estimate the weight of a slaughtered bull. The real weight of the bull was 1198 pounds. Although none of the farmers could guess the exact weight of the animal, the average of their predictions was 1197 pounds.\n",
"\n",
"\n",
"A similar idea for error reduction was adopted in the field of Machine Learning."
"# Bagging"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,54 @@
"# Getting started with ensemble learning"
]
},
{
"cell_type": "markdown",
"id": "239f071d",
"metadata": {},
"source": [
"In previous sections, we explored different classification algorithms as well as techniques that can be used to properly validate and evaluate the quality of your models.\n",
"\n",
"Now, suppose that we have chosen the best possible model for a particular problem and are struggling to further improve its accuracy. In this case, we would need to apply some more advanced machine learning techniques that are collectively referred to as *ensembles*.\n",
"\n",
"An *ensemble* is a set of elements that collectively contribute to a whole. A familiar example is a musical ensemble, which blends the sounds of several musical instruments to create harmony, or architectural ensembles, which are a set of buildings designed as a unit. In ensembles, the (whole) harmonious outcome is more important than the performance of any individual part."
]
},
{
"cell_type": "markdown",
"id": "2eff740a",
"metadata": {},
"source": [
"## Ensembles\n",
"\n",
"[Condorcet's jury theorem](https://en.wikipedia.org/wiki/Condorcet%27s_jury_theorem) (1784) is about an ensemble in some sense. It states that, if each member of the jury makes an independent judgment and the probability of the correct decision by each juror is more than 0.5, then the probability of the correct decision by the whole jury increases with the total number of jurors and tends to one. On the other hand, if the probability of being right is less than 0.5 for each juror, then the probability of the correct decision by the whole jury decreases with the number of jurors and tends to zero. \n",
"\n",
"Let's write an analytic expression for this theorem:\n",
"\n",
"- $\\large N$ is the total number of jurors;\n",
"- $\\large m$ is a minimal number of jurors that would make a majority, that is $\\large m = floor(N/2) + 1$;\n",
"- $\\large {N \\choose i}$ is the number of $\\large i$-combinations from a set with $\\large N$ elements.\n",
"- $\\large p$ is the probability of the correct decision by a juror;\n",
"- $\\large \\mu$ is the probability of the correct decision by the whole jury.\n",
"\n",
"Then:\n",
"\n",
"$$ \\large \\mu = \\sum_{i=m}^{N}{N\\choose i}p^i(1-p)^{N-i} $$\n",
"\n",
"It can be seen that if $\\large p > 0.5$, then $\\large \\mu > p$. In addition, if $\\large N \\rightarrow \\infty $, then $\\large \\mu \\rightarrow 1$.\n",
"\n",
"$~~~~~~~$... whenever we are faced with making a decision that has some important consequence, we often seek the opinions of different “experts” \n",
"\n",
"$~~~~~~$ to help us that decision ...\n",
"\n",
"\n",
"$~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$ — Page 2, Ensemble Machine Learning, 2012.\n",
"\n",
"Let's look at another example of ensembles: an observation known as [Wisdom of the crowd](https://en.wikipedia.org/wiki/Wisdom_of_the_crowd). <img src=\"https://habrastorage.org/webt/zg/hw/b7/zghwb7oztkmv840odqkjpink1vw.png\" align=\"right\" width=15% height=15%> In 1906, [Francis Galton](https://en.wikipedia.org/wiki/Francis_Galton) visited a country fair in Plymouth where he saw a contest being held for farmers. 800 participants tried to estimate the weight of a slaughtered bull. The real weight of the bull was 1198 pounds. Although none of the farmers could guess the exact weight of the animal, the average of their predictions was 1197 pounds.\n",
"\n",
"\n",
"A similar idea for error reduction was adopted in the field of Machine Learning."
]
},
{
"cell_type": "code",
"execution_count": 2,
Expand Down

0 comments on commit d1f9ebd

Please sign in to comment.