1

My question is actually related to the addition of probabilities. I am reading on computational learning theory from Tom Mitchell's machine learning book.

In chapter 7, when proving the upper bound of probabilities for $\epsilon$ exhausted version space (theorem 7.1), it says that the probability that at least one hypothesis out of the $k$ hypotheses in the hypotheses space $|H|$ being consistent with *m* training examples is at most $k(1- \epsilon)^m$.

I understand that the probability of a hypothesis, $h$, consistent with *m* training examples is $(1-\epsilon)^m$. However, why is it possible to add the probabilities for $k$ hypotheses? And might the probability be greater than 1 in this case?

What is the opposite of "at least 1"? – nbro – 2020-04-29T04:08:19.670

thats equal to 1 - P(none are consistent) ? and P(none of k hypotheses are consistent) = $\epsilon^{km}$ ? so thats 1 - $\epsilon^{km}$ – calveeen – 2020-04-29T04:14:09.983