## Naive Bayes

*Definition*: The **naive Bayes assumption** on events \(A_1, A_2, .. A_n\) says that these events are mutually independent.

This assumption makes the computation of the classifier much easier. The details of this computation will be shown shortly.

## Naive Bayes classifier

*Definition*: A **classifier** is a function \(M : Pow(F) \to C\) where \(F = \{f_1, f_2, … f_n\}\) is a set of features and \(C = \{c_1, c_2, .. c_m\}\) is a set of categories. The job of the classifier is to take some subset \(A \subseteq F\) and predict what category \(c_1\) it belongs to.

*Example*: Let \(D\) be a text document. We choose the features of \(D\) to be the words of \(D\). That is to say, if \(“proof” \in D\) then we may say that \(f_{“proof”}\) is a feature of \(D\).

*Definition*: Let \(c \in C, f_i \in F\). The **Bayesian classifier** \(M\) would be defined as

Notice that the denominator \(Pr(f_1, .. f_n)\) is the same for all classes \(c \in C\). This means we can actually factor out this term and still retain the same maximum argument. In other words, we achieve the same result with:

\(M(f_1, f_2, … f_n) = argmax_c Pr(c) Pr(f_1, .. f_n | c)\)However, even with this simplification, we have more problems. This definition requires we have priors of the form \(Pr(f_1, .. f_n)\). In general, each combination of features may be rare and not in our dataset. Because of this, we relax this condition in the next definition, under the assumption that each feature acts independently of the others when forming a hypothesis.

*Definition*: Let \(c \in C, f_i \in F\) and assume that the events \(f_1, .. f_n\) are all independent events. The **naive Bayesian classifier** would be defined as

### Python implementation

To create a working version of a Naive Bayes classifier, we need to start with some data. The following dataset is quite common in machine learning texts. It shows 14 days of various weather conditions, and if those conditions are sufficient for playing tennis

Day | Outlook | Temperature | Humidity | Wind | Play Tennis? |
---|---|---|---|---|---|

1 | Sunny | Hot | High | Weak | No |

2 | Sunny | Hot | High | Strong | No |

3 | Overcast | Hot | High | Weak | Yes |

4 | Rain | Mild | High | Weak | Yes |

5 | Rain | Cool | Normal | Weak | Yes |

6 | Rain | Cool | Normal | Strong | No |

7 | Overcast | Cool | Normal | Strong | Yes |

8 | Sunny | Mild | High | Weak | No |

9 | Sunny | Cool | Normal | Weak | Yes |

10 | Rain | Mild | Normal | Weak | Yes |

11 | Sunny | Mild | Normal | Strong | Yes |

12 | Overcast | Mild | High | Strong | Yes |

13 | Overcast | Hot | Normal | Weak | Yes |

14 | Rain | Mild | High | Strong | No |