## Sunday, March 3, 2019

### Udacity machine learning nanodegree mega review part 5

Lesson 5 Decision Trees
See more in the series visit the main course outline page

Decision tree can be seen as a classification problem
Example 1 Dating : what attribute to split one.
Intuition 20 questions, ask broader questions at first to narrow down the domain space for example animal vs personal

Decision tree learning
1.First find the best attribute to split
For example splitting in half (binary search)
4. Go back to no.1 until an answer is found
The above describes an ALGORITHM

Best attribute to split
Depends on entropy
Does the split improve information gain

Decision nodes can be used to write boolean operation or gates logic.

AND(A,B)

A
/f \t
-     B
/f\t
-  +

IF an operation is communicative, that means if we switch the operands will still get the same result. AND(A,B) is the same as AND(B,A)

Extra: random forest is a collection of randomized decision trees that was selected if it generated good result. Machine continuous decision tree branch can be a fraction of 1. Machine discrete decision branch true false. Information gain formula will automatically rule out some redundancy. Pruning the tree collapse the tree and see the error get a smaller tree avoid over fitting