(1st-January-2021)
Original purpose of machine learning
Classifying unknown input correctly
It is not to classify learning cases correctly (!)
Over learning
Learning examples can be classified with high accuracy, but unknown examples that do not exist in training data can not be processed well
It is dangerous to enlarge the dimensions of the feature space as blindly
Reduced Error Pruning
Prevent excessive learning by pruning
Which node do you mow?
Cut the nodes whose classification accuracy does not deteriorate
Estimate classification accuracy using Validation set
We cut nodes to Greedy and stop pruning just before estimation accuracy gets worse
Regarding Validation (development) set.
Put some of the learning data for performance evaluation.
Comments