1.
packages "class" and "kknn" both have k-nearest-neighbor methods.
The one in kknn, however, will do unit-standard-deviation scaling
before knn is applied. That is, each column of data will be divided
by its (bias-corrected) standard deviation. For example:
(1,10) (1,0.5)
(2,30) will be scaled to (2,1.5)
(3,50) (3,2.5)
since sd( c(1,2,3) ) is 1 and sd( c(10,30,50) ) is 20
(Please type help(kknn) and find the section 'References')
2.
Some of you mentioned that in data iris, "petal length"
and "petal width" are the two most important features.
(One way to find out is to check all the possibilities of feature
subset. After all, there are only 15 of them.)
The following are validation results using 100 instances as training
and the other 50 instances to validate. (avg. over 1000 trials)
Using all the features:
knn(class) kknn(kknn)
47.813/50 47.004/50 <- Notice that there is slight difference
Using the 3rd and 4th features (petal):
knn(class) kknn(kknn)
47.861/50 47.999/50