Use K-Nearest Neighbors (KNN) Classifier in Java

Case Study

Image you have a blog which contains a lot of nice articles. You put ads at the top of each article and hope to gain some revenue. After a while and from your report, you see that some posts generate revenue and some do not. Assuming that whether an article generates revenue or not depends on how many pictures and text paragraphs in it.

Given the dataset:

knn-dataset

we can plot them in the figure below.

knn

How K-Nearest Neighbors (KNN) algorithm works?

When a new article is written, we don’t have its data from report. If we want to know whether the new article can generate revenue, we can 1) computer the distances between the new article and each of the 6 existing articles, 2) sort the distances in descending order, 3) take the majority vote of k. This is the basic idea of KNN.

Now let’s guess a new article, which contains 13 pictures and 1 paragraph, can make revenue or not. By visualizing this point in the figure, we can guess it will make profit. But we will do it in Java.

Java Solution

kNN is also provided by Weka as a class “IBk”. IBk implements kNN. It uses normalized distances for all attributes so that attributes on different scales have the same impact on the distance function. It may return more than k neighbors if there are ties in the distance. Neighbors are voted to form the final classification.

First prepare your data by creating a txt file “ads.txt”:

@relation ads

@attribute pictures numeric
@attribute paragraphs numeric
@attribute profit {Y, N}

@data
10,2,Y
12,3,Y
9,2,Y
0,10,N
1,9,N
3,11,N

Java Code:

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
 
import weka.classifiers.Classifier;
import weka.classifiers.lazy.IBk;
import weka.core.Instance;
import weka.core.Instances;
 
public class KNN {
	public static BufferedReader readDataFile(String filename) {
		BufferedReader inputReader = null;
 
		try {
			inputReader = new BufferedReader(new FileReader(filename));
		} catch (FileNotFoundException ex) {
			System.err.println("File not found: " + filename);
		}
 
		return inputReader;
	}
 
	public static void main(String[] args) throws Exception {
		BufferedReader datafile = readDataFile("ads.txt");
 
		Instances data = new Instances(datafile);
		data.setClassIndex(data.numAttributes() - 1);
 
		//do not use first and second
		Instance first = data.instance(0);
		Instance second = data.instance(1);
		data.delete(0);
		data.delete(1);
 
		Classifier ibk = new IBk();		
		ibk.buildClassifier(data);
 
		double class1 = ibk.classifyInstance(first);
		double class2 = ibk.classifyInstance(second);
 
		System.out.println("first: " + class1 + "\nsecond: " + class2);
	}
}

Output:

first: 0.0
second: 1.0

References:
1. http://weka.sourceforge.net/doc.dev/weka/classifiers/lazy/IBk.html
2. http://www.cc.uah.es/drg/courses/datamining/IntroWeka.pdf

7 thoughts on “Use K-Nearest Neighbors (KNN) Classifier in Java”

Jaroslav LEHEÄŒKA

May 29, 2021 at 4:36 am

I think that you get result 0.0 for Y=>YES and 1.0 for N=>NO.
I think it is because this definition in the file ads.txt:
@attribute profit {Y, N}
Y is on the 0 index and N is on the 1 index.
John Tolentino

February 13, 2018 at 9:04 pm

what if i got input and get its nearest neighbor what should i do ???
Surendran Duraisamy

February 21, 2015 at 12:08 pm

first: 0.0
second: 0.0

is thec correct output.
When you try to classify 4,5 or 6th instance you get 1.0 as output
Shahil varshney

January 15, 2015 at 12:22 am

actualy please tell me what is the actual process behind this . and how we get this output?
Thorulf

May 16, 2014 at 3:13 am

When running this example I get output 0.0 for both first and second, what’s the deal with that? Also, like Juan, it would be interesting to know exactly what the output means.
Duke

May 4, 2014 at 9:14 am

THIS IS NOT WHAT PEOPLE LOOK FOR!
Juan

February 7, 2014 at 5:58 am

What does the output exactly mean? Is it related to the revenue of the new article?

7 thoughts on “Use K-Nearest Neighbors (KNN) Classifier in Java”

Leave a Comment