Calculate Words Similarity Using Wordnet in Java

This is my note of using WS4J calculate word similarity in Java.

Step 1: Download Jars

Download the following two jars and add them to your project library path.

jawjaw-1.0.2.jar - https://code.google.com/p/jawjaw/downloads/list
ws4j-1.0.1.jar - https://code.google.com/p/ws4j/downloads/list

Step 2: Play With the Demo Program

Code:

package NLP;
 
import edu.cmu.lti.lexical_db.ILexicalDatabase;
import edu.cmu.lti.lexical_db.NictWordNet;
import edu.cmu.lti.ws4j.RelatednessCalculator;
import edu.cmu.lti.ws4j.impl.HirstStOnge;
import edu.cmu.lti.ws4j.impl.JiangConrath;
import edu.cmu.lti.ws4j.impl.LeacockChodorow;
import edu.cmu.lti.ws4j.impl.Lesk;
import edu.cmu.lti.ws4j.impl.Lin;
import edu.cmu.lti.ws4j.impl.Path;
import edu.cmu.lti.ws4j.impl.Resnik;
import edu.cmu.lti.ws4j.impl.WuPalmer;
import edu.cmu.lti.ws4j.util.WS4JConfiguration;
 
public class SimilarityCalculationDemo {
 
	private static ILexicalDatabase db = new NictWordNet();
	/*
	//available options of metrics
	private static RelatednessCalculator[] rcs = { new HirstStOnge(db),
			new LeacockChodorow(db), new Lesk(db), new WuPalmer(db),
			new Resnik(db), new JiangConrath(db), new Lin(db), new Path(db) };
	*/
	private static double compute(String word1, String word2) {
		WS4JConfiguration.getInstance().setMFS(true);
		double s = new WuPalmer(db).calcRelatednessOfWords(word1, word2);
		return s;
	}
 
	public static void main(String[] args) {
		String[] words = {"add", "get", "filter", "remove", "check", "find", "collect", "create"};
 
		for(int i=0; i<words.length-1; i++){
			for(int j=i+1; j<words.length; j++){
				double distance = compute(words[i], words[j]);
				System.out.println(words[i] +" -  " +  words[j] + " = " + distance);
			}
		}
	}
}

Output:

add -  get = 0.3333333333333333
add -  filter = 0.4
add -  remove = 0.3157894736842105
add -  check = 0.2857142857142857
add -  find = 0.47619047619047616
add -  collect = 0.4
add -  create = 0.2857142857142857
get -  filter = 0.2857142857142857
get -  remove = 0.5
get -  check = 0.4
get -  find = 0.5
get -  collect = 0.5
get -  create = 0.5
filter -  remove = 0.2857142857142857
filter -  check = 0.25
filter -  find = 0.2857142857142857
filter -  collect = 0.21052631578947367
filter -  create = 0.2857142857142857
remove -  check = 0.4
remove -  find = 0.5
remove -  collect = 0.3157894736842105
remove -  create = 0.5
check -  find = 0.4
check -  collect = 0.2857142857142857
check -  create = 0.4
find -  collect = 0.38095238095238093
find -  create = 0.5
collect -  create = 0.2857142857142857
Category >> Natural Language Processing  
If you want someone to read your code, please put the code inside <pre><code> and </code></pre> tags. For example:
<pre><code> 
String foo = "bar";
</code></pre>
  • Cyrine Maazouli

    when i run this code
    i obtien this Caused by: java.lang.RuntimeException: Uncompilable source code – package edu.cmu.lti.lexical_db does not exist .
    help plz !

  • Younes MEZAOUR

    When I excute the program I obtien this : (i think the problem is that the “calcRelatednessOfWords(word1, word2)” dont work please help me it’s very urgent :'()

    java.lang.ClassNotFoundException: org.sqlite.JDBC

    at java.net.URLClassLoader$1.run(URLClassLoader.java:372)

    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)

    at java.security.AccessController.doPrivileged(Native Method)

    at java.net.URLClassLoader.findClass(URLClassLoader.java:360)

    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)

    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

    at java.lang.Class.forName0(Native Method)

    at java.lang.Class.forName(Class.java:259)

    at edu.cmu.lti.jawjaw.db.SQL.createSQLConnection(SQL.java:77)

    at edu.cmu.lti.jawjaw.db.SQL.(SQL.java:55)

    at edu.cmu.lti.jawjaw.db.SQL.(SQL.java:45)

    at edu.cmu.lti.jawjaw.db.WordDAO.findWordsByLemmaAndPos(WordDAO.java:124)

    at edu.cmu.lti.jawjaw.util.WordNetUtil.wordToSynsets(WordNetUtil.java:38)

    at edu.cmu.lti.lexical_db.NictWordNet.getAllConcepts(NictWordNet.java:38)

    at edu.cmu.lti.ws4j.util.WordSimilarityCalculator.calcRelatednessOfWords(WordSimilarityCalculator.java:79)

    at edu.cmu.lti.ws4j.RelatednessCalculator.calcRelatednessOfWords(RelatednessCalculator.java:61)

    at NLP.SimilarityCalculationDemo.compute(SimilarityCalculationDemo.java:27)

    at NLP.SimilarityCalculationDemo.main(SimilarityCalculationDemo.java:36)

    Exception in thread “main” java.lang.NullPointerException

    at edu.cmu.lti.jawjaw.db.SQL.getPreparedStatement(SQL.java:217)

    at edu.cmu.lti.jawjaw.db.WordDAO.findWordsByLemmaAndPos(WordDAO.java:124)

    at edu.cmu.lti.jawjaw.util.WordNetUtil.wordToSynsets(WordNetUtil.java:38)

    at edu.cmu.lti.lexical_db.NictWordNet.getAllConcepts(NictWordNet.java:38)

    at edu.cmu.lti.ws4j.util.WordSimilarityCalculator.calcRelatednessOfWords(WordSimilarityCalculator.java:79)

    at edu.cmu.lti.ws4j.RelatednessCalculator.calcRelatednessOfWords(RelatednessCalculator.java:61)

    at NLP.SimilarityCalculationDemo.compute(SimilarityCalculationDemo.java:27)

    at NLP.SimilarityCalculationDemo.main(SimilarityCalculationDemo.java:36)

    Java Result: 1

  • smishra

    this shows similarity between ‘bad’ and ‘excellent’ as 0.0