Getting WordNet working on Android

23/02/2014 08:38

So you want to use the Princeton WordNet on an Android project

 
The answer to this question is yes and no. There are a plethora of Java APIs to access the wordnet, wordnet.princeton.edu/wordnet/related-projects/, but all of these suffer from a problem that makes using them on Android quite difficult and tedious.

These are primarily,
 
  • versioning issues
  • incompatibility with the Dalvik format
  • other misc problems

Some people have reported success with JWI however but how they did this eludes me. I tried to use RitaWN but I ran into a persistent "NoClassFoundError" in the stack trace. Now a small caveat, this could be because I'm a complete n00b when it comes to developing Android apps. But bundling the database files required to use an API like extJWNL or JWI in the assets folder didn't work either. So I'm gonna show you today a small web based work around to find out the synonyms and antonyms of a word. Comparing the results of the API method and the Web based method indicate that 99% of the results fetched are the same.
 

THIS TUTORIAL IS FOR THE ADT BUNDLE ONLY. IF YOU USE ANDROID STUDIO OR ARE ONE OF THOSE DINOSAURS WHO USE ECLIPSE+ADT PLUGIN FIGURE IT OUT YOURSELVES( THOUGH I SUSPECT THE SOLUTION IS NOT TOO DIFFERENT)

 
Step 1: Download the JSOUP API
 
Available here, jsoup.org/download
JSOUP is a HTML parsing library written in Java that enables you to connect to web pages and extract the html content of those web pages. 
 
Step 2: Add the downloaded jar to the libs folder of your Android project
 
This is all you need to do. Android adds the jar file to the classpath itself. No need to check the order and export tab. 

 
Step 3: Add the internet permission in your Android Manifest file
 
If adding XML code confuses you just go the graphical view of the manifest and add a "Uses Permission" and select "Internet" from the drop down list.
 
 
Step 4: Performing Blasphemy and enabling Network Activity on the main thread
 
Starting with some SDK version after 10 Android prohibits network activity on the main thread and with good reason. The correct method to use a network connection is using an Async task. However since this tutorial is purely introductory and Async tasks are not the topic I will perform a quicker work around which SHOULD NEVER EVER EVER EVER BE USED IN A SERIOUS APP.

 
Step 5: Adding the required imports
 
 
Eclipse will do this for you anyway once you use relevant code but I dunno for the sake of completeness I'm adding it here anyway. 
 
Step 6: The actual code and some introductory material
 
 
The website we will be using is words.bighugelabs.com/ . They have a permissive "robots.txt" so I guess what we are doing is pretty normal. However they do have a paid API to use their website. So at anypoint after 23/2/2014 should this tutorial fail consider using another free online thesaurus for your app.

Click on the link and search for a word. For example let's search for "bike". You wil get a whole bunch of synonyms, antonyms and other related search terms.

Now examine the site's URL. it should read as "https://words.bighugelabs.com/bike". So how this site works is the word for which you want to find synonyms or antonyms is appended to the base URL which is "https://words.bighugelabs.com/". So if you wanted to find words for "car" instead of "bike" you would append "car" to "https://words.bighugelabs.com/" to get "https://words.bighugelabs.com/car". Now navigate back to "https://words.bighugelabs.com/bike".


 
right click on the first word and select "inspect element". This allows you to look at the relevant HTML source code corresponding to an HTML element on the page. 
 
THIS WILL WORK FOR THIS SITE ONLY. IF YOU CHANGE THE WEBSITE THE FOLLOWING PROCEDURE WIL NOT WORK
 
If you are familiar with HTML you can see that the first 3 links on the page are not relevant. Link number 4 onwards to 'n' (n depends on the number of synonyms retrieved) correspond to synonyms. 

Step 7:Add the following code to you Android Main Activity
 
 

 
Check the LogCat Perspective for the synonyms. 

Assumptions the word fetched has atleast 5 synonyms. Read up on Jsoup for a more efficient way to get the synonyms. 

Thanks.