UMBC Semantic Similarity Service

Computing semantic similarity between words/phrases has important applications in natural language processing, information retrieval, and artificial intelligence. There are two prevailing approaches to computing word similarity, based on either using of a thesaurus (e.g., WordNet ) or statistics from a large corpus. We provide a hybrid approach combining the two methods. This website currently offers three online demonstrations.

Our statistical method is based on distributional similarity and Latent Semantic Analysis (LSA). We further complement it with semantic relations extracted from WordNet. The whole process is automatic and can be trained using different corpora. We assume the semantics of a phrase is compositional on its component words and apply an algorithm to compute similarity between two phrases using word similarity.

Top-N Similarity -- Give top-n most similar words to an input word

Phrase Similarity -- Compute semantic similarity between two short noun or verb phrases.

STS Similarity -- Compute Semantic Textual Similarity between two sentences or phrases.


A simple API is available at here.




Developed by Lushan Han for the project Graph of Relations.
Contact us - umbcsim [AT] cs.umbc.edu
2013 Ebiquity Lab, UMBC.