Swoogle Home

Documentation

publications
manual
news

Login Form






Lost Password?
No account yet? Register

BSQ Sitestats Summary

Site Stats Summary
  Hits Visitors
Today 2209 299
Week 23610 2262

Swoogle Manual


Introduction

Three decades past, the relational empire conquered the hierarchical hegemony. Today, an upstart challenges the relational empire's dominance, threatening the return of hierarchy. XML is Lisp's bastard nephew, with uglier syntax and no semantics. Yet XML is poised to enable the creation of a Web of data that dwarfs anything since the Library at Alexandria. -- Philip Wadler. VLDB Keynote, Rome, September 2001.

Swoogle is a crawler-based indexing and retrieval system for Semantic Web Documents (SWDs), that is, web documents written in RDF or OWL. Swoogle discovers, digests, analyzes and indexes online SWDs and provide query and report service through a web interface.

Motivation

The Semantic Web is known for being a web of Semantic Web documents; however, little is known about the structure or growth of such a web. Search engines such as Google have transformed the way people access and use the web and have become a critical technology for finding and delivering information. Most existing search engines, however, provide poor support to accessing the web of SWDs and make no attempt to take advantage of the structural and semantic information encoded in SWDs. Swoogle, therefore, have been developed to index SWDs and to serve both semantic web developers as well as software agents.

Swoogle is designed to serve the research activities in Semantic Web community, especially the following:

  • to study the growth and evolution of the semantic web by efficiently querying a comprehensive database of SWD metadata;
  • to collect, index and search the definition and usage of Semantic Web Terms (SWTs) (i.e. Classes and Properties) as well corresponding Semantic Web Ontologies (SWOs)
  • to enable "distributed" knowledge sharing by making knowledge visible and easy to access
  • to support semantic web tools such as MindSwap Lab's SWOOP ontology editor in finding relevant ontologies, KSL's Inference Web infrastructure in finding distributed proofs.

System Features

The core feature provided by Swoogle is the semantic web search engine. Since Swoogle is very much work in progress: 

  • version 1 was developed during the second quarter of 2004. It provide basic search service on about 14,500 SWDs (including 4880 ontologies).
  • version 2 was developed over the summer of 2004. It additionally indexed SWTs and the relations between SWTs and SWDs. It has indexed 327,000 SWDs (including 11,279 ontologies) . It also provides statistical measures of the collection of SWDs.
  • version 3 was developed in 2005. It additionally indexed some documents embedding RDF statement (though JPG and PDF are still in TODO list). It also supports the novel semantic web search and navigation model, which semantically links SWDs and SWTs besides hyperlinks.

We first list some highlights of Swoogle's web interface:

  • Swoogle Search (has been replaced by Semantic Web Search and Navigation in version 3)
    • Search and digest documents (since version 1)
    • Search and digest terms (since version 2)
    • Navigating relations (since version 3)
    • Web Service ( since version 2). We also provide equivalent web-service interface for machine agent to avoid html-scraping.
  • Swoogle Statistics - it provides statistics about the indexed SWDs and SWTs
    • 'swoogle today' shows the number of indexed SWDs and the number of visited URLs. (since version 2)
    • more ...
  • Semantic Web archive and cache services - it provides a retrospective partial view of the growth of the semantic web.
    • Swoogle Cache: (since version 1)
    • Semantic Web Archive: (since version 3). It lists all cached snapshots of a URL.
  • Ontology Dictionary - it supports users to find and browse SWTs
    • Search and digest terms (see Swoogle Search)
    • browse terms alphabetically (since version 2)

The Swoogle system, however, is not simply a web interface - it is supported by the following functional components:

  • Swoogle crawlers - Swoogle use various approaches to automatically discover ontologies on the web.
    • SwoogleBot (since version 1). It parses and validates SWDs (including embedded ones ) and discover new links
    • Google Crawler (since version 2). It runs meta query on Google using Google API.
    • Bounded Site Crawler (since version 2). It acts like traditional web crawler but bounded within a host a time.
  • Swoogle analyzers - Swoogle indexes
    • ranking documents and terms 
    • digest document/term's metadata
    • creating efficient IR index for documents and terms 

manual  o   news  o   faq  o   web-service  o   submit-url  o   sw-archive  o   feedback  o   swoogle2005

Swoogle © 2004-2007, ebiquity group at UMBC
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.