Swoogle 2006 released
February 1, 2006
Swoogle 2006 is a a major new version of Swoogle, a search engine for the Semantic Web. Swoogle helps knowledge engineers and software agents find knowledge on the web encoded in the semantic web languages RDF and OWL. It crawls the Web looking for documents that consists of RDF or have embedded RDF within them. USing Swoogle, people and agents can discover Semantic Web ontologies, terms and data.
Swoogle 2005 is a nearly complete rewrite of Swoogle Classic, which now answers to Swoogle 2005. While Swoogle is currently missing some of Swoogle 2005's features, it enjoys a cleaner and simpler model and foundation. We will be adding in some of these features as well as new ones over the next few months. Here are some of Swoogle 2006's highlights:
- New hardware. Swoogle 2006 is running on a set of three machines: EB2 is a two processor Sun v20z with 4G of memory and runs the crawler, DBMS and development web interfaces; LOGOS is an IBM eserver runs the production web interfaces, and NATRAJ is the file server for the SW cache and archive.
- More data. Swoogle 2006 has over 850K documents in its index compared to Swoogle 2005's 340K. The documents include about 700K RDF documents and 140K HTML documents with embedded RDF.
- Better ranking. Swoogle 2006 uses the improved ranking algorithms reported on in our ISWC 2005 paper.
- Better crawling. Swoogle 2006 now does a better job of crawling new URLs, including those submitted by people.
- Web services. Swoogle 2006 exposes a set of 17 web services, currently with simple GCI interfaces that return their results as RDF graph. Using the web services requires the use of a key, so we can track usage and possible abuses.
- RDF output. All query results, whether via a web service call or through the browser interface, are available in RDF. For browser-based queries, look for the RDF VERSION link in the upper left corner of the page.
- Simpler interface. The human web interface is simpler and cleaner.
- Cache and archive. Swoogle 2006 maintains a cache of the SW documents it finds and also keeps copies of older versions in it's Semantic Web Archive .
- Registered user services. Swoogle 2006 has a better system for user accounts that includes a CAPCHA to keep out spambots. Anonymous users only see a limited number of query results where as registered users can see them all.
Some of the Swoogle 2005 features currently missing from Swoogle 2006 are the shopping cart and triple shop; the ontology dictionary; swoogle statistics and swoogle's top ten. We plan to add these back into Swoogle 2006 over the next few months. Send any comments to swoogle-developers at ebiquity.umbc.edu.
For more information, please contact UMBC ebiquity.