Swoogle 2006 released
BALTIMORE, Wednesday, February 01, 2006
Swoogle 2006 is a a major
new version of Swoogle, a search engine for the Semantic Web.
Swoogle helps knowledge engineers and software agents find
knowledge on the web encoded in the semantic web languages RDF
and OWL. It crawls the Web looking for documents that consists
of RDF or have embedded RDF within them. USing Swoogle, people
and agents can discover Semantic Web ontologies, terms and data.
Swoogle 2005 is a nearly complete rewrite of Swoogle Classic, which
now answers to Swoogle 2005. While Swoogle is currently missing
some of Swoogle 2005's features, it enjoys a cleaner and simpler
model and foundation. We will be adding in some of these
features as well as new ones over the next few months. Here are
some of Swoogle 2006's highlights:
- New hardware. Swoogle 2006 is running on a set of
three machines: EB2 is a two processor Sun v20z with 4G of memory
and runs the crawler, DBMS and development web interfaces; LOGOS
is an IBM eserver runs the production web interfaces, and NATRAJ
is the file server for the SW cache and archive.
- More data. Swoogle 2006 has over 850K documents in its
index compared to Swoogle 2005's 340K. The documents include
about 700K RDF documents and 140K HTML documents with embedded
- Better ranking. Swoogle 2006 uses the improved ranking
algorithms reported on in our ISWC 2005
- Better crawling. Swoogle 2006 now does a better job of
crawling new URLs, including those submitted by people.
- Web services. Swoogle 2006 exposes a set of 17 web
services, currently with simple GCI interfaces that return their
results as RDF graph. Using the web services requires the use of
a key, so we can track usage and possible abuses.
- RDF output. All query results, whether via a web
service call or through the browser interface, are available in
RDF. For browser-based queries, look for the RDF VERSION link in
the upper left corner of the page.
- Simpler interface. The human web interface is simpler
- Cache and archive. Swoogle 2006 maintains a cache of
the SW documents it finds and also keeps copies of older versions
in it's Semantic Web Archive .
- Registered user services. Swoogle 2006 has a better
system for user accounts that includes a CAPCHA to keep out
spambots. Anonymous users only see a limited number of query
results where as registered users can see them all.
Some of the Swoogle 2005 features currently missing from Swoogle
2006 are the shopping cart and triple shop; the ontology
dictionary; swoogle statistics and swoogle's top ten. We plan to
add these back into Swoogle 2006 over the next few months. Send
any comments to swoogle-developers at ebiquity.umbc.edu.
Web Site: http://swoogle.umbc.edu/