UMBC ebiquity research group Building intelligent systems in open, heterogeneous, dynamic, distributed environments
16 May 2008, 20:39:25 EDT  
Stress test your RDF triple store

Stress test your RDF triple store

By Tim Finin on Thursday, June 16th, 2005 at 12:30 pm.

A colleague has been testing the scalablilty of a triple store using synthetic triples. He asked if we could package up a large collection of real triples caught in the wild by Swoogle. After talking a bit, it was decided that having them as a simple SQL database dump would be the most convenient form.

10M Triples is an SQL database dump containing a table that of about 10.4M RDF triples extracted from the Swoogle cache on June 15, 2005. The size of the compressed file is 162M and when uncompressed its size is 1.7G.

Related posts: • Web Service API Helps Amazon to Give Away the Store;  • Large RDF triple stores;  • SemNews: NLP system generates Semantic Web representation of news summaries;  

 

 

3 Responses to “Stress test your RDF triple store”

  1. Morten Frederiksen Says:

    Hi,

    I get a 403 when trying to download the compressed file at the linked URI.

  2. tim finin Says:

    Thanks for pointing this out. We’ve fixed it.

  3. Tim Finin Says:

    crschmidt asked “Why would someone dump something as SQL, and not simply serialize to RDF?”, which is a good questions. Shouldn’t we be serving up dog food? As the post mentions, we made it available as an SQL dump by request. But, it made sense to me in that it would be somewhat more convenient to me to select subsets of the triples, break up the dataset into chunks, devise queries to explore the structure, etc.

Leave a Reply

Recent posts

  • The Psychology of Social Networking on KQED Forum show
  • Students: brand yourself with a blog
  • Social Data on the Web workshop at ISWC 2008
  • Petrini: Streaming Applications on the Cell BE Processor, 3pm 5/13 UMBC
  • Gossip-Based Outlier Detection for Mobile Ad Hoc Networks

  • Ebiquity community

  • Fieldmarking data blog
  • Geospatial Semantic Web
  • Harry Chen thinks aloud
  • Planet social media research
  • Social media research blog
  • TrackForward by Kolari
  • UMBC GAIM

  • UMBC