On the Semantic Web, universities do ontologies, companies do data
By Tim Finin on Monday, April 24th, 2006 at 1:00 pm.Here’s an interesting figure form Li Ding’s dissertation on Semantic Web Search. It shows the distribution across various Internet top level domains of (1) the sites that Swoogle has crawled, (2) ontology documents that Swoogle has discovered, and (3) all Semantic Web documents it has discovered.
The “pure SWDs” are RDF documents in some form (e.g., XML, N3) and excluding XHTML documents with embedded RDF. Swoogle considers a Semantic Web document to be an ontology (a SWO in Swoogle-speak) if a significant fraction of its triples are involved in defining terms as opposed to making assertions about individuals. What is considered a “significant fraction” has changed and I’m not sure what the current value is. But Swoogle only considers about 1% of the Semantic Web documents it has found to be ontologies.
Note that .edu sites publish 40% of the ontologies, .org sites 20% and .com sites 10%. Of course, many of those .edu ontologies are probably from student projects of one kind or another. When we look at all Semantic Web documents (pure SWDs), the .com sites dominate, publishing over 40% of the files.
Related posts: • Ontologies on the Semantic Web; • Three tech giants to finance research; • How many Semantic Web documents are on the Web?;

