Solvent helps extract data on web pages and materialize it as RDF
Tim Finin, 1:00pm 1 March 2007
Wingerz has written a tutorial post on Solvent, a tool developed by the MIT Simile group. It’s a Firefox extension that helps generate Javascript code to ’screen scrape’ data from website which is then encoded as RDF.
“There is a lot of structured data in web pages. While this data is usually backed by structured storage of some sort, a lot of the semantics of the data are lost by the time the page is rendered in the web browser. Simile Solvent allows you to capture data from web pages as RDF, a common data representation, allowing data consumers to explore the data on their own terms. But even if you have no interest in RDF (or the Semantic Web) you can still use these tools to generate something you’re more familiar with (like a spreadsheet). Of course, if you are interested in RDF (or learning more about RDF) this can be a great way to get yourself some data to play with. More on that in a later post. … ”
Solvent is designed to work with another MIT Firefox extension, Piggy Bank, that manages RDF data found or extracted from web pages as you browse. You can download the software from the MIT Solvent page.

