UMBC ebiquity

Web/Data Mining and Personalization

Status: Past project

Project Description:

The evolution of the Internet into the Global Information Infrastructure, coupled with the immense popularity of the Web, has also enabled the ordinary citizen to become not just a consumer of information, but also its disseminator. The Web, then, is becoming the apocryphal Vox Populi. Given that there is this vast and ever growing amount of information, how does the average user quickly find what s/he is looking for -- a task in which the present day search engines don't seem to help much!

One possible approach is to personalize the web space -- create a system which responds to user queries by potentially aggregating information from several sources in a manner which is dependent on who the user is.

Existing commercial systems seek to do some minimal personalization based on declarative information directly provided by the user, such as their zip code, or keywords describing their interests, or specific URLs, or even particular pieces of information they are interested in (e.g. price for a particular stock). Our research aims at creating systems that (semi) automatically tailor the content delivered to the user from a web site. We do so by mining the web -- both the contents, as well as the users' interaction.

Web Mining and Personalization requires modeling of an unknown number of overlapping sets in the presence of significant noise and outliers, (i. e., bad exemplars). Moreover, the data sets in Web Mining are extremely large. The aim of our reserach is to develop scalable robust fuzzy techniques to model noisy data sets containing an unknown number of overlapping categories. Specifically, in this work we are :

  1. Developing new scalable robust fuzzy clustering techniques for modeling data
  2. Exploring new techniques to handle linguistic and textual features
  3. Validating our techniques by creating prototype web mining and personalization systems

Start Date: September 1999

End Date: May 2001

Anupam Joshi

Tapan Kamdar


There are 4 associated publications:  Hide the list...

1 Refereed Publication


1. Olfa Nasraoui et al., "Relational Clustering Based on a New Robust Estimator with Application to Web Mining", InProceedings, North American Fuzzy Info. Proc. Society (NAFIPS 99), October 1999, 2739 downloads.

3 Non-Refereed Publications


1. Tapan Kamdar et al., "On Creating Adaptive Web Servers Using Weblog Mining", TechReport, University of Maryland, Baltimore County, November 2000, 3284 downloads.

2. Zhihua Jiang et al., "Retriever: Improving Web Search Engine Results Using Clustering", TechReport, University of Maryland Baltimore County, October 2000, 2946 downloads.


3. Anupam Joshi et al., "On Mining Web Access Logs", TechReport, University of Maryland Baltimore County, October 1999, 3293 downloads.


There are 0 associated resources:  Hide the list...


Research Areas:
 Data Mining
 Web based information systems