Home Research People Publications Events 

Project: Product Profiler

Data Mining techniques do not typically automatically take into account the semantic features inherent in the data being “mined”. In most data mining applications, a large amount of transactional data is analyzed without a systematic method for “understanding” the items in the transactions or what they say about the customers who purchased those items. The majority of algorithms used to analyze transaction records from retail stores treat the items in a market basket as objects and represent them as categorical values with no associated semantics.  The semantics of particular domains are injected into the data mining process in  feature engineering and in interpreting the results, both of these being very costly and require a lot of human effort. In many domains, semantic information is implicitly available and can be automatically extracted. In this project, we develop a system that extracts semantic features for apparel products and populates a knowledge base with these products and features. We use apparel products and show that semantic features of these items can be successfully extracted by applying text learning techniques to the product names and descriptions obtained from websites of retailers. We also build several applications of such a knowledge base of product semantics including recommender systems and competitive intelligence.

People:

bulletAndrew Fano
bulletRayid Ghani

Papers:

Text Mining for Product Attribute Extraction.
Rayid Ghani, Katharina Probst, Yan Liu, Marko Krema, Andrew Fano.
SIGKDD Explorations. Vol 8. Issue 1. 2006

Mining the Web to Add Semantics to Retail Data Mining
R. Ghani
Invited Paper. Web Mining: From Web to Semantic Web.
Springer Lecture Notes in Artificial Intelligence , Vol. 3209. Berendt, B.; Hotho, A.; Mladenic, D.; van Someren, M.; Spiliopoulou, M.; Stumme, G. (Eds.)

2004

Using Text Mining to Infer Semantic Attributes for Retail Data Mining
Rayid Ghani and Andrew E. Fano
IEEE International Conference on Data Mining
December 9-12, 2002, Maebashi, Japan

Building Recommender Systems Using a Knowledge Base of Product Semantics
Rayid Ghani and Andrew Fano
Workshop on Recommendation and Personalization in ECommerce (RPEC 2002)
Malaga, Spain