Exploring user-centric clickstream data for personalization

Speaker: Jiye Li

Clickstream data collected across multiple websites (user-centric data) captures users' browsing behaviors, interests and preferences more than clickstream data collected from individual websites (site-centric data). For example, we would expect that we could better model and predict the intentions of users who we know not only searched on google but also visited certain shopping websites, than if we know only one of these pieces of information. Current research on clickstream data analysis is mostly centered around site-centric data. Traditional techniques such as web log analysis have limitations on modeling user-centric data effectively. In this talk, we discuss the differences between user-centric data and site-centric data. We present and compare the performances of two types of classifiers built from these two types of clickstream data. Our initial results show the user-centric classifier obtains a better classification precision and recall. Such classifiers can be potentially used towards personalization applications.