An Information-Theoretic Approach to Interest Making
The Internet has brought a new meaning to the term communities. Geography is no longer a barrier to international communications. However, the paradigm of meeting new interesting people remains entrenched in traditional means; meeting new interesting people on the Internet still relies on chance and contacts. This thesis explores a new approach towards matching users in online communities in an effective fashion. Instead of using the conventional feature vector scheme to profile users, each user is represented by a personalized concept hierarchy (or an ontology) that is learnt from the user's behavior in the system. Each concept hierarchy is then interpreted within the Information Theory framework as a probabilistic decision tree. The matching algorithm uses the Kullback-Leiber distance as a measure of deviation between two probabilistic decision trees. Thus, in an online community, where a personalized concept hierarchy represents each user, the Kullback-Leiber distance imposes a full- order rank on the level of similarity of all the users with respect to a particular user in question. The validity and utility of the proposed scheme of matching users is then applied in a set of simulations, using the feature-vector-overlap measure as a baseline. The results of the simulations show that the Kullback Leiber distance, when used in conjunction with the concept hierarchy, is more robust to noise and is able to make a stronger and more distinctive classification of users into similar groups in comparison to the conventional keyword-overlap scheme. A graphical agent system that relies upon the ontology-based interest matching algorithm, called the Collaborative Sanctioning Network, is also described in this thesis.