Theses and Dissertations from UMD

Permanent URI for this communityhttp://hdl.handle.net/1903/2

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a give thesis/dissertation in DRUM

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Item
    COMPUTING APPROXIMATE CUSTOMIZED RANKING
    (2009) Wu, Yao; Raschid, Louiqa; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    As the amount of information grows and as users become more sophisticated, ranking techniques become important building blocks to meet user needs when answering queries. PageRank is one of the most successful link-based ranking methods, which iteratively computes the importance scores for web pages based on the importance scores of incoming pages. Due to its success, PageRank has been applied in a number of applications that require customization. We address the scalability challenges for two types of customized ranking. The first challenge is to compute the ranking of a subgraph. Various Web applications focus on identifying a subgraph, such as focused crawlers and localized search engines. The second challenge is to compute online personalized ranking. Personalized search improves the quality of search results for each user. The user needs are represented by a personalized set of pages or personalized link importance in an entity relationship graph. This requires an efficient online computation. To solve the subgraph ranking problem efficiently, we estimate the ranking scores for a subgraph. We propose a framework of an exact solution (IdealRank) and an approximate solution (ApproxRank) for computing ranking on a subgraph. Both IdealRank and ApproxRank represent the set of external pages with an external node $\Lambda$ and modify the PageRank-style transition matrix with respect to $\Lambda$. The IdealRank algorithm assumes that the scores of external pages are known. We prove that the IdealRank scores for pages in the subgraph converge to the true PageRank scores. Since the PageRank-style scores of external pages may not typically be available, we propose the ApproxRank algorithm to estimate scores for the subgraph. We analyze the $L_1$ distance between IdealRank scores and ApproxRank scores of the subgraph and show that it is within a constant factor of the $L_1$ distance of the external pages. We demonstrate with real and synthetic data that ApproxRank provides a good approximation to PageRank for a variety of subgraphs. We consider online personalization using ObjectRank; it is an authority flow based ranking for entity relationship graphs. We formalize the concept of an aggregate surfer on a data graph; the surfer's behavior is controlled by multiple personalized rankings. We prove a linearity theorem over these rankings which can be used as a tool to scale this type of personalization. DataApprox uses a repository of precomputed rankings for a given set of link weights assignments. We define DataApprox as an optimization problem; it selects a subset of the precomputed rankings from the repository and produce a weighted combination of these rankings. We analyze the $L_1$ distance between the DataApprox scores and the real authority flow ranking scores and show that DataApprox has a theoretical bound. Our experiments on the DBLP data graph show that DataApprox performs well in practice and allows fast and accurate personalized authority flow ranking.
  • Thumbnail Image
    Item
    My Mobile Music: An Adaptive Personalization System For Digital Audio Players
    (2007-07-30) Chung, Tuck Siong; Rust, Roland T.; Wedel, Michel; Business and Management: Marketing; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This paper develops a music recommendation system that automates the downloading of songs into a mobile digital audio device. The system tailors the composition of the songs to the preferences of individuals based on past behaviors. By assuming that an individual will listen longer to a song that provides a higher utility, we describe and predict individual listening behavior using a lognormal hazard function. Our recommendation system is the first to accomplish this and there is no viable alternative. Yet, our proposed approach provides an improvement over naïve methods that could be used for product recommendations. Our system has a number of distinct features. First, we use of a Sequential Monte Carlo algorithm that enables the system to deal with massive historic datasets on listening behavior of individuals. Second, we apply a variable selection procedure that helps to reduce the dimensionality of the problem, because in many applications the collection of songs need to be described by a very large number of explanatory variables (in particular music genres variables). Third, our system recommends a batch of products rather than a single product, taking into account the predicted utility and the uncertainty in the parameter estimates, and applying experimental design methods. The simulation section of this paper demonstrated that our model does achieve it objectives in handling massive data and improving predictions through model averaging. By using simulated data in the simulation, and thus knowing the true parameters, the Sequential Monte Carlo and variable selection procedures were shown to provide good estimates of an individual's preferences. Experimental results show that variable selection does simplify estimation and prediction as different individuals differ in the number of variables need to definite their listening behaviors. The results also show that for some individuals, model averaging does in fact help to improve predictions. The results of the experiment show that our model provides 23 - 35% improvement in recommendations. This improvement is achieved in a single wave and in a natural experimental setting in which the subjects have a choice or when, where and how they want to listen to the songs.