Thumbnail Image

Publication or External Link





Rapid growth and popularity of internet based services such as online markets and online advertisement systems provide a lot of new algorithmic challenges. One of the main challenges is the limited access to the input. There are two main reasons that algorithms have limited data accessibility.

  1. The input is extremely large, and hence having access to the whole data at once is not practical.

  2. The nature of the system forces us to make decisions before observing the whole input.

Internet-enabled marketplaces such as Amazon and eBay deal with huge datasets registering transaction of merchandises between lots of buyers and sellers. It is important that algorithms become more time and space efficient as the size of datasets increase. An algorithm that runs in polynomial time may not have a reasonable running time for such large datasets. In the first part of this dissertation, we study the development of allocation algorithms that are appropriate for use with massive datasets. We especially focus on the streaming setting which is a common model for big data analysis. In the graph streaming, the algorithm has access to a sequence of edges, called a stream. The algorithm reads edges in the order in which they appear in the stream. The goal is to design an algorithm that maintains a large allocation, using as little space as possible. We achieve our results by developing powerful sampling techniques. Indeed, one can implement our sampling techniques in the streaming setting as well as other distributed settings such as MapReduce.

Giant online advertisement markets such as Google, Bing and Facebook raised up several interesting allocation problems. Usually, in these applications, we need to make the decision before obtaining the full information of the input graph. This enforces an uncertainty on our belief about the input, and thus makes the classical algorithms inapplicable. To address this shortcoming online algorithms have been developed. In online algorithms again the input is a sequence of items. Here the algorithm needs to make an irrevocable decision upon arrival of each item. In the second part of this dissertation, we aim to achieve two main goals for each allocation problem in the market. Our first goal is to design models to capture the uncertainty of the input based on the properties of problems and the accessible data in real applications. Our second goal is to design algorithms and develop new techniques for these market allocation problems.