CSAIL Digital Archive
http://hdl.handle.net/1721.1/29806
Fri, 10 Jun 2016 12:42:15 GMT2016-06-10T12:42:15ZTowards Practical Theory: Bayesian Optimization and Optimal Exploration
http://hdl.handle.net/1721.1/102796
Towards Practical Theory: Bayesian Optimization and Optimal Exploration
Kawaguchi, Kenji
This thesis discusses novel principles to improve the theoretical analyses of a class of methods, aiming to provide theoretically driven yet practically useful methods. The thesis focuses on a class of methods, called bound-based search, which includes several planning algorithms (e.g., the A* algorithm and the UCT algorithm), several optimization methods (e.g., Bayesian optimization and Lipschitz optimization), and some learning algorithms (e.g., PAC-MDP algorithms). For Bayesian optimization, this work solves an open problem and achieves an exponential convergence rate. For learning algorithms, this thesis proposes a new analysis framework, called PAC-RMDP, and improves the previous theoretical bounds. The PAC-RMDP framework also provides a unifying view of some previous near-Bayes optimal and PAC-MDP algorithms. All proposed algorithms derived on the basis of the new principles produced competitive results in our numerical experiments with standard benchmark tests.
SM thesis
Thu, 26 May 2016 00:00:00 GMThttp://hdl.handle.net/1721.1/1027962016-05-26T00:00:00ZDeep Learning without Poor Local Minima
http://hdl.handle.net/1721.1/102665
Deep Learning without Poor Local Minima
Kawaguchi, Kenji
In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015. For an expected loss function of a deep nonlinear neural network, we prove the following statements under the independence assumption adopted from recent work: 1) the function is non-convex and non-concave, 2) every local minimum is a global minimum, 3) every critical point that is not a global minimum is a saddle point, and 4) the property of saddle points differs for shallow networks (with three layers) and deeper networks (with more than three layers). Moreover, we prove that the same four statements hold for deep linear neural networks with any depth, any widths and no unrealistic assumptions. As a result, we present an instance, for which we can answer to the following question: how difficult to directly train a deep model in theory? It is more difficult than the classical machine learning models (because of the non-convexity), but not too difficult (because of the nonexistence of poor local minima and the property of the saddle points). We note that even though we have advanced the theoretical foundations of deep learning, there is still a gap between theory and practice.
Mon, 23 May 2016 00:00:00 GMThttp://hdl.handle.net/1721.1/1026652016-05-23T00:00:00ZDelphi: A Software Controller for Mobile Network Selection
http://hdl.handle.net/1721.1/101636
Delphi: A Software Controller for Mobile Network Selection
Deng, Shuo; Sivaraman, Anirudh; Balakrishnan, Hari
This paper presents Delphi, a mobile software controller that helps applications select the best network among available choices for their data transfers. Delphi optimizes a specified objective such as transfer completion time, or energy per byte transferred, or the monetary cost of a transfer. It has four components: a performance predictor that uses features gathered by a network monitor, and a traffic profiler to estimate transfer sizes near the start of a transfer, all fed into a network selector that uses the prediction and transfer size estimate to optimize an objective.For each transfer, Delphi either recommends the best single network to use, or recommends Multi-Path TCP (MPTCP), but crucially selects the network for MPTCP s primary subflow . The choice of primary subflow has a strong impact onthe transfer completion time, especially for short transfers.We designed and implemented Delphi in Linux. It requires no application modifications. Our evaluation shows that Delphi reduces application network transfer time by 46% for Web browsing and by 49% for video streaming, comparedwith Android s default policy of always using Wi-Fi when it is available. Delphi can also be configured to achieve high throughput while being battery-efficient: in this configuration, it achieves 1.9x the throughput of Android s default policy while only consuming 6% more energy.
Thu, 25 Feb 2016 00:00:00 GMThttp://hdl.handle.net/1721.1/1016362016-02-25T00:00:00ZAn Analysis of the Search Spaces for Generate and Validate Patch Generation Systems
http://hdl.handle.net/1721.1/101211
An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems
Long, Fan; Rinard, Martin
We present the first systematic analysis of the characteristics of patch search spaces for automatic patch generation systems. We analyze the search spaces of two current state-of- the-art systems, SPR and Prophet, with 16 different search space configurations. Our results are derived from an analysis of 1104 different search spaces and 768 patch generation executions. Together these experiments consumed over 9000 hours of CPU time on Amazon EC2.The analysis shows that 1) correct patches are sparse in the search spaces (typically at most one correct patch per search space per defect), 2) incorrect patches that nevertheless pass all of the test cases in the validation test suite are typically orders of magnitude more abundant, and 3) leveraging information other than the test suite is therefore critical for enabling the system to successfully isolate correct patches.We also characterize a key tradeoff in the structure of the search spaces. Larger and richer search spaces that contain correct patches for more defects can actually cause systems to find fewer, not more, correct patches. We identify two reasons for this phenomenon: 1) increased validation times because of the presence of more candidate patches and 2) more incorrect patches that pass the test suite and block the discovery of correct patches. These fundamental properties, which are all characterized for the first time in this paper, help explain why past systems often fail to generate correct patches and help identify challenges, opportunities, and productive future directions for the field.
Thu, 18 Feb 2016 00:00:00 GMThttp://hdl.handle.net/1721.1/1012112016-02-18T00:00:00Z