Center for Brains, Minds & Machineshttps://hdl.handle.net/1721.1/885292019-10-14T03:17:05Z2019-10-14T03:17:05ZTechnical Report: Building a Neural Ensemble Decoder by Extracting Features Shared Across Multiple PopulationsChang, Chia-Junghttps://hdl.handle.net/1721.1/1220412019-09-05T17:13:40Z2019-09-05T00:00:00ZTechnical Report: Building a Neural Ensemble Decoder by Extracting Features Shared Across Multiple Populations
Chang, Chia-Jung
To understand whether and how a certain population of neurons represent behavioral-relevant vari- ables, building a neural ensemble decoder has been used to extract information from the recorded activity. Among different ways to decode neural ensemble activity, the parametric approach requires assumption of the spiking distribution and an underlying encoding model, which poses challenges for neurons with nonlinear, multi-modal, and complex receptive fields. Alternatively, non-parametric framework assumes no explicit probability distribution and discovers patterns from the data in an unbiased way, and thus training a machine learning model as a decoder has gained its popularity in the field. However, machine learning models require a big-enough dataset, yet the data size is often small due to limitations in recording techniques. Although increasing the number of subjects help increase the size of the overall training set, how to concatenate recorded ensemble activity across subjects while preserving their spatial-temporal structures is not trivial. In this technical report 1, a novel way to extract features shared across populations from multiple subjects to train a machine learning model is described. With this feature extraction framework, one can easily test upon different hypothesis of the underlying coding strategies. In addition, several common issues in applying a machine learning model to decode neural activity has been discussed. Overall, this report provides a rigorous protocol for applying machine learning models to decode a relatively small dataset - neural ensemble activity collected across multiple populations.
2019-09-05T00:00:00ZHippocampal Remapping as Hidden State InferenceSanders, HoniWilson, Matthew A.Gershman, Samueal J.https://hdl.handle.net/1721.1/1220402019-09-06T03:00:51Z2019-08-22T00:00:00ZHippocampal Remapping as Hidden State Inference
Sanders, Honi; Wilson, Matthew A.; Gershman, Samueal J.
Cells in the hippocampus tuned to spatial location (place cells) typically change their tuning when an animal changes context, a phenomenon known as remapping. A fundamental challenge to understanding remapping is the fact that what counts as a “context change” has never been precisely defined. Furthermore, different remapping phenomena have been classified on the basis of how much the tuning changes after different types and degrees of context change, but the relationship between these variables is not clear. We address these ambiguities by formalizing remapping in terms of hidden state inference. According to this view, remapping does not directly reflect objective, observable properties of the environment, but rather subjective beliefs about the hidden state of the environment. We show how the hidden state framework can resolve a number of puzzles about the nature of remapping.
2019-08-22T00:00:00ZBrain Signals Localization by Alternating ProjectionsAdler, AmirWax, MatiPantazis, Dimitrioshttps://hdl.handle.net/1721.1/1220342019-09-03T17:26:46Z2019-08-29T00:00:00ZBrain Signals Localization by Alternating Projections
Adler, Amir; Wax, Mati; Pantazis, Dimitrios
We present a novel solution to the problem of localization of brain signals. The solution is sequential and iterative, and is based on minimizing the least-squares (LS) criterion by the alternating projection (AP) algorithm, well known in the context of array signal processing. Unlike existing solutions belonging to the linearly constrained minimum variance (LCMV) and to the multiple-signal classification (MUSIC) families, the algorithm is applicable even in the case of a single sample and in the case of synchronous sources. The performance of the solution is demonstrated via simulations.
2019-08-29T00:00:00ZTheoretical Issues in Deep Networks: Approximation, Optimization and GeneralizationPoggio, TomasoBanburski, AndrzejLiao, Qianlihttps://hdl.handle.net/1721.1/1220142019-08-26T14:24:34Z2019-08-17T00:00:00ZTheoretical Issues in Deep Networks: Approximation, Optimization and Generalization
Poggio, Tomaso; Banburski, Andrzej; Liao, Qianli
While deep learning is successful in a number of applications, it is not yet well understood theoretically. A satisfactory theoretical characterization of deep learning however, is beginning to emerge. It covers the following questions: 1) representation power of deep networks 2) optimization of the empirical risk 3) generalization properties of gradient descent techniques - why the expected error does not suffer, despite the absence of explicit regularization, when the networks are overparametrized? In this review we discuss recent advances in the three areas. In approximation theory both shallow and deep networks have been shown to approximate any continuous functions on a bounded domain at the expense of an exponential number of parameters (exponential in the dimensionality of the function). However, for a subset of compositional functions, deep networks of the convolutional type (even without weight sharing) can have a linear dependence on dimensionality, unlike shallow networks. In optimization we discuss the loss landscape for the exponential loss function. It turns out that global minima at infinity are completely degenerate. The other critical points of the gradient are less degenerate, with at least one - and typically more - nonzero eigenvalues. This suggests that stochastic gradient descent will find with high probability the global minima. To address the question of generalization for classification tasks, we use classical uniform convergence results to justify minimizing a surrogate exponential-type loss function under a unit norm constraint on the weight matrix at each layer. It is an interesting side remark, that such minimization for (homogeneous) ReLU deep networks implies maximization of the margin. The resulting constrained gradient system turns out to be identical to the well-known {\it weight normalization} technique, originally motivated from a rather different way. We also show that standard gradient descent contains an implicit L2 unit norm constraint in the sense that it solves the same constrained minimization problem with the same critical points (but a different dynamics). Our approach, which is supported by several independent new results, offers a solution to the puzzle about generalization performance of deep overparametrized ReLU networks, uncovering the origin of the underlying hidden complexity control in the case of deep networks.
2019-08-17T00:00:00Z