• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Expectation backpropagation: Parameter-free training of multilayer neural networks with continuous or discrete weights. (2014)

by D Soudry, I Hubara, R Meir
Venue:In NIPS’2014,
Add To MetaCart

Tools

Sorted by:
Results 1 - 2 of 2

) CIFAR Senior Fellow. (*) Indicates equal contribution

by Itay Hubara , Matthieu Courbariaux , Daniel Soudry , Ran El-Yaniv , Yoshua Bengio
"... Abstract We introduce a method to train Binarized Neural Networks (BNNs) -neural networks with binary weights and activations at run-time. At train-time the binary weights and activations are used for computing the parameter gradients. During the forward pass, BNNs drastically reduce memory size an ..."
Abstract - Add to MetaCart
Abstract We introduce a method to train Binarized Neural Networks (BNNs) -neural networks with binary weights and activations at run-time. At train-time the binary weights and activations are used for computing the parameter gradients. During the forward pass, BNNs drastically reduce memory size and accesses, and replace most arithmetic operations with bit-wise operations, which is expected to substantially improve power-efficiency. To validate the effectiveness of BNNs, we conducted two sets of experiments on the Torch7 and Theano frameworks. On both, BNNs achieved nearly state-of-the-art results over the MNIST, CIFAR-10 and SVHN datasets. We also report our preliminary results on the challenging ImageNet dataset. Last but not least, we wrote a binary matrix multiplication GPU kernel with which it is possible to run our MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy. The code for training and running our BNNs is available on-line.
(Show Context)

Citation Context

...ntical outputs when their inputs are constrained to −1 or +1 (but not otherwise). The XNOR kernel is about 23 times faster than the baseline kernel and 3.4 times faster than cuBLAS, as shown in Figure 2. Last but not least, the MLP from Section 2 runs 7 times faster with the XNOR kernel than with the baseline kernel, without suffering any loss in classification accuracy (see Figure 2). 5 Discussion and Related Work Until recently, the use of extremely lowprecision networks (binary in the extreme case) was believed to be highly destructive to the network performance (Courbariaux et al., 2014). Soudry et al. (2014) and Cheng et al. (2015) proved the contrary by showing that good performance could be achieved even if all neurons and weights are binarized to±1 . This was done using Expectation BackPropagation (EBP), a variational Bayesian approach, which infers networks with binary weights and neurons by updating the posterior distributions over the weights. These distributions are updated by differentiating their parameters (e.g., mean values) via the back propagation (BP) algorithm. Esser et al. (2015) implemented a fully binary network at run time using a very similar approach to EBP, showing significa...

Bitwise Neural Networks

by Minje Kim
"... Based on the assumption that there exists a neu-ral network that efficiently represents a set of Boolean functions between all binary inputs and outputs, we propose a process for developing and deploying neural networks whose weight param-eters, bias terms, input, and intermediate hid-den layer outp ..."
Abstract - Add to MetaCart
Based on the assumption that there exists a neu-ral network that efficiently represents a set of Boolean functions between all binary inputs and outputs, we propose a process for developing and deploying neural networks whose weight param-eters, bias terms, input, and intermediate hid-den layer output signals, are all binary-valued, and require only basic bit logic for the feedfor-ward pass. The proposed Bitwise Neural Net-work (BNN) is especially suitable for resource-constrained environments, since it replaces ei-ther floating or fixed-point arithmetic with signif-icantly more efficient bitwise operations. Hence, the BNN requires for less spatial complexity, less memory bandwidth, and less power consumption in hardware. In order to design such networks, we propose to add a few training schemes, such as weight compression and noisy backpropaga-tion, which result in a bitwise network that per-forms almost as well as its corresponding real-valued network. We test the proposed network on the MNIST dataset, represented using binary features, and show that BNNs result in compet-itive performance while offering dramatic com-putational savings. 1.
(Show Context)

Citation Context

...rks where an input node is allowed to be connected to one and only one hidden node and its final layer is a union of those hidden nodes (Golea et al., 1992). A more practical network was proposed in (=-=Soudry et al., 2014-=-) recently, where the posterior probabilities of the binary weights were sought using the Expectation Back Propagation (EBP) scheme, which is similar to backpropagation in its form, but has some advan...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University