Results 1  10
of
12
Kerneltron: support vector “machine” in silicon
 IEEE Transactions on Neural Networks
, 2003
"... Abstract. Detection of complex objects in streaming video poses two fundamental challenges: training from sparse data with proper generalization across variations in the object class and the environment � and the computational power required of the trained classi er running realtime. The Kerneltron ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
(Show Context)
Abstract. Detection of complex objects in streaming video poses two fundamental challenges: training from sparse data with proper generalization across variations in the object class and the environment � and the computational power required of the trained classi er running realtime. The Kerneltron supports the generalization performance of a Support Vector Machine (SVM) and o ers the bandwidth and e ciency of a massively parallel architecture. The mixedsignal VLSI processor is dedicated to the most intensive of SVM operations: evaluating a kernel over large numbers of vectors in high dimensions. At the core of the Kerneltron is an internally analog, negrain computational array performing externally digital innerproducts between an incoming vector and each of the stored support vectors. The threetransistor unit cell in the array combines singlebit dynamic storage, binary multiplication, and zerolatency analog accumulation. Precise digital outputs are obtained through oversampled quantization of the analog array outputs combined with bitserial unary encoding of the digital inputs. The 256 input, 128 vector Kerneltron measures 3 mm 3mmin0.5mCMOS, delivers 6.5 GMACS throughput at 5.9 mW power, and attains 8bit output resolution. 1
Focalplane spatially oversampling CMOS image compression sensor
 IEEE Trans. Circuits Syst. I
, 2007
"... Abstract—Image compression algorithms employ computationally expensive spatial convolutional transforms. The CMOS image sensor performs spatially compressing image quantization on the focal plane yielding digital output at a rate proportional to the mere information rate of the video. A bank of col ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
Abstract—Image compression algorithms employ computationally expensive spatial convolutional transforms. The CMOS image sensor performs spatially compressing image quantization on the focal plane yielding digital output at a rate proportional to the mere information rate of the video. A bank of columnparallel firstorder incremental modulated analogtodigital converters (ADCs) performs columnwise distributed focalplane oversampling of up to eight adjacent pixels and concurrent weighted average quantization. Number of samples per pixel and switchedcapacitor sampling sequence order set the amplitude and sign of the pixel coefficient, respectively. A simple digital delay and adder loop performs spatial accumulation over up to eight adjacent ADC outputs during readout. This amounts to computing a twodimensional block matrix transform with up to 8 8pixel programmable kernel in parallel for all columns. Noise shaping reduces power dissipation below that of a conventional digital imager while the need for a peripheral DSP is eliminated. A 128 128 active pixel array integrated with a bank of 128 modulated ADCs was fabricated in a 0.35 m CMOS technology. The 3.1 mm 1.9mm prototype captures 8bit digital video at 30 frames/s and yields 4 GMACS projected computational throughput when scaled to HDTV 1080i resolution in discrete cosine transform (DCT) compression. Index Terms—modulated analogtodigital converter (ADC), block matrix transform, CMOS imager, focalplane image compression. I.
Silicon Support Vector Machine with OnLine Learning
 Int. J. Pattern Recognition and Artificial Intelligence
, 2003
"... Training of support vector machines (SVMs) amounts to solving a quadratic programming problem over the training data. We present a simple online SVM training algorithm of complexity approximately linear in the number of training vectors, and linear in the number of support vectors. The algorithm im ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Training of support vector machines (SVMs) amounts to solving a quadratic programming problem over the training data. We present a simple online SVM training algorithm of complexity approximately linear in the number of training vectors, and linear in the number of support vectors. The algorithm implements an online variant of sequential minimum optimization (SMO) that avoids the need for adjusting select pairs of training coefficients by adjusting the bias term along with the coefficient of the currently presented training vector. The coefficient assignment is a function of the margin returned by the SVM classifier prior to assignment, subject to inequality constraints. The training scheme lends efficiently to dedicated SVM hardware for realtime pattern recognition, implemented using resources already provided for runtime operation. Performance gains are illustrated using the Kerneltron, a massively parallel mixedsignal VLSI processor for kernelbased realtime video recognition.
Theory and Applications of Incremental 16 Converters
"... c○2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other w ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
c○2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Minimal Activity MixedSignal VLSI Architecture for RealTime Linear Transforms in Video
"... Abstract — The mixedsignal processor performs digital vectormatrix multiplication using internally analog finegrain parallel computing. The threetransistor CID/DRAM unit cell combines singlebit dynamic storage, binary multiplication, and zerolatency analog accumulation. Matrix coefficients are s ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — The mixedsignal processor performs digital vectormatrix multiplication using internally analog finegrain parallel computing. The threetransistor CID/DRAM unit cell combines singlebit dynamic storage, binary multiplication, and zerolatency analog accumulation. Matrix coefficients are stored in a bitparallel form. Deltasigma analogtodigital conversion of the analog array outputs is combined with oversampled unary coding of the digital inputs. Sorting of unary inputs results in at most a single input line transition for arbitrary multibit inputs. This amounts to a linear gain in energy efficiency of the computational array in the number of bits of the input vector. The 256 × 128 CID/DRAM processor with integrated 128 deltasigma ADCs measures 3 mm × 3 mm in 0.5 µm CMOS and delivers 6.5 GMACS dissipating 5.9 mW of power. CID/DRAM array dynamic power dissipation is reduced by a factor of four through sorting 8bit inputs. I.
CMOS Image Sensor With PerColumn ΣΔ ADC and Programmable Compressed Sensing
"... Abstract—A CMOS image sensor architecture with builtin singleshot compressed sensing is described. The image sensor employs a conventional 4T pixel and percolumn ΣΔ ADCs. The compressed sensing measurements are obtained via a column multiplexer that sequentially applies randomly selected pixel v ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—A CMOS image sensor architecture with builtin singleshot compressed sensing is described. The image sensor employs a conventional 4T pixel and percolumn ΣΔ ADCs. The compressed sensing measurements are obtained via a column multiplexer that sequentially applies randomly selected pixel values to the input of each ΣΔ modulator. At the end of readout, each ADC outputs a quantized value of the average of the pixel values applied to its input. The image is recovered from the random linear measurements offchip using numerical optimization algorithms. To demonstrate this architecture, a 256x256 pixel CMOS image sensor is fabricated in 0.15 µm CIS process. The sensor can operate in compressed sensing mode with compression ratio 1/4, 1/8, or 1/16 at 480, 960, or 1920 fps, respectively, or in normal capture mode with no compressed sensing at a maximum frame rate of 120 fps. Measurement results demonstrate capture in compressed sensing mode at roughly the same readout noise of 351 µV and power consumption of 96.2 mW of normal capture at 120 fps. This performance is achieved with only 1.8 % die area overhead. Image reconstruction shows modest quality loss relative to normal capture and significantly higher image quality than downsampling. Index Terms — ADC,CMOS image sensor, compressed/compressive sensing. I.
ALGORITHMIC PARTIAL ANALOGTODIGITAL CONVERSION IN MIXEDSIGNAL ARRAY PROCESSORS
"... We present an algorithmic analogtodigital converter (ADC) architecture for largescale parallel quantization of internally analog variables in externally digital array processors. The converter quantizes and accumulates a binary weighted sequence of partial binarybinary matrixvector products com ..."
Abstract
 Add to MetaCart
(Show Context)
We present an algorithmic analogtodigital converter (ADC) architecture for largescale parallel quantization of internally analog variables in externally digital array processors. The converter quantizes and accumulates a binary weighted sequence of partial binarybinary matrixvector products computed on the analog array, under presentation of bitserial inputs in descending binary order. The architecture combines algorithmic conversion of the residue, as in a standard algorithmic ADC, with synchronous accumulation of the partial products from the array. In conjunction with rowparallel digital storage of matrix elements in the array, two pipelined architectures are presented to accumulate partial products with common binary weight across rows: rowparallel ADC with digital postaccumulation, and rowcumulative ADC with analog preaccumulation. Simulation results are presented to quantify the tradeoff in precision and area for fullparallel flash, and rowparallel and rowcumulative partial algorithmic, analogtodigital conversion on the array. 1.
FocalPlane CMOS Wavelet Feature Extraction for RealTime Pattern Recognition
"... Kernelbased pattern recognition paradigms such as support vector machines (SVM) require computationally intensive feature extraction methods for highperformance realtime object detection in video. The CMOS sensory parallel processor architecture presented here computes deltasigma (∆Σ)modulated ..."
Abstract
 Add to MetaCart
Kernelbased pattern recognition paradigms such as support vector machines (SVM) require computationally intensive feature extraction methods for highperformance realtime object detection in video. The CMOS sensory parallel processor architecture presented here computes deltasigma (∆Σ)modulated Haar wavelet transform on the focal plane in real time. The active pixel array is integrated with a bank of columnparallel firstorder incremental oversampling analogtodigital converters (ADCs). Each ADC performs distributed spatial focalplane sampling and concurrent weighted average quantization. The architecture is benchmarked in SVM face detection on the MIT CBCL data set. At 90 % detection rate, firstlevel Haar wavelet feature extraction yields a 7.9 % reduction in the number of false positives when compared to classification with no feature extraction. The architecture yields 1.4 GMACS simulated computational throughput at SVGA imager resolution at 8bit output depth.
CMOS FocalPlane SpatiallyOversampling Computational Image Sensor
"... Many video processing applications employ spatial image transforms such as blockmatrix transforms and convolutional transforms. For example, blockmatrix transforms such as discrete cosine transform (DCT) or discrete wavelet transform (DWT) are widely used in various image and video compression al ..."
Abstract
 Add to MetaCart
(Show Context)
Many video processing applications employ spatial image transforms such as blockmatrix transforms and convolutional transforms. For example, blockmatrix transforms such as discrete cosine transform (DCT) or discrete wavelet transform (DWT) are widely used in various image and video compression algorithm standards. Convolutional transforms are often