| Shimon Edelman and Daphna Weinshall. "A self-organizing multiple-view representation of 3D objects." Biological Cybernetics, vol. 64: 209-219, 1991. |
....building block of many brain functions, and using such an approach for 3D object recognition allows 3D object recognition to fit into the same framework as these other modalities. Temporal continuity as a way to acquire multiple prototype views for objects has been proposed by a number of authors[44, 13, 27], and there is some psychophysical evidence for such effects as well [43, 42] However, that mechanism is distinct from the use of temporal continuity for learning distributions like S) and would continue to operate in parallel. That is, we use temporally associated views for improving our ....
D. Weinshall, S. Edelman, and H. H. B ulthoff. A self-organizing multiple-view representation of 3D objects. In D. Touretzky, editor, Natural Information Processing Systems, volume 2, pages 274--281. Morgan Kaufmann, San Mateo, CA, 1990. 32
....than object centered representations. 3.3. 3 Computational considerations Recent computational studies provide further support for the hypothesis that a recognition strategy based on multiple view representation can in principle account for much of the human performance in object recognition ([40], 111, 119] In particular, if the recognition problem is formulated in terms of the approximation of a mapping that associates a standard view of an object with any of its other views, powerful mathematical tools from function approximation theory are available that can construct such a ....
$. Edelman and D. Weinshall. A self-organizing multiple-view representation of 3d objects. A.I. Memo No. 1146, Artificial Intelligence Laboratory, Massachusetts Institute of Technology, August 1989.
....objects need to be modeled by composing parts, which mixes a segmentation problem with the alignment problem. Here, we advocate the view based approach, where 3D objects are modeled by a collection of 2D views of the object, each view being fairly close in geometry and topology to a sensed image [3, 10, 12, 18]. The advantages are that learning can be used in a straightforward manner to build models and that matching can be done with a representation that is close to the sensed data. The disadvantages are that the representation is verbose and full of seams and that the indexing scheme must not only ....
S. Edelman and D. Weinshall, A Self-organizing Multiple View Representation of 3D Objects, Biological Cybernetics, Vol.64, pp.209-219, 1991.
.... priming, such as that outlined above, predicts that no explicit knowledge of orientation blocking is required in order to obtain facilitation activation of visually similar shape representations being a natural consequence of recognition within image based distributed representation models (Weinshall, Edelman, Bulthoff, 1990; Edelman Weinshall, 1991; Edelman, 1995b, 1995a) Experiment 4 tests this by specifically asking whether orientation priming occurs in a context where subjects are unlikely to be aware of orientation blocking. A second prediction of an image based generalization account is that the degree of ....
.... image based models naturally predict that as evidence accumulates within orientation specific shape representations (due to the repetition of homogeneous target shapes) there will be an increase in orientation priming for the recognition of visually similar shapes at the same orientation (Weinshall et al. 1990; Edelman, 1995a) Consistent with this interpretation, based on the results of our experiments in which orientation priming was not obtained (Experiments 1 and 2) and those in which it was obtained (Experiments 3 and 4) it appears that a recurring factor in the occurrence of orientation priming ....
[Article contains additional citation context not shown here]
Weinshall, D., Edelman, S., & Bulthoff, H. H. (1990). A selforganizing multiple-view representation of 3D objects. In D. S.
....or Michael J. Tarr, Department of Cognitive and Linguistic Sciences, Brown University, Box 1978, Providence, RI 02912, tel: 401) 863 1148, fax (401) 863 2255, email: Michael Tarr brown.edu der the term image based theories (Bulthoff Edelman, 1992; Bulthoff, Edelman, Tarr, 1995; Edelman Weinshall, 1991; Poggio Edelman, 1990; Tarr, 1995) Such theories posit that orientation, as well as many other properties present in the original image, are encoded in shape representations. With regard to orientation, these theories predict that recognition performance, as measured by response time and or ....
.... predicts that no explicit knowledge of orientation blocking is required in order to obtain facilitation activation of visually similar shape representations being a natural consequence of recognition within image based distributed representation models (Weinshall, Edelman, Bulthoff, 1990; Edelman Weinshall, 1991; Edelman, 1995b, 1995a) Experiment 4 tests this by specifically asking whether orientation priming occurs in a context where subjects are unlikely to be aware of orientation blocking. A second prediction of an image based generalization account is that the degree of facilitation should be ....
[Article contains additional citation context not shown here]
Edelman, S. & Weinshall, D. (1991). A self-organizing multipleview representation of 3D objects. Biological Cybernetics, 64, 209--219.
....Recent computational models have explored how to learn to recognize 3D objects from their projected views (Poggio Edelman, 1990) Most existing models are, however, based on supervised learning, i.e. during training the teacher tells which object each view belongs to. The model proposed by Weinshall et al. 1990) also requires a signal that segregates different objects during training. This paper, on the other hand, discusses unsupervised aspects of 3D object recognition where the system discovers categories by itself. This paper presents an unsupervised classification scheme for categorizing 3D objects ....
Weinshall, D., Edelman, S. and Bthoff, H. H. (1990). A self-organizing multipleview representation of 3D objects. In Touretzky, D. S., (eds), Advances in Neural Information Processing Systems 2. Morgan Kaufmann Publishers, San Mateo, CA. 274-281.
....indicate that the human vision system represents objects with a set of two dimensional views rather than one three dimensional view, that is the human visual system learns three dimensional objects from their two dimensional images. There currently exist a handful of computer vision systems [4] [5] [6] 7] that have this ability. Poggio and Edelman [4] have used a three layer network, a regularization network, to learn three dimensional stick figures from images of these figures taken at multiple viewpoints. Once the network learns the figures, the network will recognize the figure in an ....
S. Edelman and D. Weinshall, "A self-organizing multiple-view representation of 3D objects," Biological Cybernetics, Vol. 64, pp. 209-219, 1991.
....This sort of structure is ubiquitous in sensory signals, from vision as well as other senses, and can be used by a neural network to derive temporally coherent classifications. This idea has been used, for example, in temporal versions of the Hebbian learning rule to associate items over time (Weinshall, Edelman and Bulthoff, 1990; Foldi ak, 1991) To capitalize on temporal coherence for higher order feature extraction and classification, we need a more powerful learning principle. A promising approach is to maximize some measure of agreement between the outputs of two groups of units which receive inputs physically ....
....between translated versions of the same object; they simply learned to associate different views together. In this respect, the representation learned at the hidden layer is similar to that predicted by the privileged views theory of viewpoint invariant object recognition advocated by Weinshall et al. 1990) (and others) Their algorithm learns a similar representation in a single layer of competing units with temporal Hebbian learning applied to the lateral connections between these units. However, the algorithm proposed here goes further in that it can be applied to subsequent stages of learning to ....
Weinshall, D., Edelman, S., and Bulthoff, H. H. (1990). A self-organizing multiple-view representation of 3D objects. In Advances in Neural Information Processing Systems 2, pages 274--282. Morgan Kaufmann.
....to its familiar orientations. Recognition could be achieved by transforming an input representation of a face to the orientation of the nearest stored representation [11] or, if the faces are represented by a series of views that are close enough to each other, by interpolating among those views [12, 13, 14]. In the series of simulations we report here, we trained an autoassociative memory by using multiple views of a set of faces, and we tested its ability to generalize to new views of the learned faces. Autoassociative memories are a powerful tool for storing, recognizing, and categorizing faces ....
S. Edelman, and D. Weinshall, "A self-organizing multiple-view representation of 3-D objects", Biol. Cybern. , Vol. 64, pp. 209--219, 1991.
....twodimensional representations. More about these theories and about the implemented computational models of recognition used in our simulations can be found in (Lowe, 1986; Biederman, 1987; Ullman, 1989; Ullman and Basri, 1991; Poggio and Edelman, 1990; B ulthoff and Edelman, 1992; Edelman and Weinshall, 1991). 2 Computational theories of object recognition Explicit computational theories of recognition serve as good starting points for inquiry into the nature of object representation, by providing concrete hypotheses that may be refuted or refined through appropriately designed experiments. More ....
....1989) MVPT postulates that objects are represented as linked collections of viewpoint specific images ( views ) and that recognition is achieved when the input image activates the view (or set of views) that corresponds to a familiar object transformed to the appropriate pose. There is evidence (Edelman and Weinshall, 1991; Tarr, 1989; Tarr and Pinker, 1989) indicating that this process can result in the same dependence of the response time on the pose of the stimulus object as obtained in the mental rotation experiments (Shepard and Cooper, 1982) We consider MVPT as a psychological model of human performance that ....
[Article contains additional citation context not shown here]
Edelman, S. and Weinshall, D. (1991). A self-organizing multiple-view representation of 3D objects. Biological Cybernetics, 64:209--219.
....stay active as the input moved to a new location. Thus units signaling horizontal at multiple locations would strengthen their connections to the same output unit. This mechanism can learn viewpoint tolerant representations when different views of an object are presented in temporal continuity [72, 205, 167, 150, 204]. F old iak achieved translation invariance in a single layer by having orientation tuned filters in the first layer that produced linearly separable patterns. More generally, approximate viewpoint invariance may be achieved by the superposition of several F old iak like networks [171] O Reilly ....
....tested this hypothesis with natural images, and found that although natural images contain sharp depth boundaries at object edges, depth varies slowly the vast majority of the time, and his learning algorithm was able to learn depth estimation from natural graylevel images. Weinshall and Edelman [205] applied the assumption of temporal persistence of objects to learn object representations that were invariant to rotations in depth. They trained a 2 layer network to store individual views of wire framed objects, and then updated lateral connections in the output layer with Hebbian learning as ....
[Article contains additional citation context not shown here]
D. Weinshall and S. Edelman. A self-organizing multiple view representation of 3d objects. Biological Cybernetics, 1991.
.... generalization to novel views was severely limited, with performance dropping to chance levels at a misorientation of about 40 ffi relative to familiar views (Edelman and Bulthoff 1992) 3] In this human visual model, as in certain computational models, e.g. Edelman and Weinshall (1991) [4], views that belong together are more closely associated with each other. Computationally, this method of recognition is analogous to an attempt to express the input as an interpolation of the stored views, and it can also be viewed as a perceptual organization at a higher level. In this case, ....
S. Edelman and D. Weinshall. A self-organizing multiple-view representation of 3d objects. Biological Cybernetics, 64:209--219, 1991.
.... generalization to novel views was severely limited, with performance dropping to chance levels at a misorientation of about 40 ffi relative to familiar views (Edelman Bulthoff 1992) 7] Also, in this human visual model, as in certain computational models, e.g. Edelman Weinshall (1991) [8], views that belong together are more closely associated with each other. Computationally, this method of recognition is analogous to an attempt to express the input as an interpolation of the stored views. In this case, recognition normally requires neither 3 D reconstruction of the stimulus, ....
S. Edelman and D. Weinshall. A self-organizing multiple-view representation of 3d objects. Biological Cybernetics, 64:209--219, 1991.
.... nition is viewpoint dependent are illustrated schematically in figures 1 through 3 (detailed accounts of the relevant experiments can be found in the references cited below) These are the phenomena of canonical views IT,5] mental rotation (analogous to the classical mental rotation of [ see [4,9]) and limited generalization [3,10,6,11] Following is a brief account of the relevant psychophysical findings. 2.1 Canonical Views Three dimensional objects are more easily recognized when seen from certain viewpoints, called canonical, than from other, random, viewpoints (Figure 1) The ....
.... familiar test views yield essentially constant response times (this is consistent with a changeover from time consuming rotation based strategy to a faster memory intensive approach that saves tinhe by storing all frequently occuring views) A sinhilar effect has been reported by Edelman et al. [5,9], who show how both the initial manifestation of mental rotation and its disappearance with expo sure can be replicated by a model that does not rely on 3D object centered representations and, a fortiori, has no means for rotating such representa tions (see section 3) 2.3 Limited Anisotropic ....
[Article contains additional citation context not shown here]
S. Edelman and D. Weinshall. A self-organizing multiple-view repre- sentation of 3D objects. Biological Cybernetics, 64:209 219, 1991.
No context found.
S. Edelman and D. Weinshall. A self-organizing multiple-view representation of 3D objects. Bio- lo9ical Cybernetics, 64:209 219, 1991.
....The aim of this section is to provide minimal theoretical background for understanding the predictions of the various theories relevant to our recognition experiments. More about these theories and about the implemented computational models of recognition used in our simula tions can be found in [30,1,14,31,20,8,9]. As a representative of this class of theories we have considered recognition by viewpoint nor realization, of which Ullman s recognition by alignment is an instance [30] In the alignment approach the 2D input image is compared with the projection of a stored model, much like in template ....
....ones and progressively worse on views that are far from familiar. 2.2. 3 Blurred template matching The third scheme we mention is also based on nonlinear interpolation among 2D views and, in addition, is suitable for modeling the time course of recognition, including long term learning effects [9]. The scheme is implemented as a two layer network of thresholded summation units. The input layer of the network is a retinotopic feature map (thus the modeFs name: CLF, or conjunction of localized features) The distribution of the connections from the first layer to the second, or ....
[Article contains additional citation context not shown here]
S. Edelman and D. Weinshall. A self-organizing multiple-view representation of 3D objects. 1990. in press.
....span the object space. Each persistent prototype may be represented by a set of detectors, implemented by receptive field like mechanisms tuned to a number of the object s views (Poggio and Edelman, 1990) and may be constructed in a self organizing fashion following mere exposure to the object (Edelman and Weinshall, 1991). In distinction to the persistent entities, rare or ephemeral patterns of primitive features are represented implicitly, by the distributed activity they induce in the prototype detectors (see Figure 1) The power of ephemeral implicit representations stems from the same principle that makes ....
Edelman, S. and Weinshall, D. (1991). A self-organizing multiple-view representation of 3D objects. Biological Cybernetics, 64:209--219.
....and of the predictions they were designed to test. More about these theories and about the implemented computational models of recognition used in our simulations can be found in (Lowe, 1986; Biederman, 1987; Ullman, 1989; Ullman and Basri, 1990; Poggio and Edelman, 1990; Edelman et al. 1990; Edelman and Weinshall, 1991). 2 Computational theories of object recognition Explicit computational theories of recognition serve as good starting points for inquiry into the nature of object representation, by providing concrete hypotheses that may be refuted or refined through appropriately designed experiments. Two ....
....image and a threshold situated between 0 and 1. 2.2. 3 Conjunction of localized features (CLF) The third scheme we mention is also based on interpolation among 2D views and, in addition, is particularly suitable for modeling the time course of recognition, including long term learning effects (Edelman and Weinshall, 1991; Edelman, 1991b) The scheme is implemented as a twolayer network of thresholded summation units. The input layer of the network is a retinotopic feature map (thus the model s name) The distribution of the connections from the first layer to the second, or representation, layer is such that the ....
[Article contains additional citation context not shown here]
Edelman, S. and Weinshall, D. (1991). A self-organizing multiple-view representation of 3D objects.
....data. In particular, it is becoming increasingly clear that human performance in recognizing objects under changes of viewpoint can be accounted for to a significant degree by a theory based on the notion of interpolation among view specific representations (Poggio and Edelman, 1990; Edelman and Weinshall, 1991; Bulthoff and Edelman, 1992; Edelman, 1995a) Detailed computational modeling showed that this approach to representation can be implemented by putting together simple building blocks such as the nonlinear graded profile overlapping receptive fields 1 found in biological visual systems ....
Edelman, S. and Weinshall, D. (1991). A self-organizing multiple-view representation of 3D objects.
....with a process that compares the input image against those representations without recourse to 3D operations. Examples of this class include recognition by linear combination of views (Ullman and Basri, 1991) and the various multiple view interpolation models (Poggio and Edelman, 1990; Edelman and Weinshall, 1991). For such models, the choice of stored views is also of great importance: a view is likely to be misrecognized if the differences between it and the stored views are too large for the interpolation process to cope with. Unlike in the multiple views plus (3D) transformation model, under the 2D ....
....the multiple views plus (3D) transformation model, under the 2D view based approach the response time may vary either with the shortest angular distance to the stored views, or with some image based (2D) measure of distance between the presented and the stored views. For example, in the model of (Edelman and Weinshall, 1991), representations of various views of objects to which the system has been exposed are associated with each other by links, determined by the presentation order of those views. Exposure to a series of views in the natural order corresponding to a rotation of the object produces a memory trace ....
Edelman, S. and Weinshall, D. (1991). A self-organizing multiple-view representation of 3D objects.
....represents the shape of the receptive field. A recognition system based on this method is expected to perform well when the novel view is close to the stored ones (that is, when most of the features of the input image fall close to their counterparts at least in some of the stored views; cf. [14]) The performance should become progressively worse on views that are far from the familiar ones. Methods To distinguish between the theories outlined above, we have developed an experimental paradigm based on a two alternative forced choice (2AFC) task. Our experiments consist of two phases: ....
S. Edelman and D. Weinshall. A self-organizing multiple-view representation of 3D objects. Biological Cybernetics, 64:209--219, 1991.
No context found.
Shimon Edelman and Daphna Weinshall. "A self-organizing multiple-view representation of 3D objects." Biological Cybernetics, vol. 64: 209-219, 1991.
No context found.
S. Edelman and D. Weinshall. A self-organizing multiple-view representation of 3D objects. Biological Cybernetics, 64:209 219, 1991.
No context found.
S. Edelman and D. Weinshall. A self-organizing multiple-view representation of 3D objects. Biological Cybernetics, 64:209 219, 1991.
No context found.
S Edelman and D. Weinshall. A self-organizing multiple-view representation of 3d objects. Biol. Cybern., 64:209--219, 1991.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC