| A. Yemini and S. Kliger. High speed and robust event correlation. IEEE Communication Magazine, 34(5):82--90, May 1996. |
....the observable malfunctioning of the managed system. In this section, we define the vocabulary related to fault localization and describe the most common problems it is associated with. Event is an exceptional condition occurring in the operation of the hardware or software of the managed network [22, 41]. Faults (also referred to as root problems) constitute a class of network events that can be handled directly [22, 41] Faults may be classified according to their duration time as: 1) permanent, 2) intermittent, and (3) transient [40] Permanent fault exists in a network until a repair action ....
....and describe the most common problems it is associated with. Event is an exceptional condition occurring in the operation of the hardware or software of the managed network [22, 41] Faults (also referred to as root problems) constitute a class of network events that can be handled directly [22, 41]. Faults may be classified according to their duration time as: 1) permanent, 2) intermittent, and (3) transient [40] Permanent fault exists in a network until a repair action is taken. Intermittent faults occur on a discontinuous and periodic basis, causing degradation of service for short ....
[Article contains additional citation context not shown here]
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie. High speed and robust event correlation. IEEE Communications Magazine, 34(5):82-- 90, 1996.
.... lost events 13 6 Related work In the past various event correlation techniques were proposed including rule based systems [18, 26] model based reasoning systems [13, 21] model traversing techniques [14, 15] case based systems [17] fault propagation models [9, 16] and the code book approach [27]. Rule based systems are composed of rules (productions) of the form conclusion if condition. The condition part is a logical combination of propositions about the current set of received alarms and the system state [18, 26] the conclusion determines the state of correlation process. The ....
....or from the configuration database. In addition, Yemanja s internal event publishers need not be aware which components consume the events that they forward; therefore, a change to higher level scenario does not require changes to any of the lower level scenarios. The code book technique [27] uses a network model to derive a code a set of possible symptom observations for every problem that may occur in the network. This process, called code book generation, is performed in advance upon the installations particular network topology. Code book generation eliminates the runtime ....
[Article contains additional citation context not shown here]
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie. High speed and robust event correlation. IEEE Communications Magazine, 34(5):82--90, 1996. 17
....network, and then the approximations based on Pearl s algorithms and exact bucket tree elimination algorithm are designed and evaluated through extensive simulation study. Keywords Fault localization, probabilistic inference, belief networks, event correlation 1 Introduction Fault localization [15, 18, 35], a central aspect of network fault management, isolates the most probable set of root problems based on their external manifestations, called Prepared through collaborative participation in the Communications and Networks Consortium sponsored by the U. S. Army Research Laboratory under the ....
....for endto end service failures in a given layer. The proposed solutions allow the management system to perform fault localization iteratively in real time. In the past, fault diagnosis efforts concentrated mostly on detecting, isolating, and correcting faults related to network connectivity [9, 18, 33, 35]. The diagnosis focused on lower layers of the protocol stack (physical and data link layers) 24, 35] and its major goal was to isolate faults related to the availability of network resources, such as broken cable, inactive interface, etc. Modern enterprise environments increasingly demand ....
[Article contains additional citation context not shown here]
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie. High speed and robust event correlation. IEEE Communications Magazine, 34(5):82--90, 1996.
....to the alarm in that column; alarms can require more than one event to occur. The weakness of this approach is that expert knowledge is needed to develop the correlation matrix, which gets quite involved with constructing causality graphs that model cause and e ect of events in the network. [47] This approach has been pioneered by System Arts Management, Inc. with their InCharge product, which they claim has . zero maintenance costs, near perfect fault isolation and very fast identi cation. 24] InCharge can be integrated with either HP OpenView or IBM NetView; research examples of ....
....event correlation as networks increase in complexity and importance. Achieving standard representation of correlation events is a recognized industry goal. Past research e orts that invented their own scripting language to implement a solution posed a severely self limiting implementation factor. [26, 9, 30, 47] Although some private sector e orts claim success using case based, model based and codebook based approaches, these appear to be based on highly speci c examples that have not been subjected to open evaluation from outside sources. 47] 27] Published reviews that attempt to compare event ....
[Article contains additional citation context not shown here]
S.A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie. High speed and robust event correlation. IEEE Communications, May 1996.
....utilized for the networks operating in an unreliable environment such as wireless and or military networks. 1 INTRODUCTION To improve the network ability to provide reliable services to end systems, a management system needs to efficiently and accurately identify the occurring network failures [13, 25]. A common procedure is to correlate network or service layer symptoms; however, this process is usually impaired by the large number of a system s layers and parameters [1, 8, 22] their interactions, and the uncertainty about their state. This paper presents a preliminary study of applying ....
....and improving the availability and performance of network services. The problems typically addressed in the literature are (1) incomplete knowledge about the existence of causal relationships between network events [13] 2) possibility of incomplete symptom observations or spurious symptoms [9, 25], 3) system adaptability to configuration changes [1] 4) ability to learn event correlation patterns [14] and (5) temporal event correlation [15] Most of these techniques rely on the assumption that the existence of multiple simultaneous faults is negligible. Relationships between network ....
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie. High speed and robust event correlation. IEEE Communications, 34(5):82--90, 1996. 6
....techniques, including event correlation systems, are based on static dependency models describing the relationships among the hardware and software components in the system. These dependency models are used to determine which components might be responsible for the symptoms of a given problem [5, 25, 6, 13]. The first major limitation Hotmail 7000 [11] Google 8000 [14] of traditional dependency models is the difficulty of generating and maintaining an accurate model of a constantly evolving Internet service. Their second major limitation is that they typically only model a logical system, and ....
....These systems mainly use two approaches. The first approach uses expert systems with rules (or filters) input by humans or obtained through machine learning Figure 8. Histogram of No. of components used per dynamically generated page request techniques. The second approach uses dependency models [25, 6, 13]. However, these systems do not consider how the required dependency models are obtained. More recent research has focused on automatically generating dependency models. Brown et al. 5] use active perturbation of the system to identify dependencies and use statistical modeling of the system to ....
A. Yemini and S. Kliger. High Speed and Robust Event Correlation. IEEE Communication Magazine, 34(5):82--90, May 1996.
....such dependency models. 3. Related Work Them has been significant interest in the literature in using dependency models for problem diagnosis and root cause analysis. Two main approaches stand out. The first is in the context of event correlation systems, such as those described by Yemini et al. [12], Choi et al. 2] and Gruschke [4] In these systems, incoming alarms or events are first mapped onto corresponding nodes of the dependency graph, then the dependencies from those nodes are examined to identify the set of nodes upon which the most alarm event nodes depend. These nodes are likely ....
S. Yemini, S. Kliger et al., "High Speed and Robust Event Correlation," IEEE Communications Magazine, vol. 34, no. (5), pp. 82-90, May 1996.
....is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon. state is frequently disturbed by the presence of lost and or spurious symptoms (usually referred to as observation noise) Although many researchers have suggested [6] [8] that fault localization should be resilient to the existence of spurious and or lost alarms, we are aware of only one technique [8] that incorporates lost and spurious symptoms into deterministic fault localization. In this paper, we propose a technique that allows lost and spurious symptoms to ....
.... is frequently disturbed by the presence of lost and or spurious symptoms (usually referred to as observation noise) Although many researchers have suggested [6] 8] that fault localization should be resilient to the existence of spurious and or lost alarms, we are aware of only one technique [8] that incorporates lost and spurious symptoms into deterministic fault localization. In this paper, we propose a technique that allows lost and spurious symptoms to be incorporated in the nondeterministic analysis (Section VI) We prove that reasoning with positive and noisy observations does not ....
[Article contains additional citation context not shown here]
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie, "High speed and robust event correlation," IEEE Communications Magazine, vol. 34, no. 5, pp. 82--90, 1996.
....this approach is system specific, requiring custom code and expert protocol knowledge. A more generic approach is dependency analysis, which uses traversal based techniques to subset potential root causes from an overall dependency graph based on observed symptoms and alarms [10] 24] 32] [59]. Dependency analysis requires a good dependency model, and unfortunately there has been little work on obtaining the needed dependency models from complex systems. The authors of the works referenced above simply assume the existence of dependency models, and the existing work on computing ....
S. Yemini, S. Kliger et al. High Speed and Robust Event Correlation. IEEE Communications Magazine, 34(5):8290, May 1996.
....with the capability for online fault injection (notably the IBM 3090 and ES 9000 mainframes [Merenda92] Problem diagnosis: There are several standard approaches to problem diagnosis. One is to use models and dependency graphs to perform diagnosis [Choi99] Gruschke98] Katker97] Lee00] Yemini96] When models are not available, they can either be discovered [Kar00] Brown01] Miller95] Zave98] or alternate techniques can be used, such as Bangas system specific combination of monitoring, protocol augmentation, and cross layer correlation [Banga00] Our Pinpoint example demonstrates ....
S. Yemini, S. Kliger et al. High Speed and Robust Event Correlation. IEEE Communications Magazine, 34(5):8290, May 1996.
....seen from the customers viewpoint and are the basis of SLAs. A prerequisite for guaranteeing a property of a service is to assure the properties of technical and organisational components. For an evidence of fulfilling SLAs, properties of resources have to be measured and correlated to a service [YKMY96]. Resource properties that cannot be measured and assigned to a service and to an individual customer cannot be assured as a quality of service attribute specified in SLAs. Main services can cover a range from central data centre operations to desktop services. These services can be ....
Yemini, S.; Kliger, S.; Mozes, E.; Yemini, Y., 1996, "High Speed and Robust Event Correlation", IEEE Communication Magazine.
....their root causes. Current root cause analysis techniques use approaches that do not sufficiently capture the dynamic complexity of large systems, and they require people to input extensive knowledge about the systems [22, 4] Recent event correlation techniques based on dependency models [5, 23, 6, 12] use statically generated dependencies of components to determine which components are responsible for the symptoms of a given problem. Two limitations of using dependency models are that they are difficult to generate accurately and they are difficult to keep consistent with an evolving system. ....
....OpenView [8] IBM s Tivoli [14] and Altaworks Panorama [3] These system mainly use two approaches. The first approach is expert systems that use rules (or filters) input by humans or obtained through machine learning techniques. The second approach uses dependency models, such as Yemini et al. [23], Choi et al. 6] and Gruschke[12] However, these systems do not consider how the required dependency models are obtained. More recent research has focused on automatically generating dependency models. Brown et al. 5] use active perturbation of the system to identify dependencies and use ....
A. Yemini and S. Kliger. High speed and robust event correlation. IEEE Communication Magazine, 34(5):82--90, May 1996. 17
....a; s 0 ) are completely known. There are various methods to determine the optimal policies for MDP, e.g. value iteration [2] policy iteration [12] When the system dynamics are unknown, reinforcement learning can be applied [25] e.g. the pioneering one step offpolicy Q learning algorithm [27] which iterates the Q value based on the resulting next state and immediate reward: Q t 1 (s; a) 1 )Q t (s t ; a t ) r t max a t 1 Q t (s t 1 ; a t 1 ) 8) where 0 1 is a learning parameter. It can be proved Q t under minimal technical conditions, converges to Q ....
....Q value based on the resulting next state and immediate reward: Q t 1 (s; a) 1 )Q t (s t ; a t ) r t max a t 1 Q t (s t 1 ; a t 1 ) 8) where 0 1 is a learning parameter. It can be proved Q t under minimal technical conditions, converges to Q asymptotically as t 1 [27]. The optimal policy can then be determined from Q . Solution algorithms for MDP s can handle problems with thousands of states. Unfortunately, in many real life applications, states cannot be observed completely. In a communication network, if we denote the up down states of NE, the states ....
Yemini, A., et al. (1996) High Speed and Robust Event Correlation, IEEE Communication Magazine, p82-90,May 1996.
.... [HON97] At the same time, many researchers are investigating the benefits of intelligent mobile agent technology as a more generic framework for performing distributed management tasks [MAG96, BAL97] The need for event filtering and correlation has been identified in a number of previous works [YEM96], XEM97] GRU97] KAL96] However, to the best of our knowledge, little of no work has been performed for consuming and visualizing events in web based management environments. YEM96] describes a distributed event management architecture, InCharge, for the root cause analysis procedure. This ....
....[MAG96, BAL97] The need for event filtering and correlation has been identified in a number of previous works [YEM96] XEM97] GRU97] KAL96] However, to the best of our knowledge, little of no work has been performed for consuming and visualizing events in web based management environments. [YEM96] describes a distributed event management architecture, InCharge, for the root cause analysis procedure. This procedure is based on an event model which serves as the knowl edge base for event abstractions and on a computation model which is a reasoning algorithm based on Codebook ....
S. Yemini, E. Moses, Y. Yemini and D. Ohsie, "High Speed and Robust Event Correlation ", IEEE Communications Magazine, May 1996.
....of automatically grouping related events based on their underlying common, thereby compressing the event stream and identifying underlying hidden problems. NetFACT [Houck et al. 1995] SINERGIA [Brugnoni et al. 1993] IMPACT [Jakobson et al. 1995] ECXpert [Nygate 1995] and the authors own DECS [IEEE96] are all examples of such systems. An event correlation system consists of two basic components: an event definition and propagation model (or simply event model) and reasoning algorithm. The event model describes the underlying system, while the reasoning algorithm processes incoming events and ....
....reasoning algorithm would infer the presence of the congestion problem based on the poor video and audio quality and the event model illustrated. In previous work [ISINM95] we described the coding approach to event correlation which is the reasoning algorithm of our DECS event correlation system [IEEE96] There we showed how the symptoms of each problem in a modeled system could be treated as a code for that problem, and that elementary techniques from coding theory could be profitably applied to event correlation. That work presupposes that there is a causality graph which maps each problem to ....
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie, High Speed and Robust Event Correlation. IEEE Communications Magazine, May 1996.
No context found.
A. Yemini and S. Kliger. High speed and robust event correlation. IEEE Communication Magazine, 34(5):82--90, May 1996.
No context found.
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie. High speed and robust event correlation. IEEE Communications, 34(5):82--90, May 1996.
No context found.
S. Yemini et al.- High Speed and Robust Event Correlation. IEEE Communications Magazine, May 1996.
No context found.
S. Yemini and S. K. et al. High speed and robust event correlation. IEEE Communications Magazine, May 1996.
No context found.
A. Yemeni and S. Kliger. High speed and robust event correlation. IEEE Communications Magazine, 34(5), May 1996. 15
No context found.
A. Yemini and S. Kliger. High speed and robust event correlation. IEEE Communication Magazine, 34(5):82--90, May 1996.
No context found.
S. A. Yemini, S. Kliger, E. Mozes, Y Yemini, and D. Ohsie. High speed and robust event correlation. IEEE Communications Magazine, pages 82--90, May 1996.
No context found.
YEMINI, A., AND KLIGER, S. High Speed and Robust Event Correlation. IEEE Communication Magazine 34, 5 (May 1996), 82--90.
No context found.
S. Yemini, S. Kliger et al. High Speed and Robust Event Correlation. IEEE Communications Magazine, 34(5):82--90, May 1996.
No context found.
S. Yemini et al., "High speed and robust event correlation," IEEE Communications Magazine, May 1996. 13
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC