Results 1 -
3 of
3
Discovering and Quantifying Mean Streets: A Summary of Results
, 2007
"... Mean streets represent those connected subsets of a spatial network whose attribute values are significantly higher than expected. Discovering and quantifying mean streets is an important problem with many applications such as detecting high-crime-density streets and high crash roads (or areas) for ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Mean streets represent those connected subsets of a spatial network whose attribute values are significantly higher than expected. Discovering and quantifying mean streets is an important problem with many applications such as detecting high-crime-density streets and high crash roads (or areas) for public safety, detecting urban cancer disease clusters for public health, detecting human activity patterns in asymmetric warfare scenarios, and detecting urban activity centers for consumer applications. However, discovering and quantifying mean streets in large spatial networks is computationally very expensive due to the difficulty of characterizing and enumerating the population of streets to define a norm or expected activity level. Previous work either focuses on statistical rigor at the cost of computational exorbitance, or
Geographic Crime Linkage Analysis: A Spatio-Temporal Data Mining Approach
"... Geographic crime linkage analysis focuses on identifying spatially grouped serial crimes and criminals from a given a set of crime reports and other related information provided by state and local law enforcement agencies. Discovering and tracking spatial relationships from crime data is an importan ..."
Abstract
- Add to MetaCart
Geographic crime linkage analysis focuses on identifying spatially grouped serial crimes and criminals from a given a set of crime reports and other related information provided by state and local law enforcement agencies. Discovering and tracking spatial relationships from crime data is an important problem in crime analysis e.g. identifying relationships that exist among crimes committed by the same offender (e.g. serial killer) or same group of offenders (e.g. organized crimes). However, geographic crime linkage analysis is challenging due to several reasons: i) existense of disparate sources of data in the form of incident reports, dispatch records and modus operandi information (ii) crime data might have spatially skewed ditributions (iii) size, volume and complexity of data available to the law enforcement agencies is growing (iv) high risk of generating spurious patterns and (v) presence of large amounts of missing or imprecise data.Existing work in crime analysis assume a normal distribution of crime datasets and do not consider micro-environmental factors into account. Also, they focus on manual discovery of spatially grouped crimes whose results might be analyst-oriented, dependent on the underlying distribution of the data. Existing tools in crime analysis, classical data mining and geographic profiling make use of a variety of techniques which have these limitations. The focus of this proposal is to create and explore a novel spatiotemporal data mining platform (STDMP) for geographic crime series linkage analysis (GCSLA) which includes developing a spatio temporal data mining framework that can address the spatiotemporal nature of crime data, consider micro-environmental factors while generating hypotheses, do not assume any specific distribution of the data and can scale large to crime datasets.We would validate our proposed approach with real datasets from law enforcement agencies and also deploy our framework as components of existing crime analysis tools.
Detection of Genetic Effects of Environmental Agents
, 1981
"... The fundamental problems in population monitoring for genetic effects are twofold: the binomialized nature of the data and the lower power due to small risk of finding positive results. The binomial character is artificial, even forced, and can with advantage be replaced by more refined analysis, an ..."
Abstract
- Add to MetaCart
The fundamental problems in population monitoring for genetic effects are twofold: the binomialized nature of the data and the lower power due to small risk of finding positive results. The binomial character is artificial, even forced, and can with advantage be replaced by more refined analysis, and by a focus on all mutations, not merely harmful ones. Moreover, a binomial treatment ignores accessory information (birth order, clustering, etc.). But this objective requires that an explicit model be used instead of nonparametric methods; a cancer may represent multiple independent hits that should be separately scored; sequencing of a codon or its product may show multiple distinct changes. A sage... made the discovery that the flesh of swine... might be cooked... without the necessity of consuming the whole house to dress it. Charles Lamb: Dissertation on Roast Pig There is a pompous humility that I deplore in others, which I am about to represent as a virtue in myself. One who is not well up in a field is apt to harp on the very elementary aspects; and such is our conceit, that we hope that such topics are not merely elementary, but also elemental. Just so, the demi monde of mathematics uses florid formulas, but the pure mathematician wonders about obvious things: what "greater than " may mean, or whether a sheet of paper has two surfaces. The Problem The fundamental problem in the whole system of monitoring for genetic effects in populations seems to me to reside in two features: first, the data are more or less binomial in type-that is, they deal with individuals that either have, or have not, some particular characteristic; and secondly, the probability of a successful search, that is, of finding an aberrant case, is very small, perhaps one chance in a million or some small multiple thereof. If we find that, in the face of some environmental exposure, there are authentic

