Least absolute deviations estimation for the censored regression model
 Journal of Econometrics
, 1984
"... This paper proposes an alternative to maximum likelihood estimation of the parameters of the censored regression (or censored ‘Tobit’) model. The proposed estimator is a generalization of least absolute deviations estimation for the standard linear model, and, unlike estimation methods based on the ..."
Cited by 285 (6 self)
This paper proposes an alternative to maximum likelihood estimation of the parameters of the censored regression (or censored ‘Tobit’) model. The proposed estimator is a generalization of least absolute deviations estimation for the standard linear model, and, unlike estimation methods based on the assumption of normally distributed error terms, the estimator is consistent and asymptotically normal for a wide class of error distributions, and is also robust to heteroscedasticity. The paper gives the regularity conditions and proofs of these largesample results, and proposes classes of consistent estimators of the asymptotic ovariance matrix for both homoscedastic and heteroscedastic disturbances. 1.
Polynomial Splines and Their Tensor Products in Extended Linear Modeling
 Ann. Statist
, 1997
"... ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to m ..."
Cited by 221 (16 self)
ANOVA type models are considered for a regression function or for the logarithm of a probability function, conditional probability function, density function, conditional density function, hazard function, conditional hazard function, or spectral density function. Polynomial splines are used to model the main effects, and their tensor products are used to model any interaction components that are included. In the special context of survival analysis, the baseline hazard function is modeled and nonproportionality is allowed. In general, the theory involves the L 2 rate of convergence for the fitted model and its components. The methodology involves least squares and maximum likelihood estimation, stepwise addition of basis functions using Rao statistics, stepwise deletion using Wald statistics, and model selection using BIC, crossvalidation or an independent test set. Publically available software, written in C and interfaced to S/SPLUS, is used to apply this methodology to...
Timedependent ROC curves for censored survival data and a diagnostic marker. Biometrics 2000;56:337
"... SUMMARY. ROC curves are a popular method for displaying sensitivity and specificity of a continuous diagnostic marker, X, for a binary disease variable, D. However, many disease outcomes are time dependent, D ( t) , and ROC curves that vary as a function of time may be more appropriate. A common e ..."
Cited by 140 (5 self)
SUMMARY. ROC curves are a popular method for displaying sensitivity and specificity of a continuous diagnostic marker, X, for a binary disease variable, D. However, many disease outcomes are time dependent, D ( t) , and ROC curves that vary as a function of time may be more appropriate. A common example of a timedependent variable is vital status, where D(t) = 1 if a patient has died prior t o time t and zero otherwise. We propose summarizing the discrimination potential of a marker X, measured at baseline ( t = O) , by calculating ROC curves for cumulative disease or death incidence by time t, which we denote as ROC(t). A typical complexity with survival data is that observations may be censored. Two ROC curve estimators are proposed that can accommodate censored data. A simple estimator is based on using the KaplanMeier estimator for each possible subset X> c. However, this estimator does not guarantee the necessary condition that sensitivity and specificity are monotone in X. An alternative estimator that does guarantee monotonicity is based on a nearest neighbor estimator for the bivariate distribution function of ( X, T) , where T represents survival time (Akritas, M. J., 1994, Annuls of Statistics 22, 12991327). We present an example where ROC(t) is used to compare a standard and a modified flow cytometry measurement for predicting survival after detection of breast cancer and an example where the ROC(t) curve displays the impact of modifying eligibility criteria for sample size and power in HIV prevention trials.
C.L.: Game Traffic Analysis: An MMORPG Perspective. Computer Networks 51(3
, 2007
"... Online gaming is one of the most profitable businesses over the Internet. Among all genres of the online games, the popularity of the MMORPG (Massive Multiplayer Online Role Playing Games) is especially prominent in Asia. Opting for a better understanding of the game traffic and the economic well b ..."
Cited by 92 (14 self)
Online gaming is one of the most profitable businesses over the Internet. Among all genres of the online games, the popularity of the MMORPG (Massive Multiplayer Online Role Playing Games) is especially prominent in Asia. Opting for a better understanding of the game traffic and the economic well being of the Internet, we analyze a 1,356millionpacket trace from a sizeable MMORPG, ShenZhou Online. This work is, as far as we know, the first formal analysis on the MMORPG server traces. We find that the MMORPG and FPS (FirstPerson Shooting) games are similar in that they both generate small packets and require low bandwidths. In particular, the bandwidth requirement of MMORPG is even lower due to the less realtime game play. More distinctive are the strong periodicity, temporal locality, and irregularity observed in the MMORPG traffic. The periodicity is due to a common practice in game implementation, where the game state updates are accumulated within a fixed time window before transmission. The temporal locality in the game traffic is largely due to the game nature where one action leads to another. The irregularity, particular unique in MMORPG traffic, is due to the diversity of game design where the user behavior can be drastically different depending on the quest at hand.
Quality and duration of bank relationships
 Global Cash Management in Europe
, 1998
"... Governors or its staff. For comments, we thank Mitch Berlin, Erik Berglöf, Øyvind Bøhren, Yehning Chen, ..."
Cited by 90 (23 self)
Governors or its staff. For comments, we thank Mitch Berlin, Erik Berglöf, Øyvind Bøhren, Yehning Chen,
Study of a busbased disruptiontolerant network: mobility modeling and impact on routing
 ACM MOBICOM
, 2007
"... We study traces taken from UMass DieselNet, a DisruptionTolerant Network consisting of WiFi nodes attached to buses. As buses travel their routes, they encounter other buses and in some cases are able to establish pairwise connections and transfer data between them. We analyze the bustobus conta ..."
Cited by 88 (0 self)
We study traces taken from UMass DieselNet, a DisruptionTolerant Network consisting of WiFi nodes attached to buses. As buses travel their routes, they encounter other buses and in some cases are able to establish pairwise connections and transfer data between them. We analyze the bustobus contact traces to characterize the contact process between buses and its impact on DTN routing performance. We find that the allbuspairs aggregated intercontact times show no discernible pattern. However, the intercontact times aggregated at a route level exhibit periodic behavior. Based on analysis of the deterministic intermeeting times for bus pairs running on route pairs, and consideration of the variability in bus movement and the random failures to establish connections, we construct generative routelevel models that capture the above behavior. Through tracedriven simulations of epidemic routing, we find that the epidemic performance predicted by traces generated with this finergrained routelevel model is much closer to the actual performance that would be realized in the operational system than traces generated using the coarsegrained allbuspairs aggregated model. This suggests the importance in choosing the right level of model granularity when modeling mobilityrelated measures such as intercontact times in DTNs.
A Gompertzian model of human breast cancer growth. Cancer Res
 48: 7067–7071. PMID: 3191483
, 1988
"... The pattern of growth of human breast cancer is important theoreti cally and clinically. Speer et al. (Cancer Res., 44: 412441.10, 1984) have recently proposed that all individual tumors initially grow with identical Gompertzian parameters, but subsequently develop kinetic het erogeneity by a rando ..."
Cited by 80 (4 self)
The pattern of growth of human breast cancer is important theoreti cally and clinically. Speer et al. (Cancer Res., 44: 412441.10, 1984) have recently proposed that all individual tumors initially grow with identical Gompertzian parameters, but subsequently develop kinetic het erogeneity by a random timedependent process. This concept has elicited interest because it fits clinical data for the survival of untreated patients, for the progression of shadows on serial paired mammograms, and for timetorelapse following mastectomy. The success of these curvefits is compelling, and the model has been applied to clinical trials. However, the assumption of uniform nascent growth is not supported by theory or data, and individual cancers have not been shown to follow the complex growth curves predicted by the Speer model. As an alternative, if kinetic heterogeneity is understood to be an intrinsic property of neoplasia, the same three historical data sets are fit well by an unadorned Gompertzian model which is parsimonious and has many other intuitive and empirical advantages. The two models differ significantly in such clinical projec tions as the estimated duration of silent growth prior to diagnosis and the anticipated optimal chemotherapy schedule postsurgery.
kPlane Clustering
 Journal of Global Optimization
, 2000
"... A finite new algorithm is proposed for clustering m given points in ndimensional real space into k clusters by generating k planes that constitute a local solution to the nonconvex problem of minimizing the sum of squares of the 2norm distances between each point and a nearest plane. The key to th ..."
Cited by 77 (3 self)
A finite new algorithm is proposed for clustering m given points in ndimensional real space into k clusters by generating k planes that constitute a local solution to the nonconvex problem of minimizing the sum of squares of the 2norm distances between each point and a nearest plane. The key to the algorithm lies in a formulation that generates a plane in ndimensional space that minimizes the sum of the squares of the 2norm distances to each of m1 given points in the space. The plane is generated by an eigenvector corresponding to a smallest eigenvalue of an n \Theta n simple matrix derived from the m1 points. The algorithm was tested on the publicly available Wisconsin Breast Prognosis Cancer database to generate well separated patient survival curves. In contrast, the kmean algorithm did not generate such wellseparated survival curves. 1 Introduction There are many approaches to clustering such as statistical [2, 9, 6], machine learning [7, 8] and mathematical programming [15...