23 citations found. Retrieving documents...
A. Shoshani. Statistical databases: Characteristics, problems, and some solutions. In Eigth International Conference on Very Large Data Bases, September 8-10, 1982.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Multi-Scale Partitions: Application to Spatial and.. - Philippe Rigaux And (1995)   (13 citations)  (Correct)

....the level of abstraction at which data is represented (see [RS94] for a general overview of the multi scale issues within GIS databases) In this paper we propose a model for multi scale representation. We show that this model not only applies to spatial databases but also to statistical databases [Sho82, Gho86, RS90]. We restrict our attention to a simple but very common situation where data is defined as an hierarchy of partitions of a single space. The model we define allows to represent and query databases containing such hierarchies. We discuss different ways of implementing such hierarchies which use ....

A. Shoshani. Statistical Databases: Characteristics, Problems, and Some Solutions. In Int. Conference on Very Large Databases (VLDB), Mexico, 1982.


Information Sharing Across Private Databases - Agrawal, Evfimievski, Srikant (2003)   (16 citations)  (Correct)

....our techniques do not address the question of what the parties might learn by combining the results of multiple queries. The first line of defence against this problem is the scrutiny of the queries by the parties. In addition, query restriction techniques from the statistical database literature [1, 44] can also help. These techniques include restricting the size of query results [17, 23] controlling the overlap among successive queries [19] and keeping audit trails of all answered queries to detect possible compromises [13] Schema Discovery and Heterogeneity We do not address the question ....

A. Shoshani. Statistical databases: Characteristics, problems and some solutions. In Proc. of the Eighth Int'l Conferenceon Very Large Databases, pages 208--213, Mexico City, Mexico, September 1982.


Privacy Preserving Mining of Association Rules - Evfimievski, Srikant.. (2002)   (48 citations)  (Correct)

....1.2 Related Work There has been extensive research in the area of statistical databases motivated by the desire to provide statistical information (sum, count, average, maximum, minimum, pth percentile, etc. without compromising sensitive information about individuals (see surveys in [1] [22]. The proposed techniques can be broadly classified into query restriction and data perturbation. The query restriction family includes restricting the size of query result, controlling the overlap amongst successive queries, keeping audit trail of all answered queries and constantly checking for ....

A. Shoshani. Statistical databases: Characteristics, problems and some solutions. In VLDB, pages 208--213, Mexico City, Mexico, September 1982.


Hippocratic Databases - Agrawal, Kiernan, Srikant, Xu (2002)   (12 citations)  (Correct)

....of current database systems. 2.1 Statistical Databases The research in statistical databases was motivated by the desire to be able to provide statistical information (sum, count, average, maximum, minimum, pth percentile, etc. without compromising sensitive information about individuals [1] [47]. The proposed techniques can be broadly classified into query restriction and data perturbation. The query restriction family includes restricting the size of query results [13] 18] controlling the overlap among successive queries [14] keeping audit trails of all answered queries and ....

A. Shoshani. Statistical databases: Characteristics, problems and some solutions. In Proc. of the Eighth Int'l Conference on Very Large Databases, pages 208--213, Mexico City, Mexico, September 1982.


Deriving Initial Data Warehouse Structures from the.. - Boehnlein, Ende (1999)   (2 citations)  (Correct)

....for data warehouse structures in this paper. The basic building blocks of multidimensional data structures as a central basis of data warehouses are briefly discussed in this section. The basic idea of multidimensional data structures is the separation of quantitative and qualitative data [20] (figure 1) Quantitative measurable facts, called measures or measured facts, are analyzed from various viewpoints based on the qualitative content of the data [12] For example the turnover of a company could be analyzed by examining the product structure, sales structure and time. The ....

Shoshani, A.: Statistical Databases: Characteristics, Problems and some Solutions, in: Proceedings of the 8th International Conference on Very Large Data Bases (VLDB'82, Mexico City, Mexico, sept. 8-10), 1982, p. 208-222.


Automatic Aggregation using Explicit Metadata - Grumbach, Tininini (2000)   (Correct)

....present framework. Finally, the proposed formalism is shown to be useful for statistical database schema design. 1. Introduction The manipulation of aggregate data raises important and specific issues, that have been studied with different foci both in the field of statistical databases (SDBs) [Sho82, Su83, SW85, Gho86, OOM87, CDL95, RBT96], and in the field of on line analytical processing (OLAP) GBLP96, HRU96, LS97, Sho97, CD97] The real challenge of this sort of data is caused by the rather intricate semantics of summary values, that is not handled by classical database systems. The relationships between OLAP and data mining ....

A. Shoshani. Statistical databases: Characteristics, problems, and some solutions. In Proc 8th Int Conf on Very Large Data Bases, Mexico City, Mexico, September 1982.


Privacy-Preserving Data Mining - Agrawal, Srikant (2000)   (98 citations)  (Correct)

....There has been extensive research in the area of statistical databases motivated by the desire to be able to provide statistical information (sum, count, average, maximum, minimum, pth percentile, etc. without compromising sensitive information about individuals (see excellent surveys in [AW89] Sho82] The proposed techniques can be broadly classified into query restriction and data perturbation. The query restriction family includes restricting the size of query result (e.g. Fel72] DDS79] controlling the overlap amongst successive queries (e.g. DJL79] keeping audit trail of all ....

A. Shoshani. Statistical databases: Characteristics, problems and some solutions. In Proceedings of the Eighth International Conference on Very Large Databases (VLDB), pages 208--213, Mexico City, Mexico, September 1982.


Intelligent Support for Multidimensional Data Analysis in.. - Kamp, Wietek (1997)   (Correct)

....are aggregate (summary) data, i.e. multidimensional data about subpopulations of a study population classified by different attributes, e.g. counts of incidence cases or standardized rates by age, sex, time, and district. Each data set is represented by a so called statistical object [RF92, Sho82] Formally, a statistical object is defined as a triple (S, C, m) of one summary attribute S, a set of category attributes C, and a multidimensional data array m. The summary attribute provides the metadata associated with the data set. It describes, for instance, type, domain, and statistical ....

....networks have to be developed, which group the application of a number of methods. Furthermore, the system might be extended to support interactive statistical graphics. ffl Statistical objects will be stored persistently in a statistical database offering efficient access to the data [Mic86, Sho82] ffl Up to now, the user is restricted to implementing new C classes if he wants to expand the pool of methods, data types and tests for suitability. We do not want to design a new language for statistical analysis, which would be a very complex task, but an important objective is to provide ....

A. Shoshani. Statistical databases - characteristics, problems and some solutions. In Proceedings of the 8th International Conference on Very Large Data Bases, Morgan Kaufman, 1982.


Spatial Data Analysis Support for Cancer Epidemiology in CARESS - Wietek, Kamp   (Correct)

....is linked together with population data and additional spatial data into a unified view onto the database. This view is implemented by a MDD (multidimensional discrete data) mapping layer, which is based on concepts developed for spatio temporal, multidimensional and statistical database systems [Bau94, CCS93, Gue94, Sho82]. Aggregated data sets (e.g. case or population counts by region, age, and sex) are implemented as data cubes, providing efficient access to groups of single data values. By defining categorisation hierarchies on classifying attributes (space, time, age, sex, or type of disease) flexible ....

A. Shoshani. Statistical Databases -- Characteristics, Problems and Some Solutions. In: 8th International Conference on Very Large Data Bases (VLDB), pages 208-222, Mexico City, Mexico, 1982. Morgan Kaufmann.


Intelligent Support for Multidimensional Data Analysis in.. - Kamp, Wietek (1997)   (Correct)

....context extended access means that the layer provides extended query facilities like spatial and statistical operators. Additionally, it provides an object puffer for caching determined results for reuse. Epidemiologic analysis data is represented by so called statistical objects ( 12] 13] [17]) All data sets processed by the system are aggregate (summary) data, i.e. multidimensional data about subpopulations of a study population classified by different attributes, e.g. counts of incidence cases or standardized rates by age, sex, time, and district. Formally, a statistical object is ....

A. Shoshani. Statistical Databases - Characteristics, Problems and Some Solutions. In Proceedings of the 8th International Conference on Very Large Data Bases, Morgan Kaufman, 1982.


Multi-Scale Partitions: Application to Spatial and.. - Rigaux, Scholl (1995)   (13 citations)  (Correct)

....the level of abstraction at which data is represented (see [RS94] for a general overview of the multi scale issues within GIS databases) In this paper we propose a model for multi scale representation. We show that this model not only applies to spatial databases but also to statistical databases [Sho82, Gho86, RS90]. We restrict our attention to a simple but very common situation where data is defined as an hierarchy of partitions of a single space. The model we define allows to represent and query databases containing such hierarchies. Work partially supported by the French CNRS GDR Cassini. We ....

A. Shoshani. Statistical Databases: Characteristics, Problems, and Some Solutions. In Int. Conference on Very Large Databases (VLDB), Mexico, 1982.


Modeling Large Scale OLAP Scenarios - Lehner (1998)   (24 citations)  (Correct)

....i.e. a single article in the ongoing example. From an implementation point of view, this approach leads to a high dimensionality and an extremely sparse data cube. Neither an extended multidimensional model in the modern OLAP community nor the stream of statistical and scientific databases ( 12] [19]) has addressed the problem of representing dimensional attributes (or features, properties, etc. appropriately. Proposals on multidimensional models were made to transform cells to dimensions and vice versa ( 1] add complex statistical functions ( 10] or define a sophisticated mapping to the ....

Shoshani, A.: Statistical Databases: Characteristics, Problems, and Some Solutions, in: 8th International Conference on Very Large Data Bases (VLDB82, Mexico City, Mexico, Sept. 8-10), 1982, pp. 208-222


Modeling Multidimensional Databases - Agrawal, Gupta, Sarawagi (1995)   (81 citations)  (Correct)

....estimating the size of multidimensional aggregates [SDNR96] and for indexing pre computed summaries [SR96] JS96] The research in multidimensional indexing structures (see, for example, Gut94] for an overview) is relevant as well. Lastly, research in statistical databases (see, for example, Sho82] for an overview) also addressed some of the same concerns. This paper presents a framework for research in multidimensional databases. We first review concepts and terminologies in vogue in multidimensional database products in Section 2. We also point out some of the deficiencies in the current ....

....Determining attributes like product, date, supplier are referred to as dimensions while the determined attributes like sales are referred to as measures. Dimensions are called categorical attributes and measures are called numerical or summary attributes in the statistical database literature [Sho82] There is no formal way of deciding which attributes should be made dimensions and which attributes should be made measures. It is left as a database design decision. Dimensions usually have associated with them hierarchies that specify aggregation levels and hence granularity of viewing data. ....

[Article contains additional citation context not shown here]

A. Shoshani. Statistical databases: Characteristics, problems and some solutions. In Proceedings of the Eighth International Conference on Very Large Databases (VLDB), pages 208--213, Mexico City, Mexico, September 1982.


On the Computation of Multidimensional Aggregates - Agarwal, Agrawal.. (1996)   (167 citations)  (Correct)

....cube in directions complementary to ours: HRU96, GHRU96] presents algorithms for deciding what group bys to pre compute and index; SR96] and [JS96] discuss methods for indexing pre computed summaries to allow efficient querying. Aggregate pre computation is quite common in statistical databases [Sho82]. Research in this area has considered various aspects of the problem starting from developing a model for aggregate computation [CM89] indexing pre computed aggregates [STL89] and incrementally maintaining them [Mic92] However, to the best of our knowledge, there is no published work in the ....

A. Shoshani. Statistical databases: Characteristics, problems and some solutions. In Proceedings of the Eighth International Conference on Very Large Databases (VLDB), pages 208--213, Mexico City, Mexico, September 1982.


Storm: A Statistical Object Representation Model - Rafanelli, SHOSHANI (1990)   (20 citations)  Self-citation (Shoshani)   (Correct)

.... aggregate type data [1st SDBM] 2nd SDBM] 3rd SSDBM] Rafanelli 89] Since aggregate data is often derived by applying statistical aggregation (e.g. SUM, COUNT) and statistical analysis functions over micro data [Wong 84] the aggregate data bases are also called statistical databases (SDBs) Shoshani 82] Shoshani 85] This paper will consider only aggregate type data, a choice which is justified by the widespread use of aggregate data only i.e. without the corresponding micro data. The reason is that it is too difficult to use the micro data directly (both in terms of storage space and ....

Shoshani A. Statistical Databases: Characteristics, Problems and Solutions" Proc. of the 7th Intern. Confer. on Very Large Data Bases (VLDB), Mexico city, Mexico , 1982.


A Model For Representing Statistical Objects - Shoshani, RAFANELLI (1991)   (3 citations)  Self-citation (Shoshani)   (Correct)

....are identified and some solutions are proposed. 1. Introduction Aggregate data are often derived by applying statistical aggregation (e.g. sum, count ) and statistical analysis functions over micro data [Won84] For this reason these databases are often called statistical databases (SDBs) [Sho82], SW85] Raf89] In this paper only aggregate type data will be considered; this is a choice which is justified by the widespread use of aggregate data only, i.e. without the corresponding microdata. The reason is that often it is too difficult to use the micro data directly (both in terms of ....

Shoshani A. Statistical Databases: Characteristics, Problems and Solutions" Proc. of the 7th Intern. Confer. on Very Large Data Bases (VLDB), Mexico city, 1982.


Summarizability in OLAP and Statistical Data Bases - Lenz, Shoshani (1997)   (39 citations)  Self-citation (Shoshani)   (Correct)

....two different department for the same year . Given this framework, a statistical object is defined by specifying the elements described in step 2 below. 3. 2 Step 2: specifying the elements of the statistical object We use below a terminology commonly used in the Statistical Database literature [5]. We refer to the measured attribute as the summary attribute , as it is this attribute that summarization is applied to. We refer to the attributes that make up the dimensions as category attributes . This is because typically they have a finite set of disjoint discrete values (also called ....

Shoshani, A., Statistical Databases: Characteristics, Problems, and Some Solutions, Proceedings of the International Conference on Very Large Data Bases (VLDB) 1982, pp. 208-222.


OLAP and Statistical Databases: Similarities and Differences - Shoshani (1997)   (34 citations)  Self-citation (Shoshani)   (Correct)

....aggregation Given that the semantics of the data structures of a statistical object are well defined, it is possible to express a minimum number of conditions in a query and infer the rest. This permit the use of very concise query languages. We refer to the capability as automatic aggregation [S82], because of the ability to automatically infer the conditions for applying the aggregation operation. We illustrate this with an example, shown in Figure 13. It shows a graph model of a statistical object average income of professionals by sex by year by profession . 10 Average Income ....

....of a dimension. For example, one may start with disease types , then see an interesting phenomenon in the disease type cancer , and drill down to the breakdown of the various cancer diseases. This was also recognized as a useful operation in the SDB area and is referred to as disaggregation [S82]. Statisticians have also used the concept of disaggregation by proxy to estimate lower level breakdown of a classification. For example, if the population is only known at the state level, but the area of each county is known, one can use the area of the counties as a proxy to estimate the ....

[Article contains additional citation context not shown here]

Arie Shoshani, Statistical Databases: Characteristics, Problems, and some Solutions, VLDB 1982, pp. 208-222.


General Purpose Database Summarization - Saint-Paul, Raschia, Mouaddib (2005)   (Correct)

No context found.

A. Shoshani. Statistical databases: Characteristics, problems, and some solutions. In Eigth International Conference on Very Large Data Bases, September 8-10, 1982.


A Privacy-Preserving Index for Range Queries - Hore, Mehrotra, Tsudik (2004)   (5 citations)  (Correct)

No context found.

Shoshani, A. Statistical Databases: Characteristics, Problems, and some Solutions. VLDB 1982, pp.208-222.


On the Design and Implementation of the Multidimensional - Cubestore Storage Manager (1998)   (Correct)

No context found.

Shoshani, A.: Statistical Databases: Characteristics, Problems, and Some Solutions, in: 8th International Conference on Very Large Data Bases (VLDB'82, Mexico City, Mexico, Sept. 8-10, 1982), pp. 208-222


Information Sharing across Private Databases - Agrawal, Evfimievski, Srikant (2003)   (16 citations)  (Correct)

No context found.

A. Shoshani. Statistical databases: Characteristics, problems and some solutions. In Proc. of the Eighth Int'l Conferenceon Very Large Databases, pages 208--213, Mexico City, Mexico, September 1982.


Spatial Statistics for Cancer Epidemiology - The Cancer.. - Wietek   (Correct)

No context found.

Shoshani, A. (1982): Statistical databases - characteristics, problems and some solutions. In: 8th International Conference on Very Large Data Bases (VLDB), Mexico City. Morgan Kaufmann, London, UK, 208-222.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC