Results 1 - 10
of
34
Deterministic algorithms for rank aggregation and other ranking and clustering problems
- In In Proceedings of the Fifth International Workshop on Approximation and Online Algorithms
, 2007
"... Abstract. We consider ranking and clustering problems related to the aggregation of inconsistent information. Ailon, Charikar, and Newman [1] proposed randomized constant factor approximation algorithms for these problems. Together with Hegde and Jain, we recently proposed deterministic versions of ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
(Show Context)
Abstract. We consider ranking and clustering problems related to the aggregation of inconsistent information. Ailon, Charikar, and Newman [1] proposed randomized constant factor approximation algorithms for these problems. Together with Hegde and Jain, we recently proposed deterministic versions of some of these randomized algorithms [2]. With one exception, these algorithms required the solution of a linear programming relaxation. In this paper, we introduce a purely combinatorial deterministic pivoting algorithm for weighted ranking problems with weights that satisfy the triangle inequality; our analysis is quite simple. We then shown how to use this algorithm to get the first deterministic combinatorial approximation algorithm for the partial rank aggregation problem with performance guarantee better than 2. In addition, we extend our approach to the linear programming based algorithms in Ailon et al. [1] and Ailon [3]. Finally, we show that constrained rank aggregation is not harder than unconstrained rank aggregation.
Correlation Clustering with Noisy Input
"... Correlation clustering is a type of clustering that uses a basic form of input data: For every pair of data items, the input specifies whether they are similar (belonging to the same cluster) or dissimilar (belonging to different clusters). This information may be inconsistent, and the goal is to fi ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Correlation clustering is a type of clustering that uses a basic form of input data: For every pair of data items, the input specifies whether they are similar (belonging to the same cluster) or dissimilar (belonging to different clusters). This information may be inconsistent, and the goal is to find a clustering (partition of the vertices) that disagrees with as few pieces of information as possible. Correlation clustering is APX-hard for worst-case inputs. We study the following semi-random noisy model to generate the input: start from an arbitrary partition of the vertices into clusters. Then, for each pair of vertices, the similarity information is corrupted (noisy) independently with probability p. Finally, an adversary generates the input by choosing similarity/dissimilarity information arbitrarily for each corrupted pair of vertices. In this model, our algorithm produces a clustering with cost at most 1 + O(n −1/6) times the cost of the optimal clustering, as long as p ≤ 1/2 − n −1/3. Moreover, if all clusters have size at least 1 √ c1 n then we can exactly reconstruct the planted clustering. If the noise p is small, that is, p ≤ n −δ /60, then we can exactly reconstruct all clusters of the planted clustering that have size at least 3150/δ, and provide a certificate (witness) proving that those clusters are in any optimal clustering. Among other techniques, we use the natural semidefinite programming relaxation followed by an interesting rounding phase. The analysis uses SDP duality and spectral properties of random matrices.
Fast FAST
"... We present a randomized subexponential time, polynomial space parameterized algorithm for the k-Weighted Feedback Arc Set in Tournaments (k-FAST) problem. We also show that our algorithm can be derandomized by slightly increasing the running time. To derandomize our algorithm we construct a new kin ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
(Show Context)
We present a randomized subexponential time, polynomial space parameterized algorithm for the k-Weighted Feedback Arc Set in Tournaments (k-FAST) problem. We also show that our algorithm can be derandomized by slightly increasing the running time. To derandomize our algorithm we construct a new kind of universal hash functions, that we coin universal coloring families. For integers m, k and r, a family F of functions from [m] to [r] is called a universal (m, k, r)-coloring family if for any graph G on the set of vertices [m] with at most k edges, there exists an f ∈ F which is a proper vertex coloring of G. Our algorithm is the first non-trivial subexponential time parameterized algorithm outside the framework of bidimensionality.
Rank aggregation: Together we’re strong
- In Proc. of 11th ALENEX
, 1998
"... We consider the problem of finding a ranking of a set of elements that is “closest to ” a given set of input rankings of the elements; more precisely, we want to find a permutation that minimizes the Kendall-tau distance to the input rankings, where the Kendall-tau distance is defined as the sum ove ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
We consider the problem of finding a ranking of a set of elements that is “closest to ” a given set of input rankings of the elements; more precisely, we want to find a permutation that minimizes the Kendall-tau distance to the input rankings, where the Kendall-tau distance is defined as the sum over all input rankings of the number of pairs of elements that are in a different order in the input ranking than in the output ranking. If the input rankings are permutations, this problem is known as the Kemeny rank aggregation problem. This problem arises for example in building meta-search engines for Web search, aggregating viewers ’ rankings of movies, or giving recommendations to a user based on several different criteria, where we can think of having one ranking of the
Average Parameterization and Partial Kernelization for Computing Medians
- PROC. 9TH LATIN
, 2010
"... We propose an effective polynomial-time preprocessing strategy for intractable median problems. Developing a new methodological framework, we show that if the input instances of generally intractable problems exhibit a sufficiently high degree of similarity between each other on average, then there ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
We propose an effective polynomial-time preprocessing strategy for intractable median problems. Developing a new methodological framework, we show that if the input instances of generally intractable problems exhibit a sufficiently high degree of similarity between each other on average, then there are efficient exact solving algorithms. In other words, we show that the median problems Swap Median Permutation, Consensus Clustering, Kemeny Score, and Kemeny Tie Score all are fixed-parameter tractable with respect to the parameter “average distance between input objects”. To this end, we develop the new concept of “partial kernelization” and identify interesting polynomial-time solvable special cases for the considered problems.
How to rank with few errors -- A PTAS for Weighted Feedback Arc Set on Tournaments
- ELECTRONIC COLLOQUIUM ON COMPUTATIONAL COMPLEXITY, REPORT NO. 144 (2006)
, 2006
"... Suppose you ran a chess tournament, everybody played everybody, and you wanted to use the results to rank everybody. Unless you were really lucky, the results would not be acyclic, so you could not just sort the players by who beat whom. A natural objective is to find a ranking that minimizes the nu ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Suppose you ran a chess tournament, everybody played everybody, and you wanted to use the results to rank everybody. Unless you were really lucky, the results would not be acyclic, so you could not just sort the players by who beat whom. A natural objective is to find a ranking that minimizes the number of upsets, where an upset is a pair of players where the player ranked lower on the ranking beats the player ranked higher. This is the NP-hard minimum feedback arc set (FAS) problem on tournaments. Our main result is a polynomial time approximation scheme (PTAS) for this problem. A simple weighted generalization gives a PTAS for Kemeny-Young rank aggregation.
Kernels for Feedback Arc Set In Tournaments
, 2009
"... A tournament T = (V, A) is a directed graph in which there is exactly one arc between every pair of distinct vertices. Given a digraph on n vertices and an integer parameter k, the Feedback Arc Set problem asks whether the given digraph has a set of k arcs whose removal results in an acyclic digraph ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
A tournament T = (V, A) is a directed graph in which there is exactly one arc between every pair of distinct vertices. Given a digraph on n vertices and an integer parameter k, the Feedback Arc Set problem asks whether the given digraph has a set of k arcs whose removal results in an acyclic digraph. The Feedback Arc Set problem restricted to tournaments is known as the k-Feedback Arc Set in Tournaments (k-FAST) problem. In this paper we obtain a linear vertex kernel for k-FAST. That is, we give a polynomial time algorithm which given an input instance T to k-FAST obtains an equivalent instance T ′ on O(k) vertices. In fact, given any fixed ɛ> 0, the kernelized instance has at most (2 + ɛ)k vertices. Our result improves the previous known bound of O(k²) on the kernel size for k-FAST. Our kernelization algorithm solves the problem on a subclass of tournaments in polynomial time and uses a known polynomial time approximation scheme for k-FAST.