#### DMCA

## The BOSARIS Toolkit: Theory, algorithms and code for surviving the new (2011)

Venue: | DCF,” in NIST SRE’11 Analysis Workshop |

Citations: | 9 - 0 self |

### Citations

3298 | Numerical Optimization
- Nocedal, Wright
- 2006
(Show Context)
Citation Context ...ugate gradient optimizer which was used in its predecessor, the FoCal Toolkit.17 The new optimizer uses the trust region Newton conjugate gradient algorithm for large-scale unconstrained minimization =-=[20, 21]-=-. 4 Code This section gives a high-level overview of some of the salient features of the implementation of the algorithms. More detail is available in the user manual which is distributed with the too... |

1061 | An introduction to ROC analysis
- Fawcett
- 2006
(Show Context)
Citation Context ...cores. This is useful for the earlier stages of algorithm development, when calibration is not of immediate interest. We assume the reader is familiar with the ROC (receiver operating characteristic) =-=[13]-=-. In this section we concentrate on perhaps unfamiliar relationships that exist between the ROC, minDCF and EER. In summary: the ROC spans 12 operating points by plotting error-rates as a function of ... |

913 | Probability Theory: The Logic of Science
- Jaynes
- 2003
(Show Context)
Citation Context .... It can be interpreted as: • A proper scoring rule, which encourages both good discrimination (i.e. a good DET-curve) as well as good probabilistic calibration (in the sense of [9]). See for example =-=[10]-=-, Chapter 13, the section entitled ‘The honest weatherman’, for an insightful explanation. • Generalized cross-entropy [11] between the evaluator’s perfect empirical posterior given by the labels and ... |

362 | The DET Curve in Assessment of Detection Task Performance
- Martin, Doddington, et al.
- 1997
(Show Context)
Citation Context ...al to this analysis and also provides the key to efficient minDCF and EER calculation. In our discussion below, we use the term ROC, but (unless otherwise noted) everything applies also to DET-curves =-=[14]-=-. For ROC, we assume the speakerrecognition convention where x = Pfa is on the horizontal axis and y = Pmiss on the vertical axis.8 The DET-curve differs from the ROC by axis warping:9 x = probit(Pfa)... |

341 | Robust classification for imprecise environments
- Provost, Fawcett
- 2001
(Show Context)
Citation Context ...∑n i=1 αi = 1. We already know that minDCF can be expressed either as a continuous minimization over the threshold (γ), or as a discrete minimization over the ROC points. But it can also be expressed =-=[15, 1]-=- as a continuous minimization over the convex hull, or as a discrete minimization over the set of vertices, Vch, of the convex hull: minDCF(pi,Cmiss, Cfa) = min γ piCmissPmiss(γ) + (1− pi)CfaPfa(γ) = ... |

189 | Interval estimation for a binomial proportion
- Brown, Cai, et al.
- 2001
(Show Context)
Citation Context ...l) methods to theoretically quantify the accuracy of such estimates—see 1See http://speech.fit.vutbr.cz/workshops/bosaris2010 2Available at: http://sites.google.com/site/bosaristoolkit/ 2 for example =-=[3]-=- and references therein. The results of any such analysis will depend on various modelling assumptions. For the speaker recognition problem, one such analysis, Doddington’s Rule of 30 [4], is rendered... |

118 | Transforming classifier scores into accurate multiclass probability estimates
- Zadrozny, Elkan
- 2002
(Show Context)
Citation Context ...AV: Non-parametric calibration The convention that the larger the score, the more it favours the target hypothesis, suggests that the calibration mapping, `, should be monotonically rising (isotonic) =-=[17]-=-. Since we have a finite number of training scores, each of which must be mapped to a log-likelihood-ratio, this can be done in a non-parametric way. We can independently choose the value for each poi... |

97 | Trust region Newton method for large-scale logistic regression
- Lin, Weng, et al.
- 2008
(Show Context)
Citation Context ...ugate gradient optimizer which was used in its predecessor, the FoCal Toolkit.17 The new optimizer uses the trust region Newton conjugate gradient algorithm for large-scale unconstrained minimization =-=[20, 21]-=-. 4 Code This section gives a high-level overview of some of the salient features of the implementation of the algorithms. More detail is available in the user manual which is distributed with the too... |

79 | Application-Independent Evaluation of Speaker Detection
- Brummer, J
- 2006
(Show Context)
Citation Context ...ipe requires the operating point to be fixed and known to the evaluee. Below we show how to relax this requirement. 2.4 Bayes Risk: criterion for goodness of log-likelihoodratios A small modification =-=[6]-=- to the DCF evaluation recipe makes it applicable to calibrated log-likelihood-ratios, rather than hard decisions: The evaluee submits log-likelihood-ratios (rather than decisions) and the evaluator m... |

63 | Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation
- Brummer, Burget, et al.
- 2006
(Show Context)
Citation Context ...rion for a supervised calibration database, which must be provided by the user. Since the objective function is calibration sensitive, optimizing it causes the fused output to be well calibrated. See =-=[19]-=-, or [1, Chapter 8] for more details. 3 Algorithms This section describes the key algorithms that help the toolkit to efficiently process very large sets of scores. 3.1 Efficient DCF and minDCF This s... |

55 |
The comparison and evaluation of forecasters
- deGroot, Fienberg
- 1982
(Show Context)
Citation Context ...in detail in [7, 8, 1]. It can be interpreted as: • A proper scoring rule, which encourages both good discrimination (i.e. a good DET-curve) as well as good probabilistic calibration (in the sense of =-=[9]-=-). See for example [10], Chapter 13, the section entitled ‘The honest weatherman’, for an insightful explanation. • Generalized cross-entropy [11] between the evaluator’s perfect empirical posterior g... |

16 |
Measuring, refining and calibrating speaker and language information extracted from speech
- Brummer
- 2010
(Show Context)
Citation Context ...ide, to explain theory and algorithms and is complementary to the user manual. 1 ar X iv :1 30 4. 28 65 v1s[ sta t.A P]s1 0 A prs20 13 The theory behind the toolkit is based on the Ph.D. dissertation =-=[1]-=-, which can be consulted for further details. The core implementation (code) was written by the authors of this document, as part of the ABC: AGNITIO, BUT, CRIM submission for the 2010 NIST Speaker Re... |

12 |
recognition evaluation methodology } an overview and perspective
- Doddington, Speaker
- 1998
(Show Context)
Citation Context ... for example [3] and references therein. The results of any such analysis will depend on various modelling assumptions. For the speaker recognition problem, one such analysis, Doddington’s Rule of 30 =-=[4]-=-, is rendered tractable via the assumption of independent Bernoulli trials.3 This rule suggests one needs at least 30 errors to get a probably approximately correct error-rate estimate. In practice, w... |

5 |
Forensic Evaluation of the Evidence Using Automatic Speaker Recognition Systems
- Ramos-Castro
- 2007
(Show Context)
Citation Context ...(1 + e −`t) + 0.5 |N | ∑ t∈N log2(1 + e `t) (18) 8 where k > 0 is an unimportant scale factor and logit−1 x = (1 + e−x)−1 is the inverse7 of the logit function. This criterion is further discussed in =-=[1, 7, 8, 12]-=-. It can be interpreted as a strictly proper scoring rule, empirical cross-entropy, negative log-likelihood and as optimization objective for logistic regression. 2.5.4 Normalized Bayes-error-rate plo... |

4 | EVALITA 2009 speaker identity verification “application” track organizer’s report
- Aversano, Brümmer, et al.
- 2009
(Show Context)
Citation Context ...pute the ROCCH curve, as well as the associated DET-curve obtained by applying the non-linear (probit) mapping to the axes.11 Figure 4 shows two examples. For further examples, see [1, Chapter 7], or =-=[16]-=-, or try to plot some of your own, using the toolkit. The ROCCH vertex set, Vch, is typically much, much smaller than the empirical ROC. Since the convex hull can be computed efficiently (see the PAV ... |

3 |
Leeuwen and Niko Brümmer, “An introduction to applicationindependent evaluation of speaker recognition systems
- van
- 2007
(Show Context)
Citation Context ... matter. (After taking care of a few more details below, we will demonstrate this experimentally.) The empirical Bayes risk as evaluation criterion for log-likelihood-ratios is discussed in detail in =-=[7, 8, 1]-=-. It can be interpreted as: • A proper scoring rule, which encourages both good discrimination (i.e. a good DET-curve) as well as good probabilistic calibration (in the sense of [9]). See for example ... |

2 |
and Alexandru Niculescu-Mizil, “PAV and the ROC convex hull
- Fawcett
- 2007
(Show Context)
Citation Context ... set is optimized with PAV, and then evaluated on the same data set with E (DCF), then DCF = minDCF. • It also corresponds exactly to using the slope of the ROCCH curve as calibrated likelihood-ratio =-=[18]-=-. • The type of the score distribution is unimportant. In fact, the procedure is invariant to any monotonic warping of the scores. In contrast, the parametric logistic regression calibration solution ... |

1 |
Dawid, “Coherent measures of discrepancy, uncertainty and dependence, with applications to Bayesian predictive experimental design
- Philip
- 1998
(Show Context)
Citation Context ...ell as good probabilistic calibration (in the sense of [9]). See for example [10], Chapter 13, the section entitled ‘The honest weatherman’, for an insightful explanation. • Generalized cross-entropy =-=[11]-=- between the evaluator’s perfect empirical posterior given by the labels and the posterior P (target|s, pi) of the evaluee. This information-theoretical analysis provides useful inequalities to unders... |