DMCA

Top-Level Domain Crawling for Producing Comprehensive Monolingual Corpora from the Web

by Dirk Goldhahn , Steffen Remus , Uwe Quasthoff , Chris Biemann