@EvpokPadding No, we only annotate text inline and treat the html as a different inline annotation in a different file. See Section 3.1 of our LRE article: https://t.co/ZAY1CvokHq / pre-print: https://t.co/se9yT8EWvL Here is the data: https://t.co/w6gzS9
The Spoken #Wikipedia #Corpus collection: Harvesting, alignment and an application to hyperlistening #onlinefirst in Language Resources and Evaluation. Download here: https://t.co/gMtroQQOWv #free reading access here: https://t.co/FRBXLXSqYz
RT @ArneKoehn: Spoken Wikipedia Corpora: Our Journal article is (finally) online, describing the SWC in much more detail than the original…
RT @ArneKoehn: Spoken Wikipedia Corpora: Our Journal article is (finally) online, describing the SWC in much more detail than the original…
RT @ArneKoehn: Spoken Wikipedia Corpora: Our Journal article is (finally) online, describing the SWC in much more detail than the original…
RT @ArneKoehn: Spoken Wikipedia Corpora: Our Journal article is (finally) online, describing the SWC in much more detail than the original…
@therealprotonk @WikiResearch @BayesForDays OA is really important to me; I'm funded by the public and they have a right to access my results. Springer also provides a read-only version: https://t.co/lgeY4K0OYr Correct layout etc but not a PDF.
RT @ArneKoehn: Spoken Wikipedia Corpora: Our Journal article is (finally) online, describing the SWC in much more detail than the original…
RT @ArneKoehn: Spoken Wikipedia Corpora: Our Journal article is (finally) online, describing the SWC in much more detail than the original…
RT @ArneKoehn: Spoken Wikipedia Corpora: Our Journal article is (finally) online, describing the SWC in much more detail than the original…
Spoken Wikipedia Corpora: Our Journal article is (finally) online, describing the SWC in much more detail than the original LREC publication: https://t.co/ZAY1CvokHq No access? accepted manuscript w. very minor differences: https://t.co/7KEiNBh0PQ #NLProc