Overview of the CLEF-2006 Cross-Language Speech Retrieval Track

Douglas W. Oard; Jianqiang Wang; Gareth J.F. Jones; Ryen W. White; Pavel Pecina; Dagobert Soergel; Xiaoli Huang; Izhak Shafran

Overview of the CLEF-2006 Cross-Language Speech Retrieval Track

Douglas W. Oard ,
Jianqiang Wang ,
Gareth J.F. Jones ,
Ryen W. White ,
Pavel Pecina ,
Dagobert Soergel ,
Xiaoli Huang ,
Izhak Shafran

Cross-Language Evaluation Forum (CLEF 2006), Alicante, Spain | September 2006

The CLEF-2006 Cross-Language Speech Retrieval (CL-SR) track included two tasks: to identify topically coherent segments of English interviews in a known-boundary condition, and to identify time stamps marking the beginning of topically relevant passages in Czech interviews in an unknown-boundary condition. Five teams participated in the English evaluation, performing both monolingual and cross-language searches of ASR transcripts, automatically generated metadata, and manually generated metadata. Results indicate that the 2006 evaluation topics are more challenging than those used in 2005, but that cross-language searching continued to pose no unusual challenges when compared with collections of character-coded text. Three teams participated in the Czech evaluation, but no team achieved results comparable to those obtained with English interviews. The reasons for this outcome are not yet clear.