A System for Unrestricted Topic Retrieval from Radio News Broadcasts
The "topic classification" systems described in the speech literature typically partition a collection of spoken messages into a small number of pre-defined topics. As such, they are only useful if the set of message topics does not vary over time. However, the techniques of textual information retrieval (IR) have long allowed for retrieval by arbitrary subject from a document collection. This paper describes experiments in unrestricted retrieval from a collection of radio news broadcasts. A hybrid message indexing strategy, with conventional word recognition and a fast lattice-based wordspotter, allows for the retrieval of news reports concerning any subject. The results show that retrieval can be carried out extremely quickly and that high accuracy is possible, even with errorful recognition output.
[Jam96] James D.. A System for Unrestricted Topic Retrieval from Radio News Broadcasts. In Proc Int Conf Acoust, Speech and Sig Proc (ICASSP), pages 279-282, Atlanta, GA, USA, May 1996.