Zillah Watson and Tristan Fearne from 91热爆 Research & Development told us about this project
What is the aim of this project?
We wanted to find a way to make the 91热爆's radio and television archives more searchable.
How big is the archive?
Our starting point was a massive audio archive of World Service broadcasts in English, dating back 6 decades and covering over 36,000 radio programmes. That鈥檚 more than 3 years' continuous listening!
How did you do it?
We used speech-to-text software to create transcripts of radio programmes. Then algorithms were used to extract topic tags from the transcripts. The tags then enable you to search the programmes.
Great idea, so what came next?
The next stage was to add speaker recognition - each programme has been segmented to show where different people are talking. Users can help tell us who each voice belongs to. And once a speaker鈥檚 voice has been identified, you can find other programmes in the archive featuring the same person. So you can search for all the programmes which contain, say, Nelson Mandela鈥檚 voice.
Sounds smart, so how do you want to make it better?
The tags weren鈥檛 perfect, and so we decided to see if users of the archive could help improve it by voting tags up or down. Users can also help select better photos, which are automatically pulled in by the tags. So far in our experiment over 67,000 tags have been improved.
I like the sound of that.
That鈥檚 what we鈥檙e hoping. We鈥檝e written in greater detail about how we and the site.