HTML 5 and timed media
HTML 5 work
HTML 5 is the next version of HTML, the markup language used on the web. Not all the details of have been agreed, but many of the proposed changes and new features have already been implemented in existing browsers.As part of our work on the , we built that works in current versions of Firefox, Safari and Chrome: a sample of RAD's R&D TV with subtitles and chapter navigation. This will not work in current versions of Internet Explorer, nor earlier versions of Firefox etc.
What we built
This prototype plays video and audio without plugins, and allows jumping to chapters and 'scrubbing' within the content. It uses simple JavaScript framework to enable web page elements to be changed via individual HTML or CSS 'events', and for loosely-coupled control of page components such as carousels. In particular, our JavaScript enables synchronised changes to HTML and CSS relative to a 'time parent', such as an audio or video clip, or even clock time. In addition, our solution needs to work with live events, whereby pages would be propagated in real-time.
Flash and other Rich Internet Applications (RIAs) provide something like this already via timeline scripting, but RIAs are 'black boxes', using compilers and obfuscators to hide code and data: great if you want to protect intellectual property, whereas we needed to provide a mechanism whereby data and the code acting on it were open and accessible. HTML 5 and the jQuery JavaScript framework gave us the tools we needed without requiring extra plugins or proprietary software.
Below we give more detail of the coding work done so far.
Synchronised media
On a web page it is sometimes desirable to make something happen relative to the current time of a video or audio track.
For example, subtitles change as a video plays.
adds something more dynamic: speech bubbles, notes and linked areas that can be set to appear in specific locations on a video, at specific times. This enables quite complex interactivity and navigation - in YouTube's example, a video about World War I leads to other videos about individual subjects such as tanks and aircraft, with extra notes that pop up as appropriate.
There are several open technologies available, or in development, for adding (subtitles), but not more complex page changes and interaction. and parts of the project go further in developing ways to integrate timed markup for and indexing with streaming media. Annodex uses the CMML markup language, which can be either incorporated with, or referenced from, a stream.
The presentation markup language provides rich features for timed interactive media, and has been in use for more than 10 years, but it still has some limitations:
- SMIL can be played in several media players, but and the future of SMIL browser support is unclear
- SMIL uses a presentation paradigm, and each presentation needs to be complete in itself, whereas our 'engine' needs to be event-driven, able to cope with individual live page changes, with viewers either joining a presentation part way through 'broadcast' or viewing the entire presentation after it has been published
- existing SMIL implementations are oriented to elapsed (clock) time and user input events, rather than synchronising events with a 'time parent' such as the current time of a video2.
Having considered the alternatives, we decided that our 'engine' would be most flexible and accessible if we used JavaScript alongside functionality developed as part of the .
The JavaScript we wrote does a simple job:
- build a list of page-change event objects, parsed from incoming JSON
- activate or deactivate each event, given its start and end time, relative to the current time of the event's time parent, which in our demonstrations is a video).
The HTML 5 audio and video elements remove the need for player plugins, work like any other HTML element in terms of styling and positioning, and standardise the programming interface for playback control. Less well known is that these elements emit a timeupdate event (at a frequency adjusted to fit available processing and memory) which removes the need to poll a player for the current time position. This makes media scripting far more efficient, since there is no need to run a loop or use setTimeout. In tests run on several machines we found that timeupdate events are emitted regularly and frequently (particularly in Firefox), whereas polling a media player for current video time is unreliable.
The HTML 5 method will provide native support for callbacks (or events) at the start and end of a 'cue range'. However the specification is and does not appear close to implementation.
HTML 5 video in the field
HTML 5 media elements are now supported by current versions of Firefox, Safari and Chrome.
However, the implementation of the video element has been dominated by the need to standardise codec support, in the same way that browsers currently support the JPEG, PNG and GIF image formats.
Google demonstrated their commitment to both the MP4 (H.264/AAC) and Ogg (Theora/Vorbis) with the HTML 5 media element at the . Firefox 3.5 and Safari both support the video element, though for different codecs. Dailymotion has created a with several hundred thousand videos encoded using Ogg Theora.
The biggest and least predictable change may come from technologies such as Comet or HTML 5 . These enable data to be 'pushed' to browsers from servers rather than vice versa. Push makes sense, in that enables updates without polling, but it challenges the HTTP request/response model used on the web and raises a number of security and editorial questions. In terms of our application, events could be pushed to the client browser as they became available. For example, a web page showing a broadcast from a live event could include live updated information about the festival or the band and song currently playing.
We've done this work in conjunction with the . We're using HTML5 in the project to sync video and display extra information about the content - it's early days for us on this and there are a number of serious challenges before this becomes anything near mainstream - if ever. We hope it's a useful demo and look forward to feedback.