Copyright is held by the author/owner(s)
WWW2002, May 7-11, 2002, Honolulu, Hawaii, USA.
ACM 1-58113-449-5/02/0005.
Multimedia scheduling models provide a rich variety of tools for managing the synchronization of media like video and audio, but generally have an inflexible model for time itself. In contrast, modern animation models in the computer graphics community generally lack tools for synchronization and structural time, but allow for a flexible concept of time, including variable pacing, acceleration and deceleration and other tools useful for controlling and adapting animation behaviors. Multimedia authors have been forced to choose one set of features over the others, limiting the range of presentations they can create. Some programming models addressed some of these problems, but provided no declarative means for authors and authoring tools to leverage the functionality. This paper describes a new model incorporated into SMIL 2.0 that combines the strengths of scheduling models with the flexible time manipulations of animation models. The implications of this integration are discussed with respect to scheduling and structured time, drawing upon experience with SMIL 2.0 timing and synchronization, and the integration with XHTML.
H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems
— Animations, Video,
I.3.6 [Computer Graphics]: Methodology and Techniques — languages,
standards.
General Terms: Design, Theory, Languages, Standardization.
Keywords: multimedia, timing, synchronization, animation,
graphics.
Copyright is held by the author/owner(s).
WWW2002, May 7-11, 2002, Honolulu, Hawaii, USA.
ACM 1-58113-449-5/02/0005.
Timing and synchronization are at the heart of multimedia, and have inspired considerable research and development. Timing models have evolved in various directions, reflecting the different domains of the researchers. However, most researchers (and developers of commercial products) have viewed the problem from one of two separate domains, and tend to be unaware or unconcerned about the models in use outside their chosen domain. As a result, two general classes of models exist for timing and synchronization, each with respective strengths and weaknesses and neither of which covers the broader domain of both worlds. One class of models centers around the scheduling of continuous (and generally streamed) media like video and audio, and the other is directed at the needs of animation — especially within the computer graphics community.
The video-centric models take different approaches, but generally concentrate on support for specifying the synchronization relationships among media elements. The models must allow a range of relationships among the media elements, and must accommodate the issues associated with delivering media over unreliable media (like the Internet). In most of these models, time is essentially immutable — it is a dimension (or dimensions) along which to organize media content. While some models support some user control over the frame rate of individual media, and/or control of the overall rate of presentation playback, the models generally do not provide a means to control the pace of time within the document.
In the computer graphics community, the timing models are generally quite simple, with few tools for synchronization relationships, structured timing, or the accommodation of network resource constraints. However, time within the model is essentially arbitrary and entirely mutable. Time as a property can be transformed to advance faster or slower than normal, to run backwards, and to support acceleration and deceleration functionality. When combined with simple animation tools (e.g., motion, rotation and generic interpolation), time transformations make it much easier for authors to describe common mechanical behaviors such as elastic bouncing and pendulum motion, and to ‘tune’ animations for a more pleasing or realistic effect.
This dichotomy is understandable to the extent that it mirrors an historical split between “video” and “graphics” communities within research and development. Nevertheless, the result is that neither class of models covers the complete domain. Multimedia authors have generally been forced to choose one model or the other, limiting the range of presentations that can be authored.
As more multimedia moves to Internet distribution, common models and languages become that much more important. In addition to the importance of declarative models for authoring, a common model and language for timing and synchronization is a prerequisite for document reuse, sharing, collaboration and annotation — the building blocks of the next generation of Web content (both on the World Wide Web as well as on corporate and organizational Intranets).
This paper describes the new model in SMIL 2.0 that combines the strengths of video-centric scheduling models with the flexibility of time transformations used in the computer graphics and animation community. The first section provides some background on the scheduling and animation perspectives, describes the key aspects of timing models related to the integrated model, motivating examples for a unified model, and related work. The next section describes the new model, presenting a simple authoring syntax and the underlying timegraph semantics. Finally, we describe experience with the model, including the integration of SMIL 2.0 and XHTML, and potential applications to other languages.
The video-centric camp traditionally focuses on scheduling the
delivery and presentation of continuous (time-based) media ([4, 11,
15, 16, 19, 25,
32]). It is assumed that
continuous media behaves like video and audio, with some
intrinsic notion of time in the form of frames- or samples-per-second. The commonly used media types (especially streamed media) have ballistics
or linear behavior that constrain how quickly the media can be started and stopped and to what
extent the media can support variable rate playback (i.e., at other than
normal forward play-speed). This, taken together with a lack of support for (and
as such a lack of experience with) animation tools, resulted in many simple and
strict definitions for the behavior of time within scheduling modules. While
there are a few exceptions that provide a more abstract definition of time, e.g., [18], these models only supported low-level control of time; no
authoring abstractions are defined for common use cases, and the implications
for hierarchically scheduled timing are not discussed.
[Footnote: The term “ballistics” came
into use in the development of commercial video editing systems. These
edit-controllers had sophisticated machine control software to synchronize video
and audio tape devices in the performance of an editing scenario. The behavior
of the media playback and recording devices was quite literally ballistic, due
to the mechanical nature of the devices, the physical bulk of the recording
tape, etc. The term came to be used more generally to describe the analogous
behavior of media elements within a multimedia presentation, including the
network and other resource delays to start media, stop it, etc.]
The graphics/animation-centric camp generally models time as an abstract notion used for purely rendered (i.e., mathematically or functionally defined) animations that have no intrinsic rate or duration ([2, 6, 8, 9, 12, 20, 22]). Animation in this sense is the manipulation of some property or properties as a function of time, and should not be confused with, for example, the simple sequencing of a set of images as in “cel animation”. Since the animations have no delivery costs (i.e., there is no associated media except the animation description itself), and since animations can be sampled (mathematically evaluated) at any point in time, graphics/animation-centric presentations can be rendered at arbitrary (and variable) “frame” rates. These animation models generally have little or no support for time-based media like video. Without the need for runtime synchronization management or predictive scheduling, many graphics/animation-centric models (e.g., [6, 10, 20]) adopted event-based models for timing. While some models (e.g., [13]) support some scheduling, the tools are simple, and usually define a “flat” time space (i.e., with no hierarchical timing support). Several notable exceptions combine hierarchic timing and a flexible notion of time ([8, 9, 14]). While these programming models provide no authoring abstractions and are somewhat primitive from a synchronization perspective, they help demonstrate the potential of a merged model.
As a result of the differing perspectives of the synchronization and graphics/animation communities, presentation models and engines that do a good job of scheduling and synchronization lack the tools and flexibility common to the graphics/animation models. By the same token, the graphic-centric models tend to provide few, if any, tools to support synchronization and scheduling of continuous media. As a result, authors wanting to combine animated computer graphics with video and audio, particularly for use on the web, are faced with a poor set of alternatives. At best, graphic animations are created separately, converted to a video, and then synchronized to other video and audio; this greatly increases the delivery cost of the animation even with much poorer rendering quality, and makes it much more difficult to adjust the animation in the context of the video and audio presentation. The integrated model proposed in this paper addresses this need, providing simple but flexible authoring abstractions for time transformation integrated with a rich set of timing and synchronization tools.
A broad range of multimedia presentations require the use of synchronization tools for media like video and audio, as well as tools for animation. Two simple cases are described here to illustrate the value of the integration presented in this paper.
A typical function of presentation authoring systems (e.g., Microsoft PowerPoint) supports simple transitions for bullet points on presentation slides. One commonly used transition is a motion path that ‘slides’ the bullet point into place on the slide from a position offscreen. If a simple line motion path is used to describe this transition, the effect is somewhat harsh at the end, as the point abruptly stops moving at the end of the path. Many people will perceive that the moving element bounces slightly at the end, because of the instantaneous end to the motion. PowerPoint and other tools make the result visually more pleasing by slowing down the motion at the very end, as though the element were braking to a stop. This is easily accomplished in a generic manner for motion animations using a deceleration time transform.
When a presentation with slides and such motion transitions must be synchronized to video or audio (e.g., a recording of the person delivering the original slide presentation), the slides can generally be converted to XHTML, SVG or some other medium suitable for web presentation. But in order to synchronize the recorded audio or video with the slide presentation, including the animation transitions with decelerated motion, a unified model must provide both media synchronization and time transformations for animation. [Footnote: Lacking such a tool, the slides are often recorded as video. This process greatly increases the resource costs of the presentation (video being much larger than the declarative text). It also reduces the visual fidelity of the slide content, and destroys the text functionality in the slides including the ability to copy/paste, to traverse hyperlinks, etc.]
This example presents a clockwork mechanism with a mechanical feel to the animation. The clockwork is represented by a series of “gears” of different sizes. The gears are rendered as vector graphics (or they could be images, if image rotation was supported by the runtime). Each gear has a simple rotation animation applied. The direction of rotation is set so that interlocking gears appear to rotate together as a geared mechanism. The rate of rotation is a function of the number of teeth on each individual gear. The graphic elements and the basic rotations are illustrated in Figure 1.
In order to make the animation appear to work as a clockwork, a number of changes are made. First, the mechanism should run normally for about 3 seconds, and then it should reverse. It should repeat this forever. Second, in order to make the mechanism have more realistic mechanics, acceleration and deceleration are applied to each phase of the animation; this makes the gears speed up from a standstill and then slow down to a standstill, as it changes directions. This provides an animation with a realistic mechanical feel. Audio will be synchronized to emphasize the rhythmic clockwork action.
If time transforms are not supported with hierarchic timing structures, this animation is very difficult to create and modify. Each gear rotation must be defined as a partial rotation, calculating the rotation angle from the size of the associated gear. Each rotation must be adjusted using a set of keyframes (or equivalent) to accelerate and decelerate at the beginning and end of the animation, and then finally these modified rotations must then be adjusted to reverse (copying and reversing the rotation keyframes). This is difficult enough, but there is another more serious problem with this approach. Most animations (like most media in general) are not authored in a single, perfectly completed step, but rather must be created, adjusted, presented to a client or producer, further adjusted, and so on in an iterative editing process. If the gears animation had to be adjusted to vary the pacing, the amount of rotation the gears use, or adjusted to synchronize to an updated audio track, the carefully created rotation animations would each have to be completely reworked with each editorial change. This becomes quite burdensome to any author, and greatly increases the cost of authoring.
In marked contrast, the same animation is almost trivially easy with the time transform support. The original rotations are defined with a simple rate, and repeat indefinitely. A simple time container is used to group the four rotation animations. The desired overall duration for one clockwork ‘step’ is set as a duration on the time container — this can be easily edited to adjust the amount of rotation in each step. Acceleration and deceleration are then added as properties of the time container to create the basic mechanical action, and then a simple reversing transform is enabled to make the clockwork reverse. The changes are easy to author and easy to adjust, and the result is a sophisticated animation in a fraction of the time it would take to create without time transforms.
This example underscores the power of transforming time, rather than simply adjusting individual animations. Combining time transforms with hierarchic time containment provides an important tool for many types of animation. This example also requires that the animation and associated audio are presented in tight synchronization, or the overall effect is lost. If the authoring and presentation engine does not support time transformation and synchronization tools, the author must separate animation editing and synchronization editing into two separate steps, and two tools. The editing process, and especially the task of coordinating and synchronizing the audio and animations becomes more difficult. In addition, the presentation performance is generally less reliable. A single model that unifies synchronization tools and time transforms solves the problems and enables this class of presentations with greatly simplified authoring.
A timing model for the web must integrate traditional synchronization support with time transformations, in a manner appropriate for web authors. To provide a solution for a broad set of web applications, a model must meet the following specific requirements for timing and synchronization functionality, as well as more general requirements for authoring. Most of the implications of integrating time transformation with traditional time models relate to the way times are computed in the model, and in particular, how times are translated from an abstract representation to simple presentation time. The key aspects of time models that are required are:
There are other aspects of timing and synchronization that should be included in any integrated model, but that are largely orthogonal to the discussion at hand. These include support (i.e., authoring abstractions) for repeat functionality, multiple begin and end times, minimum and maximum duration constraints, wall-clock timing, interrupt semantics, etc. Of particular note is support for interactive timing. This may be supported via events, hyperlinks, or both [17]. When modeled as an indeterminate or unresolved time for an element, interactive timing can be cleanly integrated with both hierarchic and relative time models, even in the presence of time transforms.
Some time models define all time in terms of events. While this does allow for dynamism, these models cannot abstract the semantics of interval relations, structured time, etc. Some concepts of time transformations (simple speed transforms) can be applied to a pure event-graph model, but the time transforms can only be applied to individual elements (e.g., the clockwork animation described above would not be possible).
In addition to the above requirements, a model for web timing must address the needs of web authors and document processing models used on the web. This dictates in particular that a model must support:
Taken together, all these requirements pose a significant challenge. The next section describes some of the related models and tools that address at least some of the same issues.
Several models for timing and synchronization incorporate some flexibility in the definition of time, and several models for computer graphics animation include some basic concepts from synchronization (hierarchic time in particular). These models include:
speed
transform. In addition, there are
tools for mapping from the model to the presentation that might be leveraged
in building an integrated model. However, HyTime does
not define authoring abstractions for hierarchic time, or for more complex time transforms (such as acceleration and
deceleration), nor does it define fallback semantics for media renderers
that cannot perform as required.Other programming interfaces that include some support for time transformation include [1, 4, 11]. The IBAL model [22] provides some tools as well, but is more interesting for the discussion of how objects, behavior and timing (“who”, “what” and “when”) should be separated in a model; this follows the general trend to separate content, presentation/style and timing/animation in document presentation models [29, 30].
A number of models in the literature are oriented towards a posteriori analysis of a presentation timegraph, e.g., [3, 25]. While these may be useful analytical tools, they do not generally provide usable authoring abstractions, and so do not solve the problem at hand.
Several authoring tools have explored some of these concepts as well, including Maya[2] from Alias|Wavefront and Liquid Motion[26] from Dimension X. Maya includes powerful animation tools, but only limited synchronization and time transformation tools (animations can be grouped as a clip, and speeded up or slowed down). Liquid Motion is a Java based authoring tool and runtime for 2-D web multimedia that includes authoring abstractions for hierarchic time, relative and interactive timing, and time transforms. The model is based upon a scene-graph paradigm not unlike MHEG [19], although Liquid Motion supports hierarchic timing and simple synchronization explicitly, where MHEG uses timers and events. However, Liquid Motion has only primitive scheduling support for continuous media (it was constrained by lack of support for video in early versions of the Java Virtual Machine), and does not define any fallback semantics for media.
While many of these models presage aspects of the model described in this paper, none integrates a rich set of synchronization tools with a model for time transformation to provide a solution for authors. In terms of an authoring solution, Liquid Motion comes the closest, and experience with that tool informed the development of the model we describe. The next section presents this integrated model, and describes experience with the model in several authoring languages.
The proposed timing model for the web satisfies all of the requirements described above. It combines the traditional timing and synchronization tools of hierarchic timing, relative and interactive timing, with time transformation for support of animation. The authoring abstractions are designed to balance power and flexibility on the one hand, with ease of authoring on the other. As an XML-based language, SMIL 2.0 can be easily used in Internet document processing models. The modular structure and language integration semantics facilitate re-use of the SMIL 2.0 timing and animation support in other languages.
Hierarchic timing is provided by time containers, with local time defined as an extension of the simple cascade described by [14] et al. Relative and interactive timing are integrated directly, along with a number of other more advanced tools. The framework defines how time transforms are incorporated into the local time cascade, and a simple set of time transforms is defined for authoring convenience (the model can be extended to support other transformations). Fallback semantics are defined for cases in which an element cannot perform as the timegraph specifies. Care is taken with the definition of the time transforms to minimize the timegraph side-effects, both to simplify the authoring model, and also to make the fallback semantics cleaner.
Support for hierarchic time is provided by three time containment primitives: parallel, sequence and exclusive grouping. The details of the time container semantics are beyond the scope of this paper, but are available in [7] and [28]. In brief, the functionality includes:
par
is often used simply as a grouping construct for temporal
elements.Each timed element has a model for local time which is layered to account for various modifiers and transforms. The first layer is simple time, which represents the simplest form of the actual media, an animation function, or the contained timeline (for time containers). Simple time is modified by the defined time transformations to yield segment time, which is in turn modified by repeat functionality and min/max constraints to yield active time. The local time cascade is a recursive function that derives a child’s active time from the parent time container’s simple time. From the local active time, the segment and simple times are derived (the model is logically inverted to calculate the active duration from simple duration). The equations are detailed in [7].
One of the distinguishing characteristics of the Web as a document medium is the high degree of user interaction. In addition to a rich set of timing controls for narrative authoring structures, SMIL 2.0 includes flexible interactive (event-based) timing. This allows authors to create both traditional storyline presentations as well as user driven hypermedia, and to mix the two freely in a cohesive model. Recognizing the tradition of scripting support in web presentation agents, DOM access to basic timing controls is also defined, giving authors extensible support for a wide range of interactive applications.
All timed elements (including time containers) support definition of begin and end times. Relative timing is supported both in the implicit sense of hierarchic timing, as well as support for sync-arcs, wall-clock timing, and timing relative to a particular repeat iteration of another element. Interactive timing is supported via event-based timing, DOM activation and hyperlink activation [17]. Event timing supports an extensible set of event specifiers including:
The integration of events and scheduled timing is detailed in [27] and is similar to the mechanism described in [15]. DOM activation supports procedural control to begin and end elements, and closely follows the model for event timing. Hyperlink interaction supports element activation as well as context control (seeking the presentation timeline).
As described in the requirements and as demonstrated by the clockwork
example in particular, a web timing model must integrate the kind of time
transformation support commonly used in computer graphics animation. The SMIL
2.0 timing model defines a set of four simple abstractions for time
transformation to control the pace of
element simple time. These are abstracted as the attributes speed
,
accelerate
, decelerate
and autoReverse
:
speed
accelerate
and decelerate
autoReverse
accelerate=0.3
|
accelerate=0
|
accelerate=1
|
When applied to a time container, the time transformations affect the entire subtree because of the local time cascade model. This is defined primarily to support animation cases such as the clockwork example cited earlier, but can be well applied to any timing subtree that includes sampled animation behaviors and non-linear (a.k.a. random access) media elements. Some linear media renderers may not perform well with the time manipulations (e.g., renderers that can only play the associated media at normal play speed). A fallback mechanism is described in which the timegraph and syncbase-value times are calculated using the pure mathematics of the time manipulations model, but individual media elements simply play at the normal speed or display a still frame. That is, the semantic model for time transformation of a subtree includes both a “pure” mathematical definition of the resulting timegraph, as well as semantics for graceful degradation of presentations when media elements cannot perform as specified.
The fallback semantics depend upon the capabilities of a given media renderer. Some media renderers can play any forward speed, others can play forwards and backwards but only at the normal rate of play. If the computed element speed (computed as a cascade of the speed manipulations on the element and all ascendant time containers) is not supported by the media renderer, the renderer plays at the closest supported speed (“best effort”).
The effect of the fallback semantics is to allow a presentation to degrade gracefully on platforms with less capability. The synchronization model for the presentation as a whole is preserved, but some individual media elements play at a different speed than was desired (i.e., authored). The fallback semantics and best-effort media playback ensure a reasonable, if not ideal, presentation. Moreover, the explicit fallback semantics assure the author of a consistent and predictable model for degraded rendering.
An important aspect of the simplified definition of accelerate
and decelerate
is the associated simplification it affords the
fallback mechanism. Because the model preserves the simple duration for an
element, the fallback semantics for time transformations applied to linear media elements
has minimal impact. As such, for linear media elements, the accelerate
and decelerate
transforms can almost be considered hints
to the implementation.
Although the arithmetic remains fairly simple, the model is conceptually
more complex when
accelerate
and decelerate
are applied to time
containers. Here the fallback semantics are needed to allow the realities of renderers to be respected
without compromising the semantics of the timegraph. While the model does support timegraphs with a mix of linear and
non-linear behavior, and defines specific semantics for media elements that
cannot support the ideal non-linear model, it is not a goal to provide
an ideal alternative presentation for all possible timegraphs with such a mix.
It is left to authors and authoring tools to apply the time manipulations in
appropriate situations.
With a rich toolset for timing, synchronization and time transformations, the SMIL 2.0 model can address a very broad range of media and animation applications. The easy-to-use authoring abstractions ensure that document authors do not need a programming background to apply the model. The next section describes experience with the model up to this point, including integration with the lingua franca of the web, XHTML.
SMIL 2.0 defines syntax and semantics for multimedia synchronization and presentation. A modular approach to language definition allows for the definition of a range of language profiles. A self-contained SMIL 2.0 language combines modules of SMIL 2.0 for the description of multimedia presentations. The integration with XHTML [24] combines many of the SMIL 2.0 modules with modules of XHTML to support multimedia, timing and animation integration with HTML and CSS. An implementation of this language is available in Microsoft Internet Explorer versions 5.5 and later.
The integration with XHTML and CSS uses an approach that may be applied to other language integrations as well. SMIL media and animation elements are added to the language, making it very easy to integrate media like audio and video. SMIL timing markup is applied directly to the elements of XHTML as well, providing a single timing model for the entire page. The integration allows authors to easily describe presentations that synchronize audio, video and animated XHTML/CSS. General issues and other approaches to integrating multimedia languages with other document languages are discussed in [29, 30, and 31].
However the application of timing to the XHTML elements themselves raises the question: what do the SMIL begin
and end
attributes mean for <div>
or <strong>
?
Phrasal and presentational elements like <strong>
, <b>
and <i>
have a defined semantic effect (which often
translates to some presentation effect); timing can be used to control this intrinsic
behavior. However, for elements like <div>
and
<p>
, authors generally want to time the presentation of
the associated content; given the flow layout model for HTML/CSS,
authors must choose whether or not element timing should affect document layout, in
addition to hiding and showing the element. In practice, authors requested support for
other actions to control over time, such as the timed application of an inline style
attribute.
The timeAction
attribute specifies the desired
semantic. The XHTML+SMIL language profile defines a set of timeAction
attribute values for HTML and CSS semantics; other languages can
extend the set of values as appropriate. Two language independent
actions apply to all XML languages:
xml:class
property of the timed element, when
the element is active in time.
The side-effects of setting the class value can be used by an author to
apply style rules (using a class selector) or other behavior based upon
class membership.A generic style action can be used with all XML languages that define an inline style attribute (or equivalent mechanism for local style application). The styling language can be CSS, XSL-FO or any styling language with support for dynamically controlled presentation styling.
Two style-language specific actions apply to CSS presentation control, however,
an integrating
language could map these timeActions
to isomorphic properties
in another style language.
visibility
property in XHTML+SMIL.display
property in XHTML+SMIL.The time transforms have proven very useful in practice with the XHTML+SMIL profile, especially with the SMIL animation elements. Several common applications of the transforms include:
accelerate/decelerate
and autoReverse
time transforms as well as the animation
composition semantics in SMIL 2.0, authors can combine two simple line
motion animations to create animations of an element bouncing across the
page, or elliptical “orbit” motion.The integration of SMIL 2.0 timing and synchronization support, and especially the time transform support with XHTML and CSS has proven to be a flexible, powerful and easy-to-author time model for a variety of document and presentation types. XHTML+SMIL provides a demonstration of the viability and utility of our model as a time model for the web.
XMT[21] is an XML language for MPEG-4 scene description. Although still in development, recent drafts integrate the SMIL 2.0 timing modules, including time transform support. There is little practical experience to date, but this should prove to be an interesting application of the model.
SVG 1.0 [13] includes a minimal integration of SMIL functionality (due largely to scheduling constraints — SMIL 2.0 was not complete in time for full integration with SVG 1.0). Basic animation support is included based upon a restricted set of SMIL timing functionality; however, it includes neither hierarchic timing support nor time transforms (making content such as the clockwork example difficult to impossible). In addition, there is no direct integration of timing functionality with the main SVG elements (i.e., other than animation elements). A deeper integration of timing with SVG, together with support for the full SMIL 2.0 timing and synchronization model including time transforms, should be considered as part of the work for a next version of SVG.
We are currently exploring the use of timing with NewsML, and in particular the combination of some timing declared in the NewsML with the application of timing via an XSLT stylesheet, generating XHTML+SMIL as the final presentation format [29, 30]. Additional areas of exploration include fragmentation of timed documents (based upon XML Fragment Interchange) and the timing model for compound documents.
We have presented a new model for timing and synchronization in web documents. This model, formalized in SMIL 2.0 modules, combines a rich set of tools from the ‘video-centric’ world of synchronization with time transformations supported in computer graphics animation models. Unlike previous models for graphics/animation, the SMIL 2.0 model addresses real-world constraints of linear media renderers. Our novel abstraction of acceleration and deceleration facilitates simpler integration with the timing model, simplifies the authoring model, and minimizes the impact of fallback semantics for media. The integrated model makes possible multimedia presentations that combine traditional continuous media like video and audio with animations that must transform time.
This timing model for the web supports the creation of multimedia presentations that synchronize video and audio with sophisticated animated graphics, using an author-friendly syntax. SMIL modules were designed specifically for integration with other XML languages, facilitating wider adoption of a common language and semantics among web languages. In addition to an integration with XHTML, ongoing work should advance specific integrations like SVG, and explore the use of SMIL timing with currently emerging XML tools and document processing models.
Much of the work for the paper was completed while the author was visiting CWI, Amsterdam. I would like to thank Lynda Hardman, Lloyd Rutledge and Jacco van Ossenbruggen for their patience and insight in reviewing earlier versions of this paper. I would also like to acknowledge the SYMM working group of the W3C for the diligent reviews and commitment to quality that characterizes the work on SMIL 2.0.
1 | P. Ackermann, Direct Manipulation of Temporal Structures in a Multimedia Application Framework, in Proceedings Multimedia ’94, San Francisco, pp. 51-58, Oct 1994. |
2 | Alias|Wavefront, Learning Maya 3 Product tutorial. 2001. |
3 | J.F. Allen. Maintaining Knowledge about Temporal Intervals. Communications of the ACM, 26 (11), Nov, 832-843, November 1983. |
4 | J. Bailey, A. Konstan, R. Cooley, and M. Dejong. Nsync — A Toolkit for Building Interactive Multimedia Presentations, Proc. of ACM Multimedia ’98, Bristol, England, pp. 257-266. 1998. |
5 | R.H. Campbell and A.N. Habermann. The Specification of Process Synchronization by Path Expressions, volume 16 of Lecture Notes in Computer Science. Springer Verlag, 1974. |
6 | R. Carey, G. Bell, and C. Marrin. The Virtual Reality Modeling Language ISO/IEC DIS 14772-1, April 1997. |
7 | A. Cohen, et al. (eds). Synchronized Multimedia Integration Language (SMIL 2.0) Specification W3C Recommendation 7 August 2001. http://www.w3.org/TR/smil20/. |
8 | L. Dami, E. Fiume, O. Nierstrasz, and D. Tsichritzis, Temporal Scripting using TEMPO. In Active Object Environments, (ed. D. Tsichritzis) Centre Universitaire d’Informatique, Université de Genčve, 1988. |
9 | J. Döllner and K. Hinrichs. Interactive, animated widgets. In Computer Graphics International, June 22-26, Hannover, Germany 1998. |
10 | S. Donikian and E. Rutten, Reactivity, concurrency, data-flow and hierarchical preemption for behavioral animation in Eurographics Workshop on Programming Paradigms in Graphics, Maastricht, The Netherlands, September 1995. |
11 | D.J. Duke, D.A. Duce, I. Herman and G. Faconti. Specifying the PREMO synchronization objects. Technical report 02/97-R048, European Research Consortium for Informatics and Mathematics (ERCIM), 1997. |
12 | C. Elliott, G. Schechter, R. Young, and S. Abi-Ezzi. Tbag: A high level framework for interactive, animated 3d graphics application. In Proceedings of the ACM SIGGRAPH Conference, 1994. |
13 | J. Ferraiolo (ed). Scalable Vector Graphics (SVG) 1.0 Specification. W3C Recommendation 4 September 2001. http://www.w3.org/TR/SVG/. |
14 | S. Gibbs. Composite Multimedia and Active Objects. Proc. OOPSLA’91, pages 97-112, 1991. |
15 | M. Haindl, A new multimedia synchronization model, IEEE Journal on Selected Areas in Communications, Vol. 14, No.1, pp. 73-83, Jan. 1996. |
16 | L. Hardman, D. C. A. Bulterman, and G. Rossum. The Amsterdam Hypermedia Model: Adding Time and Context to the Dexter Model. Communications of the ACM, 37(2):50-62, February 1994. |
17 | L. Hardman, P. Schmitz, J. van Ossenbruggen, W. ten Kate, L. Rutledge: The link vs. the event: activating and deactivating elements in time-based hypermedia. The New Review of Hypermedia and Multimedia 6: (2000). |
18 | ISO/IEC 10744: Information technology. Hypermedia / Time-based Structuring Language (HyTime). Second edition 1997-08-01. |
19 | ISO/IEC. MHEG-5: Coding of multimedia and hypermedia information — Part 5: Support for base-level interactive applications. 1997. International Standard ISO/IEC 13522-5:1997 (MHEG-5). |
20 | D. Kalra and A.H. Barr. Modeling with Time and Events in Computer Animation. Computer Graphics Forum (Proceedings of Eurographics `92) 11(3):45-58. |
21 | M. Kim et al., (eds). Study of ISO/IEC 14496-1:2001 / PDAM2. Working Draft. March 2001. |
22 | G. Lee. A general specification for scene animation. In International Symposium on Computer Graphics and Image Processing, Rio De Janeiro, Brazil, October 1998. SIBGRAPI. |
23 | T.D.C. Little and A. Ghafoor, Interval-Based Conceptual Models for Time-Dependent Multimedia Data, IEEE Transactions on Knowledge and Data Engineering (Special Issue: Multimedia Information Systems), Vol. 5(4), pp. 551-563. August 1993. |
24 | D. Newman, P. Schmitz A. Patterson (eds). XHTML+SMIL Language Profile. W3C Working Draft. 7 August 2001. http://www.w3.org/TR/XHTMLplusSMIL/. |
25 | K. Rothermel and T. Helbig, Clock Hierarchies: An Abstraction for Grouping and Controlling Media Streams, IEEE Journal of Selected Areas in Communications, Vol.14, No.1, pp.174-184, January 1996. |
26 | G. Schmitz. Microsoft Liquid Motion by Design, Microsoft Press, October 1998. |
27 | P. Schmitz. Unifying Scheduled Time Models with Interactive Event-based Timing. Microsoft Research Tech. Report MSR-TR-2000-114, November 2000. http://www.research.microsoft.com/scripts/pubs/view.asp?TR_ID=MSR-TR-2000-114. |
28 | P. Schmitz. The SMIL 2.0 Timing and Synchronization Model: Using Time in Documents. Microsoft Research Tech. Report MSR-TR-2001-1, January 2001. http://www.research.microsoft.com/scripts/pubs/view.asp?TR_ID=MSR-TR-2001-01. |
29 | P. Schmitz. A Unified Model for Representing Timing in XML Documents, WWW9 position paper. 15 May 2000. http://www.cwi.nl/~lynda/www9/TimingIntegrationPositionPaper.html. |
30 | W. ten Kate, P. Deunhouwer, R. Clout. Timesheets — Integrating Timing in XML, WWW9 position paper. 15 May 2000. http://www.cwi.nl/~lynda/www9/Timesheets.www9.html. |
31 | J.R. van Ossenbruggen, H.L. Hardman, L.W. Rutledge. Integrating multimedia characteristics in web-based document languages. CWI Technical Report INS-R0024, December 2000. http://www.cwi.nl/ftp/CWIreports/INS/INS-R0024.ps.Z. |
32 | T. Wahl and K. Rothermel. Representing time in multimedia systems. In Proc. IEEE International Conference on Multimedia Computing and Systems., Boston, MA, May 1994. |