Current streaming applications use proprietary mechanisms to establish and control the streams, as well as to transport the data. The framework developed for this project is designed to allow the creation of interoperable streams. This design faciliates the independent development of client and server components.
In this document, we describe the various interfaces and interactions in the CORBA streaming framework. In addition, we discuss the design and implementation of a prototype audio/video streaming application that uses this framework. This prototype is based on TAO, which is a real-time CORBA ORB.

MMDevice)MMDevice component abstracts the notion of a multimedia
device. This device might be physical (such as a video camera) or
logical (such as a program that reads video clips from a file).
The MMDevice is responsible for creating new
endpoints in
response to requests to create new streams. The endpoint typically
consist of a pair of objects -- A Virtual Device, or VDev, which
encapsulates the device-specific parameters of the connection, and the
StreamEndpoint, which encapsulates the transport-specific parameters
of the connection. The roles of VDev and StreamEndpoint are
described in more detail further on in this document.
The MMDevice is also the component where various policies
governing the creation of the VDev and
StreamEndpoint objects can be
implemented. Currently,
the implementation of MMDevice provides for two
concurrency strategies:
In this strategy, endpoint objects for each new connection are created in the same process as the factory. This means that the same process handles all the concurrent connections reactively.
The process-based concurrency strategy creates new virtual devices and stream endpoints in a new process.
VDev)The stream establishment phase provides for negotiation regarding these parameters. If the negotiation fails, the stream can be torn down immediately.
The current prototype (explained below) supports MPEG-1 video streams, and ULAW audio.
StreamCtrl)StreamCtrl abstraction is
used to bind the supplier
and consumer of a stream (such as a video-camera and a display).
This abstraction also provides a standard mechanism to access
stream controls, such as stop, pause, and play. In addition,
the Streams abstraction allows the specification of the Quality of
Service (QoS) parameters (such as frames/sec) associated with
the stream.
The current implementation of the StreamCtrl interface allows a
server to associate IDL interfaces with a multimedia
device. Once associated, a client can use the stream controller
to gain access to the control interface, and use it to control
the stream.
StreamEndpoint)StreamEndpoint is an abstraction that encapsulates
transport-specific parameters of a stream. For instance, a
stream that uses UDP as the transport protocol will use the
host name and the port number to identify hostnames.
The current implementation of the StreamEndpoint
provides a
flexible way for applications to define and exchange such
transport-level parameters.
The prototype is a client/server application. The client contacts the server and requests that the server establish streams for a particular movie (i.e. a video stream and an audio stream). The server is responsible for sending audio/video packets to the client. The client side then decodes and plays these packets.
The following describes the various components of the client and server in more detail.

MMDevice endpoint factory
(described earlier), which creates connection handlers on
response to client connections. Each connection causes the
creation of one audio process, and another video process. The video/audio processes are responsible for the following tasks:
The control component of the process is a CORBA servant, that responds to CORBA requests from the client. The interface exported by this servant represents the various operations supported by the server, such as play, rewind, forward etc...
At any given time, the server can be in several states, such as "Playing", or "Rewinding", or "Stopped". It's behavior to client requests changes, depending on it's state. For instance, the server should ignore a client "Play" request when the server is already in the "Playing" state. On the other hand, a client request of "Rewind" when the server is in the "Stopped" state should cause the server to enter the "Rewinding" state. To provide a flexible and extensible way to implement such requirements, the control component is implemented using the State pattern.
The control component is responsible for instructing the data component (discussed next) to modify the stream flow in response to CORBA requests. For instance, when the client makes the "rewind" call on the controller, the controller instructs the data component to start sending frames in reverse chronological order.
The data component is responsible for streaming the data to the client. The current implementation of the data component uses the UDP protocol to send audio/video frames. Future implementations of this component will use the SFP encoding mechanisms to support multiple protocols.

The video buffering process is responsible for reading UDP packets of the network and putting them in a shared memory queue, which can be picked up by the Video Decoding process.
Similarly, the audio buffering process is responsible for reading UDP packets of the network and putting them in a shared memory queue, which can be picked up by the Audio playback process.
The video decoding process reads the "raw" packets sent to it by the Video buffer process, and decodes them according to the MPEG-1 video specification. These decoded packets are then sent to the GUI/Video process, which displays them.
The GUI/Video process is responsible for the following two tasks:
It provides a GUI to the user, where the user can select commands like "Play", "Stop", "Rewind" etc. These commands are then sent to the Control/Audio process via local inter-process pipes.
This component is responsible for displaying video frames to the user. The decoded video frames are available in a shared memory queue.
The Control/Audio process is responsible for the following tasks:
This component recieves control messages from the GUI process, and sends the appropriate CORBA requests to the server.
The audio playback component is responsible for dequeuing audio packets from the Audio Buffer process, and playing them back using the multimedia sound hardware. Note that since the server uses the ULAW format, decoding is not necessary, and the data recieved can be directly written to the sound port, which is /dev/audio on Solaris machines.
Each server will export an offer containing server properties and a
reference to its MMDevice to a Trading Service
instance. The server will advertise three types of properties: device
configuration -- audio/video formats; server performance -- load, disk
usage, bandwidth available; and movie selection -- the contents of the
movie directory. Properties that change frequently, such as server
performance and movie availability, will be registered with the
Trading Service as dynamic properties. The server will implement
dynamic property callback interfaces to handle their evaluation.
The client will discover and bind to a server whose properties best match its device configuration, that has a movie the user finds worth viewing, and that offers the best possible performance. A GUI will prompt the user for the client's configuration, and allow the user to choose from movies available on the servers that match the client's configuration. After the user selects a movie, the GUI will allow the user to select the server that has the best performance from among those that offer the movie. Hence best-matched server selection occurs in three phases:
1) The client first does a query asking for the movie listings of all servers that deliver video and audio in a format understandable by the client, and can accept more connections.2) The client's user would then select an interesting movie from a list compiled from the results of the first query. In a second query, the client requests the performance characteristics of those servers that both match the client's configuration and are showing the selected movie.
3) The trader will store the performance characteristics ---
load, CPU usage, disk access, network traffic, context
switches, etc... --- as callback interfaces to the server
(dynamic properties). The client can attach these interfaces
to DOVE as
suppliers to be periodically polled, so the client can
visually display charts of each server's performance, updated
periodically by callbacks to those interfaces. Then based on
the charts, the user selects the server with the most suitable
performance, and the client will open an audio/video
connection to that server using the MMDevice
object reference it retrieved from the trader.
The A/V server exports a sequence of movie information structures to the trading service, each containing a movie name, associated file name, frame rate, duration, and frame size, for example. Also contained in this structure will be a URL pointer to a web page on the movie. The server selector, written in Java and communicating the the trading service using JavaIDL, compiles the list of movie names for each server into a list GUI. When the user selects an item in the list, the Java client generates an HTML page for the movie and instructs a running WWW browser process to open the file. The generated page lists in a sidebar frame all the technical information about the movie --- duration, frame rate, frame size --- as advertised by each compliant server that offers the movie, and displays the movie web page in the main frame.
Once the user has decided on a server and movie, the Java server selector passes the server IOR and the movie name to the A/V client controller process.
This will allow multiple flows within a single stream. An
instruction to start or stop a stream can be applied to
the stream as a whole or any subset of its flows. A
StreamEndpoint will contain a
FlowEndPoint object for each flow within the
stream.
For eg. a Videophone will have an audio and video
flow in the phone stream.
Stream data can be transported using a number of protocols. These protocols include reliable connection-oriented transports (like IIOP/TCP), unreliable datagram-oriented Internet transports (like UDP), and real-time inter-ORB protocols (e.g., RIOP) over high-speed interconnects such as ATM and Fiber Channel. To allow a stream to support multiple protocols, the OMG spec specifies the Simple Frame Protocol (SFP).
We have developed the basic encoding classes required to encode and decode SFP frames. Future versions of the framework will provide the ability to send and recieve frames using SFP.
Add support for interpreting QOS specification in the A/V service. This will involve methods to translate application-level QOS to network-level QOS. For example, if an application wants to stream an MPEG-1 file ,it can be translated to a network-level Bandwidth requirement of atleast 1.5 Mbps.