Crowne Plaza Hotel, Seattle, USA
The development of MBone videoconferencing tools has led to a large number of seminars being broadcast over the Internet. Most MBone seminars have little remote participation. One reason might be the absence of good floor control mechanisms. This paper describes a tool we developed, called the Questionboard, that implements floor control to facilitate question asking in large-scale MBone seminars. The qb protocol is described including the mechanisms used for reliable transmission of data and the floor control and crash recovery protocols.
Floor control, reliable multicast, large-scale, MBone
The MBone is a virtual network deployed over the Internet that supports routing of IP multicast packets [D91]. The deployment of the MBone and the development of MBone conferencing tools, such as vic [MJ95], vat [JM], ivs [TT94], sd [J], and wb [M92] have lead to a wide variety of events like seminars and conferences being multicast over the Internet. Since early 1995, we have multicast the Berkeley Multimedia and Graphics Seminar, a regularly scheduled weekly seminar over the open Internet [R95]. A speaker is invited to talk about her area of research. The presentation is multicast over the MBone. We produce audio and video streams and use wb to display slides. We run a second wb session called the control whiteboard, as a back channel to fix transmission errors and allow MBone participants to ask questions. The presentation is attended by Berkeley faculty and students, giving us a local audience in addition to the remote MBone audience. Local attendance ranges between 15 and 50 people. Remote attendance ranges between 10 and 200 people.
Our experience with this seminar series is that while many questions are asked by the local audience, few remote participants ask questions. One possible reason is the burden of setting up microphones and cameras to allow the remote participant to transmit his question. Another possible reason is the absence of a floor control mechanism. This problem is especially important for seminars with several hundred remote participants. Currently, the only way for a member of the MBone audience to ask a question is to speak into his microphone, which often interrupts the speaker, or to type the question on the second whiteboard. Neither method is natural. With the first method, a remote participant does not have enough feedback to determine when to ask a question, which is inhibiting. With the second method, our experience is that the control whiteboard is mainly used for technical feedback such as problems with receiving MBone packets and acceptable quality audio.
The difference in the frequency of interaction between local and remote audiences motivated our attempt to replicate the local environment for remote audiences. Local audience members express their desire to ask a question by raising their hand. The moderator "manages the floor" by nodding or pointing at members to "grant them the floor". We developed a distributed version of this social protocol in the Questionboard by allowing remote participants to indicate their desire to ask a question and enabling the moderator to control the floor.
The Questionboard tool (qb) encourages question asking by returning control to the moderator to recognize the person who wants to ask a question. Remote participants indicate a desire to ask a question by entering a text message into an interface during a seminar. The message can be a request for the floor to ask the question or the text of the question itself. Questions are displayed at all participating sites. The moderator can either read the question and respond verbally or grant the floor to a remote participant by selecting a question from the displayed list. The latter action pops-up an invitation to speak at the corresponding participant's site. All sites are updated to display the current floor.
This paper describes the design and implementation of qb. The rest of the paper is organized as follows. Section 2 describes related work. Section 3 discusses the design considerations and describes the functionality of qb from a user's perspective. Section 4 describes the qb protocol including the factors affecting its design. Section 5 describes our experience using this tool, Section 6 discusses our work and suggests future work, and Section 7 concludes the paper.
Many research groups are working on conference control protocols. However, we are unaware of anyone addressing the specific issue of remote question asking in large-scale loosely-coupled MBone sessions. The conference control channel protocol (CCCP) is designed to control conferences ranging from small tightly-coupled meetings to large loosely-coupled ones [HWC95]. The protocol addresses many aspects of conference control including application control, floor control, and membership control. The protocol supports a wide range of policies and reliability semantics. The reliability semantics provided are none, one, n-out-of-m and all, where none implies the message is just sent and forgotten and all requires an explicitly named list of participants. The protocol requires each tool to register for all relevant messages thus requiring changes in existing media tools. Our goals were less ambitious. We wanted to develop a floor control tool that could work without changing the way existing tools worked and could be ignored if a participant did not want floor control. Also, since we were targeting large-scale seminars, we wanted all participants to get the data without requiring tight coupling. Using the cccp model, we wanted a best-effort-all service, which did not fit into any of the above scenarios.
The simple conference control protocol (SCCP) [BO96] is designed for tightly-coupled conferences with a designated moderator who must give permission for new members to join. In addition to floor control, the protocol addresses media management, management of the set of members, and assignment of a special moderator role to one participant. The protocol assumes consistent state between participants but does not address the issue of how this state is kept consistent in the face of unreliable transport. It assumes that the underlying multicast transport provides reliable consistent delivery of data with globally ordered messages. We needed a protocol that would allow us to do floor control in the presence of lost packets in the underlying multicast transport. Thus it had to address the issues raised by unreliable multicast. In addition, we believe the scalability requirements of large-scale seminars require a protocol for loosely-coupled conferences rather than tightly-coupled ones, where the exact group membership may not be known and the state at each participant may not be consistent.
The ITU T.120 family of protocols address the problem of a conference control architecture [ITU-T]. The design of these protocols require reliable communication and they do not use multicast. Thus, they are centralized protocols that do not scale; consequently they are inappropriate for the Internet MBone.
Intel's Proshare Presenter has a facility somewhat similar to the questionboard, but the design is proprietary and it will not work with other MBone tools [Mc96].
This section describes the design principles underlying qb and the functionality that qb provides to a user.
Our goal was to encourage remote participation by providing floor control and mechanisms for asking questions in large-scale loosely-coupled MBone seminars. The MBone currently uses a lightweight session model where sites join/leave conferences independently and each site sends data whenever it wishes to. We wanted to implement floor control for lightweight sessions which lead to the following design principles:
· The design must be scalable and must work well for loosely-coupled sessions, where the exact group membership is not known. Hence each receiver must be responsible for its own reception of messages.
· The design must be independent of and work with existing MBone media tools (e.g vat, vic, etc.).
· The tool should separate policy and mechanism to allow implementation of different floor control policies.
· Each participant should have independent control over the local floor control policy at his site and should be able to disable floor control for his machine if so desired.
We implemented an independent floor control tool that runs at each participating site. These floor control tools use IP multicast to exchange floor control messages. Locally, at each site, we multicast the floor control message on the conference bus as explained later, allowing all the media tools to listen in and independently act on any relevant floor control message. Using global multicast between all floor control processes and local multicast between tools at the local site, we developed qb to implement a specific floor control policy which is an explicitly chaired conference where a chair controls the granting of the floor between participating members.
This section describes the moderator and participant interfaces to qb. The moderator interface allows the speaker or another person acting on behalf of the speaker (i.e. the moderator) to view requests to ask questions and either answer the question or grant the floor to a remote participant to ask the question in real time. The participant interface allows a participant to view questions and enter a question or a request to ask a question. The two interfaces are very similar. The primary differences are that some information displayed to a moderator may not be visible to other participants and the moderator can control the floor.
Fig. 1: Participant Interface to qb
Fig. 2: A question
Qb supports different floors for different media to allow the moderator more fine-grained control over the session. Hence, the audio and video floors need not be the same. The name of the person who holds the current audio floor is displayed near the bottom of the main panel. Clicking on the name pops up a window that shows more information about the current audio and video floor holders.
When a participant wants to ask a question, he pushes the "Enter Qs" button which brings up a dialog box shown in figure 3, that allows a question to be entered. The participant can specify the media (audio, video) he can support for asking questions and whether the question is to be public, private or anonymous. Public questions are broadcast to all participating sites. Private questions are sent only to the moderator. Anonymous questions do not display the participant's name. They are sent to the moderator who can then decide whether to ignore them, display them locally, or send them out to the whole group.
Fig. 3: The panel for entering a question
Figure 4 shows the moderator interface. The main difference from the participant interface is the inclusion of a panel to display follow-up requests. After a speaker responds to a question, participants may press the "Req To Speak" button to request the floor to ask follow-up questions. This action causes the request to appear on the moderator's front panel in the form of a button with the participant's name on it. The moderator may now grant the floor to the participant, if desired, by clicking on the corresponding button.
Fig. 4: The Moderator Interface
Fig. 5: The Moderator's Control Panel
Fig. 6: Operation with/without a video production switcher
The moderator control panel also displays current information about the speaker. This information can be changed by clicking the "Change Speaker Info" button which brings up a menu allowing the moderator to enter information about the speaker. This feature is useful when the speaker is different from the moderator. One scenario where that might be the case is if the speaker gives a presentation remotely, and the moderator needs to moderate both local and remote participants. When a participant is granted the floor, an invitation pops-up on his screen inviting him to speak, and the current floor holder is reflected to all participant qbs. A participant may configure his environment to set the desired local floor control policy to be followed at his site. [Fig 7].
Fig. 7. User Configuration Panel
This section describes the qb protocol. There are two major differences between qb and other MBone tools that affected the design of the qb protocol:
· Unlike other MBone tools where all sites are equal, qb has an inherent inequality between sites. The moderator has more authority than all other participants.
· Unlike the audio and video tools where data is delivered continuously, data transmitted between qb processes is discrete. Thus, while missing audio and video data packets are just dropped, qb data packets cannot be arbitrarily dropped.
Floor control messages have the additional requirements of time criticality and authenticity. A message granting the floor must be received on time to be useful. Contrast this requirement with wb, where late and out-of-order messages still are applicable. Moreover, qb must ensure that floor grant messages are authentic to prevent malicious users from sending spurious floor grant messages and disrupting the session.
The remainder of this section describes the basic protocol, the mechanisms used for reliable transmission of data, and the crash recovery protocols. The protocol is described in 3 subsections: The first addresses the transmission of questions, the second deals with floor control and the third describes the data representation.
The qb tool uses two ports: the moderator port to which all requests to the moderator are addressed and the participant port at which all participants listen for data. The moderator initially grants the floor to the speaker. When a participant wants to ask a question, he enters it in the interface provided. This question is then multicast to the moderator. The moderator qb assigns this question a global id, and multicasts it to the group. When the participant receives the question as part of this multicast, this message acts as an acknowledgment that the moderator received the question. If the participant qb does not receive this acknowledgment, the question is resent. Private and anonymous questions that are not retransmitted to the group are explicitly acknowledged. All participants periodically send out a session report containing their name, CNAME and other information. This report is used to estimate group membership.
Fig. 8: An MBone conference using vat and qb
Having decided on the importance of reliable communication between the moderator and a questioner, another alternative considered was to use a unicast protocol between them. We decided against mixing protocols because several sites on the MBone are behind firewalls which have been configured to pass only class D addresses [4]. Also, using multicast makes the protocol more flexible, allowing future extensions to multiple moderators as well as dynamically changing moderators. This capability might be needed for other floor control policies. For example, one way to implement meeting room semantics is to implement a token passing scheme with dynamically changing moderators. The speaker at any instant is also the moderator. The speaker chooses the next person to speak and hands off control of the floor to the new speaker/moderator. All requests to speak are now sent to this new speaker. Sending requests to a fixed multicast address makes this simpler to implement than having to establish a new unicast connection each time the speaker changes.
Private questions have their own namespace and are not multicast to the group. Anonymous questions may or may not be multicast depending on the moderator's set preference. If they are multicast to the group, they are part of the global sequence space.
The moderator periodically multicasts the sequence number of the last question seen to facilitate detection of missing last questions, as explained below in Section 4.2.1.
A latecomer to the conference waits till it receives a heartbeat packet, as explained in section 4.2.1, and multicasts a send_all request to the group, if required. Any member can respond with the requested data, using the above random time-out scheme to prevent multiple requests and replies.
There are several levels of protocols that could have been used to implement moderator recovery. We chose a simple one: When a moderator qb restarts, it too issues a send-all request. However, the updating host may not have seen the last question and would have no way of knowing that. Hence the moderator qb stores the sequence number of the last question seen temporarily to disk each time it sends out a question and reads that information on restart.
Participant qbs detect a moderator restart based on timestamps and sequence numbers. On detecting a moderator restart, participant qbs resend private questions. Anonymous questions that have not been multicast to the group are lost. They are not resent as they were intentionally not multicast to the group by the moderator.
We enhance this facility by explicitly enforcing the floor control directives using the conference bus mechanism developed at LBL [MJ95].The conference bus is a multicast IPC mechanism used to communicate between all tools at one site participating in a session. Each site has one conference bus per session. Tools communicate with each other by multicasting messages on the local bus. On receiving the grant-floor directive from the moderator, the local qb multicasts this information on the conference bus and the other tools in the session take action accordingly. For example, vat can mute all participants except the one granted the floor, and vic can open a window with the video stream from the person granted the floor if it exists. We have extended the conference bus mechanism to differentiate between audio and video floors. This change to the conference bus protocol makes the mechanism flexible, allowing, for instance, a video production switcher to be specified as the video floor instead of a participant. Consequently, instead of "focus(CNAME)" messages, we send "audio_floor(CNAME)" and "video_floor(CNAME)" messages. In addition, we also introduce the "all" option (audio_floor(all), video_floor(all)) to disable explicit floor control.
This section describes the mechanisms used to ensure smooth operation of the seminar in the face of lost, out-of-order, and late directives. Because the timing for floor control directives is critical, we cannot use a request/response method, similar to the method used to propagate questions, to handle lost grant-floor directives. Also, recall that participants must ensure that grant-floor directives are authentic. In the absence of encryption, participants must receive the floor grant information only from the moderator and not from a peer. Since the only way to detect a missing directive is by getting a newer message, the relevant floor control state is included in each message. Thus, we handle floor grant information as follows.
A heartbeat packet is sent out periodically containing critical state information including the floor control information and the sequence number of the last question seen.To facilitate quick repair of lost messages, we use a variable heartbeat scheme [HSC95], in which heartbeat transmissions are clustered in the interval immediately following a data transmission rather than spreading them out evenly across the idle period between transmissions. The interval between two heartbeat packets varies from imin to imax. After sending a data packet, the interval is reset to imin. Thereafter, each time a heartbeat packet is sent, the interval is doubled till it reaches imax. The clustering of heartbeat packets around a data packet should lead to faster detection of lost data packets while keeping the average bandwidth usage low. Putting the data itself in the heartbeat packet leads to faster recovery from lost messages.
If, in spite of the heartbeat protocol, a participant still loses floor control information, he has the option of manually entering the audio and video floors, or turning off floor control.
The variable heartbeat scheme also enables the participant granted the floor to reliably receive the directive. A time-out/retransmission scheme to reliably transmit the grant-floor directive was rejected because this approach involves waiting for at least round trip time (RTT) before retransmitting. Instead, putting the last grant-floor directive in the heartbeat packet enables a faster response.
Timestamps are used to enable out-of-order directives to be ignored.
Fig. 9: The qb Message Format
With each session the number of remote questions we received increased. In the last session we got as many remote questions as local. Most remote questions received were from outside Berkeley from places ranging from Los Angeles to Norway. We received both text and audio questions. Some participants preferred using text even though they had a microphone connected because they wanted to ensure the question was phrased correctly.
Even with the tool, we found some remote participants hesitant to ask questions. This points to the need to create a social environment more conducive to asking questions. The speaker needs to explicitly invite questions from remote participants. One problem is the lack of visual feedback from the MBone participants to the speaker. With a local audience, the speaker can see when a person internally debates whether to ask a question (e.g. starts to raise his hand or has a quizzical look on his face) and can encourage him to ask it. This visual feedback is very subtle and is not available from the MBone participants.
Another problem is that some participants seemed to be inhibited by the fact that there were several unknown persons listening in over the MBone even though they had no problem asking questions when physically present in the room. Since those same unknown persons were still listening, it seemed like being in the room de-emphasized the MBone audience for those participants.
Several participants seemed to prefer having the audio cue of the moderator speaking out their name in addition to the invitation popping-up on their screen. The audio cue seemed to help participants synchronize the exact moment for starting to speak.
In terms of qb itself, the main feedback we got was that users wanted the ability to retract questions. The exact details of the tool and its features seemed less important than the fact that it existed and allowed participants to ask questions, whether textual or audible, in a fairly intuitive way.
The qb tool is implemented using Tcl/Tk for the user interface and C++ objects for the low-level communication and timer functionality. The session reporting protocol has not yet been implemented. Hence, the menus for explicitly setting the floors require the CNAME to be entered rather than allowing the user to select a name from a displayed list of participants.
One disadvantage of using CNAMEs to identify the floor is that it does not differentiate between multiple streams being generated by the same source. For example a speaker may have 4 video cameras generating 4 different streams, each associated with the same audio stream. A mechanism is required to identify each stream to enable the speaker to associate at a particular time any one of the 4 video streams with the audio stream so that granting the floor to the speaker would automatically cause the chosen stream to be displayed. Since the RTP ssrc is not persistent, some other persistent means of identification is required. One possible approach might be to generate a "view" identifier with each video stream and use that in conjunction with the CNAME to identify the stream. Another approach might be to have different CNAMEs for each video stream but then we have the problem of associating the 5 CNAMEs (4 video and 1 audio) together.
Currently the qb tools use ASCII messages to communicate with each other for human readability and easy extensibility. It might be better to use an RTP-like data format since this would allow an RTP recording tool to record the floor control data along with the other media streams.
The floor control policy implemented is that of an explicitly chaired session. However, the mechanism used can easily be extended to other floor control policies such as a meeting room.
Early experimentation suggests that the tool does encourage remote participation. However some remote participants are still reluctant to ask questions without specific invitation from the speaker. This points to a need to develop a culture for asking questions over the MBone. We intend to continue using qb in future MBone seminars and studying its effect on remote participation.
We would like to thank Steve McCanne and the anonymous reviewers for their helpful comments on improving the quality of the paper. Thanks to Todd and our colleagues at Berkeley, specifically Andrew Swan and Dave Simpson, who helped design and test qb. We would also like to thank early testers on the Internet.
1. This material is based upon work supported under a National Science Foundation Graduate Research Fellowship and grants from the State of California MICRO Program with matching support from Fujitsu Laboratories, Intel Corporation, Mitsub
ishi Electric Research Laboratories and Philips Research.
2. This author's current address is radhika@hpl.hp.com
3. Private questions are not displayed at participant qbs. However, they cannot be guarenteed privacy just as anonymous questions are not guaranteed anonymity, since anyone can snoop on an unencrypted multicast packet. Having 2 ports just reduces unnecessary processing at participant qbs.
4. Class D addresses are addresses in the range 224.0.0.0 to 239.255.255.255 which are set aside for IP multicast.
[D91] Deering,S., "Multicast Routing in a Datagram Network", Ph.D. thesis, Stanford University, Palo Alto, California, Dec.1991.
[DG97] Dommel,H.-P., and Garcia-Luna-Aceves,J.J., "Floor Control for Multimedia Conferencing and Collaboration'', Multimedia Systems (ACM/Springer), Vol. 5, No. 1, January 1997.
[FJLMZ95] Floyd,S., Jacobsen,V., Liu,C., McCanne,S., and Zhang,L., "A reliable multicast framework for light-weight sessions and application-level framing", proceedings of SIGCOMM 1995, Boston, MA, August 1995.
[HSC95] Holbrook,H., Singhal,S. and Cheriton,D., "Log-Based Receiver-Reliable Multicast for Distributed Interactive Simulation", Proceedings of ACM SIGCOMM 95, August 1995.
[HWC95] Handley,M., Wakeman,I., and Crowcroft,J., "The Conference Control Channel Protocol (CCCP): A Scalable Base for Building Conference Control Applications", ACM SIGCOMM 95, New York, August 1995.
[HCB96] Handley,M., Crowcroft,J., Bormann,C., "The Internet Multimedia Conferencing Architecture", Internet Draft draft-ietf-mmusic-confarch-00.txt, Work in Progress, Feb 1996.
[ITU-T] T.120 Standards for Audiographic Teleconferencing
[J] Jacobsen,V. sd, UNIX Manual Pages, Lawrence Berkeley Laboratory, Berkeley, Ca.
[JM] Jacobsen,V. and McCanne,S. vat, UNIX Manual Pages, Lawrence Berkeley Laboratory, Berkeley, Ca.
[J94] Jacobsen,V., "Multimedia Conferencing on the Internet", SIGCOMM 1994 Tutorial Notes, August 1994.
[KHSC95] Kirstein,P., Handley,M., Sasse,A., Clayman,S., "Recent Activities in the MICE Conferencing Project", Proceedings INET 95.
[M92] McCanne.S., "A Distributed Whiteboard for Network Conferencing", May 1992, UC Berkeley CS 268 Computer Networks term project.
[Mc96] McGeady,S. personal communication 1996.
[MJ95] McCanne,S., Jacobsen,V., "vic: A Flexible Framework for Packet Video", ACM Multimedia 1995.
[R95] Rowe,L., Berkeley Multimedia and Graphics Seminar, url: http://www.bmrc.berkeley.edu/298.
[SBST94] Sasse,M., Bilting,U., Schulz,C., Turletti,T., "Remote Seminars through Multimedia Conferencing: Experiences from the MICE project", Proc. INET 1994.
[SCFJ96] Schulzrinne,H., Casner,S., Frederick,R., and Jacobsen,V., "RTP: A Transport Protocol for Real-Time Applications", RFC 1889.
[SWS] Shenker,S., Weinrib,A., Schooler,E. "Managing Shared Ephemeral Teleconferencing State: Policy and Mechanism".
[TT94] Turletti,T., "The INRIA Videoconferencing System (IVS)'', ConneXions - The Interoperability Report Journal,Vol. 8, No 10, Oct. 1994, pp. 20-24. Also see http://www.inria.fr/rodeo/ivs.html