The Portal is a high-quality audiovisual communication laboratory used for the international Master's programme in Music, Communication and Technology.
The main goal of the MCT Portal is to explore, challenge and develop technological solutions for the transmission of audio and video signals over regular network connections, with (a) low latency, (b) high speed, and (c) high quality. The lab will have three main use areas (1) be connected with a similar labs, and be a cornerstone of the international Master’s programme Music, Communication and Technology (2) be central in the research collaboration between the music technology groups at UiO and beyond, as well as national and international partners, (3) be a common UiO test lab for the most advanced audiovisual technologies available.
Most of today’s audiovisual communication solutions focus on simplicity and flexibility. This is necessary in a mass market in which the most important goal is to connect people with a large number of different devices (mobile phones, tablets, laptops, etc.). Such solutions may work well for basic speech communication, but does not provide the type of nuances experienced in communication between people being close together in the same space. They are also far from satisfactory for music performance and perception.
Upcoming solutions in virtual and augmented reality (VR/AR) focus on expanding the traditional two-dimensional communication experiences in three dimensions. There are certainly many exciting possibilities in such systems, but we also think that “traditional” screen/speaker-based solutions have a larger potential than what the current commercial technologies allow for. We are particularly thinking of the limitations posed by signal speed and latency, two factors that are critical for human cognition.
Another limitation in most of the current commercial communication solutions is the prioritization of the visual domain, with little attention to the other modalities. As music researchers, we particularly see that the auditory domain is not used to its fullest potential, neither in speed, resolution nor spatial attributes. The main research topics to be addressed in the MCT Portal are therefore:
- How can audio be used more effectively in network-based audiovisual communication?
- How can lower latencies and faster speeds improve the experience of the communication?
- How can other modalities be used to create more multimodal communication?
- How can better network-based communication solutions be used in education, both on-campus and distance-based teaching and learning?
While all of these questions are of a general nature, our particular entry point is that of musical communication. There are several reasons why music is a good research object for advanced communication studies. The performance and perception of music utilizes the extremes of the human cognitive system when it comes to perceptual thresholds and action responses. Think of a jazz quartet, in which a saxophone player, a pianist, bassist, and drummer are in constant interplay, trading phrases and aligning to a common pulse. The response time of the musicians are in the range of only a few milliseconds, and the audiovisual communication between the musicians is central for the success of the performance. Now imagine that two of the musicians are in Oslo and the two others are in Trondheim. Are they able to play together over a regular network communication? What types of latencies can they tolerate, and how do these latencies impact the performance. And how is it possible to create good performer-audience communication in such a setup? These are the among the things we are going to work on in the lab.
About the Setup
The MCT Portal is permanently connected to a similar lab in Trondheim. Such a permanent network-based communication setup and close collaboration is quite unique, and gives us the opportunity to test 24/7 connections in a large variety of real-life research and education activities. We currently have access to five parallel systems:
- LOLA: LOw LAtency audio visual streaming system
- TICO: TIny COdec
- NDI: Network Device Interface
- Dante: Digital Audio Network Through Ethernet
- Jacktrip: High-Quality Audio Network Performance over the Internet
The three first systems (LOLA, NDI and TICO) are video solutions, with integrated audio transmission, while the two last ones (Dante and Jacktrip) are specialized in audio transmission. They all have different hardware requirements, different licensing solutions, and variable source-code access. Some are developed by research groups (LOLA, TICO and Jacktrip), while the two commercial solutions (NDI and Dante) are considered industry standards. Even though all the systems have different hardware requirements, they share the property of relying on normal network-based protocols. This opens for easy integration of various services, and quick routing through virtual mixers and servers.
The different systems have their pros and cons, but they all allow for much lower latencies and higher speeds than “normal” communication solutions (Skype, etc.). In our preliminary testing we have been able to achieve audio communication between Oslo and Trondheim down to 20 milliseconds, and video communication around 50 milliseconds. The rule of thumb when we develop interactive music systems is to aim for latencies around 10 milliseconds, which is often considered the limit for discrimination of individual sonic events. In comparison, most of the “normal” communication solutions available operate with latencies way above 100 milliseconds. The latencies are often also uneven, causing jitter and glitches in the audio and/or video. Such high and uneven latencies do not work for musical communication, and, we believe, that they also hinder rich, nuanced communication in regular speech communication.
The reason for equipping the MCT Portal with several different systems (LOLA, TICO, etc.) is to allow for systematic comparison of their qualities. On the technology side this includes evaluations of the technical limitations, further development of the open source systems, and work on integration between closed systems and various open standards. On the psychology side, we will carry out user testing on cross-modal synchronization (audio/video) and also tolerance testing for different types of latencies. On the musicology side we will explore how musicians are able to adapt to the latencies and projection setups (audio-only, video-only, audio-video) as well as the experience of performing and perceiving music through a network-based medium. Finally, since the lab will also be used in distance-based education, it will also be important to develop pedagogical strategies and evaluate the technologies from an educational perspective.