1. Introduction
ADM-OSC has been designed to solve real problems for live and broadcast sound producers. Since 2019, a growing workgroup of industry stakeholders from live music and broadcast domains has gathered to exchange needs and experiences from real-life production cases. These companies have already expressed interest or have implemented ADM-OSC:
1.1. Why ADM?
Immersive audio is gaining ground in different industries, from music streaming to gaming, from live sound to broadcast. ADM is becoming a popular standard metadata model in some of these industries, with serialADM used in broadcast or ADM bwf or xml files used in the studio.
1.2. Why OSC?
-
Lightweight network protocol
-
Easy to implement
-
Human readable
-
Supports wildcards and bundles
-
Specification: [OpenSoundControl.org]
-
Available in a plethora of professional audio hardware and software devices
1.3. Motivation & goals
-
To facilitate the sharing of audio objects metadata between a live ecosystem and a broadcast or studio ecosystem.
-
To define a basic layer of interoperability between Object Editors and Object renderers while not aiming at replacing more complete manufacturer specific protocols or grammars.
-
To define a direct translation of the most relevant ADM Object Properties onto a communication protocol widely used in the live industry, OSC.
-
Keeping the grammar scope aligned with the ADM properties.
-
Share this proposal with the EBU so they can become a relay, publish and support this initiative.
-
Extend this small grammar to more ADM properties (beds, etc.) in the future.
1.4. Project Originators
L-Acoustics, FLUX::SE, Radio-France
1.5. Project Contributors
Adamson, BBC, d&b audiotechnik, DiGiCo, Dolby, Lawo, Magix, Merging Technologies, Meyer Sound Laboratories, Steinberg
2. Current spec (v1.0)
2.1. Object position messages
Note: These messages take the form of /adm/obj/n..., where n signifies object number
address | type | units | min | max | default | description | example |
---|---|---|---|---|---|---|---|
/adm/obj/n/azim | f | degrees | -180.0 | 180.0 | azimuth “theta - θ” of sound location § 4.4.1 Polar | /adm/obj/4/azim -22.5 | |
/adm/obj/n/elev | f | degrees | -90.0 | 90.0 | elevation “phi - ɸ” of sound location § 4.4.1 Polar | /adm/obj/4/elev 12.7 | |
/adm/obj/n/dist | f | normalized | 0.0 | 1.0 | 1.0 | distance “r” from origin § 4.4.1 Polar | /adm/obj/4/dist 0.9 |
/adm/obj/n/aed | f f f | azimuth elevation distance | synchronicity and reduced network traffic | /adm/obj/4/aed -22.5 12.7 0.9 | |||
/adm/obj/n/x | f | normalized | -1.0 | 1.0 | 0.0 | left/right § 4.4.2 Cartesian | /adm/obj/4/x -0.9 |
/adm/obj/n/y | f | normalized | -1.0 | 1.0 | 0.0 | front/back § 4.4.2 Cartesian | /adm/obj/4/y 0.15 |
/adm/obj/n/z | f | normalized | -1.0 | 1.0 | 0.0 | top/bottom § 4.4.2 Cartesian | /adm/obj/4/z 0.7 |
/adm/obj/n/xy | f f | see above | synchronicity and reduced network traffic | /adm/obj/4/xy 0.62 -0.33 | |||
/adm/obj/n/xyz | f f f | see above | synchronicity and reduced network traffic | /adm/obj/4/xyz -0.9 0.15 0.7 | |||
/adm/obj/n/w | f | normalized | 0.0 | 1.0 | 0.0 | horizontal extent in normalized units | /adm/obj/3/w 0.2 |
/adm/obj/n/gain | f | linear | 0. | 1.0 | Apply a gain to the audio in the object. | /adm/obj/3/gain 0.707 | |
/adm/obj/n/dRef | f | normalized | 0.0 | 1.0 | 1.0 | Distance where dimensionless rendering is replaced with with physics-based rendering. § 4.4.4 Distance | /adm/obj/1/dRef 0.2 |
/adm/obj/n/dMax | f | meters | 0. | Distance signified by a normalized dRef value of 1 § 4.4.4 Distance | /adm/obj/1/dMax 21.3 | ||
/adm/obj/n/mute | i | integer | 0 | 1 | 0 | 1 means “true”, so muted | /adm/obj/2/mute 0 |
/adm/obj/n/name | s | string | 0 | 128 char | object nice name | /adm/obj/1/name kickdrum |
Note: Type tags are defined as OSC 1.0 specification: i=int32, f=float32, s=OSC-string
2.2. Environment messages
These could be expanded to include program changes and other global data. They are not specific to any individual object.
address | type | units | min | max | default | description | example |
---|---|---|---|---|---|---|---|
/adm/env/change | s | string | 0 | 128 char | 128 char | program changes | /adm/env/change day |
2.3. Listener messages
These messages could be used by a binaural renderer [EBU-Tech-3396] for head tracking data and listener position in a 6DOF setting.
address | type | units | min | max | default | description | example |
---|---|---|---|---|---|---|---|
/adm/lis/ypr | f f f | degrees | -180.0 | 180.0 | 0.0 | orientation: yaw, pitch, roll | /adm/lis/ypr -45.0 30.0 5.0 |
/adm/lis/xyz | f f f | normalized | -1.0 | 11.0 | 0.0 | listener position | /adm/lis/xyz 0.0 0.5 -0.2 |
2.4. Queries and bi-directional communication
The OSC protocol is unidirectional, so the commands should be considered as SET from a sender to a receiver. A particular device might also be interested to GET the state of a particular parameter in another device. To do so, it should send a command without any arguments. The receiver should answer back to this IP with the data.
3. Implementation Matrix
Table of implementations.
What do they support?
What they would like to support?
✓ = transmit and receive tx = transmit only rx = receive only | Zactrack | (Merging Technologies) Ovation | (Merging Technologies) Pyramix | (Figure 53) QLab | (Flux::) Spat Revolution | (L-Acoustics) L-ISA Controller | (Lawo) mc2 consoles | (d&b Soundscape) En-Bridge | (Meyer Sound Laboratories) SpaceMap Go | (Steinberg) Nuendo | (Adamson) FletcherMachine | (New Audio Technology) Spatial Audio Designer | (Modulo Pi) Modulo Kinetic |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
/adm/obj/n/azim | tx | ✓ | ✓ | ✓ | rx | rx | ✓ | ✓ | ✓ | ||||
/adm/obj/n/elev | tx | ✓ | ✓ | ✓ | rx | rx | ✓ | ✓ | ✓ | ||||
/adm/obj/n/dist | tx | ✓ | ✓ | ✓ | rx | rx | ✓ | ✓ | ✓ | ||||
/adm/obj/n/aed | ✓ | tx | ✓ | ✓ | ✓ | rx | rx | ✓ | ✓ | ✓ | |||
/adm/obj/n/x | ✓ | tx | ✓ | ✓ | ✓ | ✓ | rx | ✓ | ✓ | ✓ | ✓ | ||
/adm/obj/n/y | ✓ | tx | ✓ | ✓ | ✓ | ✓ | rx | ✓ | ✓ | ✓ | ✓ | ||
/adm/obj/n/z | ✓ | tx | ✓ | ✓ | ✓ | ✓ | rx | ✓ | ✓ | ✓ | ✓ | ||
/adm/obj/n/xy | ✓ | rx | ✓ | ✓ | |||||||||
/adm/obj/n/xyz | tx | ✓ | tx | tx | ✓ | ✓ | ✓ | ✓ | rx | ✓ | ✓ | ✓ | ✓ |
/adm/obj/n/w | ✓ | tx | ✓ | ✓ | ✓ | rx | ✓ | ✓ | ✓ | ✓ | |||
/adm/obj/n/dref | ✓ | ||||||||||||
/adm/obj/n/dmax | ✓ * | ||||||||||||
/adm/obj/n/gain | ✓ | tx | ✓ | ✓ | ✓ | rx | ✓ | ✓ | ✓ | ||||
/adm/obj/n/mute | ✓ | ✓ | ✓ | rx | ✓ | ✓ | |||||||
/adm/obj/n/name | tx | ✓ | rx | ✓ | ✓ | ||||||||
/adm/env/change | ✓ | ✓ | ✓ | ||||||||||
/adm/lis/xyz | tx | rx | ✓ | ✓ | |||||||||
/adm/lis/ypr | rx | ✓ | ✓ |
Note: * FletcherMachine supports dmax as a global message only: /adm/obj/*/dmax
4. Basic ADM-OSC principals
4.1. Roles
4.1.1. Sender (client)
-
Object Editor sending positioning data to one or more receivers.
-
Cartesian position data is always normalized
4.1.2. Receiver (server)
-
Handles the (optional) local scaling of normalized data: x, y, z, distance
-
The receiver can be a DAW, an ADM renderer, an object editor, a bridge (ADM-OSC <-> sADM)
4.2. Ports
ADM-OSC typically uses UDP protocol. It is recommended to use port 4001 [ADMix] for one-way communication (so, default for senders and receivers) and 4002 for return messages (if used).
Those ports should be user editable if needed.
4.3. Message rate
Position data is typically sent at a high data rate, although care must be taken not to overload the capibilities of the receiver. S-ADM is usually half of the video frame rate, or approximatly one message every 20 ms, or 50 Hz. Similarly, the Dolby Atmos ADM Profile [Atmos-Profile] recommends that the sampling period be "less than 20 ms," although sampling is optional if the parameter does not change.
Interpolation messages have not been implemented in ADM-OSC 1.0.
4.4. Coordinates
4.4.1. Polar
-
0° azimuth is straight ahead
-
Positive azimuth is on the left, so a front-left speaker is +30°
-
+90° elevation is straight up
4.4.2. Cartesian
(-1, 1) --------- (1, 1) | | | | | | | | (-1, -1) ---------(1, -1)
-
Values are normalized between -1.0 and 1.0
- is right
- is forward
- is up
4.4.3. Conversions
To convert the coordinate system, Euler trigonometry can be used to represent the polar sphere in cartesian coordinates. The equations are provided in ITU-R BS.2127 section 6.8 [EBU-BS-2127]
To help conversions seamlessly, here are code examples on GitHub in SWIFT [convert-swift], in CPP [convert-cpp] or JavaScript [convert-js].
For full ADM compatibility, there is another recommended conversion approach in section 10.1 of: [EBU-BS-2127].
We probably cannot settle on an aligned approach on how to convert Polar to Cartesian systems and vice-versa. However can we at least propose something everybody can stand behind so that when someone uses ADM-OSC, the same results are achieved?
4.4.4. Distance
The 3D paradigm chosen by the ADM standard is a normalised (dimensionless) reference volume, defined in Cartesian or spherical coordinates.
This paradigm is used by studio/broadcast mixing tools such as Dolby Atmos or MPEG-H.
On the other hand, some audio renderers represent a physics-based world, and the notion of source distance relates to a physical unit, such as meters. Aside from direct sound gain, the source physics-based distance dm relates to advanced audio object parameters such as propagation delay, air at- tenuation, and energy levels of early/cluster reflections and late reverberation, or sound field behaviors (plane vs spherical waves).
These audio renderers include L-Acoustics L-ISA, Flux:Spat, d&B Soundscape, but also, more generally in the AR/VR domain, game audio engines such as Unreal, Unity, Wwise, or XR audio engines such as Magic Leap Soundfield Audio. A common challenge for all these renderers based on physical distance is that if the gain follows physical attenuation laws (such as “-6dB per doubling of distance”), there are some singularities when dm gets close to 0. Hence, most of these renderers include a “volume of reference” or “unit volume” where the rendering (and in particular the gain) do not follow physically-informed laws anymore. This is true for Unreal and Spat, for example.
An object position in a physics-based world can be described as:
In ADM-OSC, /dMax corresponds to the absoluteDistance
parameter in an ADM audioPackFormat element.
/dRef is a new parameter can be defined as the radius in meters of a volume of reference, which would serve the two purposes. It coincides with the dimensionless volume used in the ADM standard and it is used by physically informed renderers as the “volume of reference” where the laws of physics do not apply, and the gain(dB) is constant regardless of distance.
By definition:
and the following cases arise:
: the world is a dimensionless reference volume, matching the ADM standard: no reference volume within the physical world
Different rendering systems handle distance differently. ADM uses a "dimensionless" reference volume, the interior of a cube or sphere. There are also physics-based renderers (eg Game engines) that try to acoustically represent distance based on a simulation of distance in meters, for example.
The proposal is to define both reference distance (dRef) and a maximum distance (dMax) messages in ADM-OSC so to communicate the intended rendering approach. The default (and current behaviour of the ADM standard) would be dRef = dMax = 1m. If dRef < dMax, then distance based tranforms (such as gain changes) could be applied when an object's distance is > dRef. Inside of dRef, no transforms would be applied.
5. Development & Test tools
5.1. Chataigne module
(Mathieu Delquignies / d&b audiotechnik)
To retreive parameters or control ADM-OSC object-based audio software or hardware with OSC protocol.
5.2. Tester Desktop application
(Jose Gaudin / Meyer Sound Laboratories)
download from resources directory
5.3. Validator, Test and Stress Test Python Module
(Gael Martinet / FLUX::)
adm_osc module is available to install through pip:
pip install adm-osc
quick examples:
from adm_osc import OscClientServer # create a basic client/server that implement basic ADM-OSC communication with stable parameters # + command monitoring and analyze cs = OscClientServer ( address = '127.0.0.1' , out_port = 9000 , in_port = 9001 ) # send some individual parameters cs . send_object_position_azimuth ( object_number = 1 , v =- 30.0 ) cs . send_object_position_elevation ( object_number = 1 , v = 0.0 ) cs . send_object_position_distance ( object_number = 1 , v = 2.0 ) # or pack them cs . send_object_polar_position ( object_number = 1 , pos = [ - 30.0 , 0.0 , 2.0 ]) # in cartesian coordinates cs . send_object_cartesian_position ( object_number = 1 , pos = [ - 5.0 , 8.0 , 0.0 ]) # see documentation for full list of available functions # when receiving an adm osc command its analyze will be printed on the command output window # # e.g. # # >> received valid adm message for obj :: 1 :: gain (0.7943282127380371) # >> received valid adm message for obj :: 1 :: position aed (20.33701515197754, 0.0, 0.8807612657546997) # >> received valid adm message for obj :: 1 :: position xyz (-0.2606865465641022, 0.8273822069168091, 0.0) # >> # >> ERROR: unrecognized ADM address : "/adm/obj/1/bril" ! unknown command "/bril/" # >> ERROR: arguments are malformed for "/adm/obj/1/gain :: (1.4791083335876465,)": # >> argument 0 "1.4791083335876465" out of range ! it should be less or equal than "1.0"
from adm_osc import TestClient # create a test client, assume default address (local: '127.0.0.1') # test client can be used to test how receiver will handle all kind of parameters and parameters value range sender = TestClient ( out_port = 9000 ) # all stable parameters for a specific object sender . set_object_stable_parameters_to_minimum ( object_number = 1 ) sender . set_object_stable_parameters_to_maximum ( object_number = 1 ) sender . set_object_stable_parameters_to_default ( object_number = 1 ) sender . set_object_stable_parameters_to_random ( object_number = 1 ) # all stable parameters for a range of objects sender . set_objects_stable_parameters_minimum ( objects_range = range( 1 , 64 )) sender . set_objects_stable_parameters_maximum ( objects_range = range( 1 , 64 )) sender . set_objects_stable_parameters_default ( objects_range = range( 1 , 64 )) sender . set_objects_stable_parameters_random ( objects_range = range( 1 , 64 )) # all stable parameters for all objects sender . set_all_objects_stable_parameters_minimum () sender . set_all_objects_stable_parameters_maximum () sender . set_all_objects_stable_parameters_default () sender . set_all_objects_stable_parameters_random () # see documentation for full list of available functions
from adm_osc import StressClient # create a stress client, assume default address (local: '127.0.0.1') # stress client will send huge amount of data to stress test the receivers sender = StressClient ( out_port = 9000 ) # do stress test in cartesian coordinates sender . stress_cartesian_position ( number_of_objects = 64 , duration_in_second = 60.0 , interval_in_milliseconds = 10.0 ) # do stress test in polar coordinates sender . stress_polar_position ( number_of_objects = 64 , duration_in_second = 60.0 , interval_in_milliseconds = 10.0 )
6. Discussion
6.1. Draft 0.5
A draft for version 0.5 was proposed but not adopted. This draft contains messages for greater compatibility with ADM in broadcast use cases. It brings up a potential problem of sending critical configuration messages over UDP. Whereas loosing a few high-rate
6.2. Relationship to ADM
ADM-OSC messages are designed to be translatable to (S-)ADM if needed. Messages that don’t translate into one (or more) ADM tag should not be in the /adm
namespace.
We've added some head tracker messages to ADM-OSC 1.0. Specifically, yaw-pitch-roll like:
/adm/lis/ypr f f f
Quaternions are more useful in many situations. Should we also have something like /quat
?
7. Definitions
Audio Definition Model
The Audio Definition Model (ADM) was first published by the European Broadcast Union (EBU) in 2015 as a standard representation of audio metadata [1]. The goal of ADM is to support a broad range of use cases that include spatial and immersive audio, as well as interactive personalization and accessibility features [What-is-ADM]. ADM can be used to represent channel-based, scene-based, and object-based audio. It is defined by the EBU in ITU-R BS.2076 [EBU-BS-2076]
Object-Based Audio
Object-based representation encodes audio tracks along with positional and other data about how that audio should be reproduced, or rendered, during playback. Positional data is speaker-agnostic, allowing object-based mixes to be highly portable. A musician might audition a mix on headphones using a binaural renderer [EBU-Tech-3396] then perform at a venue with dozens of loudspeakers using a spatial renderer. That mix might then be rendered for streaming with a third renderer. [Tsingos-2017]
Open Sound Control
OpenSoundControl (OSC) is a data transport specification (an encoding) for realtime message communication among applications and hardware. OSC was developed by researchers Matt Wright and Adrian Freed during their time at the Center for New Music & Audio Technologies (CNMAT). OSC was originally designed as a highly accurate, low latency, lightweight, and flexible method of communication for use in realtime musical performance. They proposed OSC in 1997 as “a new protocol for communication among computers, sound synthesizers, and other multimedia devices that is optimized for modern networking technology”.
There are several open-source implementations that simplify developers’ adoption, The OSC 1.0 specification has been published in 2002 [8].
Six Degrees of Freedom
Forward/backward, up/down, left/right translation, combined orientation (yaw, pitch, and roll).