Monday, April 26, 2010

Apples and Oranges

KMid is now a multi-platform application for Linux, Windows, and Mac OSX. It may be the right time to make a comparison between the different operating systems with regarding to the development of KMid backends.

First, the functional components needed by a KMid backend

  • Read and parsing of SMF: MIDI and Karaoke files. This mechanism must offer not only timestamped MIDI events, but also the metadata (for instance, song lyrics) embedded into the SMF data.
  • Facility for sequencing MIDI events. Events read from a file are labeled with timestamps, to be delivered to MIDI synthesizers at the right times, handling also common player actions like play, pause and stop.
  • Internal and external, hardware and software MIDI synthesizers. The main goal of KMid is to support external musical instruments, but as many potential users do not have one, it is interesting to be able to use software synthesizers, in a transparent way without complicating the program design.
Organization of the backends

KMid::Backend is modelled after several Phonon interfaces. There is not a dependency on Phonon, only inspiration and Copy&Paste. There is a KMid::MIDIObject abstract class that resembles more or less a Phonon::MediaObject, encapsulating the sequencer functionality, and a KMid::MIDIOutput class representing a MIDI output port, similar to Phonon::AudioOutput. Main differences are:

  • KMid::MIDIObject time is measured in musical time (ticks) instead of milliseconds. Some additional properties: timeSkew, textEncoding, lyrics. Several signals, one for Each MIDI event type and metadata, feeding the program's graphic interface animations: lyrics highlighting, visual metronome, channel meters, and piano player keys.
  • KMid::MIDIOutput has pitchShift and midiMap properties. Volume and mute properties take a MIDI channel argument. There are also some real-time MIDI event slots, one for each MIDI event type, used by widgets like the channel instrument selectors or the piano player keys when they are triggered by the mouse or the computer keyboard.

ALSA provides a very advanced MIDI sequencer. The API is big and cumbersome to use in a KDE program, so I've developed a library layer, named "drumstick-alsa", providing a C++/Qt4 wrapper around it. ALSA does not provide a mechanism for reading and parsing MIDI files, so there is another library named "drumstick-file" taking care of it. In ALSA, MIDI hardware and software clients are all equivalent and there is a routing mechanism between MIDI OUT and MIDI IN ports, fully transparent and very powerful, where either of the two ends of the connection can be a program or a MIDI device driver. Of course, the data transmitted by the communications connection uses the standard MIDI protocol. KMid creates a MIDI output port externally visible, to be connected to a MIDI synthesizer, and also private loopback ports (input and output) which are used for visual feedback in the graphic interface (highlighted lyrics, metronome, piano keys...) The MidiPlayer class schedules MIDI events to a loopback port, which are received in time and propagated activating the signals emitted by the MIDIObject class and at the same time are sent through the MIDIOutput class to their final destination, which is any kind of MIDI synthesizer.

Any software synthesizer that is also an ALSA sequencer client can be connected to the MIDI output of KMid. Although not strictly necessary, for the users convenience it is possible to enable and launch FluidSynth or TiMidity++ on KMid's initialization. Unfortunately, a software synthesizer receives MIDI events and converts them into digital audio, and here  is where some users may find problems that go beyond the scope of KMid (ALSA/Jack/OSS/PulseAudio..., you know). In addition, a software synthesizer may require soundfonts. Enough to say that Linux distributions should package and integrate all the components that the users may need. The owners of external MIDI musical instruments are lucky, because they are quite well supported by ALSA.


The operating system provides the same MIDI support since the days of Windows 3.1 practically without any changes. Windows cannot connect the MIDI output from a program to the input of another program, only to a system driver, so the synthesizer must be external, or having a device driver, or using MidiYoke or any other similar solution. However, Windows provides a software synthesizer unconditionally installed and activated, so the Windows backend does not have or need a soft-synth configuration page.

The MMSystem API does not provide virtual MIDI ports, nor transparent MIDI routing, so the strategy of playing to a loopback port as in Linux is simply not possible. What is possible is to use "custom" midi events and define a callback function that is invoked whenever the time arrives to play a scheduled MIDI event. This is how all the visual feedback is implemented. The system libraries do not provide reading and parsing of MIDI files, so drumstick-file is used.


The operating system provides all kind of MIDI support services, starting with reading and processing of MIDI files. If we compare the AudioToolkit SMF processing API with drumstick-file, the latter provides a model similar to XML-SAX, while AudioToolkit would be similar to XML-DOM. After reading the MIDI files, the results are explored to extract the required metadata, and an additional track is added containing the "custom" MIDI events for visual feedback.

Virtual MIDI ports are fully supported, with routing and full transparency between applications and devices, so the loopback port strategy works exactly the same as in Linux/ALSA. The Apple software synthesizer is an object of type "AudioUnit", and requires activation before a program can use it, so the Mac OSX backend has a simple soft-synth configuration page. The DLS soundfont provided by Apple is made by Roland, like the Windows one.


The LOC numbers needed to implement each backend are very similar: mac = 2358, windows = 2435, alsa = 2715 (not counting drumstick), and its complexity is also similar. Despite the different features offered by each platform, it is possible to implement KMid backends presenting the same programming interface to the application. We can create new backends in the future using the existing ones as models, or duplicating the "dummy" template that exists in a directory with the same name in the source repository.