Sunday, September 14, 2014

Notes about VMPK 0.6.0 for Desktops

Dear Blog,

It has been a long time, but a week later after the release it is time to discuss some aspects of the latest version of our Virtual MIDI Piano Keyboard for desktop computers.

Where to start? an interesting point is the change of architecture with the replacement of RtMIDI by Drumstick-RT. This library is new and homegrown, part of the Drumstick family which includes Drumstick-File and Drumstick-ALSA as well. The motivation to create it was the difficulty of extending RtMIDI with other drivers different to the ones chosen by his author. This was not a problem in the past, because the RtMIDI sources were always included in the application, and any customization was possible and easy. Now, thanks to the Taliban of Linux distributions forcing the dynamic linking of RtMIDI this is simply not feasible - to the hell with the freedom of Free Software!. Throwing away the ipMIDI backend was not an option. On the other hand, with Drumstick RT is not only possible, but new backends can be compiled separately and installed in the system without recompiling neither VMPK nor Drumstick, because they are in fact plugins. By chance it also has fixed the bug reported (in 2009!) in ticket #15: LinuxSampler did not appear among ALSA connections, because LinuxSampler MIDI port has no flag providing proper characterization and RtMIDI (unlike Drumstick RT) filters out that port.

The replacement of the RtMIDI library with Drumstick-RT was a long time plan, not only for VMPK, but for Drumstick as well, that finally took place now. I hope that this shall be a foundation for features like recording/playback in the future. The only thing that maybe would be missing for some users is the jack-midi interface, but on the other hand Unix users will enjoy native OSS support, and also FluidSynth direct output on all operating systems, meaning  also configurable SoundFonts: a very demanded feature for Windows users.

Another long time request finally implemented is the ability of displaying any number of keys, for instance 88 keys, instead of full octaves, starting with any arbitrary white note (ticket #39), like configuring 25 or 49 keys (depending on which device, laptop or tablet, and screen size you have). Congratulations to all the requesters and sorry for keeping you waiting for this feature so long.

Finally, the migration to Qt5 has happened. This means also replacing a dependency from Xlib to XCB, that hopefully will bring future support for wayland/whatever. The victim has been the keyboard grabbing feature, that was only working on Linux thanks to a now lost X11/Qt4 feature. I hope to bring it back in the future with a multiplatform implementation.

There are now binary packages for 32/64 bit Linux users that shall work on any modern distribution, in the form of installers packaged using the excellent BitRock InstallBuilder. That means including all the required dependencies inside the package, in the same way the libraries are included in the Windows and Mac OS X setup packages. In order to reduce the package weight, superfluous things like Jack support were excluded, because the new FluidSynth backend is intended to provide instant audio out of the box without requiring the users to search, find, ask, learn, install, try and tweak. Something that traditional Linux distributions have failed to do, in despite of their duty of integration and making the life easier to their users. I am pretty sure that many Linux distros will fail to provide VMPK native packages for this release like they did for the 0.5.x  series (see Debian, Ubuntu and Fedora repositories for instance). Prove me wrong, and this kind of binary Linux packages would be deprecated.

Friday, December 27, 2013

VMPK for Android

Hi, Blog!

A lot of time since the last post. Let me announce a new port of VMPK, for Android  (4.x) devices. It is available in Google Play.


There are two apps, a paid version (0.5€) and a free one (gratis) with a small advertisement. It is not based on Qt, and it is not open source. It is a Java port rewritten from scratch using the native Android MIDI synthesizer and native Android themes. But on the other hand, is quite similar to the old N9 port, as you can see in the following screenshots, but with several additional features.



Being mostly a Java app, it includes some C code. The internal MIDI synthesizer is Android's "Sonivox EAS", which is part of AOS. The library is included in all recent Android versions, but it is not a public API, so I've compiled the library with a customized configuration and different features resulting a smaller binary and included it in the APK along with some other native code, mainly opensl_stream by Peter Brinkmann for interfacing the synth with Android's OpenSL ES audio output. An interesting aspect of the synth is the embedded GM soundfont using very small amount of memory, and not needing external data files.

Some features are: ipMIDI compatibility (MIDI OUT only) using UDP multicast and Wireless network. Accelerometer driven sliders for velocity, controllers and bender (like the N9 port), configurable number of keys and initial key, among other goodies.


Happy holidays!

Saturday, January 7, 2012

Choosing MIDI or Digital Audio by Analogy

Whenever I talk to someone about the relationship between MIDI and digital audio, one of my favorite analogies is that of computer images.

A digital raster image like a JPG file contains a bitmap. It is equivalent to an MP3 file containing digital audio. Both JPG and MP3 files contain quality loss compressed data, although other formats such as BMP and WAV files can contain pictures and digital sound without compression, respectively. In both cases the files store a set of digitized values. In the case of images, the data are individual pixels or dots that represent colors of the cells in a matrix of rows and columns that divide the digitized image. In the case of sound, individual data are samples that represent moments of time which divides the digitized sound. The digitization consists in dividing alike the image or sound into small fragments, the number of which depends on the resolution we want to get and the size of the scanned original.

Another type of images is called vector graphics. They are not suitable to represent photographs, but drawings. SVG files that are used in many illustrations of Wikipedia are of this type. Instead of image fragments, they contain symbolic descriptions using coordinates of points, distances, lines, and colors... They have the advantage of scalability without loss of quality, and ease of arbitrary modification of some of its components and properties without affecting the rest. The equivalent of this technology in the world of sound is MIDI. A MIDI sequence contains timestamped messages such as notes, instrument changes, controls, etc.. Not a proper format for storing sounds recorded by a microphone, but a symbolic representation of music similar to a score.

Images are two dimensional objects, so the digitized images consist of rows and columns of elements (pixels), and the position of the elements of a drawing is characterized by a pair of numbers that represent its Cartesian coordinates. On the other hand sound recordings are one-dimensional, sound samples are taken at constant time intervals and also MIDI messages are labeled by their position in the time line.

The above similarities have implications that reflect additional parallelism. An uncompressed digitized image consisting of any single solid color takes the same amount of memory than an image of the same size representing a photograph or a complex composition of multiple colors. Similarly, a recording of silence (for example John Cage's 4'33'') takes the same amount of memory than any symphonic piece of the same duration. On the other hand, a simple vector image takes much less memory than a complex picture of the same dimensions. And a few notes MIDI sequence occupies much less memory than a complex sequence of the same duration made up of many notes or other messages.

The problems posed by digital images and sounds on stretch and reduction of dimensions are also similar. In both cases artifacts are generated, an effect known as 'aliasing', which can be offset to some extent by using 'antialiasing' filters. On the other hand, in the case of vector graphics as MIDI sequences, you can easily perform stretching and shrinking of dimensions and duration without risking artifacts or quality loss whatsoever.

Starting from a vector image, it is necessary a rendering engine to get a digital image that can be displayed on the screen or a printer. In the case of MIDI, a sequencer and a MIDI synthesizer are required to produce digital audio that can be used by an audio interface.

The programs Inkscape and Gimp, used in Linux for creating and editing vector graphics and digital images respectively, are comparable to the Adobe programs Illustrator and Photoshop. They cover different needs and audience, thriving on  different niches. An example of this type of niche is the architects, who use vector graphics to design and represent buildings with Autocad or similar programs. These are not watertight compartments. Gimp can import vector graphic files, rendering them as bitmaps. Inkscape can also import a bitmap image as a drawing object. In each case, the users may choose the best tool for each task.

While it has been easy to list some essential image processing programs for Linux and other systems, to do the same exercise in the field of audio and MIDI is much more risky. The problem is that the way musicians work with computers is not homogeneous, with each musician working in a different way. For old school types the ideal work-flow is to note down  musical ideas, develop drafts and refine compositions using tools that work with symbolic elements, producing as a final result a paper copy of the score. Rosegarden could be appropriate at this stage. On the other extreme, there are those who never in his life read or write a score, and whose only tools of creation (other than musical instruments) are the mixer and multi-track recorder. In this case, Ardour could be right.

The two applications mentioned above allow the use of digital audio and MIDI at the same time. In the same way as in the world of images, some applications are focused on the symbolic representation (MIDI) and others in a final product (digital audio). In each case, the use of the other technology will be subordinate. For instance, Ardour MIDI messages are aligned to the audio samples. It has even developed an API (Jack MIDI) to ensure synchronization of MIDI events to digital audio samples, subordinating MIDI to the rules of digital audio. Obviously this strategy does not fit adequately on all scenarios where MIDI is useful.

As in the imaging world, symbolic representation (MIDI) is probably better suited for design, drafting and composition. By contrast, digital audio is the dominant technology in the studio, at mixing stage and production, to obtain a finished product.

Wednesday, October 12, 2011

VMPK & FluidSynth 0.1.0 released

VMPK & FluidSynth is a MeeGo Harmattan application for Nokia N9/N950 smartphones. It contains a  QML based VMPK user interface bundled with FluidSynth for sound generation.

You may download it from OVI Store right now.

Several enhancements have been included since the 0.0.1 beta announced in August.
  • Controllers, Bender, and Velocity values can be optionally controlled by the device's accelerometer.
  • Internationalization. This version includes translations to Spanish, Russian (thanks to Serguey Basalaev) and Czech (thanks to Pavel Fric).
  • Inverted color theme. This dark color combination consumes less power, enabling longer battery life.
  • Latest FluidSynth included.
Sources available at SourceForge.net, as usual.







Sunday, September 11, 2011

SoundFonts want to be free

A "SoundFont" file  (suffix .SF2) is a definition of one or several musical instruments, which can be used with synthesizers (hardware or software) to render, or convert musical notes (eg MIDI files, suffix. MID) into digital sound, which can be used by an audio interface and speakers to play music. Another file format for the same purpose is DownLoadableSounds (suffix. DLS). Both include sound samples that can be entirely synthetic or digitized from real instruments.

General MIDI is a very popular specification, that among other things define a palette of instruments. The instrument #1 is a piano, #41 a violin, #57 trumpet, #74 flute... GM SoundFonts offer 128 instruments arranged in that particular order. GS and XG are extensions of this specification.

Linux needs GM soundfonts that could be distributed together with GPL programs, similar to the need of typographic fonts for text rendering applications. Many Linux distributions incorporate the FluidR3 soundfont in their repositories. It is free and produce good quality sound, but is not small: more than 140 megabytes. MuseScore distributes "TimGM6mb" soundfont, which "only" weights 5.8 megabytes.

There are several software synthesizers using SoundFonts. Well known are FluidSynth and TiMidity++, both with free licenses. One lesser known, but no less interesting, is Gervill. It is part of OpenJDK, and therefore GPLv2 licensed. It is implemented in Java, of course. I am not very fond of Java, and I do not usually use it except for commercial projects when the customer requires their use, but this time I will do it only for fun.

Gervill can use SoundFonts, DLS, or WAV files. A very interesting feature is the so-called "Emergency Soundbank", used when no other external SoundFont is available. Definitions of this SoundFont instruments are fully synthetic and follow the GM standard.

This EmergencySoundbank is a Java class, and does not reside on a file, therefore it can not be used on another synthesizer. However, nothing prevents us from creating a Java program that instantiates the class, and stores the instrument definitions on a disk file. How complicated may this program be? Let's see:

$ cat MakeEmergencySoundfont.java
import com.sun.media.sound.*;
public class MakeEmergencySoundfont {
        public static void main(String[] args) throws Exception {
                SF2Soundbank sf2 = EmergencySoundbank.createSoundbank();
                sf2.save("GervillEmergencySoundbank.sf2");
        }
}


7 lines is not much after all. Let's compile it:

$ javac MakeEmergencySoundfont.java
MakeEmergencySoundfont.java:5: warning: com.sun.media.sound.SF2Soundbank is internal proprietary API and may be removed in a future release
                SF2Soundbank sf2 = EmergencySoundbank.createSoundbank();
                ^
MakeEmergencySoundfont.java:5: warning: com.sun.media.sound.EmergencySoundbank is internal proprietary API and may be removed in a future release
                SF2Soundbank sf2 = EmergencySoundbank.createSoundbank();
                                   ^
2 warnings


We have earned two warnings for the naughty boys, because it is ugly to directly use classes in the namespace "com.sun.media.sound.*", and if the Oracle finds out our prank, he could lock the pantry. Here, take another cookie, Neo...

To compile and use this program you only need OpenJDK6 (runtime and compiler). For older Java versions you can get a "gervill.jar" from the project website.

Running the program produces a SoundFont file:


$ java MakeEmergencySoundfont
$ ls GervillEmergencySoundbank.sf2
-rw------- 1 pedro users 1.8M 2011-09-11 12:50 GervillEmergencySoundbank.sf2


The result weights less than 2 megabytes. Of course the quality of the SoundFont is not high, but it is better than installing nothing at all by default, leaving users wondering what do they need to hear something in their programs that depend on software synthesizers.

It is interesting that a similar technique, also very simple, can be used to produce other SoundFonts based on samples of arbitrary sounds. For more details, see the MakeSoundFont example in Gervill's repository.

Finally, a question remains about what license we can use to distribute the file "GervillEmergencySoundbank.sf2" generated by our program. As a general rule, the output of a GPL program has no restrictions. However, in this case the program output does not come from processing input data, but simply dumps the results of running the algorithms included in the EmergencySoundbank class. To play it safe, we should release it as GPL.