Friday, December 27, 2013

VMPK for Android

Hi, Blog!

A lot of time since the last post. Let me announce a new port of VMPK, for Android  (4.x) devices. It is available in Google Play.


There are two apps, a paid version (0.5€) and a free one (gratis) with a small advertisement. It is not based on Qt, and it is not open source. It is a Java port rewritten from scratch using the native Android MIDI synthesizer and native Android themes. But on the other hand, is quite similar to the old N9 port, as you can see in the following screenshots, but with several additional features.



Being mostly a Java app, it includes some C code. The internal MIDI synthesizer is Android's "Sonivox EAS", which is part of AOS. The library is included in all recent Android versions, but it is not a public API, so I've compiled the library with a customized configuration and different features resulting a smaller binary and included it in the APK along with some other native code, mainly opensl_stream by Peter Brinkmann for interfacing the synth with Android's OpenSL ES audio output. An interesting aspect of the synth is the embedded GM soundfont using very small amount of memory, and not needing external data files.

Some features are: ipMIDI compatibility (MIDI OUT only) using UDP multicast and Wireless network. Accelerometer driven sliders for velocity, controllers and bender (like the N9 port), configurable number of keys and initial key, among other goodies.


Happy holidays!

Saturday, January 7, 2012

Choosing MIDI or Digital Audio by Analogy

Whenever I talk to someone about the relationship between MIDI and digital audio, one of my favorite analogies is that of computer images.

A digital raster image like a JPG file contains a bitmap. It is equivalent to an MP3 file containing digital audio. Both JPG and MP3 files contain quality loss compressed data, although other formats such as BMP and WAV files can contain pictures and digital sound without compression, respectively. In both cases the files store a set of digitized values. In the case of images, the data are individual pixels or dots that represent colors of the cells in a matrix of rows and columns that divide the digitized image. In the case of sound, individual data are samples that represent moments of time which divides the digitized sound. The digitization consists in dividing alike the image or sound into small fragments, the number of which depends on the resolution we want to get and the size of the scanned original.

Another type of images is called vector graphics. They are not suitable to represent photographs, but drawings. SVG files that are used in many illustrations of Wikipedia are of this type. Instead of image fragments, they contain symbolic descriptions using coordinates of points, distances, lines, and colors... They have the advantage of scalability without loss of quality, and ease of arbitrary modification of some of its components and properties without affecting the rest. The equivalent of this technology in the world of sound is MIDI. A MIDI sequence contains timestamped messages such as notes, instrument changes, controls, etc.. Not a proper format for storing sounds recorded by a microphone, but a symbolic representation of music similar to a score.

Images are two dimensional objects, so the digitized images consist of rows and columns of elements (pixels), and the position of the elements of a drawing is characterized by a pair of numbers that represent its Cartesian coordinates. On the other hand sound recordings are one-dimensional, sound samples are taken at constant time intervals and also MIDI messages are labeled by their position in the time line.

The above similarities have implications that reflect additional parallelism. An uncompressed digitized image consisting of any single solid color takes the same amount of memory than an image of the same size representing a photograph or a complex composition of multiple colors. Similarly, a recording of silence (for example John Cage's 4'33'') takes the same amount of memory than any symphonic piece of the same duration. On the other hand, a simple vector image takes much less memory than a complex picture of the same dimensions. And a few notes MIDI sequence occupies much less memory than a complex sequence of the same duration made up of many notes or other messages.

The problems posed by digital images and sounds on stretch and reduction of dimensions are also similar. In both cases artifacts are generated, an effect known as 'aliasing', which can be offset to some extent by using 'antialiasing' filters. On the other hand, in the case of vector graphics as MIDI sequences, you can easily perform stretching and shrinking of dimensions and duration without risking artifacts or quality loss whatsoever.

Starting from a vector image, it is necessary a rendering engine to get a digital image that can be displayed on the screen or a printer. In the case of MIDI, a sequencer and a MIDI synthesizer are required to produce digital audio that can be used by an audio interface.

The programs Inkscape and Gimp, used in Linux for creating and editing vector graphics and digital images respectively, are comparable to the Adobe programs Illustrator and Photoshop. They cover different needs and audience, thriving on  different niches. An example of this type of niche is the architects, who use vector graphics to design and represent buildings with Autocad or similar programs. These are not watertight compartments. Gimp can import vector graphic files, rendering them as bitmaps. Inkscape can also import a bitmap image as a drawing object. In each case, the users may choose the best tool for each task.

While it has been easy to list some essential image processing programs for Linux and other systems, to do the same exercise in the field of audio and MIDI is much more risky. The problem is that the way musicians work with computers is not homogeneous, with each musician working in a different way. For old school types the ideal work-flow is to note down  musical ideas, develop drafts and refine compositions using tools that work with symbolic elements, producing as a final result a paper copy of the score. Rosegarden could be appropriate at this stage. On the other extreme, there are those who never in his life read or write a score, and whose only tools of creation (other than musical instruments) are the mixer and multi-track recorder. In this case, Ardour could be right.

The two applications mentioned above allow the use of digital audio and MIDI at the same time. In the same way as in the world of images, some applications are focused on the symbolic representation (MIDI) and others in a final product (digital audio). In each case, the use of the other technology will be subordinate. For instance, Ardour MIDI messages are aligned to the audio samples. It has even developed an API (Jack MIDI) to ensure synchronization of MIDI events to digital audio samples, subordinating MIDI to the rules of digital audio. Obviously this strategy does not fit adequately on all scenarios where MIDI is useful.

As in the imaging world, symbolic representation (MIDI) is probably better suited for design, drafting and composition. By contrast, digital audio is the dominant technology in the studio, at mixing stage and production, to obtain a finished product.

Wednesday, October 12, 2011

VMPK & FluidSynth 0.1.0 released

VMPK & FluidSynth is a MeeGo Harmattan application for Nokia N9/N950 smartphones. It contains a  QML based VMPK user interface bundled with FluidSynth for sound generation.

You may download it from OVI Store right now.

Several enhancements have been included since the 0.0.1 beta announced in August.
  • Controllers, Bender, and Velocity values can be optionally controlled by the device's accelerometer.
  • Internationalization. This version includes translations to Spanish, Russian (thanks to Serguey Basalaev) and Czech (thanks to Pavel Fric).
  • Inverted color theme. This dark color combination consumes less power, enabling longer battery life.
  • Latest FluidSynth included.
Sources available at SourceForge.net, as usual.







Sunday, September 11, 2011

SoundFonts want to be free

A "SoundFont" file  (suffix .SF2) is a definition of one or several musical instruments, which can be used with synthesizers (hardware or software) to render, or convert musical notes (eg MIDI files, suffix. MID) into digital sound, which can be used by an audio interface and speakers to play music. Another file format for the same purpose is DownLoadableSounds (suffix. DLS). Both include sound samples that can be entirely synthetic or digitized from real instruments.

General MIDI is a very popular specification, that among other things define a palette of instruments. The instrument #1 is a piano, #41 a violin, #57 trumpet, #74 flute... GM SoundFonts offer 128 instruments arranged in that particular order. GS and XG are extensions of this specification.

Linux needs GM soundfonts that could be distributed together with GPL programs, similar to the need of typographic fonts for text rendering applications. Many Linux distributions incorporate the FluidR3 soundfont in their repositories. It is free and produce good quality sound, but is not small: more than 140 megabytes. MuseScore distributes "TimGM6mb" soundfont, which "only" weights 5.8 megabytes.

There are several software synthesizers using SoundFonts. Well known are FluidSynth and TiMidity++, both with free licenses. One lesser known, but no less interesting, is Gervill. It is part of OpenJDK, and therefore GPLv2 licensed. It is implemented in Java, of course. I am not very fond of Java, and I do not usually use it except for commercial projects when the customer requires their use, but this time I will do it only for fun.

Gervill can use SoundFonts, DLS, or WAV files. A very interesting feature is the so-called "Emergency Soundbank", used when no other external SoundFont is available. Definitions of this SoundFont instruments are fully synthetic and follow the GM standard.

This EmergencySoundbank is a Java class, and does not reside on a file, therefore it can not be used on another synthesizer. However, nothing prevents us from creating a Java program that instantiates the class, and stores the instrument definitions on a disk file. How complicated may this program be? Let's see:

$ cat MakeEmergencySoundfont.java
import com.sun.media.sound.*;
public class MakeEmergencySoundfont {
        public static void main(String[] args) throws Exception {
                SF2Soundbank sf2 = EmergencySoundbank.createSoundbank();
                sf2.save("GervillEmergencySoundbank.sf2");
        }
}


7 lines is not much after all. Let's compile it:

$ javac MakeEmergencySoundfont.java
MakeEmergencySoundfont.java:5: warning: com.sun.media.sound.SF2Soundbank is internal proprietary API and may be removed in a future release
                SF2Soundbank sf2 = EmergencySoundbank.createSoundbank();
                ^
MakeEmergencySoundfont.java:5: warning: com.sun.media.sound.EmergencySoundbank is internal proprietary API and may be removed in a future release
                SF2Soundbank sf2 = EmergencySoundbank.createSoundbank();
                                   ^
2 warnings


We have earned two warnings for the naughty boys, because it is ugly to directly use classes in the namespace "com.sun.media.sound.*", and if the Oracle finds out our prank, he could lock the pantry. Here, take another cookie, Neo...

To compile and use this program you only need OpenJDK6 (runtime and compiler). For older Java versions you can get a "gervill.jar" from the project website.

Running the program produces a SoundFont file:


$ java MakeEmergencySoundfont
$ ls GervillEmergencySoundbank.sf2
-rw------- 1 pedro users 1.8M 2011-09-11 12:50 GervillEmergencySoundbank.sf2


The result weights less than 2 megabytes. Of course the quality of the SoundFont is not high, but it is better than installing nothing at all by default, leaving users wondering what do they need to hear something in their programs that depend on software synthesizers.

It is interesting that a similar technique, also very simple, can be used to produce other SoundFonts based on samples of arbitrary sounds. For more details, see the MakeSoundFont example in Gervill's repository.

Finally, a question remains about what license we can use to distribute the file "GervillEmergencySoundbank.sf2" generated by our program. As a general rule, the output of a GPL program has no restrictions. However, in this case the program output does not come from processing input data, but simply dumps the results of running the algorithms included in the EmergencySoundbank class. To play it safe, we should release it as GPL.

Saturday, August 27, 2011

Presenting VMPK for Nokia N950

I've been playing with my new and sexy Nokia N950 developer device, and here is the fruit: a newborn VMPK. I've just released a beta for testing, usable but not yet optimized. Please, try it. Your feedback will be appreciated.

Download VMPK & FluidSynth for N950 from sourceforge.net

I've learned two lessons from the Symbian port of VMPK published at Nokia's OVI Store: people expect that if a program looks like a piano, it should sound like a piano. It doesn't matter if the product description says that it doesn't produce any sound by itself. Dozens of comments in OVI Store page confirm that there is no hope that users read the description before downloading a program.

When I was doing some research for the Symbian port, I've discovered that creating sound always used very large audio buffers, no matter the method, producing about one second of latency or more. This is unacceptable for a musical instrument emulation, so network MIDI was the only available option. On the other hand, the Nokia N9xx uses Linux, including ALSA and PulseAudio among other usual infrastructure, so the latency is not a problem and FluidSynth is a perfectly sound complementary addition to VMPK.

Second lesson: an user interface that fits well in the desktop version of the program is barely usable on the mobile phone form factor. The solution is to create a new user interface using QML, the new declarative language for Qt user interfaces. The piano keyboard widget was already built around the Qt Graphics View Framework, so it only required to be wrapped as a QDeclarativeItem subclass and it was readily available as a QML object, to be combined with the Qt Quick Components for Meego library to build the new user interface. Here are some screenshots.

Main page, common controls are shown.

Main menu, note names option activated.

Preferences page.

About page.