©  Jan Newmarch 2017

Jan Newmarch, Linux Sound Programming, 10.1007/978-1-4842-2496-0_23

23. Karaoke User-Level Tools

Jan Newmarch

(1)Oakleigh, Victoria, Australia

Karaoke is an “audience participation” sound system, in which the soundtrack and usually the melody are played along with a moving display of the lyrics. Within this, there can be variations or different features.

  • The lyrics can be shown all at once, while the music plays in sequence.

  • The lyrics can be highlighted in synchronization with the melody line.

  • The melody line maybe will always play or can be switched off.

  • Some players will also include a vocalist singing the song.

  • Some players with vocals will turn off the vocals when someone is singing.

  • Some players will give a graphical display of the notes of the melody.

  • Some players will give a graphical display of the melody and also show the notes the singers are singing.

  • Some players will produce scores based on some evaluation of the singer’s accuracy. The basis of such scoring is usually not known.

  • Some players allow you to change the playback speed and the playback pitch.

  • Most players will accept two microphones and can have reverb effects added to the singer’s voices.

  • Many players will allow you to select songs in advance to build a dynamic playlist.

Karaoke is popular in Asia and has a following in European countries. Karaoke systems are believed to have originated in Asia, although the history according to Wikipedia ( http://en.wikipedia.org/wiki/Karaoke ) is a little muddy.

There are various file formats for karaoke described at www.karawin.fr/defenst.php . This chapter considers the features, formats, and user-level tools for playing karaoke.

Video CD Systems

Video CDs are an older form of video storage on an optical disc. The resolution is fairly low, typically 352×240 pixels, with a frame rate of 25 frames per second. Although they were used by a few movies, they have been supplanted for movies by DVDs. However, they were used extensively at one stage for karaoke discs.

The cheaper CD/DVD players from Asia often have microphone inputs and can be used as karaoke players with VCD discs. Typically files are simple movies in AVI or MPEG format, so you can just sing along. While the lyrics are usually displayed, highlighted in time to the melody, there are no advanced features such as scoring or a display of the melody.

If you have VCD discs, they can be mounted as IC9660 files on your computer, but on a Linux system you cannot directly extract the files. Players such as VLC, MPlayer, and Totem can play files from them.

You need to use something like vcdimager to extract files from a VCD disc. This may be in your package system, or you can download it from the GNU developer site ( www.gnu.org/software/vcdimager/ ) and build it from source. The video files can then be extracted as MPEG or AVI files with the following:

          vcdxrip --cdrom-device=/dev/cdrom --rip

(On my system I had to replace /dev/cdrom with /dev/sr1 as I could not extract from the default DVD player. I found out what device it was by running mount and then unmounted it with umount.)

CD+G Discs

According to Wikipedia’s “CD+G” page ( https://en.wikipedia.org/wiki/CD%2BG ), “CD+G (also known as CD+Graphics ) is an extension of the compact disc standard that can present low-resolution graphics alongside the audio data on the disc when played on a compatible device. CD+G discs are often used for karaoke machines, which utilize this functionality to present onscreen lyrics for the song contained on the disc.”

Each song is composed of two files: an audio file and a video file containing the lyrics (and maybe some background scenes).

There are many discs that you can buy using this format. You can’t play them directly on your computer. Rhythmbox will play the audio but not the video. VLC and Totem don’t like them.

Ripping the files onto your computer for storage on your hard disk is not so straightforward. The audio discs do not have a file system in the normal sense. For example, you cannot mount them using the Unix mount command; they are not even in ISO format. Instead, you need to use a program like cdrdao to rip the files to a binary file and then work on that.

 $ cdrdao read-cd --driver generic-mmc-raw --device /dev/cdroms/cdrom0 --read-subchan rw_raw mycd.toc

The previous code creates a data file and a table of contents file.

The format of the CDG files has not apparently been publically released but is described by Jim Bumgardner (back in 1995!) at “CD+G Revealed: Playing back Karaoke tracks in Software” ( http://jbum.com/cdg_revealed.html ).

Programs such as Sound Juicer will extract the audio tracks but leave the video behind.

MP3+G Files

MP3+G files are CD+G files adapted for use on a normal PC. They consist of an MP3 file containing the audio and a CDG file containing the lyrics. Frequently they are zipped together.

Many sites selling CD+G files also sell MP3+G files. Various sites give instructions on how to create your own MP3+G files. There are not many free sites.

The program cgdrip.py from cdgtools-0.3.2 can rip CD+G files from an audio disc and convert them to a pair of MP3+G files. The instructions from the (Python) source code are as follows:

# To start using cdgrip immediately, try the following from the
# command-line (replacing the --device option by the path to your
# CD device):
#
#  $ cdrdao read-cd --driver generic-mmc-raw --device /dev/cdroms/cdrom0 --read-subchan rw_raw mycd.toc
#  $ python cdgrip.py --with-cddb --delete-bin-toc mycd.toc
#
# You may need to use a different --driver option or --read-subchan mode
# to cdrdao depending on your CD device. For more in depth details, see
# the usage instructions below.

Buying CD+G or MP3+G Files

There are many sites selling CD+G and MP3+G songs. Just do a Google search. However, the average price per song is about $3, and if you want to build up a large collection, that can become expensive. Some sites will give discounts for larger volume purchases, but even at $30 for 100 songs, the expense can be high.

Sites with very large collections come and go. At the time of writing, aceume.com offers 14,000 English songs for US 399. But you could buy their AK3C Android All-in-one Cloud Karaoke Player with 21,000 English songs and 35,000 Chinese songs included for US 600. That makes the economics of building your own karaoke player become shakier. I will ignore that issue here—it’s your choice!  

Converting MP3+G to Video Files

The tool ffmpeg can merge the audio and video to a single video file with the following, for example:

ffmpeg -i Track1.cdg -i Track1.mp3 -y Track1.avi

Use the following to create an AVI file containing both video and audio:

avconv -i Track1.cdg -i Track1.mp3 test.avi
avconv -i test.avi -c:v libx264 -c:a copy outputfile.mp4

This can be played by VLC, MPlayer, Rhythmbox, and so on.

There is a program called cdg2video . It is last dated February 2011, and changes in the FFmpeg internals means that it no longer compiles. Even if you fix the obvious changes, there are a huge number of complaints from the C compiler about the use of deprecated FFmpeg functions.

MPEG-4 Files

It is becoming common to have karaoke systems using MPEG-4 video players. These embed all of the information into a video. There is no scoring system with players of these files.

Some rate them as much higher sound quality; see http://boards.straightdope.com/sdmb/showthread.php?t=83441 , for example. I suggest it is more an issue with the synthesizer used than the format. Certainly high-end synthesizer manufacturers such as Yamaha would not agree!

MPEG-4 files are certainly larger than the corresponding MIDI files, and you will need a substantial disk to hold many of them.

There are many sites selling MP4 songs. Just do a Google search. However, the average price per song is about $3, and if you want to build up a large collection, that can become very expensive.

At the time of writing, there doesn’t seem to be a site selling large volumes of MPEG-4 songs. However, there have been in the past and may be in the future.

Karaoke Machines

There are many karaoke machines that come with a DVD. In most cases, the songs are stored as MIDI files, with the song track in one MIDI file and the lyrics in another. Some more recent systems will use WMA files for the soundtrack, and this allows one track to have a vocal supplied and the other without the vocal. Such systems will usually include a scoring mechanism, although the basis for the scoring is not made explicit. The most recent ones are hard-disk-based, usually with MP4 files. They do not seem to have a scoring system. The suppliers of these systems change regularly, even if the systems themselves are only re-badged. I own systems by Malata and Sonken, but they were purchased many years ago. I'm not convinced that more recent models are necessarily improvements.

The two systems I own show different characteristics. The Sonken MD-3881 plays songs from multiple languages, such as Chinese, Korean, English, and so on. My wife is Chinese, but I cannot read Chinese characters. There is an Anglicized script called PinYin, and the Sonken shows both the Chinese characters and the PinYin, so I can sing along too. It looks like Figure 23-1.

A435426_1_En_23_Fig1_HTML.jpg
Figure 23-1. Screen dump of Sonken player

The Malata MDVD-66192 does not show the PinYin when playing Chinese songs. But it does show the notes you are supposed to be singing and the notes you are actually singing. Figure 23-2 shows that I am way off-key.

A435426_1_En_23_Fig2_HTML.jpg
Figure 23-2. Screen dump of Malata player

MIDI Players

Karaoke files in MIDI format can be found from several sites, usually ending in .kar. Any MIDI player such as TiMidity can play such files. However, they do not always show the lyrics synchronized to the melody.

Finding MIDI Files

There are several sites on the Web offering files in MIDI format.

KAR File Format

There is no formal standard for karaoke MIDI files. There is a widely accepted industry format called the MIDI Karaoke Type 1 file format.

The following is from MIDI karaoke FAQ ( http://gnese.free.fr/Projects/KaraokeTime/Fichiers/karfaq.html ):

  • What is the MIDI Karaoke Type 1 (.KAR) file format? A MIDI karaoke file is a standard MIDI file type 1 that contains a separate track with lyrics of the song entered as text events. Load one of the MIDI karaoke files into a sequencer to examine the contents of the tracks of the file. The first track contains text events that are used to make the file recognizable as the MIDI karaoke file. The @KMIDI KARAOKE FILE text event is used for that purpose. The optional text event @V0100 denotes the format version number. Anything starting with @I is any information you want to include in the file.

  • The second track contains the text meta events for the lyrics of the song. The first event is @LENGL. It identifies the language of the song, in this case, English. The next couple of events start with @T, which identifies the title of the songs. You can have up to three events like these. The first event should contain the title of the song. Some programs (such as Soft Karaoke) read this event to get the name of the song to be displayed in the File Open dialog box. The second event usually contains the performer or author of the song. The third event can contain any copyright information or anything else.

  • The rest of the second track contains the words of the song. Each event is the syllable that is supposed to be sung at the time of the event. If the text starts with , it means to clear the screen and show the words at the top of the screen. If the text starts with /, it means to go to the next line.

  • Important note: There can be only three lines per screen in a .kar file for Soft Karaoke to play the file correctly. In other words, there can be only two forward slashes beginning each line in a line of lyrics. The next line has to start with a back slash.

There are several weaknesses in this format, listed here:

  • The list of possible languages is not specified, only English.

  • The encoding of text is not specified (for example, Unicode UTF-8).

  • There is no means of identifying the channel carrying the melody.

PyKaraoke

PyKaraoke is a dedicated karaoke player written in Python, using a variety of libraries such as Pygame and WxPython. It plays the song and shows where in the lyrics you are. A screen dump of “Smoke Gets in Your Eyes” ( www.midikaraoke.com/cgi-bin/songdir/jump.cgi?ID=1280 ) looks like Figure 23-3.

A435426_1_En_23_Fig3_HTML.jpg
Figure 23-3. Screen dump of PyKaraoke

PyKaraoke plays the soundtrack and displays the lyrics. It does not act as a proper karaoke system by also playing the singer’s input. But PyKaraoke uses the PulseAudio system, so you can simultaneously play other programs. In particular, you can have PyKaraoke running in one window, while pa-mic-2-speaker is running in another. PulseAudio will mix the two output streams and play both sources together. Of course, there will no scoring possible in such a system without extra work.

kmid3

kmid is a KDE-based karaoke player. It plays the song and shows where in the lyrics you are. A screen dump of “Smoke Gets in Your Eyes” looks like Figure 23-4.

A435426_1_En_23_Fig4_HTML.jpg
Figure 23-4. kmid screen dump. kmid uses either TiMidity or FluidSynth as a MIDI back end.

kmid plays the soundtrack and displays the lyrics. It does not act as a proper karaoke system by also playing the singer’s input. But kmid can use the PulseAudio system, so you can simultaneously play other programs. In particular, you can have kmid running in one window, while pa-mic-2-speaker is running in another. PulseAudio will mix the two output streams and play both sources together. Of course, there will no scoring possible in such a system without extra work.

Microphone Inputs and Reverb Effects

Nearly all PCs and laptops have a sound card to play audio. While nearly all of these also have a microphone input, some do not. For example, my Dell laptop does not, the Raspberry Pi does not, and many Android TV media boxes do not.

Those computers without microphone inputs often have USB ports. They will usually accept USB sound cards, and if the USB has a microphone input, then that is recognized.

If you want to support two or more microphones, then you will need the corresponding number of sound cards or a mixer device. I have seen the Behringer MX-400 MicroMix, a four-channel compact low-noise mixer, for $20, or you can find circuit diagrams on electronics sites (Google circuit diagram for audio mixer).

Reverb is an effect that gives a fuller “body” to the voice by adding (artificial) echoes with different delays. Behringer also makes the MIX800 MiniMix, which can mix two microphones with reverb effects and also has a pass-through for line input (so you can play the music and control the microphones). (I have no links to Behringer.) A similar unit is the UNIFY K9 Reverb Computer Karaoke Mixer.

DVD players from China often have dual microphone inputs with mix and reverb capabilities. Given that they can cost as little as $13. Admittedly, for 1,000 units, it shows that mixing and reverb should not be too costly. My guess is that they use something like the Mitsubishi M65845AFP ( www.datasheetcatalog.org/datasheet/MitsubishiElectricCorporation/mXuuvys.pdf ), “DIGITAL ECHO WITH MICROPHONE MIXING CIRCUIT.” The data sheet shows a number of possible configurations, for those who like to build their own.

Conclusion

There are a variety of karaoke systems, using VCD discs or dedicated systems. MIDI format karaoke files can be played using ordinary MIDI software, and there are a couple of Linux karaoke players.

Footnotes

1 This is no longer sold by Sonken. However, there are similar models sold under different brand names.

2 Newer models are sold in China but currently with a very limited English repertoire.

3 kmid seems to have disappeared from current KDE versions. This is a real shame since it was very good.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset