©  Jan Newmarch 2017

Jan Newmarch, Linux Sound Programming, 10.1007/978-1-4842-2496-0_12

12. FFmpeg/Libav

Jan Newmarch

(1)Oakleigh, Victoria, Australia

According to “A FFmpeg Tutorial for Beginners” ( http://keycorner.org/pub/text/doc/ffmpegtutorial.htm ),FFmpeg is a complete, cross-platform command-line tool capable of recording, converting, and streaming digital audio and video in various formats. It can be used to do most multimedia tasks quickly and easily, such as audio compression, audio/video format conversion, extracting images from a video, and more.

FFmpeg consists of a set of command-line tools and a set of libraries that can be used for transforming audio (and video) files from one format to another. It can work both on containers and on codecs. It is not designed for playing or recording audio; it’s more a general-purpose conversion tool.

Resources

The FFmpeg/Libav Controversy

FFmpeg was started in 2000 to provide libraries and programs for handling multimedia data. However, over the years there were a number of disputes between the developers, leading to a fork in 2011 to the Libav project. The two projects have continued since then, pretty much in parallel and often borrowing from each other. However, the situation has remained acrimonious, and there appears little possibility of resolution.

This is unfortunate for developers. While programs are generally portable between the two systems, there are sometimes differences in the APIs and in behavior. There is also the issue of distro support. For many years Debian and derivatives supported only Libav and omitted FFmpeg. This has changed, and both are supported now. See “Why Debian returned to FFmpeg” ( https://lwn.net/Articles/650816/ ) for a discussion of some of the issues.

FFmpeg Command-Line Tools

The principal FFmpeg tool is ffmpeg itself. The simplest use is as a converter from one format to another, as follows:

        ffmpeg -i file.ogg file.mp3

This will convert an Ogg container of Vorbis codec data to an MPEG container of MP2 codec data.

The Libav equivalent is avconv, which runs similarly.

      avconv -i file.ogg file.mp3

Internally, ffmpeg uses a pipeline of modules, as in Figure 12-1.

A435426_1_En_12_Fig1_HTML.gif
Figure 12-1. FFmpeg/Libav pipeline (Source: http://ffmpeg.org/ffmpeg.html )

The muxer/demuxer and decoder/encoder can all be set using options if the defaults are not appropriate.

The following are other commands:

  • ffprobe gives information about a file.

  • ffplay is a simple media player.

  • ffserver is a media server.

Programming

There are a number of libraries that can be used for FFmpeg/Libav programming . Libav builds the following libraries:

  • libavcodec

  • libavdevice

  • libavfilter

  • libavformat

  • libavresample

  • libavutil

FFmepg builds the following:

  • libavcodec

  • libavdevice

  • libavfilter

  • libavformat

  • libavresample

  • libavutil

  • libpostproc

  • libswresample

  • libswscale

The extra libraries in FFmpeg are for video postprocessing and scaling .

Using either of these systems is not a straightforward process. The Libav site states, “Libav has always been a very experimental and developer-driven project. It is a key component in many multimedia projects and has new features added constantly. To provide a stable foundation, major releases are cut every four to six months and maintained across at least two years.”

The FFmpeg site states, “FFmpeg has always been a very experimental and developer-driven project. It is a key component in many multimedia projects and has new features added constantly. Development branch snapshots work really well 99 percent of the time, so people are not afraid to use them.”

My experience has been that this “experimental” nature of both projects has led to an unstable core API, regularly obsoleting and replacing key functions . For example, the function avcodec_decode_audio in libavcodec version 56 is now up to version 4: avcodec_decode_audio4. And even that version is now deprecated in the upstream versions of FFmpeg and Libav (version 57) in favor of functions such as avcodec_send_packet that do not exist in version 56. This is in addition to having two projects with the same goals and generally identical APIs but not always. For example, FFmpeg has swr_alloc_set_opts, while Libav uses av_opt_set_int. In addition, the audiovisual codecs and containers themselves are continually evolving.

The result of this is that many example programs on the Internet no longer compile, use deprecated APIs, or belong to the “other” system. This is not to detract from two systems with superb achievements, but I just wish it wasn’t so messy.

Decoding an MP3 File

The following program decodes an MP3 file into a raw PCM file. This is about as simple a task as one can do with FFmpeg/Libav, but it is unfortunately not straightforward. First, you have to note that you want to deal with a codec, not a file containing a codec. This is not an FFmpeg/Libav issue but a general one.

Files with the extension .mpg or .mp3 may contain a number of different formats. If I run the command file on a number of files that I have, I get different results.

BST.mp3: MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, Stereo
Beethoven_Fr_Elise.mp3: MPEG ADTS, layer III, v1, 128 kbps, 44.1 kHz, Stereo
Angel-no-vocal.mp3: Audio file with ID3 version 2.3.0
01DooWackaDoo.mp3: Audio file with ID3 version 2.3.0,
    contains: MPEG ADTS, layer III, v1, 224 kbps, 44.1 kHz, JntStereo

The first two files just contain a single codec and can be managed by the following program. The third and fourth files are container files, containing MPEG+ID3 data . These need to be managed using the avformat functions such as av_read_frame 1.

The program is basically a standard example in the FFmpeg/Libav source distributions. It is based on ffmpeg-3.2/doc/examples/decoding_encoding.c in the FFmpeg source and is based on libav-12/doc/examples/avcodec.c in the Libav source. One may note in passing that both programs use avcodec_decode_audio4, which is deprecated in both these upstream versions, and neither has examples of the replacement function avcodec_send_packet.

A more serious issue is that, increasingly, MP3 files use a planar format . In this, different channels are in different planes. The FFmpeg/Libav function avcodec_decode_audio4 handles this correctly by placing each plane in a separate data array, but when it is output as PCM data, the planes have to be interleaved. This is not done in the examples and may result in incorrect PCM data (lots of clicking noises, followed by half-speed audio).

The relevant FFmpeg functions are as follows:

  • av_register_all: Register all the possible muxers, demuxers, and protocols.

  • avformat_open_input: Open the input stream.

  • av_find_stream_info: Extract stream information.

  • av_init_packet: Set default values in a packet.

  • avcodec_find_decoder: Find a suitable decoder.

  • avcodec_alloc_context3: Set default values for the primary data structure.

  • avcodec_open2: Open the decoder.

  • fread: The FFmpeg processing loop reads a buffer at a time from the data stream.

  • avcodec_decode_audio4: This decodes audio frames into raw audio data.

The rest of the code interleaves the data streams to output to a PCM file. The resultant file can be played with the following:

      aplay -c 2 -r 44100 /tmp/test.sw -f S16_LE

The program is as follows:

/*
 * copyright (c) 2001 Fabrice Bellard
 *
 * This file is part of Libav.
 *
 * Libav is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * Libav is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with Libav; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */


// From http://code.haskell.org/∼thielema/audiovideo-example/cbits/
// Adapted to version version 2.8.6-1ubuntu2 by Jan Newmarch


/**
 * @file
 * libavcodec API use example.
 *
 * @example libavcodec/api-example.c
 * Note that this library only handles codecs (mpeg, mpeg4, etc...),
 * not file formats (avi, vob, etc...). See library 'libavformat' for the
 * format handling
 */


#include <stdlib.h>
#include <stdio.h>
#include <string.h>


#ifdef HAVE_AV_CONFIG_H
#undef HAVE_AV_CONFIG_H
#endif


#include "libavcodec/avcodec.h"
#include <libavformat/avformat.h>


#define INBUF_SIZE 4096
#define AUDIO_INBUF_SIZE 20480
#define AUDIO_REFILL_THRESH 4096


void die(char *s) {
    fputs(s, stderr);
    exit(1);
}


/*
 * Audio decoding.
 */
static void audio_decode_example(AVFormatContext* container,
                                 const char *outfilename, const char *filename)
{
    AVCodec *codec;
    AVCodecContext *context = NULL;
    int len;
    FILE *f, *outfile;
    uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];
    AVPacket avpkt;
    AVFrame *decoded_frame = NULL;
    int num_streams = 0;
    int sample_size = 0;


    av_init_packet(&avpkt);

    printf("Audio decoding ");

    int stream_id = -1;

    // To find the first audio stream. This process may not be necessary
    // if you can gurarantee that the container contains only the desired
    // audio stream
    int i;
    for (i = 0; i < container->nb_streams; i++) {
        if (container->streams[i]->codec->codec_type == AVMEDIA_TYPE_AUDIO) {
            stream_id = i;
            break;
        }
    }


    /* find the appropriate audio decoder */
    AVCodecContext* codec_context = container->streams[stream_id]->codec;
    codec = avcodec_find_decoder(codec_context->codec_id);
    if (!codec) {
        fprintf(stderr, "codec not found ");
        exit(1);
    }


    context = avcodec_alloc_context3(codec);;

    /* open it */
    if (avcodec_open2(context, codec, NULL) < 0) {
        fprintf(stderr, "could not open codec ");
        exit(1);
    }


    f = fopen(filename, "rb");
    if (!f) {
        fprintf(stderr, "could not open %s ", filename);
        exit(1);
    }
    outfile = fopen(outfilename, "wb");
    if (!outfile) {
        av_free(context);
        exit(1);
    }


    /* decode until eof */
    avpkt.data = inbuf;
    avpkt.size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f);


    while (avpkt.size > 0) {
        int got_frame = 0;


        if (!decoded_frame) {
            if (!(decoded_frame = av_frame_alloc())) {
                fprintf(stderr, "out of memory ");
                exit(1);
            }
        } else {
            av_frame_unref(decoded_frame);
        }
        printf("Stream idx %d ", avpkt.stream_index);


        len = avcodec_decode_audio4(context, decoded_frame, &got_frame, &avpkt);
        if (len < 0) {
            fprintf(stderr, "Error while decoding ");
            exit(1);
        }
        if (got_frame) {
            printf("Decoded frame nb_samples %d, format %d ",
                   decoded_frame->nb_samples,
                   decoded_frame->format);
            if (decoded_frame->data[1] != NULL)
                printf("Data[1] not null ");
            else
                printf("Data[1] is null ");
            /* if a frame has been decoded, output it */
            int data_size = av_samples_get_buffer_size(NULL, context->channels,
                                                       decoded_frame->nb_samples,
                                                       context->sample_fmt, 1);
            // first time: count the number of  planar streams
            if (num_streams == 0) {
                while (num_streams < AV_NUM_DATA_POINTERS &&
                       decoded_frame->data[num_streams] != NULL)
                    num_streams++;
                printf("Number of streams %d ", num_streams);
            }


            // first time: set sample_size from 0 to e.g 2 for 16-bit data
            if (sample_size == 0) {
                sample_size =
                    data_size / (num_streams * decoded_frame->nb_samples);
            }


            int m, n;
            for (n = 0; n < decoded_frame->nb_samples; n++) {
                // interleave the samples from the planar streams
                for (m = 0; m < num_streams; m++) {
                    fwrite(&decoded_frame->data[m][n*sample_size],
                           1, sample_size, outfile);
                }
            }
        }
        avpkt.size -= len;
        avpkt.data += len;
        if (avpkt.size < AUDIO_REFILL_THRESH) {
            /* Refill the input buffer, to avoid trying to decode
             * incomplete frames. Instead of this, one could also use
             * a parser, or use a proper container format through
             * libavformat. */
            memmove(inbuf, avpkt.data, avpkt.size);
            avpkt.data = inbuf;
            len = fread(avpkt.data + avpkt.size, 1,
                        AUDIO_INBUF_SIZE - avpkt.size, f);
            if (len > 0)
                avpkt.size += len;
        }
    }


    fclose(outfile);
    fclose(f);


    avcodec_close(context);
    av_free(context);
    av_free(decoded_frame);
}


int main(int argc, char **argv)
{
    const char *filename = "Beethoven_Fr_Elise.mp3";
    AVFormatContext *pFormatCtx = NULL;


    if (argc == 2) {
        filename = argv[1];
    }


    // Register all formats and codecs
    av_register_all();
    if(avformat_open_input(&pFormatCtx, filename, NULL, NULL)!=0) {
        fprintf(stderr, "Can't get format of file %s ", filename);
        return -1; // Couldn't open file
    }
    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx, NULL)<0)
        return -1; // Couldn't find stream information
    av_dump_format(pFormatCtx, 0, filename, 0);
    printf("Num streams %d ", pFormatCtx->nb_streams);
    printf("Bit rate %d ", pFormatCtx->bit_rate);
    audio_decode_example(pFormatCtx, "/tmp/test.sw", filename);


    return 0;
}

Conclusion

This chapter briefly considered FFmpeg/Libav, looking at the libavcodec library. There is considerably more complexity to FFmpeg and Libav, and they can do far more complex transformations. In addition, they can do video processing and this is illustrated in Chapter 15.

Footnotes

1 Examples of av_read_frame are given in Chapter 15 and 21.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset