© Jan Newmarch 2017

Jan Newmarch, Raspberry Pi GPU Audio Video Programming , 10.1007/978-1-4842-2472-4_14

14. OpenMAX Audio on the Raspberry Pi

Jan Newmarch

(1)Oakleigh, Victoria, Australia

This chapter looks at audio processing on the Raspberry Pi. The support from OpenMAX is weaker: there is no satisfactory decode component. You have to resort to FFmpeg to decode and render audio.

Building Programs

Some programs in this chapter use libraries from the LibAV project, so development files from LibAV need to be installed.

sudo apt-get install libavcodec-dev
sudo apt-get install libavformat-dev

You can then build the programs in this chapter using the Makefile, which adds in the LibAV libraries.

DMX_INC =  -I/opt/vc/include/ -I /opt/vc/include/interface/vmcs_host/ -I/opt/vc/include/interface/vcos/pthreads -I/opt/vc/include/interface/vmcs_host/linux
EGL_INC =
OMX_INC =  -I /opt/vc/include/IL
OMX_ILCLIENT_INC = -I/opt/vc/src/hello_pi/libs/ilclient
INCLUDES = $(DMX_INC) $(EGL_INC) $(OMX_INC) $(OMX_ILCLIENT_INC)


CFLAGS=-g -DRASPBERRY_PI -DOMX_SKIP64BIT $(INCLUDES)
CPPFLAGS =


DMX_LIBS =  -L/opt/vc/lib/ -lbcm_host -lvcos -lvchiq_arm -lpthread
EGL_LIBS = -L/opt/vc/lib/ -lEGL -lGLESv2
OMX_LIBS = -lopenmaxil
OMX_ILCLIENT_LIBS = -L/opt/vc/src/hello_pi/libs/ilclient -lilclient
AV_LIBS =  $(shell pkg-config --libs libavcodec libavformat libavutil)


LDLIBS =  $(DMX_LIBS) $(EGL_LIBS) $(OMX_LIBS) $(OMX_ILCLIENT_LIBS) $(AV_LIBS)

all:  api-example il_render_audio
      il_ffmpeg_render_audio il_test_audio_encodings
      il_ffmpeg_render_resample_audio

Audio Components

The Raspberry Pi has a number of OpenMAX components specifically for audio processing.

  • audio_capture

  • audio_decode

  • audio_encode

  • audio_lowpower

  • audio_mixer

  • audio_processor

  • audio_render

  • audio_splitter

Audio Formats

OpenMAX has a number of data structures used to get and set information about components. You have seen some of these before.

  • OMX_PORT_PARAM_TYPE: This is used with the index parameter OMX_IndexParamImageInit in a call to OMX_GetParameter. This gives the number of image ports and the port number of the first port.

  • OMX_PARAM_PORTDEFINITIONTYPE: This is used with the index parameter OMX_PARAM_PORTDEFINITIONTYPE to give, for each port, the number of buffers, the size of each buffer, and the direction (input or output) of the port.

  • OMX_AUDIO_PORTDEFINITIONTYPE: This is a field of the OMX_PARAM_PORTDEFINITIONTYPE for images.

  • OMX_AUDIO_PARAM_PORTFORMATTYPE: This is used to get information about the different formats supported by each port.

Some of these were discussed in Chapter X.

You haven’t looked at the field OMX_AUDIO_PORTDEFINITIONTYPE, which is part of the port definition information. It contains the following relevant fields:

typedef struct OMX_AUDIO_PORTDEFINITIONTYPE {
    OMX_STRING cMIMEType;
    OMX_NATIVE_DEVICETYPE pNativeRender;
    OMX_BOOL bFlagErrorConcealment;
    OMX_AUDIO_CODINGTYPE eEncoding;
} OMX_AUDIO_PORTDEFINITIONTYPE;

The last two fields are the current values set for the port. The possible values are obtained from the next structure, OMX_AUDIO_PARAM_PORTFORMATTYPE, so I will discuss it in the next paragraph. The major field you get here is the audio encoding.

OMX_AUDIO_PARAM_PORTFORMATTYPE is defined (in section 4.1.16 in the 1.1.2 specification) as follows:

typedef struct OMX_AUDIO_PARAM_PORTFORMATTYPE {
    OMX_U32 nSize;
    OMX_VERSIONTYPE nVersion;
    OMX_U32 nPortIndex;
    OMX_U32 nIndex;
    OMX_AUDIO_CODINGTYPE eEncoding;
} OMX_AUDIO_PARAM_PORTFORMATTYPE;

The first two fields are common to all OpenMAX structures. The nPortIndex field is the port you are looking at. The nIndex field is to distinguish between all the different format types supported by this port. The eEncoding field gives information about the format.

The values for OMX_AUDIO_CODINGTYPE are given in Table 4-66 of the 1.1.2 specification and on the RPi are given in the file /opt/vc/include/IL/OMX_Audio.h, as follows:

typedef enum OMX_AUDIO_CODINGTYPE {
    OMX_AUDIO_CodingUnused = 0,  /** Placeholder value when coding is N/A  */
    OMX_AUDIO_CodingAutoDetect,  /** auto detection of audio format */
    OMX_AUDIO_CodingPCM,         /** Any variant of PCM coding */
    OMX_AUDIO_CodingADPCM,       /** Any variant of ADPCM encoded data */
    OMX_AUDIO_CodingAMR,         /** Any variant of AMR encoded data */
    OMX_AUDIO_CodingGSMFR,       /** Any variant of GSM fullrate (i.e. GSM610) */
    OMX_AUDIO_CodingGSMEFR,      /** Any variant of GSM Enhanced Fullrate encoded data*/
    OMX_AUDIO_CodingGSMHR,       /** Any variant of GSM Halfrate encoded data */
    OMX_AUDIO_CodingPDCFR,       /** Any variant of PDC Fullrate encoded data */
    OMX_AUDIO_CodingPDCEFR,      /** Any variant of PDC Enhanced Fullrate encoded data */
    OMX_AUDIO_CodingPDCHR,       /** Any variant of PDC Halfrate encoded data */
    OMX_AUDIO_CodingTDMAFR,      /** Any variant of TDMA Fullrate encoded data (TIA/EIA-136-420) */
    OMX_AUDIO_CodingTDMAEFR,     /** Any variant of TDMA Enhanced Fullrate encoded data (TIA/EIA-136-410) */
    OMX_AUDIO_CodingQCELP8,      /** Any variant of QCELP 8kbps encoded data */
    OMX_AUDIO_CodingQCELP13,     /** Any variant of QCELP 13kbps encoded data */
    OMX_AUDIO_CodingEVRC,        /** Any variant of EVRC encoded data */
    OMX_AUDIO_CodingSMV,         /** Any variant of SMV encoded data */
    OMX_AUDIO_CodingG711,        /** Any variant of G.711 encoded data */
    OMX_AUDIO_CodingG723,        /** Any variant of G.723 dot 1 encoded data */
    OMX_AUDIO_CodingG726,        /** Any variant of G.726 encoded data */
    OMX_AUDIO_CodingG729,        /** Any variant of G.729 encoded data */
    OMX_AUDIO_CodingAAC,         /** Any variant of AAC encoded data */
    OMX_AUDIO_CodingMP3,         /** Any variant of MP3 encoded data */
    OMX_AUDIO_CodingSBC,         /** Any variant of SBC encoded data */
    OMX_AUDIO_CodingVORBIS,      /** Any variant of VORBIS encoded data */
    OMX_AUDIO_CodingWMA,         /** Any variant of WMA encoded data */
    OMX_AUDIO_CodingRA,          /** Any variant of RA encoded data */
    OMX_AUDIO_CodingMIDI,        /** Any variant of MIDI encoded data */
    OMX_AUDIO_CodingKhronosExtensions = 0x6F000000, /** Reserved region for introducing Khronos   Standard Extensions */
    OMX_AUDIO_CodingVendorStartUnused = 0x7F000000, /** Reserved region for introducing Vendor Extensions */


    OMX_AUDIO_CodingFLAC,        /** Any variant of FLAC */
    OMX_AUDIO_CodingDDP,         /** Any variant of Dolby Digital Plus */
    OMX_AUDIO_CodingDTS,         /** Any variant of DTS */
    OMX_AUDIO_CodingWMAPRO,      /** Any variant of WMA Professional */
    OMX_AUDIO_CodingATRAC3,      /** Sony ATRAC-3 variants */
    OMX_AUDIO_CodingATRACX,      /** Sony ATRAC-X variants */
    OMX_AUDIO_CodingATRACAAL,    /** Sony ATRAC advanced-lossless variants  */


    OMX_AUDIO_CodingMax = 0x7FFFFFFF
} OMX_AUDIO_CODINGTYPE;

Running the program info from Chapter X shows the following for the audio_decode component:

Audio ports:
  Ports start on 120
  There are 2 open ports
  Port 120 has 128 buffers of size 16384
  Direction is input
    Port 120 requires 4 buffers
    Port 120 has min buffer size 16384 bytes
    Port 120 is an input port
    Port 120 is an audio port
    Port mimetype (null)
    Port encoding is MP3
      Supported audio formats are:
      Supported encoding is MP3
          MP3 default sampling rate 0
          MP3 default bits per sample 0
          MP3 default number of channels 0
      Supported encoding is PCM
          PCM default sampling rate 0
          PCM default bits per sample 0
          PCM default number of channels 0
      Supported encoding is AAC
      Supported encoding is WMA
      Supported encoding is Ogg Vorbis
      Supported encoding is RA
      Supported encoding is AMR
      Supported encoding is EVRC
      Supported encoding is G726
      Supported encoding is FLAC
      Supported encoding is DDP
      Supported encoding is DTS
      Supported encoding is WMAPRO
      Supported encoding is ATRAC3
      Supported encoding is ATRACX
      Supported encoding is ATRACAAL
      Supported encoding is MIDI
      No more formats supported
  Port 121 has 1 buffers of size 32768
  Direction is output
    Port 121 requires 1 buffers
    Port 121 has min buffer size 32768 bytes
    Port 121 is an output port
    Port 121 is an audio port
    Port mimetype (null)
    Port encoding is PCM
      Supported audio formats are:
      Supported encoding is PCM
          PCM default sampling rate 44100
          PCM default bits per sample 16
          PCM default number of channels 2
      Supported encoding is DDP
      Supported encoding is DTS
      No more formats supported

This looks really impressive, but regrettably, none of this is actually supported except for PCM. The following is according to jamesh in “OMX_AllocateBuffer fails for audio decoder component”:

The way it works is that the component passes back success for all the codecs it can potentially support (i.e., all the codecs we’ve ever had going). That is then constrained by what codecs are actually installed. It would be better to run time detect which codecs are present, but that code has never been written since it’s never been required. It’s also unlikely ever to be done as Broadcom no longer supports audio codecs in this way—they have moved off the Videocore to the host CPU since they are now powerful enough to handle any audio decoding task .

That’s kind of sad, really.

So, how do you find out which encodings are really supported? You can get a partial answer by trying to allocate buffers for a particular encoding. So, for each port, you loop through the possible encodings, setting the encoding and trying to allocate the buffers. You can use il_enable_port_buffersfor this as it will return -1 if the allocation fails.

The program to do this is il_test_audio_encodings.c.

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>


#include <OMX_Core.h>
#include <OMX_Component.h>


#include <bcm_host.h>
#include <ilclient.h>


FILE *outfp;

void printState(OMX_HANDLETYPE handle) {
    // elided
}


char *err2str(int err) {
    return "elided";
}


void eos_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "Got eos event ");
}


void error_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    //fprintf(stderr, "OMX error %s ", err2str(data));
}


int get_file_size(char *fname) {
    struct stat st;


    if (stat(fname, &st) == -1) {
        perror("Stat'ing img file");
        return -1;
    }
    return(st.st_size);
}


static void set_audio_decoder_input_format(COMPONENT_T *component,
                                           int port, int format) {
    // set input audio format
    //printf("Setting audio decoder format ");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;
    //setHeader(&audioPortFormat,  sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE);
    audioPortFormat.nVersion.nVersion = OMX_VERSION;


    audioPortFormat.nPortIndex = port;
    //audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM;
    audioPortFormat.eEncoding = format;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);
    //printf("Format set ok to %d ", format);
}


char *format2str(OMX_AUDIO_CODINGTYPE format) {
    switch(format) {
    case OMX_AUDIO_CodingUnused: return "OMX_AUDIO_CodingUnused";
    case OMX_AUDIO_CodingAutoDetect: return "OMX_AUDIO_CodingAutoDetect";
    case OMX_AUDIO_CodingPCM: return "OMX_AUDIO_CodingPCM";
    case OMX_AUDIO_CodingADPCM: return "OMX_AUDIO_CodingADPCM";
    case OMX_AUDIO_CodingAMR: return "OMX_AUDIO_CodingAMR";
    case OMX_AUDIO_CodingGSMFR: return "OMX_AUDIO_CodingGSMFR";
    case OMX_AUDIO_CodingGSMEFR: return "OMX_AUDIO_CodingGSMEFR" ;
    case OMX_AUDIO_CodingGSMHR: return "OMX_AUDIO_CodingGSMHR";
    case OMX_AUDIO_CodingPDCFR: return "OMX_AUDIO_CodingPDCFR";
    case OMX_AUDIO_CodingPDCEFR: return "OMX_AUDIO_CodingPDCEFR";
    case OMX_AUDIO_CodingPDCHR: return "OMX_AUDIO_CodingPDCHR";
    case OMX_AUDIO_CodingTDMAFR: return "OMX_AUDIO_CodingTDMAFR";
    case OMX_AUDIO_CodingTDMAEFR: return "OMX_AUDIO_CodingTDMAEFR";
    case OMX_AUDIO_CodingQCELP8: return "OMX_AUDIO_CodingQCELP8";
    case OMX_AUDIO_CodingQCELP13: return "OMX_AUDIO_CodingQCELP13";
    case OMX_AUDIO_CodingEVRC: return "OMX_AUDIO_CodingEVRC";
    case OMX_AUDIO_CodingSMV: return "OMX_AUDIO_CodingSMV";
    case OMX_AUDIO_CodingG711: return "OMX_AUDIO_CodingG711";
    case OMX_AUDIO_CodingG723: return "OMX_AUDIO_CodingG723";
    case OMX_AUDIO_CodingG726: return "OMX_AUDIO_CodingG726";
    case OMX_AUDIO_CodingG729: return "OMX_AUDIO_CodingG729";
    case OMX_AUDIO_CodingAAC: return "OMX_AUDIO_CodingAAC";
    case OMX_AUDIO_CodingMP3: return "OMX_AUDIO_CodingMP3";
    case OMX_AUDIO_CodingSBC: return "OMX_AUDIO_CodingSBC";
    case OMX_AUDIO_CodingVORBIS: return "OMX_AUDIO_CodingVORBIS";
    case OMX_AUDIO_CodingWMA: return "OMX_AUDIO_CodingWMA";
    case OMX_AUDIO_CodingRA: return "OMX_AUDIO_CodingRA";
    case OMX_AUDIO_CodingMIDI: return "OMX_AUDIO_CodingMIDI";
    case OMX_AUDIO_CodingFLAC: return "OMX_AUDIO_CodingFLAC";
    case OMX_AUDIO_CodingDDP: return "OMX_AUDIO_CodingDDP";
    case OMX_AUDIO_CodingDTS: return "OMX_AUDIO_CodingDTS";
    case OMX_AUDIO_CodingWMAPRO: return "OMX_AUDIO_CodingWMAPRO";
    case OMX_AUDIO_CodingATRAC3: return "OMX_AUDIO_CodingATRAC3";
    case OMX_AUDIO_CodingATRACX: return "OMX_AUDIO_CodingATRACX";
    case OMX_AUDIO_CodingATRACAAL: return "OMX_AUDIO_CodingATRACAAL" ;
    default: return "Unknown format";
    }
}


void test_audio_port_formats(COMPONENT_T *component, int port) {
    int n = 2;
    while (n <= OMX_AUDIO_CodingMIDI) {
        set_audio_decoder_input_format(component, port, n);


        // input port
        if (ilclient_enable_port_buffers(component, port,
                                         NULL, NULL, NULL) < 0) {
            printf("    Unsupported encoding is %s ",
                   format2str(n));
        } else {
            printf("    Supported encoding is %s ",
                  format2str(n));
            ilclient_disable_port_buffers(component, port,
                                          NULL, NULL, NULL);
        }
        n++;
    }
    n = OMX_AUDIO_CodingFLAC;
    while (n <= OMX_AUDIO_CodingATRACAAL) {
        set_audio_decoder_input_format(component, port, n);


        // input port
        if (ilclient_enable_port_buffers(component, port,
                                         NULL, NULL, NULL) < 0) {
            printf("    Unsupported encoding is %s ",
                   format2str(n));
        } else {
            printf("    Supported encoding is %s ",
                   format2str(n));
            ilclient_disable_port_buffers(component, port,
                                          NULL, NULL, NULL);
        }
        n++;
    }
}


void test_all_audio_ports(COMPONENT_T *component) {
    OMX_PORT_PARAM_TYPE param;
    OMX_PARAM_PORTDEFINITIONTYPE sPortDef;
    OMX_ERRORTYPE err;
    OMX_HANDLETYPE handle = ilclient_get_handle(component);


    int startPortNumber;
    int nPorts;
    int n;


    //setHeader(&param, sizeof(OMX_PORT_PARAM_TYPE));
    memset(&param, 0, sizeof(OMX_PORT_PARAM_TYPE));
    param.nSize = sizeof(OMX_PORT_PARAM_TYPE);
    param.nVersion.nVersion = OMX_VERSION;


    err = OMX_GetParameter(handle, OMX_IndexParamAudioInit, &param);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "Error in getting audio OMX_PORT_PARAM_TYPE parameter ");
        return;
    }
    printf("Audio ports: ");


    startPortNumber = param.nStartPortNumber;
    nPorts = param.nPorts;
    if (nPorts == 0) {
        printf("No ports of this type ");
        return;
    }


    printf("Ports start on %d ", startPortNumber);
    printf("There are %d open ports ", nPorts);


    for (n = 0; n < nPorts; n++) {
        memset(&sPortDef, 0, sizeof(OMX_PARAM_PORTDEFINITIONTYPE)) ;
        sPortDef.nSize = sizeof(OMX_PARAM_PORTDEFINITIONTYPE);
        sPortDef.nVersion.nVersion = OMX_VERSION;


        sPortDef.nPortIndex = startPortNumber + n;
        err = OMX_GetParameter(handle, OMX_IndexParamPortDefinition, &sPortDef);
        if(err != OMX_ErrorNone){
            fprintf(stderr, "Error in getting OMX_PORT_DEFINITION_TYPE parameter ");
            exit(1);
        }
        printf("Port %d has %d buffers of size %d ",
               sPortDef.nPortIndex,
               sPortDef.nBufferCountActual,
               sPortDef.nBufferSize);
        printf("Direction is %s ",
               (sPortDef.eDir == OMX_DirInput ? "input" : "output"));
        test_audio_port_formats(component, sPortDef.nPortIndex);
    }
}


int main(int argc, char** argv) {
    char *componentName;
    int err;
    ILCLIENT_T  *handle;
    COMPONENT_T *component;


    componentName = "audio_decode";
    if (argc == 2) {
        componentName = argv[1];
    }


    bcm_host_init();

    handle = ilclient_init();
    if (handle == NULL) {
        fprintf(stderr, "IL client init failed ") ;
        exit(1);
    }


    if (OMX_Init() != OMX_ErrorNone) {
        ilclient_destroy(handle);
        fprintf(stderr, "OMX init failed ");
        exit(1);
    }


    ilclient_set_error_callback(handle,
                                error_callback,
                                NULL);
    ilclient_set_eos_callback(handle,
                              eos_callback,
                              NULL);


    err = ilclient_create_component(handle,
                                    &component,
                                    componentName,
                                    ILCLIENT_DISABLE_ALL_PORTS
                                    |
                                    ILCLIENT_ENABLE_INPUT_BUFFERS
                                    |
                                    ILCLIENT_ENABLE_OUTPUT_BUFFERS
                                    );
    if (err == -1) {
        fprintf(stderr, "Component create failed ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    err = ilclient_change_component_state(component,
                                          OMX_StateIdle);
    if (err < 0) {
        fprintf(stderr, "Couldn't change state to Idle ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    test_all_audio_ports(component);

    exit(0);
}

The program appears to be only partially successful. For the audio_decode component, it shows that only two possible formats can be decoded, PCM and ADPCM , but shows that they can be decoded to MP3, Vorbis, and so on, which seems most unlikely.

Audio ports:
Ports start on 120
There are 2 open ports
Port 120 has 128 buffers of size 16384
Direction is input
    Supported encoding is OMX_AUDIO_CodingPCM
    Supported encoding is OMX_AUDIO_CodingADPCM
    Unsupported encoding is OMX_AUDIO_CodingAMR
    Unsupported encoding is OMX_AUDIO_CodingGSMFR
    Unsupported encoding is OMX_AUDIO_CodingGSMEFR
    Unsupported encoding is OMX_AUDIO_CodingGSMHR
    Unsupported encoding is OMX_AUDIO_CodingPDCFR
    Unsupported encoding is OMX_AUDIO_CodingPDCEFR
    Unsupported encoding is OMX_AUDIO_CodingPDCHR
    Unsupported encoding is OMX_AUDIO_CodingTDMAFR
    Unsupported encoding is OMX_AUDIO_CodingTDMAEFR
    Unsupported encoding is OMX_AUDIO_CodingQCELP8
    Unsupported encoding is OMX_AUDIO_CodingQCELP13
    ...
Port 121 has 1 buffers of size 32768
Direction is output
    Supported encoding is OMX_AUDIO_CodingPCM
    Supported encoding is OMX_AUDIO_CodingADPCM
    Supported encoding is OMX_AUDIO_CodingAMR
    Supported encoding is OMX_AUDIO_CodingGSMFR
    Supported encoding is OMX_AUDIO_CodingGSMEFR
    Supported encoding is OMX_AUDIO_CodingGSMHR
    Supported encoding is OMX_AUDIO_CodingPDCFR
    Supported encoding is OMX_AUDIO_CodingPDCEFR
    Supported encoding is OMX_AUDIO_CodingPDCHR
    Supported encoding is OMX_AUDIO_CodingTDMAFR
    Supported encoding is OMX_AUDIO_CodingTDMAEFR
    Supported encoding is OMX_AUDIO_CodingQCELP8
    Supported encoding is OMX_AUDIO_CodingQCELP13
    Supported encoding is OMX_AUDIO_CodingEVRC
    Supported encoding is OMX_AUDIO_CodingSMV
    Supported encoding is OMX_AUDIO_CodingG711
    Supported encoding is OMX_AUDIO_CodingG723
    Supported encoding is OMX_AUDIO_CodingG726
    Supported encoding is OMX_AUDIO_CodingG729
    ...

Decoding an Audio File Using audio_decode

The Broadcom audio_decode component will decode only Pulse Code Modulated (PCM) format data. It decodes it to…PCM format data. PCM is the binary format commonly used to represent unencoded audio data. In other words, unless Broadcom includes support for some of the audio codecs, then this component is pretty useless.

Rendering PCM Data

Now you’ll learn how to render PCM data.

PCM Data

The following is according to Wikipedia:

[PCM] is a method used to digitally represent sampled analog signals . It is the standard form for digital audio in computers and various Blu-ray, DVD, and Compact Disc formats, as well as other uses such as digital telephone systems. A PCM stream is a digital representation of an analog signal, in which the magnitude of the analog signal is sampled regularly at uniform intervals, with each sample being quantized to the nearest value within a range of digital steps.

PCM streams have two basic properties that determine their fidelity to the original analog signal: the sampling rate, which is the number of times per second that samples are taken; and the bit depth, which determines the number of possible digital values that each sample can take.

PCM data can be stored in files as “raw” data. In this case, there is no header information to say what the sampling rate and bit depth are. Many tools such as sox use the file extension to determine these properties. The following is from man soxformat:

f32 and f64 indicate files encoded as 32- and 64-bit (IEEE single and double precision) floating-point PCM, respectively; s8, s16, s24, and s32 indicate 8-, 16-, 24-, and 32-bit signed integer PCM, respectively; u8, u16, u24, and u32 indicate 8-, 16-, 24-, and 32-bit unsigned integer PCM, respectively.

But it should be noted that the file extension is only an aid to understanding some of the PCM codec parameters and how they are stored in the file.

Files can be converted into PCM by tools such as avconv. For example, to convert a WAV file to PCM, you use this:

avconv -i  enigma.wav -f s16le enigma.s16

The output will give information not saved in the file, which you will need to give to a processing program later.

Input #0, wav, from 'enigma.wav':
  Duration: 00:06:26.38, bitrate: 1411 kb/s
    Stream #0.0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
Output #0, s16le, to 'enigma.s16':
  Metadata:
    encoder         : Lavf54.20.4
    Stream #0.0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s

From this you can see that the format is two channels, 44,100 Hz, and 16-bit little-endian. (The file I used was from a group called Enigma, who released an album as open content.)

To check that the encoding worked, you can use aplay as follows:

aplay -r 44100 -c 2 -f S16_LE enigma.s16

Choosing an Output Device

OpenMAX has a standard audio render component. But what device does it render to? The built-in sound card? A USB sound card? That is not part of OpenMAX IL—there isn’t even a way to list the audio devices, only the audio components.

OpenMAX has an extension mechanism that can be used by an OpenMAX implementor to answer questions like this. The Broadcom core implementation has extension types OMX_CONFIG_BRCMAUDIODESTINATIONTYPE and OMX_CONFIG_BRCMAUDIOSOURCETYPE, which can be used to set the audio destination (source) device. Use the following code to do this:

void setOutputDevice(const char *name) {
   int32_t success = -1;
   OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;


   if (name && strlen(name) < sizeof(arDest.sName)) {
       setHeader(&arDest, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
       strcpy((char *)arDest.sName, name);


       err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
       if (err != OMX_ErrorNone) {
           fprintf(stderr, "Error on setting audio destination ");
           exit(1);
       }
   }
}

Here is where Broadcom becomes a bit obscure again. The header file IL/OMX_Broadcom.h states that the default value of sName is local but doesn’t give any other values. The Raspberry Pi forums say that this refers to the 3.5 mm analog audio out and that HDMI is chosen by using the value hdmi. No other values are documented, and it seems that the Broadcom OpenMAX IL does not support any other audio devices. In particular, USB audio devices are not supported by the current Broadcom OpenMAX IL components for either input or output. So, you can’t use OpenMAX IL for, say, audio capture on the Raspberry Pi since it has no Broadcom-supported audio input.

Setting PCM Format

You can use two functions to set the PCM format. The first contains nothing unusual.

void set_audio_render_input_format(COMPONENT_T *component) {
    // set input audio format
    printf("Setting audio render format ");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;


    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE);
    audioPortFormat.nVersion.nVersion = OMX_VERSION;


    audioPortFormat.nPortIndex = 100;

    OMX_GetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);


    audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM ;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);


    setPCMMode(ilclient_get_handle(component), 100);

}

The second gets the current PCM parameters and then sets the required PCM parameters (which you know independently).

void setPCMMode(OMX_HANDLETYPE handle, int startPortNumber) {
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
    OMX_ERRORTYPE err;


    memset(&sPCMMode, 0, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nSize = sizeof(OMX_AUDIO_PARAM_PCMMODETYPE);
    sPCMMode.nVersion.nVersion = OMX_VERSION;


    sPCMMode.nPortIndex = startPortNumber;

    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    printf("Sampling rate %d, channels %d ",
           sPCMMode.nSamplingRate,
           sPCMMode.nChannels);


    sPCMMode.nSamplingRate = 44100;
    sPCMMode.nChannels = 2;


    err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "PCM mode unsupported ");
        return;
    } else {
        fprintf(stderr, "PCM mode supported ");
        fprintf(stderr, "PCM sampling rate %d ", sPCMMode.nSamplingRate);
        fprintf(stderr, "PCM nChannels %d ", sPCMMode.nChannels);
    }
}

The program is il_render_audio.c.

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>


#include <OMX_Core.h>
#include <OMX_Component.h>


#include <bcm_host.h>
#include <ilclient.h>


#define AUDIO  "enigma.s16"

/* For the RPi name can be "hdmi" or "local" */
void setOutputDevice(OMX_HANDLETYPE handle, const char *name) {
    OMX_ERRORTYPE err;
    OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;


    if (name && strlen(name) < sizeof(arDest.sName)) {
        memset(&arDest, 0, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
        arDest.nSize = sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE);
        arDest.nVersion.nVersion = OMX_VERSION;


        strcpy((char *)arDest.sName, name);

        err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
        if (err != OMX_ErrorNone) {
            fprintf(stderr, "Error on setting audio destination ") ;
            exit(1);
        }
    }
}


void setPCMMode(OMX_HANDLETYPE handle, int startPortNumber) {
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
    OMX_ERRORTYPE err;


    memset(&sPCMMode, 0, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nSize = sizeof(OMX_AUDIO_PARAM_PCMMODETYPE);
    sPCMMode.nVersion.nVersion = OMX_VERSION;


    sPCMMode.nPortIndex = startPortNumber;

    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    printf("Sampling rate %d, channels %d ",
           sPCMMode.nSamplingRate,
           sPCMMode.nChannels);


    sPCMMode.nSamplingRate = 44100;
    sPCMMode.nChannels = 2;


    err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "PCM mode unsupported ");
        return;
    } else {
        fprintf(stderr, "PCM mode supported ");
        fprintf(stderr, "PCM sampling rate %d ", sPCMMode.nSamplingRate);
        fprintf(stderr, "PCM nChannels %d ", sPCMMode.nChannels);
    }
}


void printState(OMX_HANDLETYPE handle) {
    //elided
}


char *err2str(int err) {
    return "elided";
}


void eos_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "Got eos event ") ;
}


void error_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "OMX error %s ", err2str(data));
}


int get_file_size(char *fname) {
    struct stat st;


    if (stat(fname, &st) == -1) {
        perror("Stat'ing img file");
        return -1;
    }
    return(st.st_size);
}


static void set_audio_render_input_format(COMPONENT_T *component) {
    // set input audio format
    printf("Setting audio render format ");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;
    //setHeader(&audioPortFormat,  sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE);
    audioPortFormat.nVersion.nVersion = OMX_VERSION;


    audioPortFormat.nPortIndex = 100;

    OMX_GetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);


    audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM;
    //audioPortFormat.eEncoding = OMX_AUDIO_CodingMP3;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);


    setPCMMode(ilclient_get_handle(component), 100);

}

OMX_ERRORTYPE read_into_buffer_and_empty(FILE *fp,
                                         COMPONENT_T *component,
                                         OMX_BUFFERHEADERTYPE *buff_header,
                                         int *toread) {
    OMX_ERRORTYPE r;


    int buff_size = buff_header->nAllocLen;
    int nread = fread(buff_header->pBuffer, 1, buff_size, fp);


    printf("Read %d ", nread);

    buff_header->nFilledLen = nread;
    *toread -= nread;
    if (*toread <= 0) {
        printf("Setting EOS on input ");
        buff_header->nFlags |= OMX_BUFFERFLAG_EOS ;
    }
    r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
                            buff_header);
    if (r != OMX_ErrorNone) {
        fprintf(stderr, "Empty buffer error %s ",
                err2str(r));
    }
    return r;
}


int main(int argc, char** argv) {

    int i;
    char *componentName;
    int err;
    ILCLIENT_T  *handle;
    COMPONENT_T *component;


    char *audio_file = AUDIO;
    if (argc == 2) {
        audio_file = argv[1];
    }


    FILE *fp = fopen(audio_file, "r");
    int toread = get_file_size(audio_file);


    OMX_BUFFERHEADERTYPE *buff_header;

    componentName = "audio_render";

    bcm_host_init();

    handle = ilclient_init();
    if (handle == NULL) {
        fprintf(stderr, "IL client init failed ");
        exit(1);
    }


    if (OMX_Init() != OMX_ErrorNone) {
        ilclient_destroy(handle);
        fprintf(stderr, "OMX init failed ");
        exit(1);
    }


    ilclient_set_error_callback(handle,
                                error_callback ,
                                NULL);
    ilclient_set_eos_callback(handle,
                              eos_callback,
                              NULL);


    err = ilclient_create_component(handle,
                                    &component,
                                    componentName,
                                    ILCLIENT_DISABLE_ALL_PORTS
                                    |
                                    ILCLIENT_ENABLE_INPUT_BUFFERS
                                    );
    if (err == -1) {
        fprintf(stderr, "Component create failed ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    err = ilclient_change_component_state(component,
                                          OMX_StateIdle);
    if (err < 0) {
        fprintf(stderr, "Couldn't change state to Idle ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    // must be before we enable buffers
    set_audio_render_input_format(component);


    setOutputDevice(ilclient_get_handle(component), "local");

    // input port
    ilclient_enable_port_buffers(component, 100,
                                 NULL, NULL, NULL);
    ilclient_enable_port(component, 100);


    err = ilclient_change_component_state(component,
                                          OMX_StateExecuting);
    if (err < 0) {
        fprintf(stderr, "Couldn't change state to Executing ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    // now work through the file
    while (toread > 0) {
        OMX_ERRORTYPE r;


        // do we have an input buffer we can fill and empty?
        buff_header =
            ilclient_get_input_buffer(component,
                                      100,
                                      1 /* block */);
        if (buff_header != NULL) {
            read_into_buffer_and_empty(fp,
                                       component,
                                       buff_header,
                                       &toread);
        }
    }


    exit(0);
}

The program can be run with the command line of any file with two-channel 16-bit PCM with a sampling rate of 44,100 Hz. This is hard-coded into the program. It would be easy to add command-line parsing along the same lines as aplay.

Decoding an MP3 File Using FFmpeg or Avconv

If you want to play a compressed file such as MP3 or Ogg, it has to be decoded, and as noted earlier, the audio_decode component doesn’t do this. So, you have to turn to another system. The prominent audio-decoding systems are FFmpeg (forked to LibAV) and GStreamer. I will use LibAV as that is the default install on the RPi.

FFmpeg was started in 2000. LibAV forked from it in 2011. Over time, both of the libraries and the file formats have evolved. Consequently, there are code examples on the Web that are no longer appropriate. Generally, FFmpeg and LibAV follow the same API and are generally interchangeable at the API code level—but not always.

The current FFmpeg source distro includes a program called doc/examples/decoding_encoding.c, while the LibAV distro has a similar example, called avcodec.c, which can decode MP3 files to PCM format. These almost work on the MP3 files I tried because the output format from the decoder has changed from interleaved to planar, and the examples have not been updated to reflect this.

The difference is easily illustrated with stereo: interleaved means LRLRLR. With planar, a set of consecutive Rs are given after a set of consecutive Ls, as in LLLLLL…RRRRR. Interleaved is a degenerate case of planar with a run length of 1.

A frame of video/audio that is decoded by FFmpeg/LibAV is built in a struct AVFrame. This includes the following fields :

typedef struct AVFrame {
    uint8_t *data[AV_NUM_DATA_POINTERS];
    int linesize[AV_NUM_DATA_POINTERS];
    int nb_samples;
}

In the interleaved case, all the samples are in data[0]. In the planar case, they are in data[0], data[1], …. There does not seem to be an explicit indicator of how many planar streams there are, but if a data element is non-null, it seems to contain a stream. So, by walking the data array until you find NULL, you can find the number of streams.

Many tools such as aplay will accept only interleaved samples. So, given multiple planar streams, you have to interleave them yourself. This isn’t hard once you know the sample size in bytes, the length of each stream, and the number of streams (a more robust way is given in a later section).

            int data_size = av_samples_get_buffer_size(NULL, c->channels,
                                                       decoded_frame->nb_samples,
                                                       c->sample_fmt, 1);
            // first time: count the number of  planar streams
            if (num_streams == 0) {
                while (num_streams < AV_NUM_DATA_POINTERS &&
                       decoded_frame->data[num_streams] != NULL)
                    num_streams++;
            }


            // first time: set sample_size from 0 to e.g 2 for 16-bit data
            if (sample_size == 0) {
                sample_size =
                    data_size / (num_streams * decoded_frame->nb_samples);
            }


            int m, n;
            for (n = 0; n < decoded_frame->nb_samples; n++) {
                // interleave the samples from the planar streams
                for (m = 0; m < num_streams; m++) {
                    fwrite(&decoded_frame->data[m][n*sample_size],
                           1, sample_size, outfile);
                }
            }

The revised program, which reads from an MP3 file and writes decoded data to /tmp/test.sw, is api-example.c.

/*
 * copyright (c) 2001 Fabrice Bellard
 *
 * This file is part of Libav.
 *
 * Libav is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * Libav is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * Lesser General Public License for more details .
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with Libav; if not, write to the Free Software
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */


// From http://code.haskell.org/∼thielema/audiovideo-example/cbits/

/**
 * @file
 * libavcodec API use example.
 *
 * @example libavcodec/api-example.c
 * Note that this library only handles codecs (mpeg, mpeg4, etc...),
 * not file formats (avi, vob, etc...). See library 'libavformat' for the
 * format handling
 */


#include <stdlib.h>
#include <stdio.h>
#include <string.h>


#ifdef HAVE_AV_CONFIG_H
#undef HAVE_AV_CONFIG_H
#endif


#include "libavcodec/avcodec.h"
#include <libavformat/avformat.h>
#include "libavutil/mathematics.h"
#include "libavutil/samplefmt.h"


#define INBUF_SIZE 4096
#define AUDIO_INBUF_SIZE 20480
#define AUDIO_REFILL_THRESH 4096


/*
 * Audio decoding.
 */
static void audio_decode_example(const char *outfilename, const char *filename)
{
    AVCodec *codec;
    AVCodecContext *c = NULL;
    int len;
    FILE *f, *outfile;
    uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];
    AVPacket avpkt;
    AVFrame *decoded_frame = NULL ;
    int num_streams = 0;
    int sample_size = 0;


    av_init_packet(&avpkt);

    printf("Audio decoding ");

    /* find the mpeg audio decoder */
    codec = avcodec_find_decoder(AV_CODEC_ID_MP3);
    if (!codec) {
        fprintf(stderr, "codec not found ");
        exit(1);/home/httpd/html/RPi-hidden/OpenMAX/Audio/BST.mp3
    }


    c = avcodec_alloc_context3(codec);;

    /* open it */
    if (avcodec_open2(c, codec, NULL) < 0) {
        fprintf(stderr, "could not open codec ");
        exit(1);
    }


    f = fopen(filename, "rb");
    if (!f) {
        fprintf(stderr, "could not open %s ", filename);
        exit(1);
    }
    outfile = fopen(outfilename, "wb");
    if (!outfile) {
        av_free(c);
        exit(1);
    }


    /* decode until eof */
    avpkt.data = inbuf;
    avpkt.size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f);


    while (avpkt.size > 0) {
        int got_frame = 0;


        if (!decoded_frame) {
            if (!(decoded_frame = avcodec_alloc_frame())) {
                fprintf(stderr, "out of memory ");
                exit(1);
            }
        } else
            avcodec_get_frame_defaults(decoded_frame);


        len = avcodec_decode_audio4(c, decoded_frame, &got_frame, &avpkt);
        if (len < 0) {
            fprintf(stderr, "Error while decoding ");
            exit(1);
        }
        if (got_frame) {
            printf("Decoded frame nb_samples %d, format %d ",
                   decoded_frame->nb_samples,
                   decoded_frame->format);
            if (decoded_frame->data[1] != NULL)
                printf("Data[1] not null ");
            elseAV_LIBS =  $(shell pkg-config --libs libavcodec libavformat libavutil)
                printf("Data[1] is null ");
            /* if a frame has been decoded, output it */
            int data_size = av_samples_get_buffer_size(NULL, c->channels,
                                                       decoded_frame->nb_samples,
                                                       c->sample_fmt, 1);
            // first time: count the number of  planar streams
            if (num_streams == 0) {
                while (num_streams < AV_NUM_DATA_POINTERS &&
                       decoded_frame->data[num_streams] != NULL)
                    num_streams++;
            }


            // first time: set sample_size from 0 to e.g 2 for 16-bit data
            if (sample_size == 0) {
                sample_size =
                    data_size / (num_streams * decoded_frame->nb_samples);
            }


            int m, n;
            for (n = 0; n < decoded_frame->nb_samples; n++) {
                // interleave the samples from the planar streams
                for (m = 0; m < num_streams; m++) {
                    fwrite(&decoded_frame->data[m][n*sample_size],
                           1, sample_size, outfile);
                }
            }
        }
        avpkt.size -= len;
        avpkt.data += len;
        if (avpkt.size < AUDIO_REFILL_THRESH) {
            /* Refill the input buffer, to avoid trying to decode
             * incomplete frames. Instead of this, one could also use
             * a parser, or use a proper container format through
             * libavformat. */
            memmove(inbuf, avpkt.data, avpkt.size);
            avpkt.data = inbuf;
            len = fread(avpkt.data + avpkt.size, 1,
                        AUDIO_INBUF_SIZE - avpkt.size, f);
            if (len > 0)
                avpkt.size += len;
        }
    }


    fclose(outfile);
    fclose(f);


    avcodec_close(c);
    av_free(c);
    av_free(decoded_frame);
}


int main(int argc, char **argv)
{
    const char *filename = "BST.mp3";
    AVFormatContext *pFormatCtx = NULL ;


    if (argc == 2) {
        filename = argv[1];
    }


    // Register all formats and codecs
    av_register_all();
    if(avformat_open_input(&pFormatCtx, filename, NULL, NULL)!=0) {
        fprintf(stderr, "Can't get format ");
        return -1; // Couldn't open file
    }
    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx, NULL)<0)
        return -1; // Couldn't find stream information
    av_dump_format(pFormatCtx, 0, filename, 0);
    printf("Num streams %d ", pFormatCtx->nb_streams);
    printf("Bit rate %d ", pFormatCtx->bit_rate);
    audio_decode_example("/tmp/test.sw", filename);


    return 0;
}

You can build the program using make. As it does not depend on OpenMAX, it can also be built independently using the compile command .

cc -c -g api-example.c
cc api-example.o -lavutil -lavcodec -lavformat -o api-example -lm

It can be run with an MP3 file being the command-line parameter .

You can test the result as follows (you may need to change parameters):

aplay -r 44100 -c 2 -f S16_LE /tmp/test.sw                                                                                                                                    

Rendering MP3 Using FFmpeg or LibAV and OpenMAX

Since the Broadcom audio_decode component is apparently of little use, if you actually want to play MP3, Ogg, or other encoded formats, you have to use FFmpeg/LibAV (or GStreamer) to decode the audio to PCM and then pass it to the Broadcom audio_render component.

Essentially this means taking the last two programs and mashing them together. It isn’t hard, just a bit messy. The only tricky point is that the buffers returned from FFmpeg/LibAV and the buffers used by audio_render are different sizes, and you don’t really know which will be bigger. If the audio_render input buffers are bigger, then you just copy (and interleave) the FFmpeg data across; if smaller, then you have to keep fetching new buffers as each one is filled.

The resultant program is il_ffmpeg_render_audio.c.

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>


#include <OMX_Core.h>
#include <OMX_Component.h>


#include <bcm_host.h>
#include <ilclient.h>


#include "libavcodec/avcodec.h"
#include <libavformat/avformat.h>
#include "libavutil/mathematics.h"
#include "libavutil/samplefmt.h"


#define INBUF_SIZE 4096
#define AUDIO_INBUF_SIZE 20480
#define AUDIO_REFILL_THRESH 4096


#define AUDIO  "BST.mp3"

/* For the RPi name can be "hdmi" or "local" */
void setOutputDevice(OMX_HANDLETYPE handle, const char *name) {
    OMX_ERRORTYPE err;
    OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest;


    if (name && strlen(name) < sizeof(arDest.sName)) {
        memset(&arDest, 0, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
        arDest.nSize = sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE);
        arDest.nVersion.nVersion = OMX_VERSION;


        strcpy((char *)arDest.sName, name);

        err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
        if (err != OMX_ErrorNone) {
            fprintf(stderr, "Error on setting audio destination ");
            exit(1);
        }
    }
}


void setPCMMode(OMX_HANDLETYPE handle, int startPortNumber) {
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
    OMX_ERRORTYPE err;


    memset(&sPCMMode, 0, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nSize = sizeof(OMX_AUDIO_PARAM_PCMMODETYPE);
    sPCMMode.nVersion.nVersion = OMX_VERSION;


    sPCMMode.nPortIndex = startPortNumber;

    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    printf("Sampling rate %d, channels %d ",
           sPCMMode.nSamplingRate,
           sPCMMode.nChannels);


    sPCMMode.nSamplingRate = 44100;
    sPCMMode.nChannels = 2; // assumed for now - should be checked


    err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "PCM mode unsupported ");
        return;
    } else {
        fprintf(stderr, "PCM mode supported ");
        fprintf(stderr, "PCM sampling rate %d ", sPCMMode.nSamplingRate);
        fprintf(stderr, "PCM nChannels %d ", sPCMMode.nChannels) ;
    }
}


void printState(OMX_HANDLETYPE handle) {
    // elided
}


char *err2str(int err) {
    return "error elided";
}


void eos_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "Got eos event ");
}


void error_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "OMX error %s ", err2str(data));
}


int get_file_size(char *fname) {
    struct stat st;


    if (stat(fname, &st) == -1) {
        perror("Stat'ing img file");
        return -1;
    }
    return(st.st_size);
}


AVPacket avpkt;
AVCodecContext *c = NULL;


/*
 * Audio decoding.
 */
static void audio_decode_example(const char *filename)
{
    AVCodec *codec;


    av_init_packet(&avpkt);

    printf("Audio decoding ");

    /* find the mpeg audio decoder */
    codec = avcodec_find_decoder(AV_CODEC_ID_MP3);
    if (!codec) {
        fprintf(stderr, "codec not found ");
        exit(1);
    }


    c = avcodec_alloc_context3(codec);;

    /* open it */
    if (avcodec_open2(c, codec, NULL) < 0) {
        fprintf(stderr, "could not open codec ");
        exit(1);
    }
}


static void set_audio_render_input_format(COMPONENT_T *component) {
    // set input audio format
    printf("Setting audio render format ");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;
    //setHeader(&audioPortFormat,  sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE) ;
    audioPortFormat.nVersion.nVersion = OMX_VERSION;


    audioPortFormat.nPortIndex = 100;

    OMX_GetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);


    audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM;
    //audioPortFormat.eEncoding = OMX_AUDIO_CodingMP3;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);


    setPCMMode(ilclient_get_handle(component), 100);

}

int num_streams = 0;
int sample_size = 0;


OMX_ERRORTYPE read_into_buffer_and_empty(AVFrame *decoded_frame,
                                         COMPONENT_T *component,
                                         // OMX_BUFFERHEADERTYPE *buff_header,
                                         int total_len) {
    OMX_ERRORTYPE r;
    OMX_BUFFERHEADERTYPE *buff_header = NULL;
    int k, m, n;


    if (total_len <= 4096) { //buff_header->nAllocLen) {
        // all decoded frame fits into one OpenMAX buffer
        buff_header =
            ilclient_get_input_buffer(component,
                                      100,
                                      1 /* block */);
        for (k = 0, n = 0; n < decoded_frame->nb_samples; n++) {
            for (m = 0; m < num_streams; m++) {
                memcpy(&buff_header->pBuffer[k],
                       &decoded_frame->data[m][n*sample_size],
                       sample_size);
                k += sample_size;
            }
        }


        buff_header->nFilledLen = k;
        r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
                                buff_header);
        if (r != OMX_ErrorNone) {
            fprintf(stderr, "Empty buffer error %s ",
                    err2str(r));
        }
        return r;
    }


    // more than one OpenMAX buffer required
    for (k = 0, n = 0; n < decoded_frame->nb_samples; n++) {


        if (k == 0) {
             buff_header =
                ilclient_get_input_buffer(component,
                                          100,
                                          1 /* block */);
        }


        // interleave the samples from the planar streams
        for (m = 0; m < num_streams; m++) {
            memcpy(&buff_header->pBuffer[k],
                   &decoded_frame->data[m][n*sample_size],
                   sample_size);
            k += sample_size;
        }


        if (k >= buff_header->nAllocLen) {
            // this buffer is full
            buff_header->nFilledLen = k;
            r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
                                    buff_header);
            if (r != OMX_ErrorNone) {
                fprintf(stderr, "Empty buffer error %s ",
                        err2str(r));
            }
            k = 0;
            buff_header = NULL;
        }
    }
    if (buff_header != NULL) {
            buff_header->nFilledLen = k;
            r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
                                    buff_header);
            if (r != OMX_ErrorNone) {
                fprintf(stderr, "Empty buffer error %s ",
                        err2str(r));
            }
    }
    return r;
}


int main(int argc, char** argv) {

    int i;
    char *componentName;
    int err;
    ILCLIENT_T  *handle;
    COMPONENT_T *component;


    AVFormatContext *pFormatCtx = NULL;

    char *audio_file = AUDIO ;
    if (argc == 2) {
        audio_file = argv[1];
    }


    FILE *fp = fopen(audio_file, "r");
    int toread = get_file_size(audio_file);


    OMX_BUFFERHEADERTYPE *buff_header;

    componentName = "audio_render";

    bcm_host_init();

    handle = ilclient_init();
    if (handle == NULL) {
        fprintf(stderr, "IL client init failed ");
        exit(1);
    }


    if (OMX_Init() != OMX_ErrorNone) {
        ilclient_destroy(handle);
        fprintf(stderr, "OMX init failed ");
        exit(1);
    }


    ilclient_set_error_callback(handle,
                                error_callback,
                                NULL);
    ilclient_set_eos_callback(handle,
                              eos_callback,
                              NULL);


    err = ilclient_create_component(handle,
                                    &component,
                                    componentName,
                                    ILCLIENT_DISABLE_ALL_PORTS
                                    |
                                    ILCLIENT_ENABLE_INPUT_BUFFERS
                                    );
    if (err == -1) {
        fprintf(stderr, "Component create failed ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    err = ilclient_change_component_state(component,
                                          OMX_StateIdle);
    if (err < 0) {
        fprintf(stderr, "Couldn't change state to Idle ");
        exit(1);
    }
    printState(ilclient_get_handle(component)) ;


    // FFmpeg init
    av_register_all();
    if(avformat_open_input(&pFormatCtx, audio_file, NULL, NULL)!=0) {
        fprintf(stderr, "Can't get format ");
        return -1; // Couldn't open file
    }
    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx, NULL)<0)
        return -1; // Couldn't find stream information
    av_dump_format(pFormatCtx, 0, audio_file, 0);


    audio_decode_example(audio_file);

    // must be before we enable buffers
    set_audio_render_input_format(component);


    setOutputDevice(ilclient_get_handle(component), "local");

    // input port
    ilclient_enable_port_buffers(component, 100,
                                 NULL, NULL, NULL);
    ilclient_enable_port(component, 100);


    err = ilclient_change_component_state(component,
                                          OMX_StateExecuting);
    if (err < 0) {
        fprintf(stderr, "Couldn't change state to Executing ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    // now work through the file

    int len;
    FILE *f;
    uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE] ;


    AVFrame *decoded_frame = NULL;

    f = fopen(audio_file, "rb");
    if (!f) {
        fprintf(stderr, "could not open %s ", audio_file);
        exit(1);
    }


    /* decode until eof */
    avpkt.data = inbuf;
    avpkt.size = fread(inbuf, 1, AUDIO_INBUF_SIZE, f);


    while (avpkt.size > 0) {
        int got_frame = 0;


        if (!decoded_frame) {
            if (!(decoded_frame = avcodec_alloc_frame())) {
                fprintf(stderr, "out of memory ");
                exit(1);
            }
        } else
            avcodec_get_frame_defaults(decoded_frame);


        len = avcodec_decode_audio4(c, decoded_frame, &got_frame, &avpkt);
        if (len < 0) {
            fprintf(stderr, "Error while decoding ");
            exit(1);
        }
        if (got_frame) {
            /* if a frame has been decoded, we want to send it to OpenMAX */
            int data_size = av_samples_get_buffer_size(NULL, c->channels,
                                                       decoded_frame->nb_samples,
                                                       c->sample_fmt, 1);
            // first time: count the number of  planar streams
            if (num_streams == 0) {
                while (num_streams < AV_NUM_DATA_POINTERS &&
                       decoded_frame->data[num_streams] != NULL)
                    num_streams++;
            }


            // first time: set sample_size from 0 to e.g 2 for 16-bit data
            if (sample_size == 0) {
                sample_size =
                    data_size / (num_streams * decoded_frame->nb_samples);
            }


            // Empty into render_audio input buffers
            read_into_buffer_and_empty(decoded_frame,
                                       component,
                                       data_size
                                       );
        }


        avpkt.size -= len;
        avpkt.data += len;
        if (avpkt.size < AUDIO_REFILL_THRESH) {
            /* Refill the input buffer, to avoid trying to decode
             * incomplete frames. Instead of this, one could also use
             * a parser, or use a proper container format through
             * libavformat. */
            memmove(inbuf, avpkt.data, avpkt.size);
            avpkt.data = inbuf;
            len = fread(avpkt.data + avpkt.size, 1,
                        AUDIO_INBUF_SIZE - avpkt.size, f);
            if (len > 0)
                avpkt.size += len;
        }
    }


    printf("Finished decoding MP3 ");
    // clean up last empty buffer with EOS
    buff_header =
        ilclient_get_input_buffer(component,
                                  100,
                                  1 /* block */);
    buff_header->nFilledLen = 0;
    int r;
    buff_header->nFlags |= OMX_BUFFERFLAG_EOS;
    r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
                            buff_header);
    if (r != OMX_ErrorNone) {
        fprintf(stderr, "Empty buffer error %s ",
                err2str(r));
    } else {
        printf("EOS sent ");
    }


    fclose(f);

    avcodec_close(c);
    av_free(c);
    av_free(decoded_frame);


    sleep(10);
    exit(0);
}

The compiled program will take a command-line argument that is the name of an MP3 file and play it to the analog audio port .

Rendering MP3 with ID3 Extensions Using FFmpeg or LibAV and OpenMAX

MP3 was originally designed as a stereo format without metadata information (artist, date of recording, and so on). Now you have 5.1 and 6.1 formats with probably more coming. The MP3 Surround extension looks after this. Metadata is typically added as an ID3 extension. (I discovered this only after the previous program broke badly on some newer MP3 files I have: no MP3 header.)

The command file will identify such files with the following:

$ file Beethoven.mp3
Beethoven.mp3: Audio file with ID3 version 2.3.0

An MP3 file will usually have two channels for stereo. An MP3+ID3 file containing, say, an image file, will have two streams, one for the audio and one for the image. The av_dump_format will show all the IDE metadata, plus the image stream.

Input #0, mp3, from 'Beethoven.mp3':
  Metadata:
    copyright       : 2013 Naxos Digital Services Ltd.
    album           : String Quartets - BEETHOVEN L van HAYDN FJ MOZART WA SCHUBERT F JANACEK L (Petersen Quar∼1
    TSRC            : US2TL0937001
    title           : String Quartet No 6 in B-Flat Major Op 18 No 6 I Allegro con brio
    TIT1            : String Quartets - BEETHOVEN L van HAYDN FJ MOZART WA SCHUBERT F JANACEK L (Petersen Quar∼1
    disc            : 1
    TLEN            : 522200
    track           : 1
    publisher       : Capriccio-(C51147)
    encoder         : LAME 32bits version 3.98.2 (http://www.mp3dev.org/)
    album_artist    : Petersen Quartet
    artist          : Petersen Quartet
    TEXT            : Beethoven, Ludwig van
    TOFN            : 729325.mp3
    genre           : Classical Music
    composer        : Beethoven, Ludwig van
    date            : 2009
  Duration: 00:08:42.24, start: 0.000000, bitrate: 323 kb/s
    Stream #0.0: Audio: mp3, 44100 Hz, 2 channels, s16p, 320 kb/s
    Stream #0.1: Video: mjpeg, yuvj444p, 500x509 [PAR 300:300 DAR 500:509], 90k tbn
    Metadata:
      title           :
      comment         : Cover (front)

Stream 0 is the audio stream, while an image is in stream 1.

You want to pass the MP3 stream to the FFmpeg/LibAV audio decoder but not the image stream. An AV frame contains the field stream_index that can be used to distinguish between them. If it is the audio stream, pass it to the OMX audio renderer; otherwise, skip it. You will see similar behavior when you look at rendering audio and video MPEG files. You find the audio stream index by asking av_find_best_streamfor the AVMEDIA_TYPE_AUDIO stream.

In the previous sections, you read chunks in from the audio file and then relied on the decoder to break that into frames. That is now apparently on the way out. Instead, you should use the function av_read_frame, which reads only one frame at a time.

Unfortunately, on the RPi distro I was using, libavcodec-extrawas at package 52, and that has a broken implementation of av_read_frame. You will need to do a package upgrade of LibAV to version 53 or later in order for the following code to work (the original version got the stream_index wrong).

So, by this stage, you read frames using av_read_frame, find the audio stream index of the frame, and use this to distinguish between audio and other. You still get frames with the audio samples in the wrong format such as AV_SAMPLE_FMT_S16Pinstead of AV_SAMPLE_FMT_S16. In the previous sections, you reformatted the stream by hand. But of course, the formats might change again, so your code would break.

The Audio Resample package gives a general-purpose way of managing this. It requires more setup, but once done will be more stable…

…except for a little glitch: this is one of the few areas in which FFmpeg and LibAV have different APIs. How to do it with FFmpeg is shown in “How to convert sample rate from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16?” ( http://stackoverflow.com/questions/14989397/how-to-convert-sample-rate-from-av-sample-fmt-fltp-to-av-sample-fmt-s16 ). You are using LibAV, which is basically the same but with differently named types. The following sets up the conversion parameters :

    AVAudioResampleContext *swr = avresample_alloc_context();
    av_opt_set_int(swr, "in_channel_layout",  audio_dec_ctx->channel_layout, 0);
    av_opt_set_int(swr, "out_channel_layout", audio_dec_ctx->channel_layout,  0);
    av_opt_set_int(swr, "in_sample_rate",     audio_dec_ctx->sample_rate, 0);
    av_opt_set_int(swr, "out_sample_rate",    audio_dec_ctx->sample_rate, 0);
    av_opt_set_int(swr, "in_sample_fmt",  audio_dec_ctx->sample_fmt, 0);
    av_opt_set_int(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16,  0);
    avresample_open(swr);

The following performs the conversion:

    uint8_t *buffer;
    av_samples_alloc(&buffer, &out_linesize, 2, decoded_frame->nb_samples,
                     AV_SAMPLE_FMT_S16, 0);
    avresample_convert(swr, &buffer,
                       decoded_frame->linesize[0],
                       decoded_frame->nb_samples,
                       decoded_frame->data,
                       decoded_frame->linesize[0],
                       decoded_frame->nb_samples);

The revised program is il_ffmpeg_render_resample_audio.c.

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>


#include <OMX_Core.h>
#include <OMX_Component.h>


#include <bcm_host.h>
#include <ilclient.h>


#include "libavcodec/avcodec.h"
#include <libavformat/avformat.h>
#include "libavutil/mathematics.h"
#include "libavutil/samplefmt.h"
#include "libavutil/opt.h"
#include "libavresample/avresample.h"


#define INBUF_SIZE 4096
#define AUDIO_INBUF_SIZE 20480
#define AUDIO_REFILL_THRESH 4096


#define AUDIO  "Bethoven.mp3"

AVCodecContext *audio_dec_ctx;
int audio_stream_idx;


/* For the RPi name can be "hdmi" or "local" */
void setOutputDevice(OMX_HANDLETYPE handle, const char *name) {
    OMX_ERRORTYPE err;
    OMX_CONFIG_BRCMAUDIODESTINATIONTYPE arDest ;


    if (name && strlen(name) < sizeof(arDest.sName)) {
        memset(&arDest, 0, sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE));
        arDest.nSize = sizeof(OMX_CONFIG_BRCMAUDIODESTINATIONTYPE);
        arDest.nVersion.nVersion = OMX_VERSION;


        strcpy((char *)arDest.sName, name);

        err = OMX_SetParameter(handle, OMX_IndexConfigBrcmAudioDestination, &arDest);
        if (err != OMX_ErrorNone) {
            fprintf(stderr, "Error on setting audio destination ");
            exit(1);
        }
    }
}


void setPCMMode(OMX_HANDLETYPE handle, int startPortNumber) {
    OMX_AUDIO_PARAM_PCMMODETYPE sPCMMode;
    OMX_ERRORTYPE err;


    memset(&sPCMMode, 0, sizeof(OMX_AUDIO_PARAM_PCMMODETYPE));
    sPCMMode.nSize = sizeof(OMX_AUDIO_PARAM_PCMMODETYPE);
    sPCMMode.nVersion.nVersion = OMX_VERSION;


    sPCMMode.nPortIndex = startPortNumber;

    err = OMX_GetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    printf("Sampling rate %d, channels %d ",
           sPCMMode.nSamplingRate,
           sPCMMode.nChannels);


    sPCMMode.nSamplingRate = 44100;
    sPCMMode.nChannels = 2; // assumed for now - should be checked


    err = OMX_SetParameter(handle, OMX_IndexParamAudioPcm, &sPCMMode);
    if(err != OMX_ErrorNone){
        fprintf(stderr, "PCM mode unsupported ");
        return;
    } else {
        fprintf(stderr, "PCM mode supported ");
        fprintf(stderr, "PCM sampling rate %d ", sPCMMode.nSamplingRate);
        fprintf(stderr, "PCM nChannels %d ", sPCMMode.nChannels);
    }
}


void printState(OMX_HANDLETYPE handle) {
    // elided
}


char *err2str(int err) {
    return "error elided";
}


void eos_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "Got eos event ");
}


void error_callback(void *userdata, COMPONENT_T *comp, OMX_U32 data) {
    fprintf(stderr, "OMX error %s ", err2str(data));
}


int get_file_size(char *fname) {
    struct stat st;


    if (stat(fname, &st) == -1) {
        perror("Stat'ing img file");
        return -1;
    }
    return(st.st_size);
}


AVPacket avpkt;

static void set_audio_render_input_format(COMPONENT_T *component) {
    // set input audio format
    printf("Setting audio render format ");
    OMX_AUDIO_PARAM_PORTFORMATTYPE audioPortFormat;
    //setHeader(&audioPortFormat,  sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    memset(&audioPortFormat, 0, sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE));
    audioPortFormat.nSize = sizeof(OMX_AUDIO_PARAM_PORTFORMATTYPE);
    audioPortFormat.nVersion.nVersion = OMX_VERSION ;


    audioPortFormat.nPortIndex = 100;

    OMX_GetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);


    audioPortFormat.eEncoding = OMX_AUDIO_CodingPCM;
    //audioPortFormat.eEncoding = OMX_AUDIO_CodingMP3;
    OMX_SetParameter(ilclient_get_handle(component),
                     OMX_IndexParamAudioPortFormat, &audioPortFormat);


    setPCMMode(ilclient_get_handle(component), 100);

}

int num_streams = 0;
int sample_size = 0;


OMX_ERRORTYPE read_into_buffer_and_empty(AVFrame *decoded_frame,
                                         COMPONENT_T *component,
                                         // OMX_BUFFERHEADERTYPE *buff_header,
                                         int total_len) {
    OMX_ERRORTYPE r;
    OMX_BUFFERHEADERTYPE *buff_header = NULL;


    // do this once only
    AVAudioResampleContext *swr = avresample_alloc_context();
    av_opt_set_int(swr, "in_channel_layout",  audio_dec_ctx->channel_layout, 0);
    av_opt_set_int(swr, "out_channel_layout", audio_dec_ctx->channel_layout,  0);
    av_opt_set_int(swr, "in_sample_rate",     audio_dec_ctx->sample_rate, 0);
    av_opt_set_int(swr, "out_sample_rate",    audio_dec_ctx->sample_rate, 0);
    av_opt_set_int(swr, "in_sample_fmt",  audio_dec_ctx->sample_fmt, 0);
    av_opt_set_int(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16,  0);
    avresample_open(swr);


    int required_decoded_size = 0;

    int out_linesize;
    required_decoded_size =
        av_samples_get_buffer_size(&out_linesize, 2,
                                   decoded_frame->nb_samples,
                                   AV_SAMPLE_FMT_S16, 0);
    uint8_t *buffer;
    av_samples_alloc(&buffer, &out_linesize, 2, decoded_frame->nb_samples,
                     AV_SAMPLE_FMT_S16, 0);
    avresample_convert(swr, &buffer,
                       decoded_frame->linesize[0],
                       decoded_frame->nb_samples,
                       // decoded_frame->extended_data,
                       decoded_frame->data,
                       decoded_frame->linesize[0],
                       decoded_frame->nb_samples);


    while (required_decoded_size >= 0) {
        buff_header =
            ilclient_get_input_buffer(component,
                                      100,
                                      1 /* block */);
        if (required_decoded_size > 4096) {
            memcpy(buff_header->pBuffer,
                   buffer, 4096);
            buff_header->nFilledLen = 4096;
            buffer += 4096;
        } else {
             memcpy(buff_header->pBuffer,
                   buffer, required_decoded_size);
            buff_header->nFilledLen = required_decoded_size ;
        }
        required_decoded_size -= 4096;


        r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
                                buff_header);
        if (r != OMX_ErrorNone) {
            fprintf(stderr, "Empty buffer error %s ",
                    err2str(r));
            return r;
        }
    }
    return r;
}


FILE *favpkt = NULL;

int main(int argc, char** argv) {

    int i;
    char *componentName;
    int err;
    ILCLIENT_T  *handle;
    COMPONENT_T *component;


    AVFormatContext *pFormatCtx = NULL;

    char *audio_file = AUDIO;
    if (argc == 2) {
        audio_file = argv[1];
    }


    OMX_BUFFERHEADERTYPE *buff_header;

    componentName = "audio_render";

    bcm_host_init();

    handle = ilclient_init();
    if (handle == NULL) {
        fprintf(stderr, "IL client init failed ");
        exit(1);
    }


    if (OMX_Init() != OMX_ErrorNone) {
        ilclient_destroy(handle);
        fprintf(stderr, "OMX init failed ");
        exit(1);
    }


    ilclient_set_error_callback(handle,
                                error_callback,
                                NULL);
    ilclient_set_eos_callback(handle,
                              eos_callback,
                              NULL);


    err = ilclient_create_component(handle,
                                    &component,
                                    componentName,
                                    ILCLIENT_DISABLE_ALL_PORTS
                                    |
                                    ILCLIENT_ENABLE_INPUT_BUFFERS
                                    );
    if (err == -1) {
        fprintf(stderr, "Component create failed ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    err = ilclient_change_component_state(component,
                                          OMX_StateIdle);
    if (err < 0) {
        fprintf(stderr, "Couldn't change state to Idle ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    // FFmpeg init
    av_register_all();
    if(avformat_open_input(&pFormatCtx, audio_file, NULL, NULL)!=0) {
        fprintf(stderr, "Can't get format ");
        return -1; // Couldn't open file
    }
    // Retrieve stream information
    if(avformat_find_stream_info(pFormatCtx, NULL)<0)
        return -1; // Couldn't find stream information
    av_dump_format(pFormatCtx, 0, audio_file, 0);


    int ret;
    if ((ret = av_find_best_stream(pFormatCtx, AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0)) >= 0) {
        //AVCodecContext* codec_context ;
        AVStream *audio_stream;
        int sample_rate;


        audio_stream_idx = ret;
        fprintf(stderr, "Audio stream index is %d ", ret);


        audio_stream = pFormatCtx->streams[audio_stream_idx];
        audio_dec_ctx = audio_stream->codec;


        sample_rate = audio_dec_ctx->sample_rate;
        printf("Sample rate is %d ", sample_rate);
        printf("Sample format is %d ", audio_dec_ctx->sample_fmt);
        printf("Num channels %d ", audio_dec_ctx->channels);


        if (audio_dec_ctx->channel_layout == 0) {
            audio_dec_ctx->channel_layout =
                av_get_default_channel_layout(audio_dec_ctx->channels);
        }


        AVCodec *codec = avcodec_find_decoder(audio_stream->codec->codec_id);
        if (avcodec_open2(audio_dec_ctx, codec, NULL) < 0) {
            fprintf(stderr, "could not open codec ");
            exit(1);
        }


        if (codec) {
            printf("Codec name %s ", codec->name);
        }
    }


    av_init_packet(&avpkt);

    // must be before we enable buffers
    set_audio_render_input_format(component);


    setOutputDevice(ilclient_get_handle(component), "local");

    // input port
    ilclient_enable_port_buffers(component, 100,
                                 NULL, NULL, NULL);
    ilclient_enable_port(component, 100);


    err = ilclient_change_component_state(component,
                                          OMX_StateExecuting);
    if (err < 0) {
        fprintf(stderr, "Couldn't change state to Executing ");
        exit(1);
    }
    printState(ilclient_get_handle(component));


    // now work through the file

    int len;
    uint8_t inbuf[AUDIO_INBUF_SIZE + FF_INPUT_BUFFER_PADDING_SIZE];


    AVFrame *decoded_frame = NULL;

    /* decode until eof */
    avpkt.data = inbuf;
    av_read_frame(pFormatCtx, &avpkt);


    while (avpkt.size > 0) {
        printf("Packet size %d ", avpkt.size);
        printf("Stream idx is %d ", avpkt.stream_index);
        printf("Codec type %d ", pFormatCtx->streams[1]->codec->codec_type);


        if (avpkt.stream_index != audio_stream_idx) {
            // it's an image, subtitle, etc
            av_read_frame(pFormatCtx, &avpkt);
            continue;
        }


        int got_frame = 0;

        if (favpkt == NULL) {
            favpkt = fopen("tmp.mp3", "wb");
        }
        fwrite(avpkt.data, 1, avpkt.size, favpkt);


        if (!decoded_frame) {
            if (!(decoded_frame = avcodec_alloc_frame())) {
                fprintf(stderr, "out of memory ");
                exit(1);
            }
        }


        len = avcodec_decode_audio4(audio_dec_ctx,
                                     decoded_frame, &got_frame, &avpkt);
        if (len < 0) {
            fprintf(stderr, "Error while decoding ");
            exit(1);
        }
        if (got_frame) {
            /* if a frame has been decoded, we want to send it to OpenMAX */
            int data_size =
                av_samples_get_buffer_size(NULL, audio_dec_ctx->channels,
                                           decoded_frame->nb_samples,
                                           audio_dec_ctx->sample_fmt, 1);


            // Empty into render_audio input buffers
            read_into_buffer_and_empty(decoded_frame,
                                       component,
                                       data_size
                                       );
        }
        av_read_frame(pFormatCtx, &avpkt);
        continue;
    }


    printf("Finished decoding MP3 ");
    // clean up last empty buffer with EOS
    buff_header =
        ilclient_get_input_buffer(component,
                                  100,
                                  1 /* block */);
    buff_header->nFilledLen = 0;
    int r;
    buff_header->nFlags |= OMX_BUFFERFLAG_EOS;
    r = OMX_EmptyThisBuffer(ilclient_get_handle(component),
                            buff_header);
    if (r != OMX_ErrorNone) {
        fprintf(stderr, "Empty buffer error %s ",
                err2str(r));
    } else {
        printf("EOS sent ");
    }


    avcodec_close(audio_dec_ctx);
    av_free(audio_dec_ctx);
    av_free(decoded_frame);


    sleep(10);
    exit(0);
}

The program takes an MP3 file as a command-line argument, defaulting to a file called Beethoven.mp3.

Conclusion

Audio support is not so good with the RPi. You have to use tools such as FFmpeg/LibAV or Gstreamer.

Resources

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset