Standard Functions

Create a standard UDF in three steps, two of which are optional:

init

When an SQL query calls a UDF, the init routine for that function is called first, if it exists. This routine is responsible for allocating data and setting certain parameters for the main routine.

main

After the init routine is called, the main routine for the desired UDF is called. This is the only routine required to exist to define a UDF. This routine is responsible for processing the data and returning the result of the function.

deinit

Once the main routine is finished and the result is returned to the user, MySQL will call a deinit routine for the function, if it exists. The purpose of this routine is to deallocate any memory allocated by the init routine and clean up as necessary.

When an SQL query calls a standard function, these routines are called in the following manner:

init
   main
   main
   main
   ...
deinit

MySQL calls the main routine once for each row returned by the SQL query. The return values of the main routine comprise the results of the UDF as a whole.

When creating a UDF, all three routines must reside in the same library. You can, however, bundle multiple functions in the same library.

Tip

We use the terms “routine” and “function” in a specific manner in this section to reduce the possibility of confusion. The problem is that multiple native functions make up a single UDF. When we refer to a routine, we mean one of the native functions that make up the UDF. When we refer to a function, we mean the UDF itself.

The init routine

The init routine is the first routine called when a user attempts to execute a UDF within a SQL query. For a function with the name remove_html( ), the init routine must have the name remove_html_init( ). The init function must have the following signature:

my_bool func_name_init(UDF_INIT *initid, UDF_ARGS *args, char *message);

The three parameters passed to the init routine are all pointers to modifiable data that will be made available to the main function.

initid

This parameter is the main data structure used by the UDF engine. This structure will be made available to all three routines for a particular UDF. Any changes that need to be made to the default values of this structure should be made here in the init routine so they will be available for the main routine. The structure has the following members:

maybe_null

This Boolean value indicates whether the function can return an SQL NULL value. By default, this value is false unless any of the function arguments are nullable, in which case the default is true. To be safe, if you want the function to return an SQL NULL value, set the value of this attribute to true within the init routine.

decimals

This integer value indicates the maximum number of decimals in the result returned by this function. This attribute is used only if the function returns a numeric result. The default value of this attribute is determined by the arguments passed into the function. The number of decimals in the numeric argument with the most decimal places is used as the default. Since there is no way of knowing which values will be passed into the function, if you want to limit the number of decimals returned, set this attribute within the init routine.

max_length

This integer value indicates the maximum number of characters (or digits) in the result returned by this function. The default value depends on the type of function and the arguments passed to it. For a string function, the default value is the length of the longest string argument. For an integer function, the default is 21 digits (one of which will be a sign character, if necessary). For a decimal function, the default is 13 digits or other characters (including any sign character or decimal point) plus the value of the decimals attribute (see above). There is no performance penalty for setting this value to a large number (in case the function returns a BLOB or other large value).

ptr

This generic pointer can be used arbitrarily by the routines for a UDF. It is most useful to pass some data between the three routines of the function or to include a reference to some data from another part of the application. If this pointer will be used for new data (that is, not a reference to some existing data), the memory should be allocated here in the init routine. The main routine will then receive the allocated pointer to use as it will.

args

This array contains the arguments passed in from the SQL query. Because the actual processing of the arguments is performed in the main routine, this structure is rarely touched during the init routine. There are two attributes of this structure that are sometimes accessed within the init routine:

arg_type

This attribute is an array of values that correspond to the constants STRING_RESULT, INT_RESULT, and REAL_RESULT. Within the main routine, this array indicates the types of arguments the routine is receiving. These types may not always be what the routine is expecting. Instead of returning an error, it is possible to force the arguments passed by the user into the types that the main routine will need. This is done by setting the value within the arg_type array in the init routine. For example, if the first argument needs to be a STRING-type argument, set the first element of this array to STRING_RESULT. This will cause any value passed in by the user (even an integer or real value) to be coerced into a string value before the main routine is called.

lengths

This attribute is an array of values that correspond to the length of each argument passed in by the user. Within the main routine, the attribute describes the actual length values of the arguments being passed in, row by row. Within the init routine, this attribute describes maximum possible length values for string arguments.

message

This value is a character pointer used to store a user-level message about any error that occurs during the execution of the function. This preallocated character buffer is 200 characters wide by default. However, convention dictates keeping the message short enough to fit on one line. The message string must be terminated by a null byte.

The return value of the init routine is a my_bool (Boolean) type. If the initialization completes successfully, the routine should return 0 (false). If an error occurs, the routine should return 1 (true). In addition, in the case of an error, a user-level error message should be placed into the message argument.

The main routine

The main routine is the heart of the UDF. In fact, if no special initialization or cleanup is required, this routine is the only one needed to create a UDF. For a function with the name remove_html( ), the main routine should have the same name: remove_html( ).

The signature of the main routine depends on the return value of the UDF. There are three different SQL return values possible for UDFs: STRING , INTEGER , and REAL . The main routine for each of these types has the following signatures.

For a STRING:

char *func_name(UDF_INIT *initid, UDF_ARGS *args, 
 char *result, unsigned long *length,
 char *is_null, char *error)

For an INTEGER:

long long func_name(UDF_INIT *initid, UDF_ARGS *args,
 char *is_null, char *error)

For a REAL:

double func_name(UDF_INIT *initid, UDF_ARGS *args,
 char *is_null, char *error)

All the different routine signatures handle incoming arguments the same way. The differences lie in how each routine returns the result. We will consider the four parameters that are identical among the signatures first:

initid

This is the main data structure used by the UDF engine. It is a UDF_INIT structure, which is described in the previous section. Any of the attributes set during the init routine (including the generic data ptr attribute) are available during the main routine. Although there are no hard rules on the matter, the attributes of these structures should be accessed read-only from within the main routine. This is because all the initialization is done during the init routine, and all the deinitialization is done during the deinit routine. Therefore, in the middle, there is no need to modify these values.

args

These are the arguments passed in from the SQL query as a UDF_ARGS structure with the following attributes:

arg_count

This is the number of arguments passed into the UDF from the SQL query. If you want the UDF to have a fixed number of arguments, you can check the value here and return an error if the number is not correct. Otherwise, UDFs can support variable numbers of arguments.

arg_type

This is an array of type identifiers that indicate the SQL type of each argument. The number of elements in this array is equal to the arg_count attribute. Checking the values of the arg_type array in the main routine allows strict type checking of the incoming arguments. However, as mentioned in the previous section, it is generally more flexible to set the desired types in the init routine. This will transform them from whatever the user passed in to whatever types the main routine is expecting. The values of the arg_type attribute correspond to the constants STRING_RESULT, INT_RESULT, and REAL_RESULT, defined in the MySQL development headers.

args

This is an array of the actual arguments passed to the function. As mentioned above, the number of elements in this array is given by the arg_count property. The values of the arguments depend on two factors: the type of the argument and whether the argument was a constant value. Table 14-1 contains a mapping of types to C types. To accommodate these different possibilities, the elements of the args array are passed around as generic void pointers. You must therefore cast them to the appropriate C type.

Table 14-1. The mapping of MySQL result types to C types

Result type

C type

STRING_RESULT

char *

INT_RESULT

long long

REAL_RESULT

double

lengths

This value is an array of the lengths of the arguments passed to the UDF. As with the other array attributes (arg_type and args), the length of the array is given by the arg_count attribute. Each value of the array is the length of that argument (and hence the length of the corresponding element in the args array). For string arguments, the value is the accurate length of the string for that particular call to the function. Because the function will be called once for each row in the SQL query, this value will change to reflect the length of a string argument each time. Integer and decimal arguments have fixed lengths that are either defaults or set within the init routine.

is_null

This argument is a pointer to an integer value. If the UDF is returning a NULL value for a particular call to the function, set the value of this integer to 1. Otherwise, leave it at the default of 0.

error

This argument is a pointer to an integer value. If an error occurs during the main routine of the UDF, set the value of this integer to 1. This will return a NULL value for this particular call to the function. In addition, the main routine will not be called for any subsequent rows of data in the query that called the UDF, and NULL will be returned for each row.

There are two additional arguments specific to UDF that return STRING results:

result

This argument is a pointer to a character array. The return value of the UDF should be placed in the string referenced by this pointer. In addition, the pointer should be returned as the return value of the main routine (explained later). The MySQL UDF engine allocates 255 bytes as the buffer for a string return value. If the result fits within this limit, simply place the result into this buffer and return the pointer. If the result is bigger, you must use a string buffer that you have allocated. This allocation should be done by using the generic ptr attribute within the init routine. The result string should then be placed into this pointer, which should be returned as the return value of the function. If you take this route, the memory allocated for the pointer within the init routine should be deallocated in the deinit routine.

length

This argument is a pointer to an integer value. The value referenced by this pointer should be set to the length of the return value of the UDF. This does not include any trailing null characters. That is, if the return value of the function is the string make, the integer referenced by length should be set to 6.

The return value of the main routine depends upon the return value of the UDF as a whole. For INTEGER_RETURN functions, the main routine should return the return value of the function as a long long integer value. For REAL_RETURN functions, the main routine should return the return value of the function as a double value. For STRING_RETURN functions, the main routine should return the return value of the function as a pointer to a character array. As mentioned above, if the resulting value is less than 255 characters, this pointer should be the result pointer passed into the main routine. Otherwise, it should be a pointer allocated within the init routine.

The deinit routine

The deinit routine is responsible for freeing any memory allocated by the init routine and performing any other necessary cleanup. For a UDF with the name remove_html( ), the deinit routine must have the name remove_html_deinit( ). It has the following signature:

void func_name_deinit(UDF_INIT *initid)

The deinit routine takes one argument, the main UDF_INIT structure used by all three routines. This structure contains the generic ptr pointer that is available for generic use throughout the routines. If this pointer was allocated during the init routine, this deinit routine is responsible for freeing that memory. The deinit routine returns no value.

Example 14-1 shows a UDF that returns the contents of an image file, given the filename of that image. This function could be used to efficiently access image data without storing that data as a BLOB in the database.

Example 14-1. A UDF that returns the contents of an image file based on its filename
#include <mysql/my_global.h>
#include <mysql/my_sys.h>
#include <mysql/mysql.h>
#include <mysql/m_ctype.h>
#include <mysql/m_string.h>
   
#define MAX_IMAGE_SIZE 1048576
#define MAX_FILE_NAME 1024

/* Our 'init' routine is responsible for some simple set-up:
      - Check the argument count. This function must have exactly
             one argument, the filename, so we return an error 
             if there are any other number.
      - Coerce the types of the arguments. Theoretically, the 
              image names could be stored as integer IDs 
              (and thus stored in files with simple numbers 
              for names in the filesystem). Therefore, we want
              to allow any type of argument, and coerce it to a string.
     - Allocate the data buffer to hold the image data. This is a fairly large
              buffer to accommodate images of differing sizes.
*/
my_bool getImage_init(UDF_INIT *initid, UDF_ARGS *args, char *message) {
    if (args->arg_count != 1) {
        strmov(message,"Usage: getImage( fileName )");
        return 1;
    }
    args->arg_type[0] = STRING_RESULT;
    if ( !(initid->ptr = 
        (char *) malloc( sizeof(char) * MAX_IMAGE_SIZE ) ) ) {
        strmov(message, "Couldn't allocate memory!");
        return 1;
    }
    bzero( initid->ptr, sizeof(char) * MAX_IMAGE_SIZE );
    return 0;
}

/* This routine frees the memory allocated by the image buffer */
void getImage_deinit(UDF_INIT *initid) {
    if (initid->ptr)
        free(initid->ptr);
}

/* The main routine does most of the work for this function. The argument
   passed to the function is the filename of an image file. This function first
   uses 'chdir' to set the directory to the directory of the image file. This
   prevents misuse of this function by using it to read an arbitrary file on
   the filesystem. (We also remember the old working directory so that we can
   reset it at the end.)

   Once we've set the current directory, we open the image file using the
   Filename passed in as the argument. If the image opens successfully, we read
   the contents of the file into the empty data buffer. 

   Before returning the data, we set the length pointer to tell MySQL how
   Large our image buffer is. Finally, we return the pointer to the data 
   as the string result of this function.
*/
char *getImage(UDF_INIT *initid, UDF_ARGS *args, char *result,
               unsigned long *length, char *is_null, char *error) {
    FILE *img_file;
    size_t img_size;
    char filename[MAX_FILE_NAME];
    char old_dir[MAX_FILE_NAME];

    bzero(filename, MAX_FILE_NAME);
    bzero(old_dir, MAX_FILE_NAME);

    getcwd(old_dir, MAX_FILE_NAME);
    chdir("/path/to/my/pictures");
    strncpy( filename, args->args[0], args->lengths[0] );
    img_file = fopen( filename, "r" );
    if (! img_file ) {
        *is_null = 1;
        return 0;    
    }
    
    img_size =
        fread( (void *)initid->ptr, sizeof(char),
            MAX_IMAGE_SIZE, img_file );
    
    *length = (unsigned long)img_size;
    chdir(old_dir);
    fclose(img_file);
    return initid->ptr;
}
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset