Character Types

Managing character types is a little more complicated. Not only do you have to worry about the minimum and maximum string lengths, but you also have to worry about the average size and the amount of variation. For our current purposes, an index is a field or combination of fields on which you plan to search—basically, the fields in your WHERE clause. Indexing is, however, much more complicated, so we will provide further details later in the chapter. What’s important to note here is that indexing on character fields works best when the field is a fixed length. If there is little or, preferably, no variation in the length of your character-based fields, then a CHAR type is appropriate. An example of a good candidate for a CHAR field is a country code. The ISO provides a comprehensive list of standard two-character representations of country codes (US for the U.S., FR for France, etc.).[3] Because these codes are always exactly two characters, a CHAR(2) is the best way to maintain the country code based on the ISO representation

A value does not need to be constant length to use a CHAR field. It should, however, have very little variance. Phone numbers, for example, can be stored safely in a CHAR(13) field even though phone number lengths vary from nation to nation. The variance is little enough that there is no point in making a phone number field variable in length. Keep in mind that with a CHAR field, no matter how big the actual string being stored is, the field always takes up exactly the number of characters specified as the field’s size—no more, no less. Any difference between the length of the text being stored and the length of the field is made up by padding the value with spaces. While the few potential extra characters being wasted on a subset of the phone number data is not anything to worry about, you do not want to be wasting much more.

Variable-length text fields are appropriate for text fields with widely varying lengths. A good, common example of a field that demands a variable-length data type is a web URL. Most web addresses are relatively short (e.g., http://www.ora.com, http://www.imaginary.com, http://www.mysql.com) and consequentially do not pose problems. Occasionally, however, you will run into web addresses such as:

http://www.winespectator.com/Wine/Spectator/
_notes|5527293926834323221480431354?Xv11=&Xr5=&Xv1=&type-region-
search-code=&Xa14=flora+springs&Xv4=

If you construct a CHAR field large enough to hold this URL, you will be wasting a significant amount of space for almost every other URL being stored. A variable-length field lets you define a field length that can store the odd, long-length value while not wasting all that space for the common, short-length values.

Variable-length text fields in MySQL use only as much space as necessary to store an individual value into the field. A VARCHAR(255) field that holds the string “hello world,” for example, takes up only 12 bytes (1 byte for each character plus an extra byte to store the length).

Tip

MySQL varies from the ANSI standard by not padding VARCHAR fields. Any extra spaces are removed from a value before it is stored.

You cannot store strings with lengths greater than the field length you have specified. With a VARCHAR(4) field, you can store at most a string with four characters. If you attempt to store the string “happy birthday,” MySQL will truncate the string to “happ.” The downside is that there is no way to store the odd string that exceeds your designated field size. Table 3-2 shows the storage space required by the different text data types to store the 144-character Wine Spectator URL shown earlier, the space required to store an average-sized 30-character URL, and the maximum string size for that data type.

Table 3-2. The storage space required by the different MySQL character types

Data type

Storage for 144-char string

Storage for 30-char string

Maximum string size

CHAR (150)

150

150

255

VARCHAR (150)

145

31

255

TINYTEXT (150)

145

31

255

TEXT (150)

146

32

65535

MEDIUMTEXT (150)

147

33

16777215

LONGTEXT (150)

148

34

4294967295

In this table, note that storage requirements grow 1 byte at a time for the variable-length types of MEDIUMTEXT and LONGTEXT. This growth is due to the space required to store the size in variable-length fields. TEXT uses an extra byte to store the potentially greater length of the text it contains. Similarly, MEDIUMTEXT uses an extra 2 bytes over VARCHAR, and LONGTEXT uses an extra 3 bytes.

If after years of uptime with your database, you find that the world has changed and a field that once comfortably existed as a VARCHAR(25) must now hold strings as long as 30 characters, you are not out of luck. MySQL provides a command called ALTER TABLE that enables you to redefine a field type without losing any data:

ALTER TABLE mytable MODIFY mycolumn 
                  LONGTEXT


[3] States and provinces do not work the same way in internationalized applications. If you want to write an application that works in an international environment, make the columns for state and province codes CHAR(3), because Australia uses three-character state codes. Also note that there is a three-character ISO country-code standard.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset