Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6. Text Processing

Text processing is, together with mathematics, the most important basic discipline in programming. Almost all data produced by humans will, at some point, be represented as text, or strings, in your system.

As you learned in Chapter 2, D provides basic string types, making the definition and usage of strings a breeze. An additional feature of strings is that they already are encoded as 8-, 16-, or 32-bit Unicode. This means that you can use D immediately in internationalized environments. You also can manipulate the encoding of strings directly.

Considering that Tango is a general-purpose library, text processing is given a prominent place via the tango.text package, and this chapter will cover most of the important functionality available there.

In this chapter, we'll begin by discussing the basic string-manipulation utilities present in Tango. Then we'll describe the Text class that Tango provides as a string wrapper; discuss conversions between strings, numeric types, and dates and times; and finally, explain how to do text formatting using Tango's Layout class.

String-Manipulation Utilities

The string-manipulation utilities in Tango are functions that perform common operations on strings. You will find all of the basic string operation utilities in tango.text.Util. Other modules contain more advanced functionality, such as regular-expression handling and encoding-specific operations. This section focuses on the operations in the Util module and the tango.text.stream package.

Tip

tango.core.Array contains generalized operations for arrays of all types, not only strings.

All of the functionality in tango.text.Util is presented through function templates that can be instantiated implicitly and called in the same manner as any other free functions. They can generally be divided into three main groups of functionality:

Functions that can be used to modify a string's content in various ways
Functions that help you search text
Functions that lets you split a string into several components, especially for iteration

Let's explore each of these groups, further subdividing the third into splitting strings and splitting streams.

MEMORY ALLOCATION AND TEXT-PROCESSING OPERATIONS

Throughout Tango, an effort is made to minimize memory allocation when performing operations. This is particularly apparent in the text-processing APIs. The various functions never allocate memory unless it is strictly necessary; in which case, it is a fallback solution.

You can allocate memory yourself if required, and pass it along to the operations you are going to use. In the following example, src is a list of strings to append with a comma in between, and buffer is a block of user-allocated memory. If buffer is too small, the result will be put on the heap instead.

char[15] buffer;
auto result = join(src, ",", buffer);

If speed or memory usage is of no importance to you, some of the most common operations have allocating wrappers for more immediate usage.

String-Modifying Operations

The trim and strip functions are mainly used to clean up strings that may be padded at the start or the end, either with whitespace or other characters.

trim is the simplest operation, removing whitespace from the beginning and end of the input text, so that you're left with the content only. The returned string is a slice of your original, so you will need to duplicate it if you want to manipulate it further.

strip is a generalized version of trim, where you can pass it any character that should be removed from either end of the input string. As with trim, a clean slice of the original argument is returned.

Several functions let you edit the content itself. replace and substitute both replace occurrences of a specified pattern with a different one. replace replaces single characters with a new character. substitute replaces a substring of the input with a new string. Both of these operations do the replacement in place to avoid allocations. If you want to keep the original, duplicate it before passing it to either of these functions, as in the following example:

char[] original = "A string to have some letters replaced";
auto result = replace (original.dup, 'r', 'l'),

The end result of this example is the string "A stling to have some lettels leplaced."

The utilities include two simple operations to build strings: join and repeat. join concatenates a list of strings into a single big one. An optional postfix can be passed to the function that will be appended to each of the joined strings except the last. If you want to control the output buffer, you can pass your own to the function. repeat builds a string by repeating a pattern n times. This function also accepts an optional output buffer.

String-Searching Operations

The next set of operations provides querying capabilities to help you find the location of certain substrings or to determine whether they are present.

contains and containsPattern check an input string for an embedded character or string, respectively. They both will return true if a match is found.

locate, locatePrior, locatePattern, and locatePatternPrior operate in a similar manner, except that they return the position in the input string of the first match. The first two look for a character. locate searches from the start of the string or starts at the optional start parameter. locatePrior starts at the end going toward the beginning or starts at the optional starting position. locatePattern and locatePatternPrior do the same, but they look for substrings (patterns) instead. In the case that no match is found, the length of the source string is returned instead. Here is an example:

locatePatternPrior ("ababababaaaab", "aba", 8);

This example returns 6 as the first position where the pattern aba starts, going backward from index 8.

Note

Typically, most libraries return 1 when a pattern has not been found. That Tango does not do this means that the result in all cases can be used as a valid index into a slice.

If you need to check if a given character is whitespace, pass it to isSpace, which will return true if it is.

matching, indexOf, and mismatch provide highly efficient routines that you can use for testing certain aspects of your strings. matching returns true if both of the strings are equal up to the specified length. indexOf returns the index of the first match up to the specified length. In the case where no match is found, this length is returned. The returned index itself is zero-based. mismatch does the opposite, returning the first index where the strings no longer match. If the strings are actually matching, the length is returned instead.

String-Splitting Operations

In the cases where you need to separate strings on a given pattern, several approaches are available through Tango. The first approach uses the operation delimit, split, or splitLines, returning an array of strings. delimit takes an array of delimiting characters, each of which results in a new string being put into the result array where one of those delimiters is found in the source string. split does the same thing, except that it looks for one given pattern. splitLines returns an array of distinct lines (as given by the presence of or in the source string). All of these functions remove from the resulting arrays the characters or patterns used.

If you prefer to iterate over the resulting components instead of receiving an array of them, tango.text.Util provides some efficient alternatives, using slices to make sure no allocations are made during the split operation itself. lines, delimiters, patterns, and quotes are all entities that can be used in foreach. Here is an example:

foreach (segment; patterns("one, two, three", ", ")) {
    . . .
}

This example will loop on each segment found in the string passedin this case, one, two, and three.

If you would rather replace the delimiting pattern with a new one, the new pattern can be passed as an additional string to patterns. lines will let you iterate over the lines in your string, similar to splitLines. delimiters is the iterator version of delimit, whereas quotes will ignore delimiters found inside a pair of quotation marks, such as in the following example:

foreach (segment, quotes ("one two 'three four' five", " ")) {
    . . .
}

The results of this example are the segments one, two, three four, and five.

Stream-Splitting Operations

In Tango, you will also find proper support for splitting operations over streams, whether you stream data from a file, a socket, a pipe, or any other class implementing the InputStream interface (see Chapter 7). Four stream iterators are available, all based on the StreamIterator superclass, and you can further extend this number by creating other subclasses. The iterators can all be found in the tango.text.stream package.

The following example demonstrates iterating over the elements of a stream:

import tango.io.FileConduit,
       tango.io.Console;
import tango.text.stream.LineIterator;

foreach (line; new LineIterator!(char) ( new FileConduit ("filename") ))
    Cout (line).newline;

This example uses the LineIterator on a stream from a file. The iterators are templated for each of the three character types in D. In this example, char is used. Each line is simply output to the console.

The other iterators are SimpleIterator, which takes a delimiter parameter in addition to the stream; QuoteIterator, which ignores delimiters within quotation marks; and RegexIterator, which lets you specify a regular expression to use as the delimiting pattern.

Note

Regular expressions are patterns that usually are more complex than those that match a literal string. Regular expressions can be considered a language of their own.

Text Class

Tango's Text class is in tango.text.Text. This class wraps a string array for you, keeps a current selection of where you last interacted with the string, and ensures proper and efficient operation.

The Text class provides for an object-oriented approach to the most common functionality in the basic text-processing routines. It abstracts away the potentially complex parts of string manipulation, especially where there may be a danger of slicing into the middle of Unicode code units.

When instantiating Text, the char type is used if the string needs to be specified. Text also implements the interface TextView, which can be used as a read-only gateway to the string. TextView's superinterface is UniText, which enables conversion of the string to one of the other Unicode encodings.

The main means of operating on the string wrapped by Text is to select a portion of it and then perform an operation on that selection.

Read-Only View Methods

The methods in the read-only view of Text mainly let you query the text for various information and compare it to other instances, whether they are of the TextView type or a D string type.

The length of the text, a hash of it, and the encoding (represented as a TypeInfo instance) are available through the properties length, toHash, and encoding.

Both opCmp and opEquals are overloaded, as is the duplicate functionality also available through the methods compare and equals. In addition, you can check whether your text starts with or ends with a given substring.

The other methods of the read-only view are slice, copy, and comparator. If you use slice, you get the underlying array as a slice. The array is not in itself safe, as there is no way in D 1.0 to restrict its use, and so you are expected to respect that it should not be changed through this interface.

With comparator, you can set the algorithms that are used in different comparison methods, and copy accepts a target array into which you can copy the text's content.

Modifying Methods

After instantiating Text, either with or without content, the content can be set/reset using two set methods. One takes a D string of the type that the instance was created with, and the other takes a TextView instance. Both have an optional parameter that can be set to false if the instance is not intended for modification of the contents.

Almost all of the methods in Text return an instance of the enclosing class, making it possible to chain calls, as in this example:

auto text = new Text!(char)("The couple danced the rumba");
text.select("couple");
text.replace("party").prepend("large ").select("rumba");
text.replace("tango");

This sequence results in the content "The large party danced the tango."

Selecting a Part of the Text

Since selecting a portion of the text to operate on is important, the Text class provides several ways to do this. To select a part of the text, you can use either an explicit or implicit approach.

select(int, int) allows you to perform an explicit selection, as you can set the start and end index of your selection.

To perform an implicit selection, you pass a character, a pattern (D string), or a TextView instance to select or selectPrior. These will search for the given argument, either toward the end or toward the front from the current selection, and set the selection to the match. If the argument wasn't found, the method you called will return false.

When you need to see what is selected, you can obtain the selected slice via the selection property. You can also get the starting index and length by calling selectionSpan.

Operating on a Selection

Most of the modifying methods of Text operate based on the current selection. By using append, you can append some text directly to the current selection. This text can be a character (or the same character multiple times), a D string, a TextView instance, or a number. The append methods that accept numeric arguments can take additional formatting hints. If you need to append text that is not in the same Unicode encoding as your text, append and transcode it by using the encode method.

Similar to append, prepend allows you to put some text immediately ahead of the current selection.

A selection can also be replaced by a string by using replace with the string you want to use as the replacement. To completely remove a selection, use remove. You can also truncate the text, in which case the default behavior is to truncate at the end of the current selection.

Performing Nonselection Operations

A few operations operate independently of the selection. clear empties the content completely, and reserve lets you reserve space for coming insertions and additions. trim and strip operate in the same way as the corresponding functions in tango.text.Util.

Text Combined with the Utilities

For additional power, Text can be combined with the utilities in tango.text.Util, as in the following example, where the line iterator is used to iterate over lines in a Text instance.

auto source = new Text!(char)("one
two
three");

foreach (line; Util.lines(source.slice)) {
    . . .
}

Similarly, all instances of a word in a block of text can be replaced with another:

auto dst = new Text!(char);
foreach (element; Util.patterns ("all cows eat grass", "eat", "chew")) {
    dst.append (element);
}

Numeric Conversion

An important facet of text processing is the conversion of text to and from different representation, such as numeric values. Tango has a full set of operations to efficiently convert from text into the various numeric types in the language, as well as support for parsing date and time information.

Tango provides this conversion through three modulestango.text.convert.Integer, Float, and TimeStampand makes sure that the conversion can be done without any heap activity unless strictly necessary for other reasons.

Each of these modules has two common operations that function as the main workhorses of the converters: parse to convert from a text to some numeric type, and format to convert from a number into a textual representation. These give you full control of the process.

Formatting a Value into a String

Following the Tango convention, format requires that you pass along an output buffer for the result. Here is an example where the number 15 is formatted:

char[10] output;
auto result = format (output, 15);

In this example, the formatted number is placed into output, which, in this case, is allocated on the stack. The result returned is a slice of output containing the exact string. Many of the format functions provide additional parameters so you can better control the formatted output.

Converting Strings into Numeric Types

Creating a value in a numeric type based on a textual representation is the opposite operation to formatting. parse supports additional parameters to control the output, such as to say something about the radix used.

Converting Integers

The Integer module lets you convert to or from the types short, ushort, int, uint, long, and ulong. In this module, parse can take three arguments:

The first argument is the string to be parsed,
The second argument can be a uint describing the radix. If omitted, 10 is the default.
As a third argument, you can pass a pointer to a uint, which will be set to a number representing how much of the string was processed to create the result.

In the same vein, format has two additional parameters with default values. The first two are always the output buffer and the number to be formatted. The third is a style identifier that says how the number is to be represented in the resulting text. You can choose from the values shown in Table 6-1.

Table 6.1. Style Identifiers Available for Integer.parse

Style Identifier	Description
`Style.Unsigned`	Format as unsigned decimal
`Style.Signed`	Format as signed decimal (the default)
`Style.Octal`	Format as an octal number
`Style.Hex`	Format as a lowercase hexadecimal number
`Style.HexUpper`	Format as an uppercase hexadecimal number
`Style.Binary`	Format as a binary number

In addition to specifying how the number should be formatted, you can pass along a style flag, as shown in Table 6-2.

Table 6.2. Style Flags Available for Integer.parse

Style Flag	Description
`Flags.None`	No modifiers are applied (the default)
`Flags.Prefix`	Prefix the conversion with a radix specifier
`Flags.Plus`	Prefix positive numbers with a plus sign (+)
`Flags.Space`	Prefix positive numbers with a space
`Flags.Zero`	Pad with zeros on the left

The following example shows how you can format a number into hexadecimals including a prefix:

auto text = Integer.format (new char[32], 12345L, Style.HexUpper, Flags.Prefix);

Note

In the example of formatting numbers as hexadecimals, tango.text.convert.Integer has been imported using renamed imports, so that the operations in the module can be used through the Integer namespace.

Along with parse and format, the Integer module contains some convenience functions. toString, toString16, and toString32 will format a number for you, allocating the necessary storage as it goes along. toInt and toLong will parse a string, assuming that it is fully parsable.

You can also use convert, which does not look for a radix in the input text. The trim function will extract optional signs and the radix, while removing extraneous space at the start, leaving the digits ready for parsing.

Converting Floating-Point Numbers

For the various floating-point types in Dfloat, double, and realyou should use tango.text.convert.Float to convert between them and their text representations. This module is somewhat simpler to use than Integer, as there are fewer variations in how the results can be formatted.

parse normally takes only the string representing the number, but can take an additional pointer to a uint, which will represent how much of the string was processed (or eaten) to create the resulting numeric type.

When formatting a floating-point number using format, you can customize it using two additional parameters. The first parameter after the number to be formatted is the number of decimals to be used. The default is 6. The second parameter is an int indicating how many exponent places should be emitted, effectively saying at which point a number should start being formatted in scientific notation. You can use 0 for always and 2 for numbers larger than 100 or smaller than 0.01. The default is 10. Here is an example of formatting a floating-point number:

auto text = Float.format (new char[64], 223.1456667, 5, 2);

This will convert the number into a string using five decimal places and scientific notation: 2.23145e+02. The result will be a slice from the buffer, which is created on the heap in this example. Also in this example, the renaming of imports is used to create the Float namespace.

As with Integer, Float has wrappers for the most common setup, negating the need to preallocate a buffer for the output. toString, toString16, and toString32 wrap format, whereas toDouble wraps the parsing, requiring that the full string is parsable as a number.

Converting Dates

Strings representing points in time can also be converted into a numeric value, a so-called timestamp, by importing tango.text.convert.TimeStamp. The resulting value is usually how many units of time have passed since a particular point, called an epoch. The most common type of timestamp in modern computing has been the number of milliseconds since 1970.

You can pass a string to parse in one of the formats specified in Table 6-3, getting a timestamp in return. The one additional optional parameter is a pointer to a uint saying how much of the string was parsed to create the timestamp. If the parsing fails, the predefined value InvalidEpoch will be returned instead.

Table 6.3. Some Ttimestamp Formats Handled by tango.text.convert.TimeStamp

Format	Example
RFC 1123	Sun, 06 Nov 1994 08:49:37 GMT
RFC 850	Sunday, 06-Nov-94 08:49:37 GMT
asctime	Sun Nov 6 08:49:37 1994
DOS time	12-31-06 08:49AM
ISO-8601	2006-01-31 14:49:30,001

Formatting a timestamp using format will yield a string in the RFC 1123 format. toString, toString16, and toString32 wrap this for the cases where you don't want to or don't need to preallocate a buffer for the output. The following example shows a string that is parsed before the resulting value is formatted.

auto date = "Sun, 06 Nov 1994 08:49:37 GMT";
auto msSinceJan1st0001 = TimeStamp.parse (date);
auto text = TimeStamp.format (new char[64], msSinceJan1st0001);

dostime and iso8601 can be used to convert from the DOS time and ISO-8601 formats, respectively. rfc1123, rfc850, and asctime do the work for parse, and can be used directly if you are sure of the format of the timestamp's textual representation.

Tip

See tango.time.ISO8601 for a more complete module for ISO-8601 parsing.

Layout and Formatting

As part of the Tango text-processing functionality, you will find a powerful text-formatting framework. It replaces what has typically been done by printf in C and related languages, and is similar to the formatting frameworks of .NET and ICU. Tango's formatter is more flexible in how it can format. Tango itself uses the flexibility of the formatting system to extend it for locale support.

The formatter is accessible through several levels in Tango, depending on your needs. You will find the core functionality in tango.text.convert.Layout, whereas tango.io.Stdout provides formatting to the console, similar to printf. Stdout again utilizes tango.io.Print, which wraps Layout for all cases where you need to send formatted output to a stream. In addition, you can also use tango.text.convert.Sprint, which wraps Layout for heapless formatting, reusing memory from the construction of the Sprint instance, whether this is allocated on the heap or the stack. Layout is also supported in the tango.util.log package and via tango.io.stream.FormatStream for generic stream output.

In this section, first we will cover the format string and how it can be composed, then the Layout class, and finally, the Locale extension.

The Format String

If you are going to print a formatted string to the console, you will typically do so using Stdout, as in the following example:

import tango.io.Stdout;
Stdout.format ("Printing the value {} to the {}", 5, "console").newline;

Here, the first string passed to format is the format string. It describes how you want your output to look, with the braces ({}) as placeholders for dynamic content. In this example, the first pair of braces will be substituted with 5 and the second with console. Typically, you will want to specify your output in more detail and also say something about where in the template the arguments should be put. The newline at the end emits a newline and flushes Stdout so that the message is seen on the console immediately.

A number within the braces functions as a zero-based index into format's argument list. Thus, the previous format call is equivalent to this:

Stdout.format ("Printing the value {0} to the {1}", 5, "console").newline;

Since the order for the example already is implicit, there is not much point to using this technique here. The line can be rewritten to the following:

Stdout.format ("Printing the value {1} to the {0}", "console", 5).newline;

The output will be the same, but now the second argument after the format string is used first. This becomes particularly useful when internationalizing your application. By passing format strings from different languages, and having a fixed order of the arguments, you can use the indices to make the words of the different languages be printed in a sane and correct order. The following example shows how this is done with an English phrase, where the second form tends to be more common in poetry, "I can see Bill" versus "Bill I can see."

char[] s = "I";  // subject
char[] o = "Bill"; // object
Stdout.format ("{0} can see {1}.", s, o).newline;
Stdout.format ("{1} {0} can see.", s, o).newline;

An index can also be reused several times so that a particular value can be repeated, as follows:

Stdout.format ("Printing the value {0} and then the same again {0}", 5);

Note

If you need to print a brace, escape it with another one, like this {{.

A pair of braces is called a format item, and can contain up to three components, all of which are optional. The first one is the index (as shown in the previous example), the second is an alignment component, and the third is a format string component. Let's take a closer look at the latter two components.

Alignment Component

You can use the alignment component of a format item to specify how much space a format item should take in the resulting string. The default is that the formatter uses as much space as is needed. If the alignment component specifies less than is actually needed, the specified value will be ignored. If more than needed is specified, the remaining space will be padded.

Alignment is specified by a number directly preceded by a comma, thus an index plus alignment component will become {index,alignment}. Here's an example:

char[] myFName = "Johnny";
Stdout.formatln("First Name = |{0,15}|", myFName);
Stdout.formatln("Last Name = |{0,15}|", "Foo de Bar");

Stdout.formatln("First Name = |{0,-15}|", myFName);
Stdout.formatln("Last Name = |{0,-15}|", "Foo de Bar");

Stdout.formatln("First name = |{0,5}|", myFName);

An additional aspect of the alignment component is shown in the third and fourth calls to formatln. (Stdout.formatln emits an additional newline at the end of the output, but is otherwise the same as format; this is an alternative to adding newline at the end.) By negating the alignment, the printed text will be left-adjusted instead of being put to the right (the default). In other words, negation will pad behind, while the default behavior is to pad in front of the value. The output of the preceding example will look like this:

First Name = |         Johnny|
Last Name = |     Foo de Bar|
First Name = |Johnny         |
Last Name = |Foo de Bar     |
First name = |Johnny|

On the last line of this output, you can see how the alignment is ignored, as the amount of space specified wasn't enough to output the value Johnny.

Format String Component

To apply additional cues to the formatter, you can use a format string component within the braces of the format item. When specifying it, prepend it with a colon, as in the following example:

Stdout.formatln ("I have {:G} birds on the roof", 100);

In the example, the letter G is used, which stands for General and is also the default used if nothing is specified.

The format string component does not need to be only one letter. It can also be a string to indicate a more complex specification. Subclasses of Layout, such as Locale, can add support for more format string components or reinterpret, such as for localization purposes.

Table 6-4 shows which format components are supported for Layout.

Table 6.4. Supported Format String Components in tango.text.convert.Layout

Format	Description
`d`	Decimal format (default)
`x`	Hexadecimal format
`X`	Uppercase hexadecimal format
`e`, `E`	Scientific notation

Appending a positive number immediately after one of the format components listed in Table 6-4 will give the minimum number of digits used to format the number. Here is an example:

Stdout.formatln ("A hexadecimal number follows: {0:X9}", 0xafe0000);

This will print an uppercase hexadecimal number using nine digits: 00AFE0000.

The Layout Class

When you are formatting using Tango, tango.text.convert.Layout does most of the work in its convert method, or sprint when you want to format to a preallocated buffer. For sprint, the first argument should be the output, and the format string should be the second buffer. For convert, the format string is first. The arguments following the format string are those that will be formatted and substituted into the template string.

If you need to format a string, but don't want to print it to the console, you can instantiate Layout directly. convert is aliased to Layout's opCall, thus you can format as in this example:

import tango.text.convert.Layout;
auto layout = new Layout!(char); // Need to specify encoding you are going to use
auto result = layout("A format {}", "string");

If you want to format text in a different encoding, you can use the fromUtf8, fromUtf16, or fromUtf32 method.

Tip

tango.text.convert.Format contains a Layout already instantiated for UTF-8.

Locale Support

The locale support works by subclassing Layout, hooking into the various format string components, and specifying a lot more functionality. To use Locale, you just need to instantiate it instead of Layout.

import tango.text.locale.Locale;
auto locale = new Locale;

Note that Locale has UTF-8 set as its default encoding. To enable localized output using Stdout, set it as the layout engine using Stdout's layout property, as follows:

Stdout.layout = new Locale;

If nothing more is specified, Locale will try to look up the locale settings of the user's computer and format according to what it finds. If you want to specify a specific locale setting, do this via Locale's constructor, as follows:

auto locale = new Locale(Culture.getCulture("fr_FR"));

The format string components that are understood by Locale when given a numeric value are shown in Table 6-5.

Table 6.5. Format String Components for tango.text.locale.Locale for Numeric Values

Format	Description
`g`, `G`	General (default)
`d`, `D`	Decimal
`x`, `X`	Lowercase and uppercase hexadecimal
`b`, `B`	Binary string
`c`, `C`	Currency
`f`, `F`	Fixed-point
`n`, `N`	Number with a delimiter (, or .) every three digits

All the format string components support an additional number to specify the precision of the formatted number.

The other kind of values localized by Locale are tango.time.Time.Time objects. Table 6-6 shows which format string components are available as shortcuts when a Time instance is passed to the formatter. These shortcuts are substituted with a pattern in the formatter.

Table 6.6. Format String Components for tango.text.locale.Locale for Time Values

Format^[a]	Description	Pattern
^[a]
`d`	Short date	dd/MM/yyyy
`D`	Long date	dddd dd MMMM yyyy.
`f`	Long date and short time	dddd dd MMMM yyyy HH': 'mm
`F`	Full date and time	dddd dd MMMM yyyy HH': 'mm': 'ss
`g`	Short date and short time	dd/MM/yyyy HH': 'mm
`G`	Short date and long time	dd/MM/yyyy HH': 'mm': 'ss
`m`, `M`	Day in month	MM MMMM
`r`, `R`	RFC 1123	ddd, dd MMM yyyy HH': 'mm': 'ss 'GMT'"
`s`	A sortable date time	yyyy'-'MM'-'dd'T'HH': 'mm': 'ss"
`t`	Short time	HH': 'mm
`T`	Long time	HH': 'mm': 'ss
`y`, `Y`	Month in year	MMMM yyyy
^[a]This format is locale-independent. If a locale for a specific culture is chosen, the format may vary.

If the format string component for a Time instance is more than one character, it means that it is a custom format, which may include the components shown in Table 6-7. The patterns shown in Table 6-6 are predefined examples of such patterns. Elements in these patterns that are enclosed in single quotation marks are printed as is. This is mainly used to escape parts of the pattern that otherwise may be interpreted in some capacity by the formatter.

Table 6.7. Custom Formatting Options for Time Instances

Format	Description
`dddd`	Full name of the day in the week
`ddd`	The name of day in the week, three letters
`dd`	The day in the month, two digits
`MMMM`	Full name of month
`MMM`	Short name of month, four letters
`MM`	Month in the year, two digits
`yy`, `yyyy`	Year, two or four digits
`HH`	Hours
`mm`	Minutes
`ss`	Seconds

The following is an example of printing a Time instance to the console in a customized format, using direct console output via tango.io.Console.Cout.

import tango.io.Console;
import tango.time.WallClock;
import tango.text.locale.Locale;

auto layout = new Locale;
Cout (layout ("{:ddd, dd MMMM yyyy HH':'mm':'ss z}", WallClock.now)).newline;

You should by now have a good understanding of the basic and intermediate routines present in Tango's text-processing functionality. In the next chapter, you'll learn about Tango's input/output packages.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 6. Text Processing

Create new playlist

Sign In

Sign Up

Chapter 6. Text Processing

String-Manipulation Utilities

Tip

String-Modifying Operations

String-Searching Operations

Note

String-Splitting Operations

Stream-Splitting Operations

Note

Text Class

Read-Only View Methods

Modifying Methods

Selecting a Part of the Text

Operating on a Selection

Performing Nonselection Operations

Text Combined with the Utilities

Numeric Conversion

Formatting a Value into a String

Converting Strings into Numeric Types

Converting Integers

Note

Converting Floating-Point Numbers

Converting Dates

Tip

Layout and Formatting

The Format String

Note

Alignment Component

Format String Component

The Layout Class

Tip

Locale Support

Table of Contents for
6. Text Processing