Strings and bytes

Textual data in Python is handled with str objects, more commonly known as strings. They are immutable sequences of Unicode code points. Unicode code points can represent a character, but can also have other meanings, such as formatting data, for example. Python, unlike other languages, doesn't have a char type, so a single character is rendered simply by a string of length 1.

Unicode is an excellent way to handle data, and should be used for the internals of any application. When it comes to storing textual data though, or sending it on the network, you may want to encode it, using an appropriate encoding for the medium you're using. The result of an encoding produces a bytes object, whose syntax and behavior is similar to that of strings. String literals are written in Python using single, double, or triple quotes (both single or double). If built with triple quotes, a string can span on multiple lines. An example will clarify this:

>>> # 4 ways to make a string
>>> str1 = 'This is a string. We built it with single quotes.'
>>> str2 = "This is also a string, but built with double quotes."
>>> str3 = '''This is built using triple quotes,
... so it can span multiple lines.'''
>>> str4 = """This too
... is a multiline one
... built with triple double-quotes."""
>>> str4  #A
'This too
is a multiline one
built with triple double-quotes.'
>>> print(str4)  #B
This too
is a multiline one
built with triple double-quotes.

In #A and #B, we print str4, first implicitly, and then explicitly, using the print function. A nice exercise would be to find out why they are different. Are you up to the challenge? (hint: look up the str function.)

Strings, like any sequence, have a length. You can get this by calling the len function:

>>> len(str1)
49

Table of Contents for Strings and bytes

Create new playlist

Sign In

Sign Up

Table of Contents for
Strings and bytes