6.19. *Shallow and Deep Copies

Earlier in Section 3.5, we described how object assignments are simply object references. This means that when you create an object, then assign that object to another variable, Python does not copy the object. Instead, it copies only a reference to the object. For example:

>>> aList = [[78, 'pyramid'], [84, 'vulture'], [81, 'eye']]
>>> anotherList = aList
>>> aList
[[78, 'pyramid'], [84, 'vulture'], [81, 'eye']]
>>>
>>> anotherList
[[78, 'pyramid'], [84, 'vulture'], [81, 'eye']]

Above, a list of two elements is created and its reference assigned to aList. When aList is assigned to anotherList, the contents of the list reference by aList are not copied when anotherList is created. Rather, anotherList “copies” the reference from aList, not the data. We can confirm this by taking a look at the identities of the objects that both references point to:

>>> id(aList)
1191872
>>> id(anotherList)
1191872

A shallow copy of an object is defined to be a newly-created object of the same type as the original object whose contents are references to the elements in the original object. In other words, the copied object itself is new, but the contents are not. Shallow copies of sequence objects may be taken one of two ways: (1) taking a complete slice using the slice operator, or (2) using the copy() function of the copy module, as indicated in the example below:

>>> thirdList = aList[:]
>>> thirdList
[[78, 'pyramid'], [84, 'vulture'], [81, 'eye']]
>>> id(thirdList)
1192232
>>>
>>> import copy
>>> fourthList = copy.copy(aList)
>>> fourthList
[[78, 'pyramid'], [84, 'vulture'], [81, 'eye']]
>>> id(fourthList)
1192304

The thirdList list is created using the slice operator to take an entire slice (both starting and ending indices are absent). We also present the new object's identity to confirm its disassociation with the original object. Likewise for the creation of the fourthList list. This time, we use the copy.copy() function to perform the same feat. However, the elements of these lists are still only references to the original object's elements.

>>> id(aList[0]), id(aList[1]), id(aList[2])
(1064072, 1191920, 1191896)
>>> id(thirdList[0]), id(thirdList[1]), id(thirdList[2])
(1064072, 1191920, 1191896)
>>> id(fourthList[0]), id(fourthList[1]), id(fourthList[2])
(1064072, 1191920, 1191896)

We pull the identities of these objects to confirm our hypothesis. In order to obtain a full or deep copy of the object—creating a new container but containing references to completely new copies (references) of the element in the original object—we need to use the copy.deepcopy() function.

>>> lastList = copy.deepcopy(aList)
>>> lastList
[[78, 'pyramid'], [84, 'vulture'], [81, 'eye']]
>>> id(lastList)
1193248
>>> id(lastList[0]), id(lastList[1]), id(lastList[2])
(1192280, 1193128, 1193104)

There are a few notes and caveats to making copies to keep in mind. The first is that non-container types (i.e., numbers, strings, and other “atomic” objects like code, type, and xrange objects) are not copied. Shallow copies of sequences are all done using complete slices. Mapping types, which will be covered in Chapter 8, are copied using the dictionary copy method. Finally, deep copies of tuples are not made if they contain only atomic objects. If we changed each of the small lists in the larger list above to all tuples, we would have performed only a shallow copy, even though we requested a deep copy.

CORE MODULE: copy

The shallow and deep copy operations that we just described are found in the copy module. There are really only two functions to use from this module: copy()—creates shallow copy, and deepcopy()—creates a deep copy.


Sequence types provide various mechanisms for ordered storage of data. Strings are a general medium for carrying data, whether it be displayed to a user, stored on a disk, transmitted across the network, or be a singular container for multiple sources of information. Lists and tuples provide container storage that allows for simple manipulation and access of multiple objects, whether they by Python data types or user-defined objects. Individual or groups of elements may be accessed as slices via sequentially-ordered index offsets. Together, these data types provide flexible and easy-to-use storage tools in your Python development environment. We conclude this chapter with a summary of operators, built-in functions and methods for sequence types given as Table6.12.

Table 6.12. Sequence Type Operators, Built-in Functions and Methods
Operator, built-in function or methodStringListTuple
[] (list creation)  
()  
''  
append()  
capitalize()  
center()  
chr()  
cmp()
count() 
encode()  
endswith()  
expandtabs()  
extend()  
find()  
hex()  
index() 
insert() 
isdecimal()  
isdigit()  
islower()  
isnumeric()  
isspace()  
istitle()  
isupper()  
join()  
len()
list()
ljust()  
lower()  
lstrip()  
max()
min()
oct()  
ord()  
pop()  
raw_input()  
remove()  
replace()  
repr() 
reverse()  
rfind()  
rindex()  
rjust()  
rstrip()  
sort()  
split()  
splitlines()  
startswith()  
str()
strip()  
swapcase()  
split()  
title()  
tuple()
type()
upper()  
zfill()  
. (attributes) 
[] (slice)
[:]
*
%  
+
in
not in

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset