4 Useful Collection Types In Python

python collection types

In this Python tutorial, we’ll start moving forward with various collection types. Specifically we’ll be having a look at strings, collections, lists, and dictionaries. The four most common and useful of these are str, bytes, list, and dict. In addition to these common data types, we’ll examine a few ways to iterate over these types of collections with some looping constructs.

1. str

We begin our investigation with the Python str data type. In fact, we have make use of str in the prior lesson, but we’ll look at them more closely here. The official definition of a string in Python is a sequence of Unicode codepoints. In simpler English, this is roughly equivalent to what we think of as characters. These characters in the Python string are immutable, or in plain terms, they can not be changed. Once a string comes into being, it’s contents can not be modified. Just like you might find in PHP or JavaScript, a string in Python is delimited by either single or double quotes. What you choose is up to you, however you must be sure to keep their use consistent and they can not be mixed and matched. You can not start a string with a single quote and end it with a double quote or vice versa. Using the simple REPL command line, we can examine a few instances of Python strings.

Literal Strings

>>> 'I can write strings like nobodys business'
'I can write strings like nobodys business'
>>> "If you mix quote types, you will get an error!'
File "<stdin>", line 1
"If you mix quote types, you will get an error!'
SyntaxError: EOL while scanning string literal
>>> "Can't touch this"
"Can't touch this"


Adjacent Literal Strings

In Python, literal strings that are adjacent to each other are complied down to a single string. It’s a little odd, especially since no other languages do this, but it is a feature of Python so we mention it here.
>>> "I'm a string." " I am also a string that is adjacent."
"I'm a string. I am also a string that is adjacent."

Multi Line Strings in Python

When dealing with multiple lines of string data at once, we can make use of a special syntax. Just like PHP has heredoc syntax, Python makes use of triple quotes. Similar to string literals, you can make use of single or double quotes. Let’s try it out.

>>> '''In Python, sometimes
... we might like to enter
... some text on multiple
... lines. See what I mean?'''
'In Python, sometimes\nwe might like to enter\nsome text on multiple\nlines. See what I mean?'

>>> """If you feel like
... using double quotes
... instead of single quotes
... you can do that friend."""
'If you feel like\nusing double quotes\ninstead of single quotes\nyou can do that friend.'

>>> mystring = 'Finally, you could \nwrite a multi line string\n on your own\n like this.'
>>> print(mystring)
Finally, you could
write a multi line string
on your own
like this.

Escape sequences work like you might expect and are highlighted in this table.

This Escape Sequence Has This Meaning
\newline Backslash and newline ignored
\\ Backslash (\)
\’ Single quote (‘)
\” Double quote (“)
\a ASCII Bell (BEL)
\b ASCII Backspace (BS)
\f ASCII Formfeed (FF)
\n ASCII Linefeed (LF)
\r ASCII Carriage Return (CR)
\t ASCII Horizontal Tab (TAB)
\v ASCII Vertical Tab (VT)
\ooo Character with octal value ooo
\xhh Character with hex value hh
Escape sequences only recognized in string literals are listed below
\N{name} Character named name in the Unicode database
\uxxxx Character with 16-bit hex value xxxx
\Uxxxxxxxx Character with 32-bit hex value xxxxxxxx

Making use of the str() constructor

In other languages we can do things like type cast a variable to a specific type. In Python, we can use the string constructor to change other data types like ints or floats in their string representation. Let’s see what we mean in the REPL.

>>> bucket = str(1234)
>>> type(bucket)
<class 'str'>
>>> print(bucket)
>>> sink = str(7.02e4)
>>> type(sink)
<class 'str'>
>>> print(sink)

Here we make use of the Python built in function type(), to tell us what the variable in each test holds. We can see that both the integer and the float we pass in to the string constructor are converted to their string representation.

Strings have methods… Lots of them!

There are a lot of built in methods for operating on strings in Python. Here are some of the ones we found in the documentation: str.format(), str.capitalize(), str.casefold(), str.center(), str.count(), str.encode(), str.endswith(), str.expandtabs(), str.find(), str.format(), str.format(), str.index(), str.isalnum(), str.isalpha(), str.isdecimal(), str.isdigit(), str.isidentifier(), str.islower(), str.isnumeric(), str.isprintable(), str.isspace(), str.istitle(), str.isupper(), str.join(), str.ljust(), str.lower(), str.lstrip(), str.maketrans(), str.translate(), str.partition(), str.replace(), str.rfind(), str.rindex(), str.rjust(), str.rpartition(), str.rsplit(), str.rstrip(), str.split(), str.splitlines(), str.startswith(), str.strip(), str.swapcase(), str.title(), str.translate(), str.maketrans(), str.upper(), str.upper(), and str.zfill().

>>> haha = str.upper('check this out fool')
>>> print(haha)

>>> uh_oh = str.lower('I AINT PLAYIN FOOL!')
>>> print(uh_oh)
i aint playin fool!

These snippets above used the methods by way of the constructor. A more common way to apply these methods can be found in the syntax below.

>>> haha = 'check this out fool'
>>> haha.upper()
>>> uh_oh = 'I AINT PLAYIN FOOL!'
>>> uh_oh.lower()
'i aint playin fool!'

Of course, you’ll find much better ways to apply these handy methods to your programs than we did here. We just wanted to show a couple of them in action.

2. bytes

Bytes in Python are act a bit like strings but they are in fact different. When dealing with single byte character encoding like ASCII or raw binary data, you will be dealing with bytes. They are defined literally similar to how you would define a string, with the difference being that you prefix the opening quote of the string with a lowercase b character.

Testing out some bytes on the REPL

>>> digital = b'check out all these bytes'
>>> type(digital)
<class 'bytes'>
>>> digital.split()
[b'check', b'out', b'all', b'these', b'bytes']

Understanding the bytes type becomes important when dealing with files, network resources, and HTTP responses on the web since all of these are transmitted as byte streams.

3. list

In the example above we made use of the split() function as you saw. This function actually returns a list, which is a sequence of objects. Whereas you can not change a string, a list can be updated and modified. Lists in Python look a lot like arrays from other languages like JavaScript and PHP. To construct a literal list, you actually use what you would think are for an array – the square bracket notation. Items inside the square brackets are comma separated, and the items are zero based. Let’s just have a quick play at the REPL to see how these data structures in Python work.

>>> [310, 311, 319, 321]
[310, 311, 319, 321]
>>> fruit = ['banana', 'blueberry', 'apple', 'orange']
>>> fruit[0] = 317
>>> fruit
[317, 'blueberry', 'apple', 'orange']

In the example above, we first create a list of numbers. Then we create a list of strings and store that list in a variable named fruit. We can use index based updating just like we saw how we applied a number to the 0 index of our fruit list. When we then inspect that list, we can see that both numbers and strings are coexisting peacefully inside the very same list. Lists also have many list methods such as list.append(), list.extend(), list.insert(), list.remove(), list.pop(), list.clear(), list.index(), list.count(), list.sort(), list.reverse(), and list.copy(). Let’s initialize an empty list and then append some items to it.

>>> mylist = []
>>> mylist.append(True)
>>> mylist.append(None)
>>> mylist.append(False)
>>> mylist.append(1)
>>> mylist.append('one')
>>> mylist.append('two')
>>> mylist.append('three')
>>> mylist.append('four, Tell me that you love me more')
>>> mylist.append(['nested', 'list'])
>>> mylist
[True, None, False, 1, 'one', 'two', 'three', 'four, Tell me that you love me more', ['nested', 'list']]

We can see that they are very easy to work with, are quite mutable, and can nest inside each other. We have booleans, numbers, strings, and nested lists in this example. They have a JavaScript Object feel to them.

In addition to what we have seen so far, there is also a list constructor that magically turns any contents into a list.

>>> list('mississippi')
['m', 'i', 's', 's', 'i', 's', 's', 'i', 'p', 'p', 'i']

Lists are often considered the workhorse of Python data structures, we’ll need to become very familiar with them as we move along.

4. dict

Dictionaries in Python map keys to values, much like an associative array would. They are another fundamental building block to the Python language. We can easily create a new dictionary of urls by using the curly braces similar to how we might create an object in JavaScript.

We can access the value of a particular slot in the dictionary by referencing it’s key. Pretty simple stuff here. A few guidelines do apply:

dict literal

  • delimited by { and }
  • key value pairs are comma separated
  • a given key value pair is joined by a colon :
  • each key must be unique

Attempting to access an item of the dictionary which is not actually included in the dictionary produces an error.

It is easy to update or add new items to the dictionary like so:

To remove an item from the dictionary, simply use the del keyword. Running this code now will produce an error since we removed the ‘Model’ entry from the dictionary.

note: The keys of a dictionary are always immutable. Strings, Tuples, or numbers can be used as keys, but a list is not allowed.

Dictionaries have built in functions to help in working with them. You can use cmp() to compare dictionaries, len() to find the number of items in the dictionary, and str() to create a string representation of the dict. Check out the full list of things you can do with the dict data type.

4 Useful Collection Types In Python Summary

In this beginner level python tutorial, we had a quick look at the str, bytes, list, and dict data types in Python 3. These are enough to get you started writing some basic scripts and programs.