|

# Python Data Structures

In the last tutorial, we had a look at some common programming algorithms. A companion to these algorithms is a set of common data structures. Programming algorithms need to work with data and that data is often contained in specific formats or data structures. Now is a good time to learn more about these common data structures that are used when creating various algorithms. The purpose of data structures is to organize information in ways that make it easy to be operated on by algorithms. You might have a watchlist of stocks and you might want to be able to sort them by dividend yield or P/E Ratio. Another example would be a tree structure that represents a collection of folders and files where you want to find a specific file within all those folders. Each scenario has associated data with a different data structure. The most common data structures to be well versed in are arrays, linked lists, stacks, queues, trees, and hash tables. Different applications will need different types of data structures to hold the information that algorithms need to work on. In this tutorial, we’ll explore these topics further.

## Arrays

An array is a group of items where the position of each item is identified by an index or a key value. A one-dimensional array is the most basic type of array, and the diagram below shows what this might look like.

Element positions can be calculated using a mathematical expression which allows array elements to be accessed directly in a random access approach. What this means is that since the position of each element can be directly computed, there is no need to navigate or traverse the data structure in order to access an item. The first index element of an array is always at position 0. Here is an example of a simple one-dimensional array in Python.

```3
6
9
12
15```

Accessing every other element of the array

```3
9
15```

Accessing an element directly

`15`

Arrays can have multiple dimensions. To create a two-dimensional array, the first dimension can themselves contain arrays. Accessing an item in a two-dimensional array requires you to provide two indexes. Here is a diagram of a two-dimensional array with an index of 2,1 highlighted.

In Python, you’ll likely more commonly use the List data structure which is an array-like data type. Both the List and the Array in Python behave in similar ways in that you can iterate over them and store items at a specific index. The difference between the two is in the functions that you can perform on them. It is more work to use true arrays in Python because you have to import the array module and declare an array. Lists are simply a part of Python’s syntax, so they are used much more often and cover most of the use cases you’ll need. True arrays will be better for math functions as well as for working with large amounts of data. Most times, you can simply go with Lists. Here are a few examples of some lists in Python.

The linked list data structure is a linear collection of data elements that are often referred to as nodes. They are similar to arrays however each of the nodes has a field that points to the next element in the list, unlike an array. There are single linked lists and double linked lists. Here are a couple of diagrams that display this.

The first item of a linked list is called the head. Each element contains a field that points to the next item in the list. The last item in a linked list points to null, which means that it is the end of the list.

In a double linked list, each data item has a reference to both the previous and next items.

Here is an implementation of a linked list in Python. It uses two classes. One to represent the nodes of the list, and one to represent the linked list itself. The `Node` class implements the node type that will be stored in the linked list. It has a single `next` field, indicating that this is a singly linked list. The `LinkedList` class has fields for the `head` as well as a `count` field that keeps track of how many nodes are in the list.

#### Initialize a Linked List and store some values

```Node:  15
Node:  12
Node:  9
Node:  6
Node:  3```

#### Printing the count of the linked list

`Number of items in List:  5`

#### Find two Node objects in the Linked List

```Finding item:  <__main__.Node object at 0x03512FD0>
Finding item:  <__main__.Node object at 0x03538028>
```

#### Delete a node in a linked list

```Number of items in List:  4
Finding item:  <__main__.Node object at 0x031A8058>
Node:  15
Node:  12
Node:  9
Node:  3
```

## Stack Data Structure

The stack data structure is a collection of elements that has two basic operations, push and pop. Stacks are LIFO, or last-in-first-out, data structures. The last item pushed onto a stack is the first one popped. An example of a stack is when you’re using the back button in your browser. As you surf the internet, the browser adds each link to a stack in order to maintain the order in which they were visited. When you click on the back button, the most recently added URL is popped off the stack and then revisited.

### Stack Data Structure In Python

You can get the characteristics of working with a stack data structure in Python by making use of a list.

#### Print out the stack

`['Tom', 'Dick', 'Harry', 'Bosch']`

#### Pop an item off the stack

```Bosch
['Tom', 'Dick', 'Harry']```

### Stack as a Class

You might also do something like the following which uses a user-defined class to offer stack functionality. This is still just a wrapper around using the list type, but now you have an actual push() method you can use.

```Initial stack: [0, 1, 2, 3, 4]
pop(): 4
After pop(), the stack is now: [0, 1, 2, 3]
After push(7), the stack is now: [0, 1, 2, 3, 7]
The size is: 5```

## Queue Data Structure

The queue data structure also supports adding and removing items, but it uses the FIFO method. FIFO is a first-in, first-out approach. An empty queue that gets an item added to it, would be the first item in the list. Queuing in more items simply grows the length of the list. Queues are very common in programming since they mimick so much of what happens in real life. Have you ever been to the department of motor vehicles? Then you know very well what a Queue is. You walk up to the end of the line(queue), wait for some large amount of time(queue processing), and then finally get service once all others in front of you have been served. In Software, message processing is a common use of a queue.

### Queue Data Structure In Python

#### Print out the queue

`deque(['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'])`

#### Pop item off the queue

```Monday
deque(['Tuesday', 'Wednesday', 'Thursday', 'Friday'])```

## Hash Table Data Structure

A hash table is basically an associative array. Many other programming languages have associative arrays, and Python has its implementation of this data structure via dictionaries. This data structure maps keys to values, using a hash function. A hash function uses the key to calculate an index for the slots and maps the key to a value. The ability to uniquely map a given key to a specific value is a big benefit of hash tables. This makes working with counters and filters fast and easy. Hash tables are also quite fast, making them good for large datasets. Hash tables don’t order their items in any specific way, so you would need to add a sorting mechanism if this is required.

### Hash Table Data Structure In Python

#### Initialize a new hash table

`{'firstkey': 1, 'secondkey': 2, 'thirdkey': 'three'}`

#### Create a second hash table with iteration

`{'firstkey': 1, 'secondkey': 2, 'thirdkey': 3}`

#### Replace an item in a hash table

`{'firstkey': 1, 'secondkey': 'two', 'thirdkey': 3}`

#### Iterate over the hash table to print key-value pairs

```key:  firstkey  value:  1
key:  secondkey  value:  two
key:  thirdkey  value:  3```