Python os Module Tutorial

Python os Module Tutorial

Python has a dedicated module for working with and interacting with the operating system of the machine Python is installed on. This is the Python os module. It is a wonderful tool with many helpful functions making it possible to directly manipulate the file system. The os module works well on any operating system you like whether that be Windows, Linux, or Mac. The Python os module is very extensive and in this Python os Module Tutorial, we’ll take a look at some of the most useful and common techniques you can use.


What can the os module do for us?

The os module can do many things, here is a list of some of the more common tasks you can do.

  • Get the name of the operating system
  • Get the current working directory
  • Change directories
  • Get or Set user and group information
  • Test for access to a path and see if a file or directory exists
  • Return a list of entities in a directory
  • Create a directory
  • Remove and rename files and directories
  • Get the stats for a file
  • Generate files and directory names
  • Kill a process

How to use the os module

The os module must be imported into your Python program before you can use it. A simple import statement will accomplish this for us.

import os

getcwd()

Return a Unicode string representing the current working directory.

import os

print(os.getcwd())
C:\python\osmodule

chdir(path)

Change the current working directory to the specified path. The path may always be specified as a string. On some platforms, path may also be specified as an open file descriptor. If this functionality is unavailable, using it raises an exception.

import os

os.chdir('c:/python')

print(os.getcwd())
c:\python

listdir(path=None)

Return a list containing the names of the files in the directory. The path can be specified as either str, bytes, or a path-like object. If the path is bytes, the filenames returned will also be bytes; in all other circumstances the filenames returned will be str. If the path is None, uses the path=’.’. On some platforms, path may also be specified as an open file descriptor;\ the file descriptor must refer to a directory. If this functionality is unavailable, using it raises NotImplementedError. The list is in arbitrary order. It does not include the special entries ‘.’ and ‘..’ even if they are present in the directory.

import os

print(os.listdir())
['.idea', 'main.py']

mkdir(path, mode=511, *, dir_fd=None)

Create a directory. If dir_fd is not None, it should be a file descriptor open to a directory, and path should be relative; path will then be relative to that directory. dir_fd may not be implemented on your platform. If it is unavailable, using it will raise a NotImplementedError. The mode argument is ignored on Windows.

import os

os.mkdir('New Directory')

print(os.listdir())
['.idea', 'main.py', 'New Directory']

makedirs(name, mode=511, exist_ok=False)

makedirs(name [, mode=0o777][, exist_ok=False])
Super-mkdir; create a leaf directory and all intermediate ones. Works like mkdir, except that any intermediate path segment (not just the rightmost) will be created if it does not exist. If the target directory already exists, raise an OSError if exist_ok is False. Otherwise no exception is raised. This is recursive.

import os

os.makedirs('directory/with/many/levels')

print(os.listdir())
['.idea', 'directory', 'main.py', 'New Directory']

rmdir(path, *, dir_fd=None)

Remove a directory. If dir_fd is not None, it should be a file descriptor open to a directory, and path should be relative; path will then be relative to that directory. dir_fd may not be implemented on your platform. If it is unavailable, using it will raise a NotImplementedError.

import os

os.rmdir('New Directory')

print(os.listdir())
['.idea', 'directory', 'main.py']

Trying to remove a non-empty directory will produce an error.

import os

os.rmdir('directory')

print(os.listdir())
Traceback (most recent call last):
  File "C:\python\osmodule\main.py", line 3, in 
    os.rmdir('directory')
OSError: [WinError 145] The directory is not empty: 'directory'

removedirs(name)

removedirs(name)
Super-rmdir; remove a leaf directory and all empty intermediate ones. Works like rmdir except that, if the leaf directory is successfully removed, directories corresponding to rightmost path segments will be pruned away until either the whole path is consumed or an error occurs. Errors during this latter phase are ignored — they generally mean that a directory was not empty.

import os

os.removedirs('directory/with/many/levels')

print(os.listdir())
['.idea', 'main.py']

rename(src, dst, *, src_dir_fd=None, dst_dir_fd=None)

Rename a file or directory. If either src_dir_fd or dst_dir_fd is not None, it should be a file descriptor open to a directory, and the respective path string (src or dst) should be relative; the path will then be relative to that directory. src_dir_fd and dst_dir_fd, may not be implemented on your platform. If they are unavailable, using them will raise a NotImplementedError.

import os

open('created_file.py', 'w').close()

os.rename('created_file.py', 'renamed_file.py')

print(os.listdir())
['.idea', 'main.py', 'renamed_file.py']

stat(path, *, dir_fd=None, follow_symlinks=True)

Perform a stat system call on the given path. path Path to be examined; can be string, bytes, a path-like object or open-file-descriptor int. dir_fd If not None, it should be a file descriptor open to a directory, and path should be a relative string; path will then be relative to that directory. follow_symlinks If False, and the last element of the path is a symbolic link, stat will examine the symbolic link itself instead of the file the link points to. dir_fd and follow_symlinks may not be implemented on your platform. If they are unavailable, using them will raise a NotImplementedError. It’s an error to use dir_fd or follow_symlinks when specifying path as an open file descriptor.

import os

good_info = os.stat('renamed_file.py')

print(good_info)
os.stat_result(st_mode=33206, st_ino=71494644084647853, st_dev=4063410304, st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1642185288, st_mtime=1642185288, st_ctime=1642185288)

walk(top, topdown=True, onerror=None, followlinks=False)

Directory tree generator. For each directory in the directory tree rooted at top (including top itself, but excluding ‘.’ and ‘..’), yields a 3-tuple dirpath, dirnames, filenames dirpath is a string, the path to the directory. dirnames is a list of the names of the subdirectories in dirpath (excluding ‘.’ and ‘..’). filenames is a list of the names of the non-directory files in dirpath. Note that the names in the lists are just names, with no path components. To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name).

If optional arg ‘topdown’ is true or not specified, the triple for a directory is generated before the triples for any of its subdirectories (directories are generated top down). If topdown is false, the triple for a directory is generated after the triples for all of its subdirectories (directories are generated bottom up).

When topdown is true, the caller can modify the dirnames list in-place (e.g., via del or slice assignment), and walk will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, or to impose a specific order of visiting. Modifying dirnames when topdown is false has no effect on the behavior of os.walk(), since the directories in dirnames have already been generated by the time dirnames itself is generated. No matter the value of topdown, the list of subdirectories is retrieved before the tuples for the directory and its subdirectories are generated.

By default errors from the os.scandir() call are ignored. If optional arg ‘onerror’ is specified, it should be a function; it will be called with one argument, an OSError instance. It can report the error to continue with the walk, or raise the exception to abort the walk. Note that the filename is available as the filename attribute of the exception object.

By default, os.walk does not follow symbolic links to subdirectories on systems that support them. In order to get this functionality, set the optional argument ‘followlinks’ to true.

import os

for root, dirs, files in os.walk('c:\python\osmodule'):
    for name in files:
        print('file: ' + os.path.join(root, name))
    for name in dirs:
        print('dir: ' + os.path.join(root, name))
file: C:\python\osmodule\main.py
file: C:\python\osmodule\renamed_file.py
dir: C:\python\osmodule\.idea
file: C:\python\osmodule\.idea\.gitignore
file: C:\python\osmodule\.idea\misc.xml
file: C:\python\osmodule\.idea\modules.xml
file: C:\python\osmodule\.idea\osmodule.iml
file: C:\python\osmodule\.idea\workspace.xml
dir: C:\python\osmodule\.idea\inspectionProfiles
file: C:\python\osmodule\.idea\inspectionProfiles\profiles_settings.xml
file: C:\python\osmodule\.idea\inspectionProfiles\Project_Default.xml

os.environ

os.environ in Python is a mapping object that represents the user’s environmental variables. It returns a dictionary having user’s environmental variable as key and their values as value. os.environ behaves like a python dictionary, so all the common dictionary operations like get and set can be performed. We can also modify os.environ but any changes will be effective only for the current process where it was assigned and it will not change the value permanently.

import os

good_vals = os.environ

for k, v in good_vals.items():
    print(f"{k} = {v}"
prints out all environ values

Getting a single environ value.

import os

good_vals = os.environ.get('homedrive')

print(good_vals)
C:

os.path.join()

join(path, *paths) – Join two (or more) paths.

import os

good_vals = os.environ.get('homedrive')

joined = os.path.join(good_vals, '/index.html')

print(joined)
C:/index.html

os.path.basename()

basename(p) Returns the final component of a pathname.

import os

basename = os.path.basename('path/to/file.html')

print(basename)
file.html

os.path.dirname()

dirname(p) Returns the directory component of a pathname.

import os

dirname = os.path.dirname('path/to/file.html')

print(dirname)
path/to

split(p)

Split a pathname. Return tuple (head, tail) where tail is everything after the final slash. Either part may be empty.

import os

split = os.path.split('path/to/file.html')

print(split)
('path/to', 'file.html')

exists(path)

Test whether a path exists. Returns False for broken symbolic links.

import os

imaginary = os.path.exists('path/to/file.html')

real = os.path.exists('c:/python/osmodule/main.py')

print(imaginary)
print(real)
False
True

isfile() and isdir()

Checks to see if the path is a file or if the path is a directory.

import os

contents = os.listdir()

for item in contents:
    if os.path.isdir(item):
        print(item + " is a directory")
    elif os.path.isfile(item):
        print(item + " is a file")
.idea is a directory
main.py is a file
renamed_file.py is a file

splitext(p)

Split the extension from a pathname. Extension is everything from the last dot to the end, ignoring leading dots. Returns “(root, ext)”; ext may be empty.

import os

file_and_extension = os.path.splitext('renamed_file.py')

print(file_and_extension)
('renamed_file', '.py')

Putting It All Together

Each of the examples on its own is helpful, but a fully working program will show how these types of commands fit together. By using a combination of some of the os module functions we can create some neat utility programs. This program below makes it simple to bulk rename files in a folder and all of its subfolders. Doing this manually is quite a tedious process and is a perfect example of how to use Python to automate the boring stuff as they would say. Shout out to Al Sweigart! Here is the code with the os functions highlighted.

import sys
import os


def main():
    find, replace, root_dir = get_input()

    if os.path.exists(root_dir) is False:
        print("This path does not exist.")

    else:
        print("Doing replacement...")
        rename_all_files(find, replace, root_dir)
        print()


def get_input():
    print(" You entered", len(sys.argv) - 1, "arguments at the command line.")

    if len(sys.argv) != 4:
        raise Exception(
            " Error: Wrong number of arguments. Enter 3 arguments: 1. "
            "string to replace 2. replacement string 3. path for files ")

    find = sys.argv[1]
    replace = sys.argv[2]
    root_dir = sys.argv[3]

    print(' Find this string:\t', find)
    print(' Replace with this string:\t', replace)
    print(' Start in this directory:\t', root_dir)
    print()
    return find, replace, root_dir


def rename_all_files(find, replace, root_dir):
    files_changed_count = 0

    for (root, dirs, files) in os.walk(root_dir):

        for old_filename in files:
            if os.path.exists(str(root) + '/' + str(old_filename)) is False:
                print("This file name does not exist.")
                break

            new_name = old_filename.replace(find, replace)

            if old_filename != new_name:
                print("Old filename is: " + str(old_filename))
                print('New filename is:', new_name, '\n')

                path_with_old_file = root + "/" + old_filename
                path_with_new_file = root + "/" + new_name

                os.rename(path_with_old_file, path_with_new_file)
                files_changed_count = files_changed_count + 1
    print()
    print('Renamed: ', files_changed_count, ' file(s)')


if __name__ == '__main__':
    main()
python $python rename_files.py 'old' 'new' 'c:/python/renametest'
 You entered 3 arguments at the command line.
 Find this string:       old
 Replace with this string:       new
 Start in this directory:        c:/python/renametest

Doing replacement...
Old filename is: anotheroldfile.html
New filename is: anothernewfile.html

Old filename is: oldfile.txt
New filename is: newfile.txt

Old filename is: someoldfile.txt
New filename is: somenewfile.txt

Old filename is: nestedanotheroldfile.html
New filename is: nestedanothernewfile.html

Old filename is: nestedoldfile.txt
New filename is: nestednewfile.txt

Old filename is: nestedsomeoldfile.txt
New filename is: nestedsomenewfile.txt

Old filename is: 3deepanotheroldfile.html
New filename is: 3deepanothernewfile.html

Old filename is: 3deepoldfile.txt
New filename is: 3deepnewfile.txt

Old filename is: 3deepsomeoldfile.txt
New filename is: 3deepsomenewfile.txt


Renamed:  9  file(s)

Running this file took literally one second. Renaming all of these files manually would take much longer.

Learn More About The Python os module

Python os Module Tutorial Summary

The os module in Python makes many attributes and methods available to you so you can interact with the underlying operating system in an easy and efficient way. The great thing is that all of these methods work equally well on all types of operating systems like Windows, Linux, and Apple operating systems. We had a good look at many of the commonly used functions in the os module and then looked at a completed Python program that is able to rename files in a directory and its subdirectories in a clean and efficient manner.