Python - List Files in a Directory

There are various ways to list the files and directories using Python. In this article we'll take a look at some functions that can help us with this. First, let's assume we have a directory called 'files' on our desktop with some simple files in it:

files
├── data.csv
├── image.jpg
├── notes.txt
└── subfolder
    └── subfolder_notes.txt

Now we'll take a look at the different ways we can access this information with Python.

All file and directory names

To list the names all files and folders in a directory you can use os.listdir(): If you don't pass an argument to os.listdir() then it will return the fole and the directory names in the current directory of the Python module.

import os

files = os.listdir('/Users/user/Desktop/files')
    for f in files:
        print(f)

# subfolder
# notes.txt
# data.csv
# image.jpg

Things to notice about this method is that it is not recursive as it does not list the contents of subfolder, items are returned in an arbitrary order, and finally there is no distinction between files or directories, they are all just strings in a list. You could differentiate between files and directories manually by parsing for file extensions, or else read on to learn about other methods.

File and directory names separately

Using os.walk() will iterate through the provided directory first and then recursively through each sub-directory. For each iteration it will return a tuple containing three items, the path (string) of the directory currently being iterated upon (see root below), a list containing the names of all the directories in the directory (see dirs) and a list containing all of the file names in the current directory (see files).

import os

for root, dirs, files in os.walk('/Users/user/Desktop/files')
    print(root)
    print(dirs)
    print(files)

# /Users/user/Desktop/files
# ['subfolder']
# ['notes.txt', 'data.csv', 'image.jpg']
#
# /Users/user/Desktop/files
# []
# ['subfolder_notes.txt']
#

Using walk() gives you a more compartmentalised structure of items in sub-directories which you can then further process or ignore as needed. walk() requires a path argument (top) to be passed to it when being called but also takes three optional keyword arguments, these are topdown=True, onerror=None and followlinks=False

topdown iterates from top to bottom through directories from starting from the top directory provided, setting topdown to False reverses the iteration order.

onerror ignores errors by default but can take a function that will be called when an error occurs, this function will be passed an OSError instance with a filename attribute containing the name of the offending file. At this point you can choose to log the error or raise an exception.

If followlinks is set to True then walk() will follow symbolic links that lead to directories, so if used be careful of any infinite loops that can occur from circular references.

File paths matching a pattern

If you need to access files or folders that match a specific pattern the you can use glob.glob() to do this.

import glob

paths = glob.glob('/Users/user/Desktop/files/**', recursive=True)
for p in paths:
    print(p)

# /Users/user/Desktop/files/
# /Users/user/Desktop/files/subfolder
# /Users/user/Desktop/files/subfolder/subfolder_notes.txt
# /Users/user/Desktop/files/notes.txt
# /Users/user/Desktop/files/data.csv
# /Users/user/Desktop/files/image.jpg

glob() takes a string argument that is a directory path but and also supports the Unix pattern matching syntax. You can use this to search for things like /files/*.txt and so on. Note that this will return the full path for each object found, not just the filename, so you may need to do some further processing on each result if you just want to get the filename.