kuda.ai | code. guitar. life.

python's `os.walk(path)`: Walk a path and yield file paths

created in February 2022

With the help of two modules from Python's standard library we will build a functions that can walk a directory and yield the name of any file that matches your criteria. We will use the os module and the glob module. You could for example recursively iterate a directory and yield all files that end in .json.

Walk directories and yield all json files

import os
import glob

def get_json_file_paths(path: str, filename: str = "*.json") -> str:
    """Walk path and yield all file paths that match the filename.

    Args:
        path (str):
            The function will start from this directory and walk 
            all nested directories.

        filename (str, optional):
            Whenever a file matches this string, the function will yield
            the absolute path to this file. Pass for instance "*.json" to
            yield all files that have the .json extension. 

    Yields:
        A string that is the absolute path to the file that was found.

    https://docs.python.org/3/library/os.html#os.walk
    https://docs.python.org/3/library/glob.html#glob.glob
    """
    for dirpath, dirnames, filenames in os.walk(path):
        files_paths = glob.glob(os.path.join(dirpath, filename))
        for file_path in file_paths:
            yield os.path.abspath(file_path)

if __name__ == "__main__":
    paths_to_json_files = get_json_file_paths(
        "~/projects/analyse_economic_data/data",
        "*.json"
    )