python's `os.walk(path)`: Walk a path and yield file paths
created in February 2022
With the help of two modules from Python's standard library we will build a functions that can walk a directory and yield the name of any file that matches your criteria. We will use the os module and the glob module. You could for example recursively iterate a directory and yield all files that end in .json
.
Walk directories and yield all json files
import os
import glob
def get_json_file_paths(path: str, filename: str = "*.json") -> str:
"""Walk path and yield all file paths that match the filename.
Args:
path (str):
The function will start from this directory and walk
all nested directories.
filename (str, optional):
Whenever a file matches this string, the function will yield
the absolute path to this file. Pass for instance "*.json" to
yield all files that have the .json extension.
Yields:
A string that is the absolute path to the file that was found.
https://docs.python.org/3/library/os.html#os.walk
https://docs.python.org/3/library/glob.html#glob.glob
"""
for dirpath, dirnames, filenames in os.walk(path):
files_paths = glob.glob(os.path.join(dirpath, filename))
for file_path in file_paths:
yield os.path.abspath(file_path)
if __name__ == "__main__":
paths_to_json_files = get_json_file_paths(
"~/projects/analyse_economic_data/data",
"*.json"
)