Lesson 30. Looping Over Files In A Directory

Looping over files in a directory is a basic ETL task. In this tutorial, I’m going to introduce you to the syntax. If you downloaded the code from GitHub, there will be small sample files to work with.

In later lessons, you will see how it is done with live files.

Examples

Example #1: Loop Over Everything In Folder

import os

script_dir = os.getcwd()
data_directory = 'data\\'
example_directory = 'FileLoopExample\\'
path = os.path.join(script_dir,data_directory,example_directory)

for filename in os.listdir(path):
    print(filenamep

Example #2: Loop Over Files With A Specific File Extension

import os

script_dir = os.getcwd()
data_directory = 'data\\'
example_directory = 'FileLoopExample\\'
path = os.path.join(script_dir,data_directory,example_directory)

for filename in os.listdir(path):
    if filename.endswith('.csv'):
        print(filename

Example #3: Loop Over Files In Subdirectories Recursively

import os

script_dir = os.getcwd()
data_directory = 'data\\'
example_directory = 'FileLoopExample\\'
path = os.path.join(script_dir,data_directory,example_directory)

for subdir, dirs, files in os.walk(path):
     for filename in files:
            print(filename)

Last updated