Motivation: Practical Recursion on File Systems
File System Operations
and File Tree Traversal
CS111 Computer Programming
Department of Computer Science
Wellesley College
So far you’ve used recursion to make self-similar pictures in
cs1graphics, Turtle World, and Picture world.
But can recursion be used to solve more practical problems? YES!
Today we’ll see that the file system on your computer is organized
in a recursive way, and so it’s most natural to process it using
recursive functions. There are many practical recursive functions
that ‘‘walk’’ over trees of folders and files. You’ll see some in today’s
lecture and others on PS9.
FileSystemOps 15-2
Sample Directory: Expanded Folder View
Sample Directory:
File Tree View
lecture_15
testdir
lecture-15.ipynb
pset
hyp.py
graphics
cs1graphics.py
FileSystemOps 15-3
shrub
images
shrub1.png
remember
pubs.json
shrub2.png
ephemeral.py
shrub.py
memories.txt
tracks.csv
persistent.py
A file tree is recursively defined as:
(1) a file (a leaf of the tree); or
(2) a directory (a.k.a. folder)
(an intermediate node of the tree)
containing zero or more file trees
FileSystemOps 15-4
File System Operations: listdir
File System Operations: getcwd
Via the os module, Python provides a way to manipulate the
directories and files in a file system. To use these features, we first
need to import the os module:
import os
The os.listdir function returns a list of all files/directories in
the argument directory.
In [3]: os.listdir(os.getcwd())
Out[3]: ['.DS_Store', '.ipynb_checkpoints', 'lecture-15.ipynb', 'testdir']
os.listdir shows often hidden ‘‘dot files’’
The os.getcwd function returns the current working directory as
a string. This is the directory that the Python program is currently
“connected” to. All relative file names will be interpreted relative to
this directory.
In [1]: import os
'.' is a synonym for the connected directory
In [4]: os.listdir('.')
Out[4]: ['.DS_Store', '.ipynb_checkpoints', 'lecture-15.ipynb', 'testdir’]
In [5]: os.listdir('testdir')
Out[5]: ['.DS_Store', 'hyp.py', 'pset', 'pubs.json', 'remember', 'tracks.csv']
components of a file path are separated
In [6]: os.listdir('testdir/remember')
separates
Out[6]: ['.DS_Store', 'ephemeral.py', 'memories.txt',
'persistent.py']
by /
In [7]: os.listdir('testdir/pset')
Out[7]: ['.DS_Store', 'scene', 'shrub']
In [2]: os.getcwd()
Out[2]: '/Users/wendy/Downloads/lecture_15'
In [8]: os.listdir('testdir/pset/shrub')
Out[8]: ['.DS_Store', 'images', 'shrub.py']
FileSystemOps 15-5
File System Operations: path.exists
The os.path.exists function determines whether the given
name denotes a file/directory in the filesystem.
In [10]: os.path.exists('testdir/remember/memories.txt')
Out[10]: True
In [11]: os.path.exists('testdir/remember/catPlaysPiano.png')
Out[11]: False
In [9]: os.listdir('testdir/pset/shrub/images')
Out[9]: ['shrub1.png', 'shrub2.png']
FileSystemOps 15-6
File System Ops: path.isfile & path.isdir
os.path.isfile and os.path.isdir determine whether the given name
is a file or directory, respectively. If the file does not exist, these return False.
In [15]: os.path.isdir('testdir/remember')
Out[15]: True
In [16]: os.path.isfile('testdir/remember')
Out[16]: False
In [17]: os.path.isdir('testdir/remember/memories.txt')
Out[17]: False
In [12]: os.path.exists('testdir/remember')
Out[12]: True
In [18]: os.path.isfile('testdir/remember/memories.txt')
Out[18]: True
In [13]: os.path.exists('remember/memories.txt')
Out[13]: False
In [19]: os.path.isdir('memories.txt')
Out[19]: False
In [14]: os.path.exists('memories.txt')
Out[14]: False
In [20]: os.path.isfile('memories.txt')
Out[20]: False
FileSystemOps 15-7
FileSystemOps 15-8
File Tree Traversal: Print all Dirs and Files
In [25]: printFileTree('testdir')
testdir
testdir/hyp.py
testdir/pset
testdir/pset/scene
testdir/pset/scene/cs1graphics.py
testdir/pset/shrub
testdir/pset/shrub/images
testdir/pset/shrub/images/shrub1.png
testdir/pset/shrub/images/shrub2.png
testdir/pset/shrub/shrub.py
testdir/pubs.json
testdir/remember
testdir/remember/ephemeral.py
testdir/remember/memories.txt
testdir/remember/persistent.py
testdir/tracks.csv
In [26]: printFileTree('testdir/hyp.py')
testdir/hyp.py
In [27]: printFileTree('testdir/remember')
testdir/remember
testdir/remember/ephemeral.py
testdir/remember/memories.txt
testdir/remember/persistent.py
FileSystemOps 15-9
File Tree Traversal: Better Version
File Tree Traversal: Broken Version
def printFileTreeBroken(root):
'''Print all directories and files
(one per line) starting at root,
which is a directory or file name'''
Implicit else: pass
if os.path.isfile(root):
at end handles other
print root
file types and nonexistent
elif os.path.isdir(root):
file names by doing
print root
for fileOrDir in os.listdir(root):
nothing for them.
printFileTreeBroken(fileOrDir)
In [28]: printFileTreeBroken('testdir')
testdir
Why does this print only two names?
.DS_Store
In [29]: os.listdir('testdir')
Out[29]: ['.DS_Store', 'hyp.py', 'pset', 'pubs.json', 'remember',
'tracks.csv']
In [30]: os.path.isfile('hyp.py')
Out[30]: False
In [31]: os.path.isfile('testdir/hyp.py')
Out[31]: True
FileSystemOps 15-10
File Tree Traversal: Filter out Dot Files
def printFileTreeBetter(root):
if os.path.isfile(root):
print root
elif os.path.isdir(root):
print root
for fileOrDir in os.listdir(root):
printFileTreeBetter(os.path.join(root, fileOrDir))
This acts like root + '/' + fileOrDir,
but is clearer and less error prone.
In [32]: printFileTreeBetter('testdir')
testdir
testdir/.DS_Store
Dot files are still printed, but we’d like to hide them.
testdir/hyp.py
testdir/pset
testdir/pset/.DS_Store
testdir/pset/scene
testdir/pset/scene/cs1graphics.py
…
FileSystemOps 15-11
def printFileTree(root):
if os.path.isfile(root):
print root
elif os.path.isdir(root):
print root
for fileOrDir in os.listdir(root):
if fileOrDir[0] != '.': # filter out dot files
printFileTree(os.path.join(root, fileOrDir))
In [32]: printDirectoryTree('testdir')
testdir
testdir/hyp.py
testdir/pset
testdir/pset/scene
testdir/pset/scene/cs1graphics.py
testdir/pset/shrub
testdir/pset/shrub/images
testdir/pset/shrub/images/shrub1.png
testdir/pset/shrub/images/shrub2.png
…
FileSystemOps 15-12
File Tree Traversal: Print Files Only
def printFiles(root):
'''Print only the (nondirectory) files encountered
in a file tree traversal starting at root'''
if os.path.isfile(root):
print root
elif os.path.isdir(root):
for fileOrDir in os.listdir(root):
if fileOrDir[0] != '.':
printFiles(os.path.join(root, fileOrDir))
In [35]: printFiles('testdir')
testdir/hyp.py
testdir/pset/scene/cs1graphics.py
testdir/pset/shrub/images/shrub1.png
testdir/pset/shrub/images/shrub2.png
testdir/pset/shrub/shrub.py
testdir/pubs.json
testdir/remember/ephemeral.py
testdir/remember/memories.txt
testdir/remember/persistent.py
testdir/tracks.csv
File Tree Traversal: Print Dirs Only
def printDirs(root):
'''Print only the directories encountered in a
file tree traversal starting at root'''
if os.path.isdir(root):
print root
for fileOrDir in os.listdir(root):
if fileOrDir[0] != '.':
printDirs(os.path.join(root, fileOrDir))
In [34]: printDirs('testdir')
testdir
testdir/pset
testdir/pset/scene
testdir/pset/shrub
testdir/pset/shrub/images
testdir/remember
FileSystemOps 15-13
File Tree Traversal: Count Files Only
def countFiles(root):
'''Returns the number of (nondirectory) files encountered
in a file tree traversal starting at root'''
if os.path.isfile(root):
return 1
elif os.path.isdir(root):
filesHere = 0
for fileOrDir in os.listdir(root):
if fileOrDir[0] != '.':
filesHere += countFiles(os.path.join(root,
fileOrDir))
return filesHere
FileSystemOps 15-14
File Tree Traversal: Count Dirs Only
def countDirs(root):
'''Returns the number of directories encountered in a
file tree traversal starting at root'''
if os.path.isdir(root):
dirsHere = 1 # count the root directory itself
for fileOrDir in os.listdir(root):
if fileOrDir[0] != '.':
dirsHere += countDirs(os.path.join(root,
fileOrDir))
return dirsHere
else:
return 0 # nondirectory files don’t count
In [36]: countDirs('testdir')
Out[36]: 6
In [37]: countFiles('testdir')
Out[37]: 10
FileSystemOps 15-15
FileSystemOps 15-16
File System Operations: path.getsize
The os.path.getsize function returns the size of the file
(in bytes). For text files, this is the number of characters.
In [21]: os.path.getsize('testdir/remember/memories.txt')
Out[21]: 80
In [22]: os.path.getsize('testdir/remember/persistent.py')
Out[22]: 1634
In [23]: os.path.getsize('testdir/pset/shrub/images/shrub1.png')
Out[23]: 27248
In [24]: os.path.getsize('testdir/pset/shrub/images')
Out[24]: 136
The size of a directory is related to the
‘‘meta information’’ the directory holds
for subdirectories and files. It is not the
sum total of the sizes of the contained files.
FileSystemOps 15-17
© Copyright 2026 Paperzz