This article shows how to list the files and directories inside a directory using Python 3. Throughout this article, we'll refer to the following example directory structure:
We'll assume the code examples will be saved in script.py
above, and will be run from inside the mydir
directory so that the relative path '.'
always refers to mydir
.
pathlib
(Python 3.4 and up)iterdir
To list the contents of a directory using Python 3.4 or higher, we can use the built-in pathlib
library's iterdir()
to iterate through the contents. In our example directory, we can write in script.py
:
1 2 3 4 | from pathlib import Path for p in Path( '.' ).iterdir(): print( p ) |
When we run from inside mydir
, we should see output like:
$ python3 script.py
alpha
beta
index.html
script.py
Because iterdir
is non-recursive, it only lists the immediate contents of mydir
and not the contents of subdirectories (like a1.html
).
Note that each item returned by iterdir
is also a pathlib.Path
, so we can call any pathlib.Path
method on the object. For example, to resolve each item as an absolute path, we can write in script.py
:
1 2 3 4 | from pathlib import Path for p in Path( '.' ).iterdir(): print( p.resolve() ) |
This will list the resolved absolute path of each item instead of just the filenames.
Because iterdir
returns a generator object (meant to be used in loops), if we want to store the results in a list variable, we can write:
1 2 3 4 | from pathlib import Path files = list( Path( '.' ).iterdir() ) print( files ) |
glob
We can also use pathlib.Path.glob
to list all files (the equivalent of iterdir
):
1 2 3 4 | from pathlib import Path for p in Path( '.' ).glob( '*' ): print( p ) |
glob
If we want to filter our results using Unix glob
command-style pattern matching, glob
can handle that too. For example, if we only want to list .html
files, we would write in script.py
:
1 2 3 4 | from pathlib import Path for p in Path( '.' ).glob( '*.html' ): print( p ) |
As with iterdir
, glob
returns a generator object, so we'll have to use list()
if we want to convert it to a list:
1 2 3 4 | from pathlib import Path files = list( Path( '.' ).glob( '*.html' ) ) print( files ) |
To recursively list the entire directory tree rooted at a particular directory (including the contents of subdirectories), we can use rglob
. In script.py
, we can write:
1 2 3 4 | from pathlib import Path for p in Path( '.' ).rglob( '*' ): print( p ) |
This time, when we run script.py
from inside mydir
, we should see output like:
$ python3 script.py
alpha
beta
index.html
script.py
alpha/a1.html
alpha/a2.html
beta/b1.html
beta/b2.html
rglob
is the equivalent of calling glob
with **/
at the beginning of the path, so the following code is equivalent to the rglob
code we just saw:
1 2 3 4 | from pathlib import Path for p in Path( '.' ).glob( '**/*' ): print( p ) |
rglob
Just as with glob
, rglob
also allows glob-style pattern matching, but automatically does so recursively. In our example, to list all *.html
files in the directory tree rooted at mydir
, we can write in script.py
:
1 2 3 4 | from pathlib import Path for p in Path( '.' ).rglob( '*.html' ): print( p ) |
This should display all and only .html
files, including those inside subdirectories:
$ python3 script.py
index.html
alpha/a1.html
alpha/a2.html
beta/b1.html
beta/b2.html
Since rglob
is the same as calling glob
with **/
, we could also just use glob
to achieve the same result:
1 2 3 4 | from pathlib import Path for p in Path( '.' ).glob( '**/*.html' ): print( p ) |
pathlib
os.listdir
On any version of Python 3, we can use the built-in os
library to list directory contents. In script.py
, we can write:
1 2 3 4 | import os for filename in os.listdir( '.' ): print( filename ) |
Unlike with pathlib
, os.listdir
simply returns filenames as strings, so we can't call methods like .resolve()
on the result items. To get full paths, we have to build them manually:
1 2 3 4 5 6 7 | import os root = '.' for filename in os.listdir( root ): relative_path = os.path.join( root, filename ) absolute_path = os.path.abspath( relative_path ) print( absolute_path ) |
Another difference from pathlib
is that os.listdir
returns a list of strings, so we don't need to call list()
on the result to convert it to a list:
1 2 3 4 | import os files = os.listdir( '.' ) # files is a list print( files ) |
glob
Also available on all versions of Python 3 is the built-in glob
library, which provides Unix glob
command-style filename pattern matching.
To list all items in a directory (equivalent to os.listdir
), we can write in script.py
:
1 2 3 4 | import glob for filename in glob.glob( './*' ): print( filename ) |
This will produce output like:
$ python3 script.py
./alpha
./beta
./index.html
./script.py
Note that the root directory ('.'
in our example) is simply included in the path pattern passed into glob.glob()
.
glob
To list only .html
files, we can write in script.py
:
1 2 3 4 | import glob for filename in glob.glob( './*.html' ): print( filename ) |
Since Python versions lower than 3.5 do not have a recursive glob
option, and Python versions 3.5 and up have pathlib.Path.rglob
, we'll skip recursive examples of glob.glob
here.
os.walk
On any version of Python 3, we can use os.walk
to list all the contents of a directory recursively.
os.walk()
returns a generator object that can be used with a for
loop. Each iteration yields a 3-tuple that represents a directory in the directory tree:
- current_dir
: the path of the directory that the current iteration represents;
- subdirs
: list of names (strings) of immediate subdirectories of current_dir
; and
- files
: list of names (strings) of files inside current_dir
.
In our example, we can write in script.py
:
1 2 3 4 5 6 7 8 9 10 11 12 13 | import os for current_dir, subdirs, files in os.walk( '.' ): # Current Iteration Directory print( current_dir ) # Directories for dirname in subdirs: print( '\t' + dirname ) # Files for filename in files: print( '\t' + filename ) |
This produces the following output:
$ python3 script.py
.
alpha
beta
index.html
script.py
./alpha
a1.html
a2.html
./beta
b1.html
b2.html
To get full paths instead of just filenames, we can write:
1 2 3 4 5 6 7 8 9 10 11 | import os for current_dir, subdirs, files in os.walk( '.' ): for dirname in subdirs: relative_path = os.path.join( current_dir, dirname ) absolute_path = os.path.abspath( relative_path ) print( absolute_path ) for filename in files: relative_path = os.path.join( current_dir, filename ) absolute_path = os.path.abspath( relative_path ) print( absolute_path ) |
walk
To filter results based on filenames, we have to manually write pattern matching code. To accomplish that, we can use regular expressions or string methods on the filenames. For example, to only list .html
files in our example directory, we can write in script.py
:
1 2 3 4 5 6 7 8 9 | import os for current_dir, _, files in os.walk( '.' ): # Skip subdirs since we're only interested in files. for filename in files: if filename.endswith( '.html' ): relative_path = os.path.join( current_dir, filename ) absolute_path = os.path.abspath( relative_path ) print( absolute_path ) |