Computer Science Atlas
Code Review

Python 3 Examples: Read a File

February 2, 2021
 

Using open

Read as Text to String

UTF-8 (Unicode) Encoding

Copy
with open( 'myfile.txt', 'rb' ) as f:
    content = f.read().decode()

print( content )
1
2
3
4
with open( 'myfile.txt', 'rb' ) as f:
    content = f.read().decode()

print( content )

.read() first loads the file in binary format, then .decode() converts it to a string using Unicode UTF-8 decoding rules.

Python automatically closes the file f after running all the code inside the with ... as block (here, just line 2).

System Default Encoding

If you are working with system files, you may prefer to use the system's default encoding, which may or may not be UTF-8, depending on your operating system. You can check what the default encoding is on your current system by calling:

Copy
locale.getpreferredencoding()
1
locale.getpreferredencoding()

but this may change if someone runs your Python script on a different system.

To read the file using system-default text encoding:

Copy
with open( 'myfile.txt', 'r' ) as f:
    content = f.read()
1
2
with open( 'myfile.txt', 'r' ) as f:
    content = f.read()

Notice that 'r' (text mode) is used instead of 'rb' (binary mode), and there is no longer a .decode() call after .read(), since open already returns a string that does not need to be decoded.

Read as Binary Data to bytes

Copy
with open( 'myfile.txt', 'rb' ) as f:
    content = f.read()
1
2
with open( 'myfile.txt', 'rb' ) as f:
    content = f.read()

Notice that 'rb' (binary mode) is used here, and there is no .decode() call. After this code runs, content holds binary data of type bytes rather than a string.

Using pathlib.Path (Python 3.5 and up)

Using open is convenient for reading files because open is built into the Python language, and you don't need to import any libraries to use it.

However, if you don't mind importing the pathlib library, or need to import it anyway for other code, the pathlib library provides another way you can read a file to string. Although you need to explicitly import pathlib in order to use it, the library comes installed with Python, so you don't need to install any packages.

Read as Text to String

UTF-8 (Unicode) Encoding

Copy
from pathlib import Path

content = Path( 'myfile.txt' ).read_text( 'utf-8' )

print( content )
1
2
3
4
5
from pathlib import Path

content = Path( 'myfile.txt' ).read_text( 'utf-8' )

print( content )

System Default Encoding

As with open, you can read the file using your system's default encoding by calling read_text with no arguments:

Copy
from pathlib import Path

content = Path( 'myfile.txt' ).read_text()

print( content )
1
2
3
4
5
from pathlib import Path

content = Path( 'myfile.txt' ).read_text()

print( content )

Read as Binary Data to bytes

If you want to read the file as binary data, you can call read_bytes() instead of read_text():

Copy
from pathlib import Path

content = Path( 'myfile.txt' ).read_bytes()
1
2
3
from pathlib import Path

content = Path( 'myfile.txt' ).read_bytes()

After running the code above, content holds binary data of type bytes rather than a string.

References