Python 3: Download a Webpage or File from URL

Table of Contents

Download as Text
Read to String
Save to File (Works Only for Decoded Text Data)
Download as Binary Data to bytes
Read to Variable
Save to File (Works for Text or Binary Data)

Download as Text

Downloading as text data is required if you want to store the webpage or file to a string, and take advantage of the many available string functions such as split() and find() to process the data.

Read to String

from urllib.request import urlopen

with urlopen( 'https://example.com/' ) as webpage:
    content = webpage.read().decode()

print( content )

1
2
3
4
5
6
from urllib.request import urlopen

with urlopen( 'https://example.com/' ) as webpage:
    content = webpage.read().decode()

print( content )

.read() first downloads the data in binary format, then .decode() converts it to a string using Unicode UTF-8 decoding rules. If the text is encoded in a different format, such as ASCII, you have to specify the format explicitly as an argument to decode():

content = webpage.read().decode( 'ascii' )

content = webpage.read().decode( 'ascii' )

Save to File (Works Only for Decoded Text Data)

from urllib.request import urlopen

# Download from URL and decode as UTF-8 text.
with urlopen( 'https://example.com/' ) as webpage:
    content = webpage.read().decode()

# Save to file.
with open( 'output.html', 'w' ) as output:
    output.write( content )

1
2
3
4
5
6
7
8
9
from urllib.request import urlopen

# Download from URL and decode as UTF-8 text.
with urlopen( 'https://example.com/' ) as webpage:
    content = webpage.read().decode()

# Save to file.
with open( 'output.html', 'w' ) as output:
    output.write( content )

Download as Binary Data to `bytes`

If you don't need to use string operations like find() on the downloaded data, or if the data isn't text data at all (e.g., image, video, or Excel files), then you can simply treat it as binary data (type bytes).

Read to Variable

from urllib.request import urlopen

with urlopen( 'https://example.com/file.png' ) as file:
    content = file.read()

1
2
3
4
from urllib.request import urlopen

with urlopen( 'https://example.com/file.png' ) as file:
    content = file.read()

Save to File (Works for Text or Binary Data)

from urllib.request import urlopen

# Download from URL.
with urlopen( 'https://example.com/' ) as webpage:
    content = webpage.read()

# Save to file.
with open( 'output.html', 'wb' ) as download:
    download.write( content )

1
2
3
4
5
6
7
8
9
from urllib.request import urlopen

# Download from URL.
with urlopen( 'https://example.com/' ) as webpage:
    content = webpage.read()

# Save to file.
with open( 'output.html', 'wb' ) as download:
    download.write( content )