Download files in Playwright Python

Whenever there is an action on the webpage, you will see a request generated in the network. When you download a file, it will generate a request with “download”, so we just have to listen for this request using Playwright.

Once we have a request, we can decide where to store the downloaded file or what is the current name of the downloading file or also read the file.

For this example, I will be using the below website but use any website as it does not matter.

https://filesamples.com/formats/csv

Method 1: Perform And Download

When it comes to initiating a download, different web pages have distinct methods. Some websites trigger the download by clicking a button, while others activate it by hovering over a link. In such cases use this method. This method can be useful when you know the file downloads immediately or with a few seconds delay.

playwright = sync_playwright().start()
browser = playwright.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://filesamples.com/formats/csv")
with page.expect_download(timeout=30000) as downloading_file:
    page.locator("(//*[@class='output'] //a)[1]").click()

Method and Action you can do on downloading file:

There few other operations you can do on the downloading file.

downloading_file.is_done() – whether the downloading file is completed or not. If done returns true
downloading_file.value.failure() – Whether download failed or not, if failed then return true
downloading_file.value.cancel() – Cancels the currently downloading file
downloading_file.value.url – Gives the URL of the file (note, it is not page URL)
downloading_file.value.path() – Given value of where it is getting downloaded to in your local machine (gives temp path)
downloading_file.value.suggested_filename – file name of the downloaded file
downloading_file.value.save_as(path=”/path /where /to /save”) – Saves the file in custom path

from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
browser = playwright.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://filesamples.com/formats/csv")
with page.expect_download(timeout=30000) as downloading_file:
    page.locator("(//*[@class='output'] //a)[1]").click()
print("suggested_filename", downloading_file.value.suggested_filename)
print("path", downloading_file.value.path())
print("url", downloading_file.value.url)
print("Is failed", downloading_file.value.failure())
downloading_file.value.save_as(path="/Users/pavan/Downloads/new_download_with_playwright.csv")
print("is done", downloading_file.is_done())

Once you download the file, you can read the file and check the memory or do whatever is your scenario.

Method 2: You do not know when the file downloads:

Sometimes you might not have an idea when the files are going to be downloaded, in Such cases you cannot use the above method where you have to wait for the file to start downloading.

So we will be listening to a particular action to happen from the webpage, for downloading file page.on("download") occurs, so we have to implement our code in a way to handle it.

All the methods present in method 1 are present with downloading_file variable here also.

from playwright.sync_api import sync_playwright
playwright = sync_playwright().start()
browser = playwright.chromium.launch(headless=False)
page = browser.new_page()
page.goto("https://filesamples.com/formats/csv")

def download_file_function(downloading_file):
    print("hello")
    print(downloading_file.path)
page.on("download", download_file)
page.locator("(//*[@class='output'] //a)[1]").click()

Download files in Playwright Python

Method 1: Perform And Download

Method and Action you can do on downloading file:

Method 2: You do not know when the file downloads:

Related

Trending Posts You Might Like