Automating Web Browsing
Selenium can automate web browsing, but why would you want to do that in the first place? There are three good reasons for browser automation, and those are Testing, Web Bots, and Web Scraping.
Web Application Testing
Websites have evolved into Web Applications, and like any other piece of software, they must be tested to ensure correct behavior. Automating testing decreases the cost and time, while also providing a means of round-the-clock testing. It also makes cross-browser proofing easier. Testing enables faster regression testing that may be required after debugging or upon further development of software. It is easy to scale to a variety of devices and environments, thereby allowing cross-browser and cross-device testing easily.
Anything you can do manually using a web browser can be automated using Selenium and Python. This is what is known as a Web Bot. It is a piece of software that executes commands or performs routine tasks without the user’s intervention. This can be applied to any repetitive task online. For example, let’s say you order the same burrito every day from a website. Well, instead of manually filling out the form each time, you can instead script the entire process. Any repetitive online task can now be optimized by creating a Python script.
In order for Python to control a web browser, a piece of software called a Web Driver is needed. Two very popular drivers for using Selenium are the Firefox driver and the Chrome driver. We’ll look at both of these. Each of these drivers is an executable file. Firefox uses geckodriver.exe, and Chrome uses chromedriver.exe. Once you download these files, you need to either add them to your path manually or specify the path programmatically. We will take the latter approach. For Firefox gecko, we are using the Win64 version. For the Chrome driver, we are using the 32-bit version.
Launching a Selenium-Controlled Browser
We are ready to control a browser from Python using Selenium! Here is how to launch Firefox using Selenium. Note that we are pointing to the geckodriver.exe file that we have downloaded as the executable_path. If you do not take this step, the browser will not launch correctly. When this code runs, it launches Firefox in the style you see below with an orange-striped theme in the URL field to indicate that the browser is being controlled via Selenium.
from selenium import webdriver from shutil import which driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver)
Launching Chrome With Selenium
To launch a Chrome browser instead, we can simply change the driver in use like so. When the Chrome browser launches via Selenium, it displays a message that Chrome is being controlled by automated test software.
from selenium import webdriver from shutil import which driver = which('chromedriver.exe') browser = webdriver.Chrome(executable_path=driver)
Closing the browser
You can manually close the Selenium-controlled browser by clicking on the X like you usually would. A better option is that when your script finishes the job you program it to do, you should explicitly shut down the browser in your code using the .quit() method.
from selenium import webdriver from shutil import which driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver) browser.quit()
Headless Browsing In Chrome or Firefox
You do not have to launch a browser to run your Selenium application if you do not want to. This is what is known as Headless mode. To use headless mode with either browser, simply use set the Options() as needed.
from selenium import webdriver from shutil import which from selenium.webdriver.firefox.options import Options options = Options() options.headless = True driver = which('geckodriver.exe') browser = webdriver.Firefox(options=options, executable_path=driver) browser.quit()
from selenium import webdriver from shutil import which from selenium.webdriver.chrome.options import Options options = Options() options.headless = True driver = which('chromedriver.exe') browser = webdriver.Chrome(options=options, executable_path=driver) browser.quit()
Fetching Specific Pages
To instruct the selenium controlled browser to fetch content from specific pages you can use the .get() method. Here is an example of visiting a popular search engine web page on the internet.
from selenium import webdriver from shutil import which driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver) browser.get('https://duckduckgo.com/')
Finding Elements On The Page
Once the browser visits a specific page, Selenium can find elements on the page with various methods. These are the methods you can use to find page elements. As you can see, there are a lot of them. In most cases, you’ll only need two or three to accomplish what you need to do. The .find_elements_by_css_selector() and .find_element_by_xpath() methods seem to be very popular.
WebElement object/list returned
|Elements that use the CSS class name|
|Elements that match the CSS selector|
|Elements with a matching id attribute value|
|<a> elements that completely match the text provided|
|<a> elements that contain the text provided|
|Elements with a matching name attribute value|
|Elements with a matching tag name (case insensitive; an <a> element is matched by ‘a’ and ‘A’|
|Elements that have the specified xpath.|
Locating A Text Input
We already know how to launch a web browser and visit a search engine website. Now let’s see how to select the text input on the page. There are many ways to select the element on the page, and perhaps the easiest and most accurate is to use the XPath. First, you need to use the developer tools in your browser to find the Xpath to use.
When selecting Copy Xpath, we get this value.
We can use it in our program now and when we run the program, we see that printing out the resulting element shows us that it is a FirefoxWebElement, so locating the text input on the page was a success.
from selenium import webdriver from shutil import which driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver) browser.get('https://duckduckgo.com/') element = browser.find_element_by_xpath('//*[@id="search_form_input_homepage"]') print(element)
<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="1302c443-53b9-4b4d-9354-bc93c9d5d7ba", element="bb944d54-6f29-479a-98af-69a70b0a41a1")>
Typing Into A Text Input
Once a text input is found, the program can type text into the input. To do this, the .send_keys() method is used.
from selenium import webdriver from shutil import which driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver) browser.get('https://duckduckgo.com/') element = browser.find_element_by_xpath('//*[@id="search_form_input_homepage"]') element.send_keys('How do you automate a web browser?')
How To Press Enter Key
Once you locate a text input and type some text into it, what is usually the next step? That’s right, you need to hit the Enter key in order for anything to happen. This can also be accomplished with the .send_keys() method, but you must also import the Keys module in Selenium. Here is how we do that. Notice that once the Enter key is pressed, the website returns a list of results, all from our Python script!
from selenium import webdriver from shutil import which from selenium.webdriver.common.keys import Keys driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver) browser.get('https://duckduckgo.com/') element = browser.find_element_by_xpath('//*[@id="search_form_input_homepage"]') element.send_keys('How do you automate a web browser?') element.send_keys(Keys.RETURN)
Selenium Easy Practice Examples
The Selenium Easy website has a testing playground we can use to try some more common selenium tasks. Below is an example of a text input with an associated button. We can type into the text field, and then click the button to display a message. We will use selenium to write a Python script to complete this task.
Here is the code for this test. We’ll use the .find_element_by_id() method to locate the text input and the .find_element_by_xpath() method to locate the button. We can also use .send_keys() to fill in the text input and the .click() method to click the button.
from selenium import webdriver from shutil import which from selenium.webdriver.common.keys import Keys driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver) browser.get('https://www.seleniumeasy.com/test/basic-first-form-demo.html') input_element = browser.find_element_by_id('user-message') input_element.send_keys('Check out this great message!') show_message_button = browser.find_element_by_xpath('//*[@id="get-input"]/button') show_message_button.click()
After running the test, we see the browser has successfully taken the actions we programmed. The text was entered, the button clicked, and the message displayed.
WebElement Attributes and Methods
This brings us to a discussion of the attributes and methods of web elements. Once an element is selected via selenium and assigned to a variable, that variable now has attributes and methods we can use to programmatically take action. This is how we are able to use things like .send_keys() and .click(). Here is a list of some of the commonly used WebElements.
Simulates typing into the element.
– value – A string for typing, or setting form fields. For setting file inputs, this could be a local file path.
Use this to send simple key events or to fill out form fields
form_textfield = browser.find_element_by_name('username') form_textfield.send_keys("admin")
This can also be used to set file inputs.
file_input = browser.find_element_by_name('profilePic') file_input.send_keys("path/to/profilepic.gif")
Clicks the selected element.
Submits a form.
Gets the given attribute or property of the element.
This method will first try to return the value of a property with the given name. If a property with that name doesn’t exist, it returns the value of the attribute with the same name. If there’s no attribute with that name, None is returned.
Values which are considered truthy, that is equals “true” or “false”, are returned as booleans. All other non-None values are returned as strings. For attributes or properties which do not exist, None is returned.
– name – Name of the attribute/property to retrieve.
# Check if the "active" CSS class is applied to an element. is_active = "active" in target_element.get_attribute("class")
Clears the text if it’s a text entry element.
Gets the given property of the element.
– name – Name of the property to retrieve.
text_length = target_element.get_property("text_length")
Whether the element is visible to a user.
Returns whether the element is enabled.
Returns whether the element is selected. Can be used to check if a checkbox or radio button is selected.
The text within the element, such as ‘hello’ in <span>hello</span>
The id of the tag.
The tag name, such as ‘li’ for an <li> element
Two Inputs And A Buton Click
Here is another example from the selenium easy website. In this exercise, we want to create a python script that uses selenium to enter a value for two distinct input fields, then click a button on the page to operate on the values entered into the two input fields.
For the solution to this test, we are going to run Firefox in headless mode. We will use selenium to enter a number in inputs one and two, then click a button to add the two together. Lastly, we will use selenium to find the result on the page and print it out in the Python script. If the numbers add up to what we expect, then we know the test worked, no launching of a browser needed. When the script is run, we see the output of 17. So we know it worked since we would expect 10 + 7 to = 17.
from selenium import webdriver from shutil import which from selenium.webdriver.common.keys import Keys from selenium.webdriver.firefox.options import Options options = Options() options.headless = True driver = which('geckodriver.exe') browser = webdriver.Firefox(options=options, executable_path=driver) browser.get('https://www.seleniumeasy.com/test/basic-first-form-demo.html') input_element_one = browser.find_element_by_id('sum1') input_element_one.send_keys('10') input_element_two = browser.find_element_by_id('sum2') input_element_two.send_keys('7') get_total_element = browser.find_element_by_xpath('//*[@id="gettotal"]/button') get_total_element.click() result_element = browser.find_element_by_id('displayvalue') print(result_element.text)
Drag And Drop With ActionChains
Many things can be performed with Selenium using a single function. As such we’ll take a look at a slightly more challenging example of drag and drop in Selenium. Drag and drop operations have three basic steps. First, an object or text must be selected. Then it must be dragged to the desired area and finally dropped into place. To demonstrate this in Python, we’ll be using this dhtmlgoodies webpage, which will act as a practice ground for our script. The markup we will work on looks like this.
To implement a drag and drop in Selenium, we have to add the ActionChains library. Action chains extend Selenium by allowing the web driver to perform more complex tasks like drag and dropping. When methods are called for actions on the ActionChains objects, the actions are stored in a queue. Then we call the .drag_and_drop() method passing in the source and destination. Finally, the .perform() method is called as a method chain to execute the action. Let’s see this in action.
from selenium import webdriver from shutil import which from selenium.webdriver.common.keys import Keys from selenium.webdriver.common.action_chains import ActionChains driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver) browser.get('http://www.dhtmlgoodies.com/scripts/drag-drop-custom/demo-drag-drop-3.html') source_element = browser.find_element_by_xpath('//*[@id="box7"]') destination_element = browser.find_element_by_xpath('//*[@id="box107"]') actions = ActionChains(browser) actions.drag_and_drop(source_element, destination_element).perform()
Clicking Browser Buttons
Selenium can simulate clicks on various browser buttons as well through the following methods:
- browser.back() Clicks the Back button.
- browser.forward() Clicks the Forward button.
- browser.refresh() Clicks the Refresh/Reload button.
- browser.quit() Clicks the Close Window button.
Example Application: Stock Quote Checker
Now we can put together everything we have learned about Selenium and Python working together to create a simple application that allows you to enter a ticker symbol into your program, and it will fetch and return the current quote to you. This process is placed in a loop, allowing the user to continue to enter tickers and get a quote. To end the program, the user can simply type the letter ‘q’ to quit the program. Here is the code and some sample output of looking up a few tickers like spy, aapl, and tsla. Also note that we use the time module to add some wait times, otherwise the program might fail if the remote webpage is not loaded in time.
import time from selenium import webdriver from shutil import which from selenium.webdriver.common.keys import Keys driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver) browser.get('https://finance.yahoo.com') ticker_to_lookup = True while (ticker_to_lookup != "q"): ticker_to_lookup = input('What ticker to you want to look up? (q to quit) ') if ticker_to_lookup == 'q': browser.quit() break quote_lookup_text_input = browser.find_element_by_xpath('//*[@id="Col2-0-SymbolLookup-Proxy"]/div/div/div/fieldset/input') quote_lookup_text_input.send_keys(ticker_to_lookup, Keys.RETURN) time.sleep(10) quote_span = browser.find_element_by_xpath( '/html/body/div/div/div/div/div/div/div/div/div/div/div/div/div/span') print(ticker_to_lookup + ' is currently ' + quote_span.text) browser.back() time.sleep(5)
What ticker to you want to look up? (q to quit) spy spy is currently 283.71 What ticker to you want to look up? (q to quit) aapl aapl is currently 287.26 What ticker to you want to look up? (q to quit) tsla tsla is currently 736.51 What ticker to you want to look up? (q to quit) q Process finished with exit code 0
Selenium Wait Functions
Selenium has something that is known as wait functions. Wait functions exist because modern websites often use asynchronous techniques like AJAX to update portions of the webpage with no reloads. This provides a great user experience, but sometimes the Selenium driver may run into problems if it tries to locate an element on the page before it has loaded. This will raise an exception in the script and our program will not work correctly. Wait functions help with this by adding time intervals in between actions performed, thus allowing the web driver to wait until an element is loaded before it interacts with it. Selenium offers two types of waits, explicit and implicit. Explicit waits, when paired with a condition, will wait until that condition is satisfied before executing. Implicit waits will instead try to poll the DOM for a certain amount of time until the element is available.
An Example Of Using A Wait Function
To get started using a wait in Selenium we need to import some needed modules. Of course, we need the webdriver module to get started. Then we import three new modules and those are By, WebDriverWait, and expected_conditions. We use an alias to reference expected_conditions as EC to make writing the code easier.
from selenium import webdriver from shutil import which from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC
The page we are going to demonstrate using a Selenium wait for is at Google Earth. If you visit Google Earth, you can see that the upper navbar actually loads in slightly later than the rest of the page. If Selenium tried to locate a link in the navbar and click right away, it would fail. This is a good example of when we can use a Selenium wait so that the script works correctly, even when a fragment of the page has a slight delay loading.
from selenium import webdriver from shutil import which from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC driver = which('geckodriver.exe') browser = webdriver.Firefox(executable_path=driver) browser.get('https://www.google.com/earth/') wait = WebDriverWait(browser, 10) launchButton = wait.until(EC.element_to_be_clickable((By.XPATH, '/html/body/header/div/nav/ul/li/a'))) launchButton.click()
The code above makes use of an explicit wait using the WebDriverWait function. This function will throw an exception after 10 seconds (the number specified as argument 2) if the condition we make is not satisfied. Next, we create a condition for the explicit wait. To implement this, we compare it with the expected conditions module to make the program wait until a specific action can be completed. The code above is telling the program to wait until the launch Earth button becomes clickable in the browser. We simply use the XPath of the button and implement this idea. Go ahead and inspect the button. The line of code just before the click is saying to wait until that button is clickable before actually moving forward with the click. In one of the other examples in this tutorial, we used the sleep() function of Python to do something similar. Using a Selenium wait function is a little more verbose in code, but your scripts will run faster with them because they will act as soon as the item is ready, whereas the sleep() function is going to wait for a specified amount of time no matter what.
Learn More About Selenium With Python
- Selenium Python.readthedocs.io (selenium-python.readthedocs.io)
- Selenium Python (guru99.com)
- Modern Web Automation With Python And Selenium (realpython.com)
- Selenium Webdriver Python Tutorial (techbeamers.com)
- Guide Python Selenium To Run Web Automation Test (browserstack.com)
- Python Selenium (zetcode.com)
- Selenium Python (javatpoint.com)
- Pythonspot Selenium (pythonspot.com)
- Seleniumeasy Python (seleniumeasy.com)
- Using Selenium With Python Tutorial (blog.testproject.io)
- Tutorial For Selenium Index (tutorialspoint.com)
Selenium Python Tutorial Summary
In this tutorial, we saw how to fully automate web-based tasks by directly controlling a web browser via Python code with the Selenium library. Selenium is quite powerful and allows you to complete any task you would otherwise do manually in a web browser like visiting various URLs, filling out forms, clicking page elements, using drag and drop, and more. The web browser is perhaps the most commonly used piece of software in an internet-connected age, and being able to automate and leverage this in code is a great skill to have.