Selenium Python Tutorial

Selenium Python Tutorial

Selenium is a tool to automate controlling any type of web browser that works with languages like Python, Java, C#, Ruby, and others. We’ll be looking at how to use Selenium with Python in this tutorial. By using Python with Selenium, you can launch the browser of your choice, fetch any webpage, programmatically click links, fill out web forms, and use other browser functions like back, forward, and reload. Selenium is very powerful and facilitates more flexibility than web scraping tools like Beautiful Soup and Scrapy. It is possible to work directly with pages rendered fully in JavaScript like Single Page Applications whereas other tools can not. The tradeoff is that launching a browser for testing and scraping is more memory intensive and slower.


Automating Web Browsing

Selenium can automate web browsing, but why would you want to do that in the first place? There are three good reasons for browser automation, and those are Testing, Web Bots, and Web Scraping.

Web Application Testing

Websites have evolved into Web Applications, and like any other piece of software, they must be tested to ensure correct behavior. Automating testing decreases the cost and time, while also providing a means of round the clock testing. It also makes cross-browser proofing easier. Testing enables faster regression testing that may be required after debugging or upon further development of software. It is easy to scale to a variety of devices and environments, thereby allowing cross-browser and cross-device testing easily.

Web Bots

Anything you can do manually using a web browser can be automated using Selenium and Python. This is what is known as a Web Bot. It is a piece of software that executes commands or performs routine tasks without the user’s intervention. This can be applied to any repetitive task online. For example, let’s say you order the same burrito every day from a website. Well, instead of manually filling out the form each time, you can instead script the entire process. Any repetitive online task can now be optimized by creating a Python script.


Web Drivers

In order for Python to control a web browser, a piece of software called a Web Driver is needed. Two very popular drivers for using Selenium are the Firefox driver and the Chrome driver. We’ll look at both of these. Each of these drivers is an executable file. Firefox uses geckodriver.exe, and Chrome uses chromedriver.exe. Once you download these files, you need to either add them to your path manually or specify the path programmatically. We will take the latter approach. For Firefox gecko, we are using the Win64 version. For the Chrome driver, we are using the 32-bit version.


Launching a Selenium-Controlled Browser

We are ready to control a browser from Python using Selenium! Here is how to launch Firefox using Selenium. Note that we are pointing to the geckodriver.exe file that we have downloaded as the executable_path. If you do not take this step, the browser will not launch correctly. When this code runs, it launches Firefox in the style you see below with an orange striped theme in the URL field to indicate that the browser is being controlled via Selenium.

Selenium Firefox Browser

Launching Chrome With Selenium

To launch a Chrome browser instead, we can simply change the driver in use like so. When the Chrome browser launches via Selenium, it displays a message that Chrome is being controlled by automated test software.

Selenium Google Chrome Browser

Closing the browser

You can manually close the Selenium controlled browser by clicking on the X like you usually would. A better option is that when your script finishes the job you program it to do, you should explicitly shut down the browser in your code using the .quit() method.

Headless Browsing In Chrome or Firefox

You do not have to launch a browser to run your Selenium application if you do not want to. This is what is known as Headless mode. To use headless mode with either browser, simply use set the Options() as needed.

Firefox

Chrome


Fetching Specific Pages

To instruct the selenium controlled browser to fetch content from specific pages you can use the .get() method. Here is an example of visiting a popular search engine web page on the internet.

selenium get method


Finding Elements On The Page

Once the browser visits a specific page, Selenium can find elements on the page with various methods. These are the methods you can use to find page elements. As you can see, there are a lot of them. In most cases, you’ll only need two or three to accomplish what you need to do. The .find_elements_by_css_selector() and .find_element_by_xpath() methods seem to be very popular.

Method name
WebElement object/list returned
browser.find_element_by_class_name(name)
browser.find_elements_by_class_name(name)
Elements that use the CSS class name
browser.find_element_by_css_selector(selector)
browser.find_elements_by_css_selector(selector)
Elements that match the CSS selector
browser.find_element_by_id(id)
browser.find_elements_by_id(id)
Elements with a matching id attribute value
browser.find_element_by_link_text(text)
browser.find_elements_by_link_text(text)
<a> elements that completely match the text provided
browser.find_element_by_partial_link_text(text)
browser.find_elements_by_partial_link_text(text)
<a> elements that contain the text provided
browser.find_element_by_name(name)
browser.find_elements_by_name(name)
Elements with a matching name attribute value
browser.find_element_by_tag_name(name)
browser.find_elements_by_tag_name(name)
Elements with a matching tag name (case insensitive; an <a> element is matched by ‘a’ and ‘A’
browser.find_element_by_xpath((xpath)
browser.find_elements_by_xpath((xpath)
Elements that have the specified xpath.

Locating A Text Input

We already know how to launch a web browser and visit a search engine website. Now let’s see how to select the text input on the page. There are many ways to select the element on the page, and perhaps the easiest and most accurate is to use the XPath. First, you need to use the developer tools in your browser to find the Xpath to use.

how to find xpath of browser element

When selecting Copy Xpath, we get this value.

//*[@id="search_form_input_homepage"]

We can use it in our program now and when we run the program, we see that printing out the resulting element shows us that it is a FirefoxWebElement, so locating the text input on the page was a success.

<selenium.webdriver.firefox.webelement.FirefoxWebElement (session="1302c443-53b9-4b4d-9354-bc93c9d5d7ba", element="bb944d54-6f29-479a-98af-69a70b0a41a1")>

Typing Into A Text Input

Once a text input is found, the program can type text into the input. To do this, the .send_keys() method is used.

selenium send_keys method

How To Press Enter Key

Once you locate a text input and type some text into it, what is usually the next step? That’s right, you need to hit the Enter key in order for anything to happen. This can also be accomplished with the .send_keys() method, but you must also import the Keys module in Selenium. Here is how we do that. Notice that once the Enter key is pressed, the website returns a list of results, all from our Python script!

selenium enter text and press enter


Selenium Easy Practice Examples

The Selenium Easy website has a testing playground we can use to try some more common selenium tasks. Below is an example of a text input with an associated button. We can type into the text field, and then click the button to display a message. We will use selenium to write a Python script to complete this task.

selenium select text input click button

Here is the code for this test. We’ll use the .find_element_by_id() method to locate the text input and the .find_element_by_xpath() method to locate the button. We can also use .send_keys() to fill in the text input and the .click() method to click the button.

After running the test, we see the browser has successfully taken the actions we programmed. The text was entered, button clicked, and message displayed.

selenium text case success

WebElement Attributes and Methods

This brings us to a discussion of the attributes and methods of web elements. Once an element is selected via selenium and assigned to a variable, that variable now has attributes and methods we can use to programmatically take action. This is how we are able to use things like .send_keys() and .click(). Here is a list of some of the commonly used WebElements.


element.send_keys()

Simulates typing into the element.

Args
– value – A string for typing, or setting form fields. For setting file inputs, this could be a local file path.

Use this to send simple key events or to fill out form fields

This can also be used to set file inputs.


element.click()

Clicks the selected element.


element.submit()

Submits a form.


element.get_attribute()

Gets the given attribute or property of the element.

This method will first try to return the value of a property with the given name. If a property with that name doesn’t exist, it returns the value of the attribute with the same name. If there’s no attribute with that name, None is returned.

Values which are considered truthy, that is equals “true” or “false”, are returned as booleans. All other non-None values are returned as strings. For attributes or properties which do not exist, None is returned.

Args
– name – Name of the attribute/property to retrieve.

Example


element.clear()

Clears the text if it’s a text entry element.


element.get_property()

Gets the given property of the element.

Args
– name – Name of the property to retrieve.

Example


element.is_displayed()

Whether the element is visible to a user.


element.is_enabled()

Returns whether the element is enabled.


element.is_selected()

Returns whether the element is selected. Can be used to check if a checkbox or radio button is selected.


element.text

The text within the element, such as ‘hello’ in <span>hello</span>


element.id

The id of the tag.


element.tag_name

The tag name, such as ‘li’ for an <li> element


Two Inputs And A Buton Click

Here is another example from the selenium easy website. In this exercise, we want to create a python script that uses selenium to enter a value for two distinct input fields, then click a button on the page to operate on the values entered into the two input fields.

two inputs with selenium python

For the solution to this test, we are going to run Firefox in headless mode. We will use selenium to enter a number in input one and two, then click a button to add the two together. Lastly, we will use selenium to find the result on the page and print it out in the Python script. If the numbers add up to what we expect, then we know the test worked, no launching of a browser needed. When the script is run, we see the output of 17. So we know it worked since we would expect 10 + 7 to = 17.

17

Drag And Drop With ActionChains

Many things can be performed with Selenium using a single function. As such we’ll take a look at a slightly more challenging example of drag and drop in Selenium. A drag and drop has three basic steps. First, an object or text must be selected. Then it must be dragged to the desired area and finally dropped into place. To demonstrate this in Python, we’ll be using this dhtmlgoodies webpage, which will act as a practice ground for our script. The markup we will work on looks like this.

selenium drag and drop example

To implement a drag and drop in Selenium, we have to add the ActionChains library. Action chains extend Selenium by allowing the web driver to perform more complex tasks like drag and dropping. When methods are called for actions on the ActionChains objects, the actions are stored in a queue. Then we call the .drag_and_drop() method passing in the source and destination. Finally, the .perform() method is called as a method chain to execute the action. Let’s see this in action.

selenium drag and drop success


Clicking Browser Buttons

Selenium can simulate clicks on various browser buttons as well through the following methods:

  • browser.back() Clicks the Back button.
  • browser.forward() Clicks the Forward button.
  • browser.refresh() Clicks the Refresh/Reload button.
  • browser.quit() Clicks the Close Window button.

Example Application: Stock Quote Checker

Now we can put together everything we have learned about Selenium and Python working together to create a simple application that allows you to enter a ticker symbol into your program, and it will fetch and return the current quote to you. This process is placed in a loop, allowing the user to continue to enter tickers and get a quote. To end the program, the user can simply type the letter ‘q’ to quit the program. Here is the code and some sample output of looking up a few tickers like spy, aapl, and tsla. Also note that we use the time module to add some wait times, otherwise the program might fail if the remote webpage is not loaded in time.

What ticker to you want to look up? (q to quit) spy
spy is currently 283.71
What ticker to you want to look up? (q to quit) aapl
aapl is currently 287.26
What ticker to you want to look up? (q to quit) tsla
tsla is currently 736.51
What ticker to you want to look up? (q to quit) q

Process finished with exit code 0

Selenium Wait Functions

Selenium has something that is known as wait functions. Wait functions exist because modern websites often use asynchronous techniques like AJAX to update portions of the webpage with no reloads. This provides the great user experience, but sometimes the Selenium driver may run into problems if it tries to locate an element on the page before it has loaded. This will raise an exception in the script and our program will not work correctly. Wait functions help with this by adding time intervals in between actions performed, thus allowing the web driver to wait until an element is loaded before it interacts with it. Selenium offers two types of waits, explicit and implicit. Explicit waits, when paired with a condition, will wait until that condition is satisfied before executing. Implicit waits will instead try to poll the DOM for a certain amount of time until the element is available.

An Example Of Using A Wait Function

To get started using a wait in Selenium we need to import some needed modules. Of course, we need the webdriver module to get started. Then we import three new modules and those are By, WebDriverWait, and expected_conditions. We use an alias to reference expected_conditions as EC to make writing the code easier.

The page we are going to demonstrate using a Selenium wait for is at Google Earth. If you visit Google Earth, you can see that the upper navbar actually loads in slightly later than the rest of the page. If Selenium tried to locate a link in the navbar and click right away, it would fail. This is a good example of when we can use a Selenium wait so that the script works correctly, even when a fragment of the page has a slight delay loading.

using waits for dynamic content selenium

The code above makes use of an explicit wait using the WebDriverWait function. This function will throw an exception after 10 seconds (the number specified as argument 2) if the condition we make is not satisfied. Next, we create a condition for the explicit wait. To implement this, we compare it with the expected conditions module to make the program wait until a specific action can be completed. The code above is telling the program to wait until the launch Earth button becomes clickable in the browser. We simply use the XPath of the button and implement this idea. Go ahead and inspect the button. The line of code just before the click is saying to wait until that button is clickable before actually moving forward with the click. In one of the other examples in this tutorial, we used the sleep() function of Python to do something similar. Using a Selenium wait function is a little more verbose in code, but your scripts will run faster with them because they will act as soon as the item is ready, whereas the sleep() function is going to wait for a specified amount of time no matter what.

Learn More About Selenium With Python

Selenium Python Tutorial Summary

In this tutorial, we saw how to fully automate web-based tasks by directly controlling a web browser via Python code with the Selenium library. Selenium is quite powerful and allows you to complete any task you would otherwise do manually in a web browser like visiting various URLs, filling out forms, clicking page elements, using drag and drop, and more. The web browser is perhaps the most commonly used piece of software in an internet-connected age, and being able to automate and leverage this in code is a great skill to have.