Web scraping is a common technique used to fetch data from the internet for different types of applications. With the almost limitless data available online, software developers have created many tools to make it possible to compile information efficiently. During the process of web scraping, a computer program sends a request to a website on the internet. An Html document is sent back as a response to the program’s request. Inside of that document is information you may be interested in for one purpose or another. In order to access this data quickly, the step of parsing comes into play. By parsing the document, we can isolate and focus on the specific data points we are interested in. Common Python libraries for helping with this technique are Beautiful Soup, lxml, and Requests. In this tutorial, we’ll put these tools to work to learn how to implement Web Scraping using Python.
Install Web Scraping Code
To follow along run these three commands from the terminal. It’s also recommended to make use of a virtual environment to kepp things clean on your system.
- pip install lxml
- pip install requests
- pip install beautifulsoup4
Find A Website To Scrape
To learn about how to do web scraping, we can test out a website called http://quotes.toscrape.com/ which looks like it was made for just this purpose.
From this website, maybe we would like to create a data store of all the authors, tags, and quotes from the page. How could that be done? Well, first we can look at the source of the page. This is the data that is actually returned when a request is sent to the website. So in the Firefox web browser, we can right-click on the page and choose “view page source”.
This will display the raw Html markup on the page. It is shown here for reference.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 |
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Quotes to Scrape</title> <link rel="stylesheet" href="/static/bootstrap.min.css"> <link rel="stylesheet" href="/static/main.css"> </head> <body> <div class="container"> <div class="row header-box"> <div class="col-md-8"> <h1> <a href="/" style="text-decoration: none">Quotes to Scrape</a> </h1> </div> <div class="col-md-4"> <p> <a href="/login">Login</a> </p> </div> </div> <div class="row"> <div class="col-md-8"> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”</span> <span>by <small class="author" itemprop="author">Albert Einstein</small> <a href="/author/Albert-Einstein">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="change,deep-thoughts,thinking,world" / > <a class="tag" href="/tag/change/page/1/">change</a> <a class="tag" href="/tag/deep-thoughts/page/1/">deep-thoughts</a> <a class="tag" href="/tag/thinking/page/1/">thinking</a> <a class="tag" href="/tag/world/page/1/">world</a> </div> </div> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“It is our choices, Harry, that show what we truly are, far more than our abilities.”</span> <span>by <small class="author" itemprop="author">J.K. Rowling</small> <a href="/author/J-K-Rowling">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="abilities,choices" / > <a class="tag" href="/tag/abilities/page/1/">abilities</a> <a class="tag" href="/tag/choices/page/1/">choices</a> </div> </div> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”</span> <span>by <small class="author" itemprop="author">Albert Einstein</small> <a href="/author/Albert-Einstein">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="inspirational,life,live,miracle,miracles" / > <a class="tag" href="/tag/inspirational/page/1/">inspirational</a> <a class="tag" href="/tag/life/page/1/">life</a> <a class="tag" href="/tag/live/page/1/">live</a> <a class="tag" href="/tag/miracle/page/1/">miracle</a> <a class="tag" href="/tag/miracles/page/1/">miracles</a> </div> </div> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”</span> <span>by <small class="author" itemprop="author">Jane Austen</small> <a href="/author/Jane-Austen">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="aliteracy,books,classic,humor" / > <a class="tag" href="/tag/aliteracy/page/1/">aliteracy</a> <a class="tag" href="/tag/books/page/1/">books</a> <a class="tag" href="/tag/classic/page/1/">classic</a> <a class="tag" href="/tag/humor/page/1/">humor</a> </div> </div> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”</span> <span>by <small class="author" itemprop="author">Marilyn Monroe</small> <a href="/author/Marilyn-Monroe">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="be-yourself,inspirational" / > <a class="tag" href="/tag/be-yourself/page/1/">be-yourself</a> <a class="tag" href="/tag/inspirational/page/1/">inspirational</a> </div> </div> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“Try not to become a man of success. Rather become a man of value.”</span> <span>by <small class="author" itemprop="author">Albert Einstein</small> <a href="/author/Albert-Einstein">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="adulthood,success,value" / > <a class="tag" href="/tag/adulthood/page/1/">adulthood</a> <a class="tag" href="/tag/success/page/1/">success</a> <a class="tag" href="/tag/value/page/1/">value</a> </div> </div> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“It is better to be hated for what you are than to be loved for what you are not.”</span> <span>by <small class="author" itemprop="author">André Gide</small> <a href="/author/Andre-Gide">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="life,love" / > <a class="tag" href="/tag/life/page/1/">life</a> <a class="tag" href="/tag/love/page/1/">love</a> </div> </div> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“I have not failed. I've just found 10,000 ways that won't work.”</span> <span>by <small class="author" itemprop="author">Thomas A. Edison</small> <a href="/author/Thomas-A-Edison">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="edison,failure,inspirational,paraphrased" / > <a class="tag" href="/tag/edison/page/1/">edison</a> <a class="tag" href="/tag/failure/page/1/">failure</a> <a class="tag" href="/tag/inspirational/page/1/">inspirational</a> <a class="tag" href="/tag/paraphrased/page/1/">paraphrased</a> </div> </div> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“A woman is like a tea bag; you never know how strong it is until it's in hot water.”</span> <span>by <small class="author" itemprop="author">Eleanor Roosevelt</small> <a href="/author/Eleanor-Roosevelt">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="misattributed-eleanor-roosevelt" / > <a class="tag" href="/tag/misattributed-eleanor-roosevelt/page/1/">misattributed-eleanor-roosevelt</a> </div> </div> <div class="quote" itemscope itemtype="http://schema.org/CreativeWork"> <span class="text" itemprop="text">“A day without sunshine is like, you know, night.”</span> <span>by <small class="author" itemprop="author">Steve Martin</small> <a href="/author/Steve-Martin">(about)</a> </span> <div class="tags"> Tags: <meta class="keywords" itemprop="keywords" content="humor,obvious,simile" / > <a class="tag" href="/tag/humor/page/1/">humor</a> <a class="tag" href="/tag/obvious/page/1/">obvious</a> <a class="tag" href="/tag/simile/page/1/">simile</a> </div> </div> <nav> <ul class="pager"> <li class="next"> <a href="/page/2/">Next <span aria-hidden="true">→</span></a> </li> </ul> </nav> </div> <div class="col-md-4 tags-box"> <h2>Top Ten tags</h2> <span class="tag-item"> <a class="tag" style="font-size: 28px" href="/tag/love/">love</a> </span> <span class="tag-item"> <a class="tag" style="font-size: 26px" href="/tag/inspirational/">inspirational</a> </span> <span class="tag-item"> <a class="tag" style="font-size: 26px" href="/tag/life/">life</a> </span> <span class="tag-item"> <a class="tag" style="font-size: 24px" href="/tag/humor/">humor</a> </span> <span class="tag-item"> <a class="tag" style="font-size: 22px" href="/tag/books/">books</a> </span> <span class="tag-item"> <a class="tag" style="font-size: 14px" href="/tag/reading/">reading</a> </span> <span class="tag-item"> <a class="tag" style="font-size: 10px" href="/tag/friendship/">friendship</a> </span> <span class="tag-item"> <a class="tag" style="font-size: 8px" href="/tag/friends/">friends</a> </span> <span class="tag-item"> <a class="tag" style="font-size: 8px" href="/tag/truth/">truth</a> </span> <span class="tag-item"> <a class="tag" style="font-size: 6px" href="/tag/simile/">simile</a> </span> </div> </div> </div> <footer class="footer"> <div class="container"> <p class="text-muted"> Quotes by: <a href="https://www.goodreads.com/quotes">GoodReads.com</a> </p> <p class="copyright"> Made with <span class='sh-red'>❤</span> by <a href="https://scrapinghub.com">Scrapinghub</a> </p> </div> </footer> </body> </html> |
As you can see from the above markup, there is a lot of data that kind of just looks all mashed together. The purpose of web scraping is to be able to access just the parts of the web page that we are interested in. Many software developers will employ regular expressions for this task, and that is definitely a viable option. The Python Beautiful Soup library is a much more user-friendly way to extract the information we want.
Building The Scraping Script
In PyCharm, we can add a new file that will hold the Python code to scrape our page.
scraper.py
1 2 3 4 5 6 7 8 |
import requests from bs4 import BeautifulSoup url = 'http://quotes.toscrape.com/' response = requests.get(url) soup = BeautifulSoup(response.text, 'lxml') print(soup) |
The code above is the beginning of our Python scraping script. At the top of the file, the first thing to do is import the requests and BeautifulSoup libraries. Then, we set the URL we want to scrape right into that url
variable. This is then passed to the requests.get() function and we assign the result into the response
variable. We use the BeautifulSoup() constructor to put the response text into the soup
variable setting lxml as the format. Last, we print out the soup
variable and you should see something similar to the screen shot below. Essentially, the software is visiting the website, reading the data and viewing the source of the website much as we did manually above. The only difference is this time around, all we had to do was click a button to see the output. Pretty neat!
Traversing HTML Structures
HTML stands for hypertext markup language and works by distributing elements of the HTML document with specific tags. HTML has many different tags but a general layout involves three basic ones. An HTML tag, a head tag, and a body tag. These tags organize the HTML document. In our case, we’ll be mostly focused on the information within the body tag. At this point, our script is able to fetch the Html markup from our designated Url. The next step is to focus on the specific data we are interested in. Notice that if you use the inspector tool in your browser, it is fairly easy to see exactly what Html markup is responsible for rendering a given piece of information on the page. As we hover our mouse pointer over a particular span tag, we can see the associated text is automatically highlighted in the browser window. It turns out that every quote is inside of a span tag which also has a class of text. This is how you decipher how to scrape data. You look for patterns on the page and then create code that works on that pattern. Have a play around and notice that this works no matter where you place the mouse pointer. We can see the mapping of a specific quote to specific Html markup. Web scraping makes it possible to easily fetch all similar sections of an Html document. That’s pretty much all the HTML we need to know to scrape simple websites.
Parsing Html Markup
There is a lot of information in the Html document, but Beautiful Soup makes it really easy to find the data we want, sometimes with just one line of code. So let’s go ahead and search all span tags that have a class of text. This should find all the quotes for us. When you want to find multiple of the same tags on the page you can use the find_all() function.
scraper.py
1 2 3 4 5 6 7 8 9 |
import requests from bs4 import BeautifulSoup url = 'http://quotes.toscrape.com/' response = requests.get(url) soup = BeautifulSoup(response.text, 'lxml') quotes = soup.find_all('span', class_='text') print(quotes) |
When the code above runs, the quotes variable gets assigned a list of all the elements from the Html document that is a span tag with a class of text. Printing out that quotes variable gives us the output we see below. The entire Html tag is captured along with its inner contents.
Beautiful Soup text property
The extra Html markup that is returned in the script is not really what we are interested in. To get only the data we want, in this case, the actual quotes, we can use the .text property made available to us via Beautiful Soup. Note the new highlighted code here where we use a for loop to iterate over all of the captured data and print out only the contents we want.
scraper.py
1 2 3 4 5 6 7 8 9 10 |
import requests from bs4 import BeautifulSoup url = 'http://quotes.toscrape.com/' response = requests.get(url) soup = BeautifulSoup(response.text, 'lxml') quotes = soup.find_all('span', class_='text') for quote in quotes: print(quote.text) |
This gives us a nice output with just the quotes we are interested in.
C:pythonvrequestsScriptspython.exe C:/python/vrequests/scraper.py “The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.” “It is our choices, Harry, that show what we truly are, far more than our abilities.” “There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.” “The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.” “Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.” “Try not to become a man of success. Rather become a man of value.” “It is better to be hated for what you are than to be loved for what you are not.” “I have not failed. I've just found 10,000 ways that won't work.” “A woman is like a tea bag; you never know how strong it is until it's in hot water.” “A day without sunshine is like, you know, night.” Process finished with exit code 0
Neat! To now find all the authors and also print them out as they are associated with each quote, we can use the code below. By following the same steps as before, we first manually inspect the page we want to scrape. We can see that each author is contained inside of a <small> tag with an author class. So we follow the same format as before with the find_all() function and store the result in that new authors
variable. We also need to change up the for loop to make use of the range() function so we can iterate over both the quotes and authors at the same time.
scraper.py
1 2 3 4 5 6 7 8 9 10 11 12 |
import requests from bs4 import BeautifulSoup url = 'http://quotes.toscrape.com/' response = requests.get(url) soup = BeautifulSoup(response.text, 'lxml') quotes = soup.find_all('span', class_='text') authors = soup.find_all('small', class_='author') for i in range(0, len(quotes)): print(quotes[i].text) print('--' + authors[i].text) |
Now we get the quotes and each associated author when the script is run.
C:pythonvrequestsScriptspython.exe C:/python/vrequests/scraper.py “The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.” --Albert Einstein “It is our choices, Harry, that show what we truly are, far more than our abilities.” --J.K. Rowling “There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.” --Albert Einstein “The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.” --Jane Austen “Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.” --Marilyn Monroe “Try not to become a man of success. Rather become a man of value.” --Albert Einstein “It is better to be hated for what you are than to be loved for what you are not.” --André Gide “I have not failed. I've just found 10,000 ways that won't work.” --Thomas A. Edison “A woman is like a tea bag; you never know how strong it is until it's in hot water.” --Eleanor Roosevelt “A day without sunshine is like, you know, night.” --Steve Martin Process finished with exit code 0
Finally, we’ll just add some code to fetch all the tags for each quote as well. This one is a little trickier because we first need to fetch each outer wrapping div of each collection of tags. If we didn’t do this first step, then we could fetch all the tags but we wouldn’t know how to associate them to a quote and author pair. Once the outer div is captured, we can drill down further by using the find_all() function again on *that* subset. From there we have to add an inner loop to the first loop to complete the process.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import requests from bs4 import BeautifulSoup url = 'http://quotes.toscrape.com/' response = requests.get(url) soup = BeautifulSoup(response.text, 'lxml') quotes = soup.find_all('span', class_='text') authors = soup.find_all('small', class_='author') tags = soup.find_all('div', class_='tags') for i in range(0, len(quotes)): print(quotes[i].text) print('--' + authors[i].text) tagsforquote = tags[i].find_all('a', class_='tag') for tagforquote in tagsforquote: print(tagforquote.text) print('n') |
This code now gives us the following result. Pretty cool, right?!
C:pythonvrequestsScriptspython.exe C:/python/vrequests/scraper.py “The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.” --Albert Einstein change deep-thoughts thinking world “It is our choices, Harry, that show what we truly are, far more than our abilities.” --J.K. Rowling abilities choices “There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.” --Albert Einstein inspirational life live miracle miracles “The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.” --Jane Austen aliteracy books classic humor “Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.” --Marilyn Monroe be-yourself inspirational “Try not to become a man of success. Rather become a man of value.” --Albert Einstein adulthood success value “It is better to be hated for what you are than to be loved for what you are not.” --André Gide life love “I have not failed. I've just found 10,000 ways that won't work.” --Thomas A. Edison edison failure inspirational paraphrased “A woman is like a tea bag; you never know how strong it is until it's in hot water.” --Eleanor Roosevelt misattributed-eleanor-roosevelt “A day without sunshine is like, you know, night.” --Steve Martin humor obvious simile Process finished with exit code 0
Practice Web Scraping
Another great resource for learning how to Web scrape can be found at https://scrapingclub.com. There are many tutorials there that cover how to use another Python web scraping software package called Scrapy. In addition to that are several practice web pages for scraping that we can utilize. We can start with this url here https://scrapingclub.com/exercise/list_basic/?page=1
We want to simply extract the item name and price from each entry and display it as a list. So step one is to examine the source of the page to determine how we can search on the Html. It looks like we have some Bootstrap classes we can search on among other things.
With this knowledge, here is our Python script for this scrape.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import requests from bs4 import BeautifulSoup url = 'https://scrapingclub.com/exercise/list_basic/?page=1' response = requests.get(url) soup = BeautifulSoup(response.text, 'lxml') items = soup.find_all('div', class_='col-lg-4 col-md-6 mb-4') count = 1 for i in items: itemName = i.find('h4', class_='card-title').text.strip() itemPrice = i.find('h5').text print(f'{count}: {itemPrice} for the {itemName}') count += 1 |
C:pythonvrequestsScriptspython.exe C:/python/vrequests/scraper.py 1: $24.99 for the Short Dress 2: $29.99 for the Patterned Slacks 3: $49.99 for the Short Chiffon Dress 4: $59.99 for the Off-the-shoulder Dress 5: $24.99 for the V-neck Top 6: $49.99 for the Short Chiffon Dress 7: $24.99 for the V-neck Top 8: $24.99 for the V-neck Top 9: $59.99 for the Short Lace Dress Process finished with exit code 0
Web Scraping More Than One Page
The URL above is a single page of a paginated collection. We can see that by the page=1 in the URL. We can also set up a Beautiful Soup script to scrape more than one page at a time. Here is a script that scrapes all of the linked pages from the original page. Once all those URLs are captured, the script can issue a request to each individual page and parse out the results.
scraper.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
import requests from bs4 import BeautifulSoup url = 'https://scrapingclub.com/exercise/list_basic/?page=1' response = requests.get(url) soup = BeautifulSoup(response.text, 'lxml') items = soup.find_all('div', class_='col-lg-4 col-md-6 mb-4') count = 1 for i in items: itemName = i.find('h4', class_='card-title').text.strip() itemPrice = i.find('h5').text print(f'{count}: {itemPrice} for the {itemName}') count += 1 pages = soup.find('ul', class_='pagination') urls = [] links = pages.find_all('a', class_='page-link') for link in links: pageNum = int(link.text) if link.text.isdigit() else None if pageNum != None: hrefval = link.get('href') urls.append(hrefval) count = 1 for i in urls: newUrl = url + i response = requests.get(newUrl) soup = BeautifulSoup(response.text, 'lxml') items = soup.find_all('div', class_='col-lg-4 col-md-6 mb-4') for i in items: itemName = i.find('h4', class_='card-title').text.strip() itemPrice = i.find('h5').text print(f'{count}: {itemPrice} for the {itemName}') count += 1 |
Running that script then scrapes all the pages in one go and outputs a large list like so.
C:pythonvrequestsScriptspython.exe C:/python/vrequests/scraper.py 1: $24.99 for the Short Dress 2: $29.99 for the Patterned Slacks 3: $49.99 for the Short Chiffon Dress 4: $59.99 for the Off-the-shoulder Dress 5: $24.99 for the V-neck Top 6: $49.99 for the Short Chiffon Dress 7: $24.99 for the V-neck Top 8: $24.99 for the V-neck Top 9: $59.99 for the Short Lace Dress 1: $24.99 for the Short Dress 2: $29.99 for the Patterned Slacks 3: $49.99 for the Short Chiffon Dress 4: $59.99 for the Off-the-shoulder Dress 5: $24.99 for the V-neck Top 6: $49.99 for the Short Chiffon Dress 7: $24.99 for the V-neck Top 8: $24.99 for the V-neck Top 9: $59.99 for the Short Lace Dress 10: $24.99 for the Short Dress 11: $29.99 for the Patterned Slacks 12: $49.99 for the Short Chiffon Dress 13: $59.99 for the Off-the-shoulder Dress 14: $24.99 for the V-neck Top 15: $49.99 for the Short Chiffon Dress 16: $24.99 for the V-neck Top 17: $24.99 for the V-neck Top 18: $59.99 for the Short Lace Dress 19: $24.99 for the Short Dress 20: $29.99 for the Patterned Slacks 21: $49.99 for the Short Chiffon Dress 22: $59.99 for the Off-the-shoulder Dress 23: $24.99 for the V-neck Top 24: $49.99 for the Short Chiffon Dress 25: $24.99 for the V-neck Top 26: $24.99 for the V-neck Top 27: $59.99 for the Short Lace Dress 28: $24.99 for the Short Dress 29: $29.99 for the Patterned Slacks 30: $49.99 for the Short Chiffon Dress 31: $59.99 for the Off-the-shoulder Dress 32: $24.99 for the V-neck Top 33: $49.99 for the Short Chiffon Dress 34: $24.99 for the V-neck Top 35: $24.99 for the V-neck Top 36: $59.99 for the Short Lace Dress 37: $24.99 for the Short Dress 38: $29.99 for the Patterned Slacks 39: $49.99 for the Short Chiffon Dress 40: $59.99 for the Off-the-shoulder Dress 41: $24.99 for the V-neck Top 42: $49.99 for the Short Chiffon Dress 43: $24.99 for the V-neck Top 44: $24.99 for the V-neck Top 45: $59.99 for the Short Lace Dress 46: $24.99 for the Short Dress 47: $29.99 for the Patterned Slacks 48: $49.99 for the Short Chiffon Dress 49: $59.99 for the Off-the-shoulder Dress 50: $24.99 for the V-neck Top 51: $49.99 for the Short Chiffon Dress 52: $24.99 for the V-neck Top 53: $24.99 for the V-neck Top 54: $59.99 for the Short Lace Dress Process finished with exit code 0
Learn More About Beautiful Soup
- Beautiful Soup Web Scraper Python (realpython.com)
- Python And Beautifulsoup Web Scraping Tutorial (medium.com)
- Implementing Web Scraping In Python With Beautifulsoup (tutorialspoint.com)
- Step By Step Tutorial Web Scraping Wikipedia With Beautifulsoup (towardsdatascience.com)
- Python Beautiful Soup Web Scraping Script (letslearnabout.net)
- Scraping Amazon Product Information With Python And Beautifulsoup (hackernoon.com)
- Quick Web Scraping With Python Beautiful Soup (levelup.gitconnected.com)
- Webscraping With Python Beautiful Soup And Urllib3 (dzone.com)
- Web Scraping Tutorial Python (dataquest.io)
- Python Tutorial Beautiful Soup (tutorials.datasciencedojo.com)
- Python Beautifulsoup (zetcode.com)
- Python On The Web Beautifulsoup (pythonforbeginners.com)
- How To Scrape Web Pages With Beautiful Soup And Python 3 (digitalocean.com)
Python Web Scraping With Beautiful Soup Summary
Beautiful Soup is one of a few available libraries built for Web Scraping using Python. It is very easy to get started with Beautiful Soup as we saw in this tutorial. Web scraping scripts can be used to gather and compile data from the internet for various types of data analysis projects, or whatever else your imagination comes up with.