Scraping Python books data from Amazon using Scrapy Framework

We learned how we can scrap twitter data using BeautifulSoup. But BeautifulSoup is slow and we need to take care of multiple things.

Here we will see how to scrap data from websites using scrapy.

I tried scraping Python books details from Amazon.com using scrapy and I found it extremely fast and easy. We will see how to start working with scrapy, create a scraper, scrap data and save data to Database.

Scraper code is available on Github. I dumped the data in MySQL database and developed a mini Django app over it which is available here.

Continue reading “Scraping Python books data from Amazon using Scrapy Framework”

Python Script 7: Scraping tweets using BeautifulSoup

Twitter is one of the most popular social networking services used by most prominent people of world. Tweets can be used to perform sentimental analysis.

In this article we will see how to scrap tweets using BeautifulSoup. We are not using Twitter API as most of the APIs have rate limits.

Continue reading “Python Script 7: Scraping tweets using BeautifulSoup”

Python Script 6: Wishing Merry Christmas using Python Turtle

Merry Christmas everyone.

Since this is Christmas today, I thought of wising everyone in a different way. I am python programmer and I love writing code so I decided to do something with python and after 1 hour I was ready with the below script to wish all of you Merry Christmas using python turtle.

Code is available on Github as well.

Code :
 

Output Video:

Happy learning.

 

Reference:
[1] https://coolpythoncodes.com/python-turtle/
[2] https://docs.python.org/3.6/library/turtle.html

How to backup database periodically on PythonAnyWhere server

You can host your Django app effortlessly on PythonAnyWhere server. If you are using the database in your app then it is strongly recommended to take backup of database to avoid loss of data.

This PythonAnyWhere article explain the process to take sql dump. We will extend the same article to take database backup periodically and delete the old files.

Continue reading “How to backup database periodically on PythonAnyWhere server”

Python Script 5: How to find most popular technologies on Stackoverflow

This script crawls the Stackoverflow pages to find the most popular technology by counting the number of tags on each question.

Important: Please do not send too many requests. Respect the robot.txt file.

Code is also available on Github.

You will require to install beautifulsoup  and requests  python package.

Code:
 

best python scripts

Other Scripts:

Opening top 10 Google search results in one hit.
Formatting and validating JSON.
Crawling all emails from a site.

 

Python Script 3: Validate and format JSON string

As per official JSON website, JSON is a light-weight data interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition – December 1999.

In this small article we will see how to validate and format the JSON string using python.

Format JSON string:
output:
Continue reading “Python Script 3: Validate and format JSON string”

Python Script 2 : Crawling all emails from a website

This is the second article in the series of python scripts. In this article we will see how to crawl all pages of a website and fetch all the emails.

Important: Please note that some sites may not want you to crawl their site. Please honour their robot.txt file. In some cases it may lead to legal action. 
This article is only for educational purpose. Readers are requested not to misuse it. 
Continue reading “Python Script 2 : Crawling all emails from a website”

Python Script 1: Convert ebooks from epub to mobi format

We are starting a series of python scripts which we may use in our daily life to automate mundane task and save some time.

This is the first article in this series. Recently I bought Amazon’s Ebook Reader, kindle paperwhite 3. I purchased few books from kindle store and downloaded most of the books in Epub format. Now kindle doesn’t support epub format. You need to convert them to either mobi or azw3 format.

Continue reading “Python Script 1: Convert ebooks from epub to mobi format”