Sending Emails Using Python and Gmail

We already know how to send email using office 365 or how to send bulk emails using mailgun api.

Here we will see how to send email from your python script or Django App using Gmail account.

Gmail have some daily limits on the number of emails you can send in a day. If you need to send bulk emails, use mailgun api.

Code:

Import dependencies:

Create message:

Create message body and attach to message:

Send the message:

 

Complete code with comments:
 

Available on Gitlab as well. Please comment your views.

Host your Django Application for free on PythonAnyWhere Servers.

 

Python Script 10: Collecting one million website links

I needed a collection of different website links to experiment with Docker cluster. So I created this small script to collect one million website URLs.

Code is available on Github too.

Running script:

Either create a new virtual environment using python3 or use existing one in your system.

Install the dependencies.

Activate the virtual environment and run the code.

Code:

 

We are scraping links from site http://www.websitelists.in/. If you inspect the webpage, you can see anchor  tag inside td  tag with class web_width . We will convert the page response into BeautifulSoup object and get all such elements and extract the HREF  value of them.

one million site urls

 

Although there is natural delay of more than 1 second between consecutive requests which is pretty slow but is good for server. I still introduced one second delay to avoid 429 HTTP status.

Scraped links will be dumped in text file in same directory.

 

Hosting Django App for free on PythonAnyWhere Server.

Featured Image Source : http://ehacking.net/

Python Script 7: Scraping tweets using BeautifulSoup

Twitter is one of the most popular social networking services used by most prominent people of world. Tweets can be used to perform sentimental analysis.

In this article we will see how to scrape tweets using BeautifulSoup. We are not using Twitter API as most of the APIs have rate limits.

Continue reading “Python Script 7: Scraping tweets using BeautifulSoup”

Scraping 10000 tweets in 60 seconds using celery, RabbitMQ and Docker cluster with rotating proxy

In previous articles we used requests and BeautifulSoup to scrape the data. Scraping data this way is slow (Using selenium is even slower). Sometimes we need data quickly. But if we try to speed up the process of scraping data using multi-threading or any other technique, we will start getting http status 429  i.e. too may requests. We might get banned from the site as well.

Purpose of this article is to scrape lots of data quickly without getting banned and we will do this by using docker cluster of celery and RabbitMQ along with Tor.

For this to achieve we will follow below steps:

  1. Install docker and docker-compose
  2. Download the boilerplate code to setup docker cluster in 5 minutes
  3. Understanding the code
  4. Experiment with docker cluster
  5. Update the code to download tweets
  6. Using Tor to avoid getting banned.
  7. Speeding up the process by increasing workers and concurrency

Note: This article is for educational purpose only. Do not send too many requests to any server. Respect the robot.txt file. Use API if possible.

Let’s start.

Continue reading “Scraping 10000 tweets in 60 seconds using celery, RabbitMQ and Docker cluster with rotating proxy”

Scraping Python books data from Amazon using Scrapy Framework

We learned how we can scrape twitter data using BeautifulSoup. But BeautifulSoup is slow and we need to take care of multiple things.

Here we will see how to scrape data from websites using scrapy.

I tried scraping Python books details from Amazon.com using scrapy and I found it extremely fast and easy. We will see how to start working with scrapy, create a scraper, scrape data and save data to Database.

Scraper code is available on Github. I dumped the data in MySQL database and developed a mini Django app over it which is available here.

Continue reading “Scraping Python books data from Amazon using Scrapy Framework”

Python Script 7: Scraping tweets using BeautifulSoup

Twitter is one of the most popular social networking services used by most prominent people of world. Tweets can be used to perform sentimental analysis.

In this article we will see how to scrape tweets using BeautifulSoup. We are not using Twitter API as most of the APIs have rate limits.

Continue reading “Python Script 7: Scraping tweets using BeautifulSoup”