Scraping Python books data from Amazon using Scrapy Framework

We learned how we can scrap twitter data using BeautifulSoup. But BeautifulSoup is slow and we need to take care of multiple things.

Here we will see how to scrap data from websites using scrapy.

I tried scraping Python books details from Amazon.com using scrapy and I found it extremely fast and easy. We will see how to start working with scrapy, create a scraper, scrap data and save data to Database.

Scraper code is available on Github. I dumped the data in MySQL database and developed a mini Django app over it which is available here.

Continue reading “Scraping Python books data from Amazon using Scrapy Framework”

Generating and Returning PDF as response in Django

We might need to generate a receipt or a report in PDF format in Django app. In this article, we will see how to generate a dynamic PDF from html content and return it as a response.

Create a Django project. If you are not using virtual environment, we strongly recommend to do so.

Installing Dependencies:

Once virtual environment is ready and activated, install the below dependencies.

For pdfkit  to work, we need wkhtmltopdf  installed in our Linux system.

Continue reading “Generating and Returning PDF as response in Django”

Using signals in Django to log changes in models

Sometimes we need to know who made what changes to which table. This might be required for legal audit purpose or for simple organisational level logging.

There are multiple Django apps available online which can help you log the model changes but there is no fun in doing that. We will see how to do it without using ready-made app and hence will learn something in the process.

Signals:

Signals lets a sender notify another receiver that some event have occurred and some action needs to be performed.

For example, we have some data in cache as well in DB. We read data from cache and if not found then goes to DB as fallback. Now whenever a DB is updated, we need to update the cache as well. But we might update the model from multiple views. Hence it is tough and not clean to write cache update logic in every such view. Signals comes into picture now.

Continue reading “Using signals in Django to log changes in models”

Python Script 7: Scraping tweets using BeautifulSoup

Twitter is one of the most popular social networking services used by most prominent people of world. Tweets can be used to perform sentimental analysis.

In this article we will see how to scrap tweets using BeautifulSoup. We are not using Twitter API as most of the APIs have rate limits.

Continue reading “Python Script 7: Scraping tweets using BeautifulSoup”

py_instagram_dl – The Python Package to Download All pictures of an Instagram User

I created a small script to download all pictures of an Instagram user without using APIs as APIs poses few limitations like rate limit.

After few rounds of tweaking, optimisation and beautifying code, I though of creating a python package out of it. If you want to know how to create a distributable python package, this article will be extremely helpful as steps are discussed in great detail.

You can find the  py_instagram_dl  package listed on pypi.
link is –  https://pypi.python.org/pypi/py-instagram-dl.

How to download all pictures of an Instagram user:
  • Create a virtual environment. Optional but strongly recommended. You may follow this simple and step by step pocket guide on Python Virtual Environment.
  • Install dependencies. This package instead few other python packages to work.
  • Now install this package.
  • Use the installed package in your code.
    Parameter Options:
Download  method have one mandatory and two optional parameters as of now.

Mandatory Parameter:
Parameter 1: Valid username of Instagram user.

Optional Parameter:
verbose
: default value – True (boolean) : Decides whether information should be printed on screen. Recommended to have it set to True so that in case of large number of downloads you can make sure script is working and is not just freezed.

wait_between_requests : default value – 0 (integer) : This is the time in seconds for which scripts waits to send new hit to download the picture to Instagram. It is recommended to pass a positive value for this parameter. If you are getting rate limit exceptions after downloading few pictures, pass 1 in this parameter, i.e. wait for 1 second between each request.

Exceptions:

InvalidUsernameException: When a non existent username is provided.
RateLimitException: When rate limit is reached. Use parameter wait_between_requests  to avoid this.

 

Source code.

Server Access Logging in Django using middleware

Some application admins need to know which user performed what action in the application. We also felt the need of such tracking hence we started developing the access log system for our application.

In this article we will see how to develop the server access logging app for Django project.

We will be storing below information:

  • URL/link visited.
  • Request method, get or post.
  • Get or Post Data
  • Referrer
  • IP address of visitor
  • Session ID

Continue reading “Server Access Logging in Django using middleware”

Solving Django error ‘NoReverseMatch at’ URL with arguments ‘()’ and keyword arguments ‘{}’ not found.

Every Django developer encounters below error at least once in their life for sure.

Beginners spend many hours debugging the issue, jumping from question to question on Stackoverflow and posting in multiple groups on Facebook.

In this article we have tried to list all the common mistakes developer makes which leads them to above error.

Continue reading “Solving Django error ‘NoReverseMatch at’ URL with arguments ‘()’ and keyword arguments ‘{}’ not found.”

Virtual Environment in Python – A Pocket Guide

In almost every article, we recommended the use of virtual environment for developing any Python or Django project.

In this article, we will briefly cover the virtual environment in python, installation and usage.

What is a Virtual Environment:

Virtual environment is an isolated python environment which can be created using virtualenv python tool. This virtual environment contains all the packages that a python package would require. Python project running in virtual environment does not use the system wide installed python package.

Continue reading “Virtual Environment in Python – A Pocket Guide”

Get latest Bitcoin and other crypto-currencies rates using python Django

Everybody is investing in bitcoins. James Howells is trying to dig a landfill site to get 7500 bitcoins that were dumped there in 2013.

To be a good investor, it is necessary that you keep track of ups and downs in the market. There are multiple platforms where you can track the price of bitcoin. But for a python programmer that is no fun. Being a python programmer we will develop our own project where we can get latest bitcoin and other crypto-currency prices.

Let’s start.

Continue reading “Get latest Bitcoin and other crypto-currencies rates using python Django”

Python Script 6: Wishing Merry Christmas using Python Turtle

Merry Christmas everyone.

Since this is Christmas today, I thought of wising everyone in a different way. I am python programmer and I love writing code so I decided to do something with python and after 1 hour I was ready with the below script to wish all of you Merry Christmas using python turtle.

Code is available on Github as well.

Code :
 

Output Video:

Happy learning.

 

Reference:
[1] https://coolpythoncodes.com/python-turtle/
[2] https://docs.python.org/3.6/library/turtle.html