As you already know, Parsero is a free script written in Python which helps you to automatically audit the Robots.txt file of a web server. In just a few seconds, you are able to get a lot of valuable information which is needed when you are auditing a website.
This tool is available for download here:
https://github.com/behindthefirewalls/Parsero
And here you can learn what Parsero already did.
http://www.behindthefirewalls.com/2013/12/parsero-tool-to-audit-robotstxt.html
How to install Parsero v0.6
Parsero is really easy to install. You can install it for example, in Kali Linux. You only need to run the commands below.apt-get install python3 apt-get install python3-pip pip-3.2 install urllib3 pip-3.2 install beautifulsoup4 git clone https://github.com/behindthefirewalls/Parsero.git
What's new?
If you look at the Parsero help, you will see two new features:- "-o" : To only show the available Disallow entries.
- "-sb" : To search in Bing indexed Dissallows.
Showing only the available Disallows
In the picture below you will see the difference between using the "-o" option and not using it.
If the robots.txt file has a few entries, I recommend you don't use the "-o" option because you will be able to figure out what type of content the administrator wanted to hide looking if you get all the results. But if the file is bigger, you have a lot of information to analyze and it is easer perform the audit getting only the links which are allowed to be visited.
Searching the Disallows entries in Bing
The fact that the administrator wrote a robots.txt to try to hide the crawlers part of his content doesn't mean that the search engines don't index these Disallow entries.
For example, in the picture below, Parsero will find content indexed by Bing which it mustn't have been indexed. Parsero will show you the first 10 Bing results for the indexed Disallows.
By doing CTRL+ click on the links, your browser will be redirected to:
- White links: the search page in Bing.
- Green links: directly to the result found in Bing (the content is not always available and sometimes you will get a 404 HTTP code error).
nstallation is sooo fruuustrating! now i understand why many prefer the linux enterprise (servers) more than the desktop versions. i've already installed python 3.4.2 and urllib3, parsero failed to run! windows 8 is a fail but linux is way more than failure, it terribly sucks, the linux community should really welcome newbies like us, sooo fruuustrating!
ReplyDeleteYes, if you are not familiarized with Python and Linux, it could be frustrating... That is the reason because I added 2 options to install it in a really easy way:
Delete1. Install Parsero from Pipy using these commands:
sudo apt-get install python3-pip
sudo pip3 install parsero
$ parsero -h
2. If you already have installed Python3:
git clone https://github.com/behindthefirewalls/Parsero.git
cd Parsero
sudo setup.py install
$ parsero -h
3. Kali Linux
sudo apt-get update
sudo apt-get install parsero
$ parsero -h
All installation options described here: https://github.com/behindthefirewalls/Parsero
Please, let me know if that helps you!!! ;P