Chapter 1. Instant Simple Botting with PHP

Welcome to Instant Simple Botting with PHP.This book will explain all the information and code you will need to start simple botting with PHP. Using this book and PHP, you will learn the basics of HTML requests and responses, get started with building your own bot, and learn how to parse and save data that you harvest with your bot.

This document contains the following sections:

So,what is Simple Botting with PHP?lets you discover what simple botting with PHP actually is, what you can do with it, how you can create your own bots, and why it's so great.

Installation teaches you how to create your own command-line PHP applications, how to execute command-line PHP applications, the diff erence between using cURL and simple socket connections, and how to perform simple HTTP GET and POST requests.

Quick start will teach you how to create your own bot, implement the bot confi guration settings, instantiate the bot and execute requests, and save data harvested by the bot.

Top 5 features you need to knowabout will help you fi nd out how to perform fi ve important botting tasks. By the end of this section, you will be able to parse harvested data, store parsed data in multiple ways, build bot logging, add stealth to your bots, and start creating advanced features for your bots such as link handling.

People and places you should get to know will provide you with various helpful suggestions and links to the project page, as well as articles and tutorials that can further assist you in developing powerful PHP bots. Open source projects are centered on a community of sharing information and tools.

So, what is Simple Botting with PHP?

In this book, I am going to explain how to create your own bots using PHP. You should already be familiar with PHP (Hypertext Preprocessor scripting language) and common built-in PHP functions. Throughout this book, I will only use common PHP functions that will be available in basic PHP installs. PHP is a good language to use to create your first robot because it is a popular and powerful language that can easily be tested in a web browser.

What is a robot? A robot or bot or web bot or spider (bots that navigate on their own) software application that used to systematically execute requests and handle responses that can be used to the benefit of its developer. These benefits can include activities such as gathering or harvesting data, checking a website for errors or links, checking e-mail, or handling more advanced issues such as crawling and archiving multiple websites.

Why use robots? The benefits listed above are all good reasons to use bots. Furthermore, bots can often be used to complete tasks by saving time through automation. For example, say, the company you work for has a project that requires data entry. A data directory on the local company server stores flat files that must be opened by an end user. Then, the end user must copy the records in the flat file line-by-line and paste the copied strings into various web application form fields. Finally, the end user submits the web application form and the is saved in a proprietary database.

If there were only twenty flat files on the server with a total of five hundred records, it would probably be logical to have a data entry employee to complete the task. However, say, there were one thousand files with twenty five thousand records. Now, it might be more tactical to develop a bot capable of scanning the files, extracting the records, and submitting the records through the web application using HTTP POST requests. In this book, you will learn the logic that will allow you to create a bot capable of completing these basic tasks; however, you can take that knowledge and—through practice—build advanced bots that can execute a wide variety of tasks.

HTTP request types

Anything you commonly use on the Internet can also be used by a web bot. Obviously, we as end users, use the Internet much differently than a web bot does. Most of the times, when you want to submit a form on a website you simply fill the HTML form and click on a submit button. The website will process the posted information (HTTP request) and maybe redirect you to another page (HTTP response), where the website owner thanks you for completing their form.

When we develop a bot, we must attack the same task using a different workflow. First, we need to program the bot so that it sends an HTTP request with the same data that would be submitted on the website's HTML form. Instead of having the bot click on the submit button, we simply set the response type sent by the bot. By doing this, we can signal the web server that we are sending data that we want the web server to digest. This type of request is called a POST request.

Another type of request is a GET request, which is a more common and simple request type. A GET request simply asks the server to provide a resource based on a URL. In our bots, we will be using both GET and POST requests. In simple terms, you can think of a GET request exactly like you're telling a web server to get something for you (a getter method). A POST request, on the other hand, is like telling the server to set something for you (a setter method type).

Simple is smarter

If you are familiar with popular APIs (Application Programming Interfaces), you'll know that they work much the same as HTTP requests and responses. In fact, our web bots will act as an API to web servers. What do I mean by this? Most APIs work the same; we can request an action that normally triggers a response that can then be consumed and utilized.

In much the same way, we will our bot to request something and after the request has been sent our bot will fetch the response and execute various functions or methods. If a bot is developed correctly, we don't have to think about everything the bot is doing internally. Exactly in the same way we don't have to think about what an API is actually doing when we send the request, rather we will just expect a response.

Code example expectations

In order to build bots that mimic APIs and are simple to use, we need to develop them using PHP classes, which will allow us to use bot objects. If you are unfamiliar with PHP object-oriented programming (OOP) you should research it before we use classes and objects later on.

In this book, I will be demonstrating PHP code using PHP 5.4 coding standards and plentiful code comments.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset