Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 8

Leveraging Open-Source Intelligence

Richard Ackroyd, Senior Security Engineer, RandomStorm Limited

The old saying “If I had eight hours to chop down a tree, I’d spend six sharpening my axe” could not relate more perfectly to this phase. The reconnaissance work you perform will make or break a social engineering engagement. It will truly form the foundations of any work that follows it. There is a wealth of information at our fingertips. This chapter will show you how to find and manipulate it to aid in your assessment. Harvesting e-mail addresses, document metadata, corporate websites, and social media will all be covered.

Keywords

Passive Spider; FOCA; Metagoofil; photographic metadata; Exiftool; image picker; Wget; GeoSetter; reverse image search engines; PDFGrep; document obfuscation; theharvester; Sam Spade; Jigsaw; Recon-ng; social media; LinkedIn; Facebook; DNS records; CeWL; WHOIS records

Information in this chapter

• The Corporate Website

• Employee Names

• Staff Hierarchy

• Phone Numbers

• Employee Photos and Data

• Active and Passive Spidering

• Document Metadata

• FOCA

• Metagoofil

• Strings

• PDFGrep

• Photographic Metadata

• ExifTool

• Image Picker

• WGET

• GeoSetter

• Reverse Image Search Engines

• Not so metadata

• E-mail addresses & E-mail enumeration

• Phishing Attacks

• Password Attacks

• E-mail address naming conventions

• Social Media

• Linked in

• Facebook

• Twitter

• DNS Records & Enumeration

• Dnsrecon

• Subdomain brute forcing and enumeration

• CeWL

• WhoIS

Introduction

Chapter 7 introduced a model for creating targeted scenarios. This chapter is going to look at how to use publicly available information to profile a particular target.

This is a genuinely enjoyable part of social engineering. It is truly fascinating just how much information a typical business puts into the public domain. The scary thing is, most organizations don’t even realize that they are doing it. The information can find its way out in document metadata or the registration details for their public IP addresses. It is a truly fascinating aspect of social engineering that will also prove useful on penetration tests.

The chapter will begin by covering the corporate web presence, and how to interrogate it for useful information. That information could be basic in nature such as corporate direct dials and e-mail addresses. Alternatively, it could be used to provide the building blocks of a pretext. Pieces of seemingly uninteresting information such as partners, vendors, and clients suddenly become very interesting indeed.

The methods for recovering information directly from the corporate website will be looked at both of passive and active nature.

Next, the chapter covers the process and tools behind the gathering of corporate e-mail addresses. E-mail addresses are gold dust on the vast majority of social engineering attacks. Not only do they provide a base for phishing attacks, but they give the naming convention of every other mailbox in the organization.

The tools that acquire these e-mail addresses often don’t tell us where they found them. We will trace them back to their source to show how useful this can be. As an example, during a recent social engineering gig, a good proportion of these e-mail addresses was actually found in documents published by staff members. These documents were not available on the corporate website, but did included direct dials, mobile phone numbers, and a slew of other useful intelligence.

Document metadata will be covered in the following section. This chapter takes a look at some popular tools for harvesting and parsing documents that are published on the corporate website. Amongst the things that are expected to be seen are usernames, folder structures of local workstations, e-mail addresses, and operating systems in the output. It goes without saying that this is useful to a social engineer.

Next is the investigation into social media sites, blogs, and forums. All of which can be useful sources of information about a potential target. Search engine harvesting will be the next port of call. It never ceases to amaze me just what a brilliant reconnaissance tool the “Google” search engine is. The subject of DNS and Whois records, how to get to them, which tools to use, and how to apply the intelligence gained to your engagement are briefly touched upon. Finally, this chapter provides guidance on how all this extracted information can be used, including some common tasks associated with the manipulation of the data, such as building user and password lists, which can be leveraged in further attacks. A lot of really great technologies, techniques, and tools will be covered in this chapter, so let’s get going.

The vast majority of tools used in this chapter are installed in Kali Linux. You can download a Virtual machine or ISO here: http://www.kali.org.

The corporate website

When looking to gather intelligence on any business, their corporate web presence is the obvious place to start. However, what sort of information can be retrieved that isn’t immediately visible? Where are the hidden gems and how can these be applied to an on-going assessment? It is often discovered that most businesses are surprised at the amount of information on offer to a potential attacker. What kind of information can be expected to be found, and why is it useful to social engineers as well as malicious individuals?

Business purpose

Understanding what a business does, what its ideals are, and how it operates are always key pieces of information. If the approach to be chosen is going to be impersonating a member of staff, thorough research is needed in order to pull it off. Understanding the basics of the organization along with some of the lingo used within the trade can go a long way. Knowing about a business can also help when making educated guesses about the type of systems that may be in use.

A little digging can also provide hints on how much resistance will be faced during a call or onsite visit. If the business is used to dealing with government and military agencies, the social engineer is more likely to be faced with well-trained individuals when it comes to information security. It is more likely that they will be more regimented as an organization and will be process driven. Other indicators to the potential security posture of an organization can include logos for information assurance standards such as ISO or PCI. Again, these are indicators that at least some people within the business will have had training that could make them more difficult to leverage.

A lot of what is done by social engineers is done blind, at least to start with, so it is recommended that as much as possible is gained from this exercise. Failing to prepare is preparing to fail.

Partners, clients, vendors

A lot of organizations publish lists of clients on their website, as well as partners and vendors. What better pretext could there be for an information gathering call? Calling into a help desk and impersonating a client is likely to get you some useful information. If the target organization has a client portal, it may be the case that you could gain access to it by attempting to have the client’s password reset. In a lot of cases, all it take to gain somebody’s confidence is a little bit of knowledge that would be deemed private by them. This could be the URL of the client portal and using the client’s name. In many cases, adding in a sense of urgency can apply all the pressure needed to get things moving.

E-mail addresses

E-mail addresses are exceptionally useful in social engineering attacks and these will be covered in more detail later in this chapter, including how to harvest them from publicly available resources. There are several ways to leverage them, e.g., choosing to go down the route of targeted or broad-scale phishing attacks, or using them to attack VPN and e-mail portals. While the latter may look a lot more like penetration testing territory, many social engineering engagements are made up of this blended approach.

The e-mail address can also indicate the internal username convention, which means if more employees can be identified, an even bigger e-mail and user list can be built. These can be used throughout the engagement for further attacks.

Employee names

It is likely that the number of employee names on a corporate website will be limited to directors and shareholders. Finding people further down the hierarchy is still possible, especially where organizations have blogs or news articles written by employees. Employee names are useful from an impersonation point of view or for name dropping when making a call. Most will argue that LinkedIn is a better resource for this kind of information. Just be certain that their current employment status is accurate. Impersonating an employee that no longer works for the target will be unlikely to provide any benefit.

Staff hierarchy

Understanding a staff hierarchy is always useful to a social engineer. A call claiming to be from someone in a position of authority can grease the wheels in unexpected ways. How many call center staff dare risk offending the person that is in charge of their entire department? Another angle to take is that of familiarity or lack thereof. Picking a staff member that is less likely to have regular contact with call center staff means less chance of being busted. There is nothing worse that going through the entire process of building your impersonation attempt only to have your target say “hey, you don’t sound like Bob…” At this point, you have probably set alarm bells ringing at an organizational level.

Phone numbers

Phone numbers, like e-mail addresses, are also key to the ongoing engagement. It is quite common to find that an organization will only publish the number for a central switchboard, therefore it is often believed that this is more secure, as the switchboard staff are likely to have had role specific security training. However this is often not the case. The people who answer the phone are employed to help, and to help the business, not to be a hindrance. They will likely be taking large volumes of calls too, which means little time to vet each call and less chance of recognizing repeat callers. Consequently, switchboards are frequently targeted in order to access direct dials for other employees within a business. It is rarely the case that someone will be challenged for much other than their name and the purpose of the call. Even if they are, the organization will likely take thousands of inbound sales calls every year so it is unlikely that any suspicions will be raised.

Another interesting vector relates to the lack of responsibility to authenticate somebody, who has already been passed through by another member of staff. This can pose serious problems to an organization when it comes to information leakage. It shouldn’t be taken at face value that the person being spoken to is a client, just because somebody told you they are a client doesn’t mean you can’t verify it before giving any sensitive information to them.

Photos of employees and business locations

Later in this chapter, the intelligence potential from digital photographs, which contain all sorts of hidden information, will be investigated. This can include the type of device the photo was taken with geographic location and more.

Then there is the more obvious intelligence within the photo. This can range from photographs of physical locations, to the interior layout of offices and even ID badges. Images.google.com is an absolute gold mine for this kind of endeavor. For example, often ID badges are discovered in promotional photographs and then used to recreate duplicates. Even a roughly created ID badge can be enough to enable a social engineer to get around a building without any intervention from employees or security. In fact, it has been known for social engineers to very quickly fabricate a badge using an Inkjet printer. Despite being printed on photo paper, it still may pass a close up inspection by a security guard. The chances are all they are looking at is the photo or merely going through the process.

Having established some of the useful information gathered from a corporate web presence what are the tools used to retrieve it?

Spidering

Spiders, sometimes known as Crawlers, are designed to index websites and their contents. They do this by starting with a seed URL and then identifying all hyperlinks on these sites. The Spider then visits each of these links, eventually building a full map of the website in question. Spiders are most commonly used by search engine providers such as Google and Bing. This forms the basis for how the content is delivered when it is searched for.

Spidering can be tremendously useful both to social engineers and penetration testers. This is because it allows the complete mapping out of a corporate web presence in an automated and time effective manner.

A spider can be employed against a website in both a passive and active fashion. Here’s a look at a tool that can be used to passively map a website’s structure.

Passive Spider

Any reconnaissance effort may be for nothing if it starts to raise alarms. Wouldn’t it be great if we could passively spider a website? As it happens this can be achieved through the use of a “spider by proxy.”

The big search engine providers, such as Google and Bing, spend a lot of time spidering the Internet so that the search engine user doesn’t have to. Why not take advantage of that?

Step up Passive Spider (https://github.com/RandomStorm/passive-spider) created by the award winning Ryan Dewhurst (@ethicalhack3r)—http://www.ethicalhack3r.co.uk.

Passive Spider takes advantage of the search provider’s leg work by utilizing the search results to give the site layout.

At the time of writing, Passive Spider uses only the Bing search engine, however registration for a free Bing developer API key is required to perform any searches.

The installation requires Ruby but is very straightforward. It has been tested on OSX running Ruby 1.9.3 and also works in Kali Linux, again on Ruby 1.9.3.

Here are the installation instructions:

git clone https://github.com/RandomStorm/passive-spider.git

cd passive-spider

gem install bundler && bundle install.

Alternatively, you can download the ZIP archive from github (www.https://github.com) directly and extract the files.

In the passive-spider directory is a file called api_keys.config. This is where you need to enter the previously mentioned Bing search API key in between the speech marks.

Ensure that the pspider.rb file has execute permissions:

chmod +x pspider.rb

At this point, Passive Spider is ready to be run. Let’s see what information it gathers.

./pspider.rb --domain syngress.com

[+]---------------------[+]

[+] Passive Spider v0.2 [+]

[+]---------------------[+]

Shhhhh… by Ryan'ethicalhack3r' Dewhurst

Part of The RandomStorm Open Source Initiative

[+] URLs: 59

http://booksite.syngress.com/

http://www.syngress.com/?cur=usd

http://booksite.syngress.com/companion/conrad/practice_exams.php

http://booksite.syngress.com/companion/conrad/

http://booksite.syngress.com/companion/issa.php

http://www.syngress.com/about-us

http://booksite.syngress.com/companion/special_interests.php

http://booksite.syngress.com/companion/conrad/podcasts.php

http://booksite.syngress.com/companion/conrad/Conrad_PracticeExamA/COU36289844/open.html

http://booksite.syngress.com/companion/certification.php

http://booksite.syngress.com/companion/hacking_penetration.php

http://booksite.syngress.com/9781597494250/content/

http://www.syngress.com/news/nacdl-cacjs-5th-annual-forensic-science-seminar-march-23-24-las-vegas/

http://www.syngress.com/special-interests/

http://www.syngress.com/events/5th-annual-cai-security-symposium---northern-ky-univ-oct-28th-highland-heights-ky/

http://booksite.syngress.com/companion/digital_forensics.php

http://booksite.syngress.com/Landy/index.php

http://www.syngress.com/news/securabits-100th-podcast---guest-harlan-carvey-and-craig-heffner-tonight-march-7th-7-30-est/

http://booksite.syngress.com/9781597494250/content/Video/HIPT/module06/index.html

http://www.syngress.com/events/7th-annual-scada-and-process-control-system-security-summit-orlando-florida/

http://www.syngress.com/information-security-and-system-administrators/Dictionary-of-Information-Security/

http://www.syngress.com/news/read-the-latest-review-for-digital-forensics-with-open-source-tools-by-altheide-and-carvey/

http://booksite.syngress.com/9781597494250/content/Video/HIPT/module02_E/index.html

As you can see, the output is very self-explanatory. We now have the basic layout of the target corporation’s website without having touched it. This sort of approach is highly recommended prior to commencing any more active types of testing. It may be that a more active spider is required for the entire site further down the line. This will be covered later in this chapter.

Passive Spider doesn’t stop there. It also displays any documents it found during the query, which again can be extremely useful to a social engineer. Why documents are useful to use when looking at metadata will be covered later in this section. Additionally, documents prove to be extremely useful as they often contain contact details for employees.

Scrolling to the bottom of the results reveals any subdomains and interesting keywords that were found during the search.

Active spidering with OWASP Zed Attack Proxy

The Zed Attack Proxy (ZAP) is an easy to use integrated penetration testing tool for finding vulnerabilities in web applications.

It is designed to be used by people with a wide range of security experience and as such is ideal for developers and functional testers who are new to penetration testing.

ZAP provides automated scanners as well as a set of tools that allow you to find security vulnerabilities manually.

https://www.owasp.org/index.php/OWASP_Zed_Attack_Proxy_Project

ZAP is a yet another fantastic open-source tool that we can take advantage of, however the full suite of features it provides is way beyond the scope of this book. What we are interested in for now is its spidering ability.

The ZAP is so called because it proxies your connections out to your target of choice. This gives ZAP the ability to intercept and tamper with any outbound request or inbound response.

After launching ZAP, you configure your browser to point at it by configuring localhost and port 8080 in your proxy settings. I use the FoxyProxy add-on for Firefox, but there are alternatives out there for other browsers. You could just use the standard proxy settings if you choose. FoxyProxy and applications like it are far quicker if you are going to be changing things around a lot, and you probably will.

So, let’s get ZAP installed and the browser configured so that we can intercept traffic. For the purposes of this section, I’m going to assume you are using Firefox.

Step 1: Go to the FoxyProxy website: https://addons.mozilla.org/en-US/firefox/addon/foxyproxy-standard/.

Step 2: Click “Continue to download.”

Step 3: Click “add to firefox.”

Step 4: Firefox should prompt you to install FoxyProxy. Firefox will need to be restarted at this point.

That’s all there is to it. Next, we have to configure FoxyProxy to enable the sending of all the traffic to ZAP.

Step 1: Find the FoxyProxy button at the end of the URL bar and right click it. Choose options.

Step 2: Click “add new proxy” (Figure 8.1).

Step 3: Fill out the proxy settings for “localhost” and port 8080 (Figure 8.2).

FoxyProxy will automatically send HTTPS and HTTP traffic to ZAP when we enable it. You may have to configure HTTPS proxying separately in the browser or system settings if FoxyProxy is not being used.

Step 4: Click the General button at the top and give the entry a name, “Owasp ZAP” will do.

Step 5: Choose a color for the entry and click ok to finish up.

Now to enable the proxy just right click the little fox icon at the end of the address bar and select “Use OWASP ZAP for all URLs.” This option will send all of the HTTP and HTTPS traffic through ZAP. As it hasn’t been launched yet, none of the websites will load.

Having grabbed ZAP from the link provided at the beginning of this section there is a Windows installer and a Mac version. The example given is for running it on a Mac but the process is the same regardless. By installing and running ZAP, the following screen is presented (Figure 8.3).

So, let’s jump back to Firefox to browse a site and see what happens. This is demonstrated through the use of the Damn Vulnerable Web Application (http://www.dvwa.co.uk), for the purpose of the book, which is also included in the OWASP Broken Web Applications operating system (Figure 8.4).

This shows that the sites visited appear in the “Sites” box on the left. The individual HTTP requests are in the bottom pane. At this point the only thing of interest is the Spider, although the reader is encouraged to explore ZAP’s other functionality.

If the interest is in intercepting requests destined for a HTTPS site, acceptance of the ZAP certificate is required so that the traffic can be decrypted and re-encrypted.

Here is an overview of the Spidering process:

Step 1: Click the Spider button.

Step 2: Drop down the “Site” menu on the left-hand side and choose the site that the spider is to be deployed against.

Step 3: Click the play button and wait for it to finish (Figure 8.5).

Spidering a website can be intrusive and may cause issues on some systems. Please be aware of the risks and seek permission prior to performing any kind of active testing.

Why is this information useful to a social engineer?

Both passive and active spidering have been investigated, but why is this information useful to social engineers? The site layout is just a precursor to further exploration; it helps provide guidance to the reconnaissance path and ensure that time is spent effectively. It can provide at a glance information that can be useful to any social engineer such as business partners, portals, clients, vendors, and contact pages. It can also help to identify documents and key words across large sites. All of this information can be used as a platform for further reconnaissance. Ultimately this information will lead to the construction of one or more pretexts that form the basis of any assessment.

Spidering is just one of the tools that would be used very early on in the reconnaissance lifecycle. Having thoroughly covered this tool, it is pertinent to look at some of the other tools that are available for use by the social engineer.

Document metadata

Document metadata is basically attribute information stored within office documents. When a Microsoft Word or PDF document is created, it is automatically tagged with some metadata without the author really even knowing about it. This information can be retrieved by anybody who has the document.

Typically the metadata will be the user and business name selected when the office product was installed. At least some of the document metadata can be viewed by checking the document properties from within the office application (Figure 8.6).

Figure 8.6 Installation options being populated into the metadata of Microsoft Word documents.

This clearly demonstrates that at the very least the document may provide the name of the individual who created it. This can then be added to username lists or potentially as a name drop during a call to an organization.

There are many other metadata tags that can be added by an individual, and quite a few that the application adds by itself.

It is common to find operating system versions, directory structures, and users within this hidden metadata. Additionally, the exact version of software used to create the file can also be found. Here are some of the tools that can be used to extract this intelligence:

Strings

Strings searches for printable strings within a file and its metadata and displays them within a terminal.

For example, here is the output of strings when running it against a PDF file (Figure 8.7):

Figure 8.7 Strings output from PDF file.

mac1:rich$ strings mypdfdocument.pdf

In this particular case, both the operating system in use, Mac OSX 10.0.4, as well as the version of Acrobat in use could be retrieved. Surely, it would take a long time to gather this information in the real world, especially if the customer has a lot of documents available on their website. Fortunately there are numerous tools that can automate this process.

FOCA—http://www.informatica64.com/foca.aspx

FOCA is a Windows application designed as an information gathering tool for penetration testers. It is often regarded as one of the best tools out there for this kind of work. It covers a multitude of functionality beyond that which is covered in this book. The Pro version should be considered as a worthwhile investment for anyone using the application a great deal.

One of the first things that will strike the user about FOCA is how easy it is to use. It is all GUI driven, nicely laid out, and to most people will be very intuitive.

The process of extracting metadata is pretty much completely automated. FOCA is given a domain and it goes away and digs out any documents that exist within. Next, FOCA is informed to download the documents and extract the metadata. It will categorize each type found and display them in an easily navigated tree. The metadata can then be exported to files so that it can manipulate it. It really doesn’t get any easier than this.

Below is a walk-through of the process using FOCA Free which can be downloaded from the link provided. An e-mail address is required but this is a small price to pay for a fantastic application.

At the first launch of FOCA, the user is greeted by a screen not unlike the one in Figure 8.8.

Here is the “step-by-step” guide:

Step 1: Click “Project” at the top left and then “New Project.”

Step 2: Give the project a name and then enter a domain in the “Domain Website” field. This should just be mydomain.com, not a URL.

Step 3: Tell FOCA where to store the project documents and add any notes that may be useful later (Figure 8.9).

Step 4: Click “Create” and choose a location to save the project file. The desktop will suffice for now.
Now, a screen like the one in Figure 8.10 will be seen.
Select Google and Bing, leaving the Extensions as they are. The Extensions are the types of file that FOCA is going to look for during its search.

Step 5: Click “Search All.”
Documents should start to populate the screen. The name, size, and type will all be visible.

Step 6: Now the document search has finished, right click on any document, and select “Download all.” This will download all of the documents to your previously configured location (Figure 8.11).

It may take some time to download all of the documents. Go and grab a coffee and check on it in 5 minutes.

Step 7: Now that all of the documents are downloaded, again right click on any document in the search results and choose “Extract All Metadata.”

This should result in a nicely organized view of the metadata (Figure 8.12).

FOCA has extracted users, folder structures, software versions, and operating systems from all of the documents available and all that was required was to give it a domain. During a real world assessment a greater amount of detail will be seen than is shown in this screenshot, assuming the client has not sanitized all documents before publishing them. Rarely is the “Passwords” metadata category seen to light up to this day, but it is often the first thing that is looked for.

For example, a right click on any of the metadata categories on the left and exporting the results to a file can be especially useful for building user lists.

Additionally, a right click within the documents pane and adding a local file or directory for metadata extraction can prove to be extremely useful.

This has demonstrated some of the uses of FOCA; however, there are open-source alternatives to FOCA. Let’s take a look at one now.

Metagoofil

Metagoofil works in a similar way to FOCA. It starts by searching in Google and then downloads documents from the target website. Metagoofil can then start to strip metadata from the documents and present the results in a report. As with FOCA, Metagoofil is capable of retrieving usernames, software versions, e-mail addresses, and document paths.

Metagoofil can be obtained from http://code.google.com/p/metagoofil/, and has been tested in Linux and OSX. It should also be bundled in BackTrack and Kali Linux.

Here’s an example of the command line switches:

******************************************************

* Metagoofil Ver 2.2            *

* Christian Martorella          *

* Edge-Security.com            *

* cmartorella_at_edge-security.com   *

******************************************************

 Usage: metagoofil options

    -d: domain to search

    -t: filetype to download (pdf,doc,xls,ppt,odp,ods,docx,xlsx,pptx)

    -l: limit of results to search (default 200)

    -h: work with documents in directory (use "yes" for local analysis)

    -n: limit of files to download

    -o: working directory (location to save downloaded files)

    -f: output file

 Examples:

 metagoofil.py -d apple.com -t doc,pdf -l 200 -n 50 -o applefiles -f results.html

 metagoofil.py -h yes -o applefiles -f results.html (local dir analysis)

It’s all very straightforward really. Give Metagoofil a domain name, tell it the type of documents to look at and limit the search results and number of file downloads. These limits are going to be defined by the size of client and the amount of time available. It is best to try and avoid downloading hundreds of each type of document if there is only a short reconnaissance window. Some clients may only have a handful of documents in any case. Here the tool is run to see what it brings back:

metagoofil -d offensivesite.com -t doc -l 200 -n 50 -o /root/Desktop/metadata/ -f results.html

The -d switch has been used to set the domain, -t to define .doc (Microsoft Word), followed by limiting the search results to 200 with the -l switch. The number of files downloaded per type is 50 as defined by the -n option. Next the choice was made to download the files to the metadata folder on the Desktop. Finally, the results were published to an HTML file with the -f option. Here’s a look at the output from the tool:

******************************************************

* Metagoofil Ver 2.2

* Christian Martorella

* Edge-Security.com

* cmartorella_at_edge-security.com

******************************************************

[-] Starting online search…

[-] Searching for doc files, with a limit of 200

  Searching 100 results…

 Searching 200 results…

Results: 8 files found

Starting to download 20 of them:

----------------------------------------

[1/20] /onoes=en

  [x] Error downloading /onoes=en

[2/20] http://www.offensivesite.com/docs/2323.doc

[3/20] http://www.offensivesite.com/docs/11.doc

[4/20] http://www.offensivesite.com/docs/22.doc

[5/20] http://www.offensivesite.com/docs/123.doc

[6/20] http://www.offensivesite.com/docs/122.doc

[7/20] http://www.offensivesite.com/docs/bob.doc

[8/20] http://www.offensivesite.com/docs/testing.doc

[9/20] http://www.offensivesite.com/docs/lotsometadata.doc

[10/20] http://www.offensivesite.com/docs/doc.doc

[11/20] http://www.offensivesite.com/docs/diary.doc

[12/20] http://www.offensivesite.com/docs/random.doc

[13/20] http://www.offensivesite.com/docs/things.doc

[14/20] http://www.offensivesite.com/docs/morethings.doc

[15/20] http://www.offensivesite.com/docs/manual.doc

[16/20] http://www.offensivesite.com/docs/passwords.doc

[17/20] http://www.offensivesite.com/docs/creditcardnumbers.doc

[18/20] http://www.offensivesite.com/docs/fortknoxdoorcodes.doc

[19/20] http://www.offensivesite.com/docs/safecombination.doc

[20/20] http://www.offensivesite.com/docs/deathstarplans.doc

[+] List of users found:

--------------------------

Edmond Dantès

Jim Seaman

Andrew Gilhooley

Charlotte Howarth

Bryn Bellis

Owen Bellis

Gavin Watson

Andrew Mason

James Pickard

John Martin

[+] List of software found:

-----------------------------

Microsoft Office Word

Microsoft Office Word

Microsoft Office Word OSX

Microsoft Word 10.0

Microsoft Word 9.0

[+] List of paths and servers found:

-------------------------------------

'C:Documents and SettingsTheEmperorMy Documentsdeathstarplans.doc'

'S:My Documentscreditcardnumbers.doc'

'C:Documents and SettingschazzlesApplication DataMicrosoftWordAutoRecovery save of passwords.doc'

'/Users/jseaman/Documents/safecombination.doc'

[+] List of e-mails found:

----------------------------

Edmond Dantè[email protected]

Jim [email protected]

Andrew [email protected]

Charlotte [email protected]

Bryn [email protected]

Owen [email protected]

Gavin [email protected]

Andrew [email protected]

James [email protected]

John [email protected]

Wow, we really hit the jackpot there. A single command and a few downloads later a vast amount more data about our target is known. The results were also dropped into a nicely formatted HTML file for us using -o command switch (Figure 8.13).

Metagoofil can work with locally downloaded files too. For example if somebody sends a file as a mail attachment, metagoofil can be tasked to strip the metadata from it. Simply use the -h command switch.

mac1:rich$ metagoofil -h yes -o /root/Desktop/metadata/ -f results2.html

This assumes we have put our local documents in the /root/Desktop/metadata directory. The process from here is identical. Metagoofil strips the document metadata and prints it to screen as well as writing the results.html file.

Why document metadata is useful to social engineers

I’m guessing at this point, you can see why this is a useful tool for social engineers. A couple of commands and a few downloads later, we have gathered an insane amount of information about our target organization. We know what software the target is using and in a lot of cases which operating system.

We know who creates their documents for publishing. We have a nice collection of e-mail addresses which could be used in a phishing attack or for pretexts where e-mail is the primary means of communication. Given that we know what software and OS they may be using we can also ensure that we choose attacks that are more likely to work given the target environment.

On top of all this, the internal usernames and the naming convention for those users have been harvested. This can be useful when performing a blended assessment or if attempting an attack of login portals or Webmail. It also means that LinkedIn and other social media sites can be farmed, obtaining a list of employees which is then used to build an even bigger list of e-mail address for any phishing attack. Not forgetting that each of those documents may well contain the basis for an effective pretext, so check through them religiously.

Photographic metadata

There is just no escaping the fact that every digital file created will contain its own fingerprint. This might be relatively harmless information, like the type of system that created it, or it could potentially be far more sensitive, like the exact geolocation at which the photograph was taken.

The age of the smartphone has ensured that these devices have become the most popular for taking pictures, remember that such devices also have built-in GPS functionality. That means that every photograph that is uploaded directly to Facebook, LinkedIn, or Twitter also has the location data contained within. This information is known as Exif (exchangeable image file format) data.

Here is a look at some tools for Exif data extraction.

Exiftool—http://www.sno.phy.queensu.ca/~phil/exiftool/

Exiftool is a free Exif reader for both Windows and OSX. This example uses the OSX version, but the Windows version works in much the same way.

Exiftool can be used to edit metadata as well as retrieve it, meaning it can be used to sanitize any corporate photographs before publishing them.

Running the application is simple, simply tell it which photo to extract data from.

mac1:rich$ exiftool myphoto.jpg

Anyone following this through as they read through the book will likely have a screen full of output. A lot of it is surplus to requirements to us as social engineers, but some things stand out (Figure 8.14).

The first screen grab shows us some key pieces of information (namely the time the image was created and the type of phone and software in use). The latter will prove far more useful should they be looking at exploits or potential asset recovery during the engagement? Older versions of IOS as in this example are more susceptible to data retrieval. The time the image was taken is also useful, mainly to understand if the image is still relevant to us or not. It may be that we are trying to identify the location of satellite offices for a business, but we know that they only opened after a specific date.

Next is the really interesting stuff; the latitude and longitude at which the image was taken (Figure 8.15). In this instance, the full string was 51°30′39.60″N, 0°5′6.60″W.

My mapping tool of choice here is Google Maps, predictably. We will need to manipulate the string a little, so that Google will accept it. All you need to do is remove “deg” from both the latitude and longitude. So, where was I when the photo was taken? (Figure 8.16). As it turns out, just down the road from the Gherkin, London.

How can the process of image retrieval and metadata extraction be automated? Certainly downloading files individually and stripping them out one by one should be avoided. As it happens, someone, somewhere has tackled the issue already.

Here are some of the options.

Image Picker—a Firefox add-on—https://addons.mozilla.org/en-us/firefox/addon/image-picker/

Image picker is an add-on for Firefox, which will download all images from the page you are browsing. Installation of Firefox add-ons is very straightforward. Click the link above, hit install, and then restart Firefox.

When Firefox restarts, there will be a little button that looks like a picture with a download arrow on it. Clicking this button will give the option to download all images within the tab.

This add-on is still only going to be really useful on image heavy sites, as it will not spider the site looking for images. So, if the targets are Flickr or Picasa accounts, this could be a really useful tool, not to mention very straight forward to use.

Now there are a lot of photos, so Exiftool can be used to process the entire directory. The command is the same, just give it the directory name instead of the image name.

mac1:rich$ exiftool owlpictures

You could grep for the values you want if necessary. The following command will print the GPS position of each photo.

mac1:rich$ exiftool owlpictures | grep'GPS Position'

Interestingly, Twitter sanitizes all Exif data from images uploaded to it, so this is no longer an avenue of interest for us. This wasn’t always the case with some high profile examples highlighting the issues. Roelof Temmingh, founder of Paterva (http://paterva.com/web6/) once gave a talk and demonstration which highlighted who was tweeting from within the confines of the NSA’s parking lot. Using Maltego to gather geolocation data they were able to highlight potential employees of the NSA in very short order, as well as link rather startling personal data. It is out-of-the-box thinking like this that can really bolster a social engineering engagement, not to mention highlight that maybe we are looking in the wrong places when it comes to security.

Using Wget to download images from a site

Wget is a command line tool that can make HTTP, HTTPS, and FTP connections to a site, mainly for the purpose of automated retrieval of files. It is a command line tool that is available for Linux, OSX, and Windows. If either OSX or Linux is being used, it is likely that the user will already be in possession of it. If not, take a look here to download and install the package—http://www.gnu.org/software/wget/.

Wget can be instructed to spider links on a page and download any images it finds to a specific depth. It is basically going to do what Image Picker does, but on steroids. Remember that scene from “The Social Network” where Mark Zuckerberg needs to download all the profile pictures of students from their Facebook? He used wget too.

mac1:rich$ wget -r -l1 -A.jpg www.offensivesite.com

mac1:rich$ exiftool www.offensivesite.com | grep ‘GPS Position’

This command will recursively download files from the site and follow a single level of links, downloading images from each. It will then drop them all conveniently in a directory named www.offensivesite.com. Again, Exiftool pointed at the directory to strip all the GPS data.

GeoSetter—http://www.geosetter.de/en/

GeoSetter is a Windows application which strips geo data from images and then builds a map from them. It is extremely simple to use and quickly highlights potential physical locations that could be used in your assessments. It supports the export of data to Google Earth and the editing of geoinformation, should you wish to sanitize an image.

GeoSetter is a GUI-driven application with a simple installer. Simply tell the application where to find the images you want to look at and it will do the rest. This can be achieved by clicking the “Images” menu item and then opening the correct folder. It will take a short while for the images to be imported if you have a large collection, so be patient, and check the progress indicator on the left below your images.

In the interests of science, I uploaded the contents of my iPhone’s photo library and Geomapped them (Figure 8.17).

Immediately it can be identified just from the grouping of image locations where the majority of my time is spent. The iPhone’s owner lives just outside Leeds, Yorkshire, England. Leeds is half obscured under a mass of mapping pins.

Selecting all of the relevant images and choosing “Export to Google Earth” from the images menu is also a nice way to visualize the data. The end result is much like the Google map above, but with overlays of each of the images instead of map markers. This enables the swift identification of which images were taken where (Figure 8.18).

Piecing all of the data together is a fairly simple task. A corporate website or employee blog has been targeted and every image has been downloaded using wget. Next, either Exiftool, or GeoSetter has been used to map out the locations at which the images were taken. Here are some examples of the kinds of useful intel this exercise can provide to a social engineer:

• Corporate locations and offices

• Data Center locations

• Potential spots at which employees of an organization go to socialize

• Clients and vendors related to the organization

• Corporate device types (iPhone, Android)

• Device names. (The iPhone device name is very often the users name too.)

Some organizations keep their facility locations closely guarded, so being able to show them the damage done by not educating their employees regarding Exif data can be a valuable exercise in and of itself.

Identifying potential locations where employees may hang out provides a social engineer with all sorts of opportunities. It could be as simple as swiping an RFID badge, or as complex as coercing information out of employees when they are off their guard.

Twitter was a gold mine for this kind of information, but that has changed now that Exif data is scrubbed on upload. This is a great move by Twitter and hopefully more will follow.

Reverse image search engines

Whilst on the topic of images, a serious look at reverse image search engines should be carried out. These services offer the ability to upload an image file and watch the search engine trace it back to its other locations. Some of these also attempt to match attributes within the image to other photographs stored online, e.g., colors and shapes.

There are plenty, although the most popular are arguably Google’s reverse image search—http://www.google.com/insidesearch/features/images/searchbyimage.html and Tineye—http://www.tineye.com.

Both of the services offer the ability to upload a file or provide a URL for the file. Please note that if an image is uploaded, it is important to ensure that the image rights are not granted to somebody else.

This kind of service is useful to social engineers because it helps to map a single image back to social networking accounts, blogs, twitter accounts, corporate websites, and personal websites. As an example, a colleague’s LinkedIn photo fed into Google reverse image search. It immediately identified their Twitter account within the search results, therefore if a picture of an employee is discovered on a corporate website but there is no idea who it is, this can provide a tremendously useful function. The more online presence that can be linked to an individual, the more chance there is of constructing a believable pretext and retrieving further intelligence.

Not so metadata

There are other kinds of data within documents that cannot be ignored. They aren’t pieces of metadata, nor are they necessarily directly visible within the document, but they could potentially be the most damaging.

First, and most obvious of all, is document content. It isn’t completely unusual to find information relating to internal systems or members of staff, even entire contact lists within files uploaded to an organizational website. This is why it can often pay to scour these documents for interesting information. There are a handful of ways to go about this, some manual, some less so.

Using the built-in “finder” in OSX is one of the ways that can be chosen to go about it. Simply browsing to the folder that contains the files, click the magnifying glass in the top right and choose the keywords “password” or “system” for example. Another good idea is to search for the area code section of a phone number. Finder will then return any matches that should be investigated further.

PDFGrep—http://pdfgrep.sourceforge.net

If there was ever a tool that did what it said on the packet, pdfgrep is it. It can be pointed at either an individual PDF or a directory full of them, and it will search them all for a pattern of choosing. This works with regex as well as direct string matching.

In order to get the tool compiled in Kali Linux, poppler needs to be installed, which is a toolset for PDF rendering.

root@pentest:/pdfgrep-1.3.0# apt-get install libpoppler-cpp0.

Then all that needs to be done is to follow the instructions within the INSTALL file within the PDFGrep directory.

Briefly, the shell commands `./configure; make; make install’ should configure, build, and install this package.

Next, all that is needed is to issue each of those commands and watch out for any errors that crop up. So ./configure first. Wait for the process to finish without error. Then the same for “make” and “make install.”

Once this process is complete, there should be an executable file called “pdfgrep.” Running it couldn’t be simpler, here is an example of searching through documents for the word “password.”

root@pentest:/pdfgrep-1.3.0# pdfgrep -R password /root/Desktop/docs/

/root/Desktop/docs//email.pdf:Your password at first logon will be “Password1”

Given that the pattern can be a regular expression, the only limitation is your imagination. Strings can be searched for that contain, start with, or end with certain values.

Even if the regex is not known, there are many great examples that are just a Google search away. Here are some examples of what to look out for within these documents, some more common than others.

• National Insurance (UK) or Social Security (US) numbers.

• Phone numbers

• E-mail addresses

• Postal codes (which then should lead you to addresses)

• Names (search for titles, Mr, Mrs, Dr, etc.).

Pdfgrep is a great way to quickly find key pieces of data during any reconnaissance work. If Metagoofil or FOCA has been used, each will have already downloaded all of the PDF documents from the corporate website. Now, pdfgrep can be rerun against the folder looking for key words. However, this is not a replacement for manually reviewing each document, but it can help to shave hours of this part of the assessment.

Document obfuscation

While this is certainly one of the more obscure document sanitization issues, it has been seen on several occasions.

The first time it was seen was on a penetration testing gig, but it was quickly realized that it could apply quite nicely to social engineering. The organizations in question did not seem to have any real process for document sanitization prior to submission to their website. While the engineer was trawling through the list of documents downloaded by FOCA, they noticed that several of them had been obfuscated with black squares and rectangles. Opening the documents in the Adobe Creative suite showed that the shape could be moved away, revealing the sensitive data below. This one can definitely be counted as one of the rarer issues, but it is always worth checking those documents through manually.

The Way Back Machine—http://archive.org/web/web.php

The Way Back Machine is an archive of older versions of websites. In some instances, it stretches back years and has regular snapshots of many websites. It can often be useful to check a target’s domain for sensitive information such as contact details and physical locations. While information security is a big deal today, you don’t have to go back too far to realize it wasn’t always this way.

As is clearly evident, the corporate web presence is a great source of information for a social engineer. Some of the things available are more obvious such as contact details, staff hierarchies, and business purpose. These, alongside information relating to clients, vendors, and partners can be used to form effective pretexts. When added to all of the fantastic intelligence that can be harvested using the tools discussed in this section, the engagement is starting to look very healthy. Let’s dive straight into e-mail addresses, how to find them, and their significance in engagements of this type.

E-mail addresses

The importance of acquiring a target’s e-mail addresses during an assessment cannot be overstated, yet they are given away without a thought. They are used to sign up for forums, online shopping accounts, social networks, and even personal blogs. Is the e-mail address a piece of information that should be taking more seriously? Should it be treated like guarding the crown jewels?

The general experience has always been that a lot of businesses don’t police what their users do with corporate e-mail addresses. Sometimes it might be better to only assign a mailbox to those that really need it.

So why are social engineers so interested in what is seemingly such an unimportant piece of information?

Phishing attacks

Phishing attacks have become very popular in the modern threat landscape. The reason for this is might be twofold. First of all, they are incredibly easy to perform, at least to a basic standard. Second of all, people fall for them in their millions every single day. Seems like a winning combination for any would be scammer.

Including a phishing exercise as a part of a social engineering engagement is always worthwhile, especially where the attempt is targeted. That being said, broad scope phishing attacks also have their place, it just boils down to timescales. Phishing attacks and how to perform them will be covered in greater detail in Chapter 9.

Password attacks

As already mentioned, a lot of engagements include elements of penetration testing as well as social engineering. For example, gathering e-mail addresses can allow an attack using Outlook Web Access (OWA). The e-mail address can also be broken down into different permutations to guess internal naming conventions for users. This information can then be used to attack VPN portals.

Insider knowledge

Gathering an e-mail address and then finding out where it is used on the Internet can lead to further useful information. This can lead to a pretext all of its own. Calling into an organization pretending to be a user who cannot log into his mailbox may seem cliche, but it has a long history of success. Just knowing the URL of an organization’s OWA or VPN device along with the username can create enough plausibility for a successful attack. How to find an organization’s assets will be covered in greater detail, later in the chapter when DNS enumeration techniques are looked at.

E-mail address conventions

Although briefly mentioned when looking at password attacks, one corporate e-mail address can be a real foot in the door.

Most organizations try to publish generic mailboxes such as [email protected]. This makes it difficult for a social engineer to run any sort of phishing scam. If access can be gained to a single user’s address, all of that changes. Now a list of all employees of the target business can be gained using LinkedIn (www.linkedin.com). This intel can be used to create a much larger list of potential target e-mail addresses with a reasonable level of certainty that they will exist.

Now that several good reasons for wanting to harvest corporate e-mail addresses have been covered, here’s an overview of how this can be achieved.

theharvester—https://code.google.com/p/theharvester/

theharvester actually performs a lot more than just e-mail address retrieval, it also finds subdomains, employee names, hosts, and open ports to name a few. The intention of the tool was to provide a platform for intelligence gathering during penetration testing. The information it returns is still of use to social engineers, probably more so than penetration testers strictly speaking.

The tool is included in BackTrack and Kali Linux by default.

theharvester is a command line tool, but is extremely straightforward to use as the examples from the documentation show:

Examples:./theharvester.py -d microsoft.com -l 500 -b google

     ./theharvester.py -d microsoft.com -b pgp

      ./theharvester.py -d microsoft -l 200 -b linkedin

The -d command switch specifies your target domain or organization.

The -b command switch is the search mechanism you would like to employ, be that “Google,” “Bing,” or “all”.

The -l command switch limits the number of results you will retrieve.

Let’s run theharvester against an actual domain and see what it brings back.

root@pentest:~# theharvester -d syngress.com -b all

Full harvest.

[+] Emails found:

------------------

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

The output has been reduced for the sake of brevity, but theharvester has very quickly and efficiently identified 12 e-mail addresses which can be used in an engagement all by harvesting search engines. It is important to note that a domain name needn’t be provided, just passing the company name to theharvester is sufficient. This may lead to some inaccuracy in the results, so do be careful before using them during an attack. It can also lead to results for the other top level domains (TLDs), such as .com, .co.uk, and .org, which may otherwise have been missed. The general findings are that specifying the full domain returns the most usable results, therefore it is recommended that each of the TLDs is manually cycled through.

As with Metagoofil, which was covered earlier, an output of the results to an HTML file is possible for easier viewing (Figure 8.19).

root@pentest:~# theharvester -d syngress.com -b all -f results.html

Having established the naming convention for the target domain’s e-mail addresses, LinkedIn can be harvested to create more. Again, theharvester provides the functionality.

root@pentest:/# theharvester -d syngress -b linkedin

[-] Searching in Linkedin.

  Searching 100 results.

Users from Linkedin:

=================

Amy Pedersen

Larry Pesce

Shawn Tooley

Vitaly Osipov

Elsevier

Becky Pinkard

Vitaly Osipov

Eli Faskha

Gilbert Verdian

Alberto Revelli

Raj Samani

Cherie Amon

Amy Pedersen

David Harley CITP FBCS CISSP

Chris Gatford

Arno Theron

Lawrence Pingree

Christopher Lathem

Craig Edwards

Justin Clarke

Byungho Min

Obviously there is a need to manipulate this list slightly to get the right output, but the size of the attack surface is increasing every minute.

theharvester discovers more than we will cover at this point although I do encourage you to try it for yourself. All of the functionality is very self-explanatory and accessible.

FOCA

FOCA has already been touched upon when looking specifically at document metadata, so this will be kept brief. During that topic, it was seen that an e-mail address could be retrieved from within documents. These could also be exported to a text file so that the output of all tools could be combined. It is a good policy to use several tools to obtain any intelligence to ensure complete coverage. Good social engineers are always on the lookout for new ways to harvest open-source intelligence, so it is important that the output of each tool is manipulated and built into a master list of intelligence.

Metagoofil

As with FOCA, the basics of using Metagoofil have already been covered. Metagoofil will strip the e-mail addresses out of document metadata and print them to screen in a usable format. As always, it is highly recommended that intelligence is gained from as many different sources as is possible, storing them in a master list of information. As a quick refresher, the command should look something like this:

metagoofil -d syngress.com -t doc,pdf -l 200 -n 50 -o /root/Desktop/metadata/ -f results.html

Don’t forget that more than just .doc and .pdf can be chosen. It is always worth checking for other document types, as it is never known what interesting pieces of information may be discovered.

Whois

Whois records can often have the administrative, technical, and registrant contacts attached. Each of these records can contain e-mail addresses which can be added to our list. Whois commands can be run from a linux command line with ease:

root@pentest:/# whois microsoft.com

Registrant:

    Domain Administrator

    Microsoft Corporation

    One Microsoft Way

    Redmond WA 98052

US

    [email protected] +1.4258828080 Fax: +1.4259367329

The output has been shortened for brevity, but it provides an idea of the final product. It is quite common to have these records removed or sanitized to avoid information disclosure, but it should still be checking during each engagement.

Sam Spade

Sam Spade is a free Windows utility that can be used for a multitude of reconnaissance exercises. Although it may not be the latest and greatest, in fact it’s been around for about as long as I can remember, it still has some nice features that can be added to any reconnaissance effort.

Unfortunately the official website http://www.samspade.org is down at the time of writing with no signs of ever being reinstated. Luckily, there are plenty of places still hosting the installer around the web, therefore try the following link in the meantime:

http://www.majorgeeks.com/files/details/sam_spade.html.

Sam Spade is more of a toolkit rather than just a one trick pony. It covers everything from DNS enumeration to website crawling, and for this exercise it is the latter that is of interest.

Several options are available when crawling a website, one of which is to search for e-mail addresses and present them in the output. Mirroring the site to a local directory for further investigations or even to clone sites for phishing attacks is a viable option (Figure 8.20).

Because Sam Spade has been around so long, most people don’t think it has a lot of relevance any more, but it is another useful tool in the box. Sure there are other ways to get the same functionality, but having options is always useful.

Jigsaw

Recently acquired by salesforce, Jigsaw is a contact management site that was initially crowd sourced. While full access to the site is not free, the level of information available is vast.

A search for Syngress brings back 341 contact records, which include names and positions. By drilling down into each record, individual e-mail addresses and phone numbers can be retrieved. Drilling down into each contact costs points, which can be paid for, but in reality all that is really needed is one e-mail address to then build a list based on its convention. Given that some credits are available for free, it is possible to play around to get the e-mail addresses needed. This is because viewing the names of each employee is not only free, but presented in a list for easy manipulation (Figure 8.21).

The format above will literally copy and paste straight into Excel or a tool of choice for manipulation elsewhere—cat, sed, awk, etc. So if we take the first contact on the list and drill down, an e-mail address format can be grabbed (Figure 8.22).

Now that we know the convention is firstname.lastname we can build an e-mail address list for further attacks. I copied the above into Excel and then the name column out into a text file called “emailsort.txt” for further editing. Following which, “awk” is then used to chop things around a little from the command line.

ssclownboat$ awk -F,'{print $2"."$1}' emailsort.txt | sed's/$/@elsevier.com/'

[email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 [email protected]

 Trygve.Anderson R. [email protected]

 [email protected]

 [email protected]

 [email protected]

…Output truncated for brevity…

What I did here is read the file which contained lastname.firstname format users, and swapped those values around with awk so that the first name came first. I then used “sed” to add the “@elsevier.com” to the end of each line. If you weren’t happy that the list did not contain duplicates, you could go further by sorting for uniques with “sort -u.” You could very easily achieve the same results by using Excel by importing the lastname.firstname text file using a comma as the delimiter, swapping the fields around and using concatenate to add in the @elsevier.com.

Recon-ng—https://bitbucket.org/LaNMaSteR53/recon-ng—also includes a handful of Jigsaw modules

The “recon/contacts/gather/http/web/jigsaw” module only needs to be given a company name to work its magic. It will pull each contact record and add them to its database for your use.

recon-ng > use recon/contacts/gather/http/web/jigsaw

recon-ng [jigsaw] > #

recon-ng [jigsaw] > set company syngress

COMPANY => syngress

recon-ng [jigsaw] > run

[*] Gathering Company IDs…

[*] Query: http://www.jigsaw.com/FreeTextSearchCompany.xhtml?opCode=search&freeText=syngress

[*] Unique Company Match Found: 4604397

[*] Gathering Contact IDs for Company ‘4604397’…

[*] Query: http://www.jigsaw.com/SearchContact.xhtml?rpage=1&opCode=showCompDir&companyId=4604397

[*] Fetching BotMitigationCookie…

[*] Query: http://www.jigsaw.com/SearchContact.xhtml?rpage=1&opCode=showCompDir&companyId=4604397

[*] Gathering Contacts…

[*] [44073477] Cathy Boyer - Sales and Marketing (Saint Louis, MO - United States)

[*] [44089692] Steve Mackie - Sales and Marketing (Everett, WA - United States)

[*] [44164766] Ben Cox - Manager Global Infrastructure Development (Kidlington - United Kingdom)

[*] [44289059] Ian Hagues - Delta BI Analyst (Kidlington - United Kingdom)

[*] [45455694] Daniela D Georgescu - Executive Publisher (New York, NY - United States)

The contacts can be copied from the database and you can start building a list of potential e-mail addresses from it based on the earlier examples.

As has been seen in this section, there are a multitude of techniques that can be used to exploit the gathering or guessing of e-mail addresses. Next comes the use of social media and how this can be leveraged in social engineering engagements.

Social media

Social networking sites have always been a gold mine of information for social engineers. People upload their entire lives to sites such as Facebook without giving a second thought to their privacy.

According to Mark Zuckerberg, founder of Facebook, the age of privacy is over. But of course he would say that. As far as he is concerned, the more information he has about you, the better. Why? Because it enables Facebook to commoditize you more efficiently. They can deliver adverts in a more targeted fashion and ensure that click through rates are as high as possible. This is of course making the assumption that targeted advertising is the limit of Facebook’s intentions.

The more social networking has become a part of everyday life, the more it has come under scrutiny in the news media. Privacy has been an on-going concern with major changes to the way that social networks operate to account for this. It has been interesting to see the impact of these mainstream news articles. The vast majority of people who use sites such as Facebook are far more aware of privacy issues now than they ever were. This proves that if awareness exercises are delivered in the right way, and are relevant to the recipient, they can be very effective indeed.

Given that social network sites have tightened up their game, can social engineers still exploit them? To answer this, we will look at some of the more popular social media sites and identify any useful intelligence.

LinkedIn is basically the corporate world’s equivalent of Facebook. It is an online networking application that allows people to connect with others throughout various industries. Most people reading this will already have an account and some sort of profile on the site.

The bulk of the useful information needed relates to profiles. A LinkedIn profile is in essence an online CV. People add their employment history, skills and specializations, photograph, and references. It is also possible to endorse people for their specific skill set.

Basically, LinkedIn is a gold mine of information for anyone trying to track somebody or somebody’s skill set down. For this reason, LinkedIn is exceptionally popular in sales and recruitment environments. It is this level of employee information that makes LinkedIn priceless to social engineers.

What is often found is that an organization will manage their own group on LinkedIn. What is great about this functionality is that it provides us with a ready-made list of employees with their entire history. If a social engineer is seeking to impersonate a member of staff, this level of intelligence is going to prove useful. As already touched upon earlier, it can also be useful for building lists of e-mail addresses, assuming that the convention has already been discovered for the addresses. In other words, it expands the scope or attack surface.

It is worth noting at this point that LinkedIn is not free. It is also not anonymous. When viewing people’s profiles, they may know that this has been done. For this reason, it is beneficial to sign up for the pro version, avoiding using individual profiles and changing the settings control panel in the “What others see when you view their profile” option, clicking the radio button for “You will be totally anonymous.”

There is one limitation in the free version of LinkedIn that can be bypassed in any case. For example, search for an organization name, it will reveal a list of employees. What it doesn’t do is reveal the full name of each person. Typically, the first page of results will have a full name, and the rest will be the first name and last initial.

Syngress was searched for, as a business, and then we selected Elsevier, their parent organization. On the first page of results, the name “Steve.E” was discovered. By clicking on the profile, it states “upgrade for full name.” The preferred option would be to have it for free. Simply select everything to the right of the picture, job title and below, and copy and paste it into Google (Figure 8.23).

Perhaps the general belief would be that this wouldn’t work, but the second hit was exactly what is needed. Clicking it reveals the full name of the employee, allowing them to be added to the list of targets or to use as a name drop during a call (Figure 8.24).

It was actually one of our sales guys that showed me this hack, and I was blown away not only by its simplicity, but again by the devious nature of our sales team.

We already covered a tool that can harvest information from LinkedIn, the immeasurably useful “theharvester”.

theharvester -d elsevier -l 500 -b linkedin

The output provided will need to be manually ratified before it can be used in an engagement. theharvester will pick up on past employees of an organization as well as current.

Recon-ng—https://bitbucket.org/LaNMaSteR53/recon-ng

Recon-ng is a reconnaissance framework for social engineers and penetration testers. It ships with Kali Linux, so you can just launch it by typing “recon-ng” from a command shell. I would recommend getting the latest version with “git clone—https://bitbucket.org/LaNMaSteR53/recon-ng.git.” You can then run ./recon-ng.py to get up and running. If you are familiar with Metasploit, then this will feel familiar. Navigation is very similar, as is the setup of each module.

The linkedin_auth module relies on both a user account on LinkedIn and an API key to harvest contact details. Once you have configured this you simply set the target company name and run the module (Figure 8.25).

Setting this up is quite simple. Sign up for a developer API key, and then assign the key to recon-ng with the “keys add” command. Recon-ng will provide guidance through the rest of the setup.

There are more long-game approaches when it comes to the use of LinkedIn. For example, setting up fake profiles and networking with the target employees. While this angle can yield results, it is beyond the scope of this section.

Facebook

Facebook is without a doubt, the most popular social networking site in existence, however this is mainly from a personal usage standpoint. While there are corporate presences on Facebook, they are certainly more limited in scope than LinkedIn.

There are several areas of potential information gathering that can be exploited. First of all, techniques that were documented earlier to harvest any images from the corporate profile. In this instance it’s not a case of looking for GPS location data, as Facebook strips this upon upload, but looking for photographs of locations, the interior of offices, or employees.

Another avenue of approach is to look at comments on any posts, see if these can be linked to the accounts to employees of the target.

One of the more powerful tools available to a social engineer is Graph Search. This Facebook functionality allows the ability to craft search terms, relating to anything that has been touched by a Facebook user. The possibilities are nearly endless, but obviously require the target(s) to have profiles that allow some level of access. Let’s take a quick look at some useful terms and see what the fallout is.

“People who like Microsoft”—An interesting search term that could identify potential partners, employees, or investors. At the very least this can be classed as useful information, but with some manual investigation could be a lot more. Try it with your employer and see if the results are employees, suffice it to say that a good proportion of them will be. Now start digging through profiles and looking for useful information. It is recommended to start with the “places” functionality. When a user check-in or adds a location to a post, this is where it shows up. It is in essence a map of the employees whereabouts and is likely to contain corporate locations.

“People who work at Microsoft”—Now things are starting to really get juicy. A full list of people can be seen that claim to work for Microsoft including job titles. Each of these profiles potentially contains sensitive information that could be used in any social engineering engagement.

This can also be modified to look at potential ex-employees with “people who worked at Microsoft in 2012.”

“Photos taken in Redmond, Washington of Microsoft”—If there is a rough physical location, this search can turn up fantastic results. You could go from knowing nothing about a target to having a rough layout of the grounds as well as an idea of physical security.

“Photos of people who work at Microsoft taken in Redmond, Washington”—This would be a useful search term when looking for photos of employees that may contain ID badges. Again, the results may also contain useful information on physical security.

“Places in Washington, District of Columbia Microsoft employees visited”—This is an interesting search term when looking for employee hangouts. The search results can be further narrowed down to the exact employees that went to the locations. In the results, click the “Microsoft Employees were here” button to get a list of people. This will automatically populate the search term, for example, “Microsoft employees who visited 9:30 Club.”

Of course, access can be restricted through the use of the privacy settings in an account. A lot of people may not know that this is even possible, let alone have taken the time to lock it down. It is akin to having a Google search engine for people and their lives.

Recon-ng had a module for harvesting sensitive data from Graph Search via the API, but the functionality appears to have been removed. I predict a slew of reconnaissance tools on the horizon for this functionality.

Let’s have a brief look at Twitter to round up the social networking sites.

Twitter

Twitter is another of the more popular social media sites in existence. Unlike Facebook or LinkedIn, the concept of a profile within Twitter is very limited. You can add a little detail about yourself, but nowhere close to the level of LinkedIn or Facebook. This narrows our attack surface to an extent, but Twitter can still sometimes be a useful source of intelligence.

Twitter are another social media provider that are constantly in the public eye, as a result of which they have gradually tightened up security. As an example, Exif data is now scrubbed upon upload so we can no longer harvest location data from a user’s photos.

Even though things have been locked down to an extent, there are still some interesting angles to pursue.

Recon-ng

Recon-ng steps up the plate again with a useful reconnaissance module. The Twitter contact gathering module searches for users who have mentioned the handle you provided. This can help to map out potential colleagues for use down the line.

There is a need to sign up for a Twitter API at https://dev.twitter.com, however possessing a Twitter account allows us to sign in using that but there is still a requirement to fill out a form stating the intention for the application and one or two other details.

Launching recon-ng is as simple as typing it into a command shell and waiting for the console to pop up. Then just choose the module with the “use” command.

recon-ng > use recon/contacts/gather/http/api/twitter

recon-ng [twitter] > show options

 Name Current Value Req Description

 ---- ------------- --- -----------

 DTG  no date-time group in the form YYYY-MM-DD

 HANDLE  yes target twitter handle

At this point, you can start configuring your api keys with the “keys add” command. You will configure both the twitter_api and twitter_secret keys. These are actually called the “consumer key” and the “consumer secret” in the Twitter control panel.

recon-ng [twitter] > keys add twitter_api myconsumerkeygoeshere

[*] Key'twitter_api’ added.

recon-ng [twitter] > keys add twitter_secret myconsumersecretgoeshere

[*] Key'twitter_secret’ added.

Next, just set the handle of choice and run the module

recon-ng [twitter] > set HANDLE @David_Cameron

HANDLE => @David_Cameron

recon-ng [twitter] > run

[*] Searching for users mentioned by the given handle.

[*] Searching for users who mentioned the given handle.

 +---------------------------------------+

 |  Handle  |  Name  |  Time  |

 +---------------------------------------+

 | StopLeseMajeste | Emilio Esteban | Mon Aug 26 16:23:20 +0000 2013 |

 | AuthorSaraKhan | Sara Khan   | Mon Aug 26 16:23:00 +0000 2013 |

 | HIGHtenedStoner | UK4legalWeeD  | Mon Aug 26 16:08:12 +0000 2013 |

The tool should enable the narrowing down of potential colleagues without having to manually dig through the target’s Twitter history. Once correlated with data from LinkedIn and Facebook, the accuracy of the intelligence should increase exponentially.

DNS records

Enumerating DNS records can lead to some interesting finds when it comes to social engineering and penetration testing. For example, knowing the location of an organizations webmail service as well as a username and e-mail address can immediately grant you credibility if you call up the help desk. Most first tier help desk operators aren’t going to know about subdomain brute forcing and will assume that if you know the URL, you are an employee. I have been in situations where I have successfully had an employee’s domain password reset over the phone, having only these pieces of information.

From a penetration testing point of view, this information can be used to identify key assets that could be leveraged during password attacks. As I have already noted, a lot of engagements included elements of social engineering and penetration testing.

So, given that we have covered how to get the e-mail addresses of employees, how do we go about using DNS in an engagement?

Dnsrecon—https://github.com/darkoperator/dnsrecon—Twitter—@Carlos_Perez

Dnsrecon is my go-to tool when it comes to DNS reconnaissance. It is easy to use, fast, and flexible. It ships with Kali and Backtrack, so fire the system up and test it as the motions are walked through.

Here’s how the standard output looks:

root@pentest:~# dnsrecon -d syngress.com

[*] Performing General Enumeration of Domain: syngress.com

[!] Wildcard resolution is enabled on this domain

[!] It is resolving to 92.242.132.15

[!] All queries will resolve to this address!!

[-] DNSSEC is not configured for syngress.com

[*]      SOA ns.elsevier.co.uk 193.131.222.35

[*]      NS ns0-s.dns.pipex.net 158.43.129.83

[*]      NS ns0-s.dns.pipex.net 2001:600:1c0:e000::35:2a

[*]      NS ns.elsevier.co.uk 193.131.222.35

[*]      NS ns1-s.dns.pipex.net 158.43.193.83

[*]      NS ns1-s.dns.pipex.net 2001:600:1c0:e001::35:2a

[*]      MX syngress.com.inbound10.mxlogic.net 208.65.144.3

[*]      MX syngress.com.inbound10.mxlogic.net 208.65.145.2

[*]      MX syngress.com.inbound10.mxlogic.net 208.65.145.3

[*]      MX syngress.com.inbound10.mxlogic.net 208.65.144.2

[*]      MX syngress.com.inbound10.mxlogicmx.net 208.65.145.2

[*]      MX syngress.com.inbound10.mxlogicmx.net 208.65.144.2

[*]      A syngress.com 50.87.186.171

[*] Enumerating SRV Records

[-] No SRV Records Found for syngress.com

[*] 0 Records Found

Basically, the Start of Authority record has been enumerated and the name servers (NS) and Mail Exchange (MX) records recovered. This information could be useful in some scenarios but there is nothing particularly exciting here.

Subdomain brute forcing

Brute forcing is a misnomer here, because we are going to use a list, or a dictionary, but the principal still stands. Dnsrecon is instructed to use a list of possible subdomains, and attempt a name lookup against each. Any that return successfully will be printed to screen, granting us insight into the target’s public facing footprint. Due to the speed at which these names can be resolved, it is possible to get through sizeable lists in very little time. While it is unlikely that any disruption to a DNS server will be caused, it is always worth bearing in mind the impact that there may be on a system.

Thankfully, dnsrecon ships with a standard name list that can be used to get started. Here’s how the command looks to start with:

root@pentest:~# dnsrecon -d apple.com -t brt -D /usr/share/dnsrecon/namelist.txt

[*] Performing host and subdomain brute force against apple.com

[*]      CNAME access.apple.com www.access.apple.com

[*]      A www.access.apple.com 17.254.3.40

[*]      CNAME apple.apple.com apple.com

[*]      A apple.com 17.172.224.47

[*]      A apple.com 17.149.160.49

[*]      A apple.com 17.178.96.59

[*]      A asia.apple.com 17.172.224.30

[*]      A asia.apple.com 17.149.160.30

[*]      A asia.apple.com 17.83.137.5

[*]      A au.apple.com 17.254.20.46

[*]      A b2b.apple.com 17.254.2.97

[*]      A bz.apple.com 17.151.62.52

[*]      A bz.apple.com 17.151.62.54

[*]      A bz.apple.com 17.151.62.53

There are a great deal of confirmed subdomains coming back in the output. The command structure is straightforward. -d is used to define the domain needed to look at and then -t to specify the type, in this case brt or “Brute Force.” We then fed it a list of potential subdomains to use with the -D switch, followed by the use of the standard name list that ships with dnsrecon in Kali. There were a lot more results than those listed too. Try experimenting with independent domains, to see what can be found.

There are some wordlists and alternative methods that are definitely worth exploring. First of all, Ryan Dewhurst A.K.A @ethicalhack3r did some research on the topic. By leveraging the Alexa top 1 million, and attempting a zone transfer against each domain, he was able to get a 6% success rate. This obviously yielded massive amounts of data as far as plain text lists go. Ryan kindly split that data up into more usable files for us to use. You can check out the full post here: http://www.ethicalhack3r.co.uk/zone-transfers-on-the-alexa-top-1-million-part-2/#more-17123.

If Ryan’s “Subdomains top 5000” list is used, are better results seen? Over time this is highly likely as these are real world subdomains over a broad scope.

As a direct comparison, dnsrecon was run against apple.com with both the standard “namelist.txt” and Ryan’s “subdomains-top1mil-5000.txt.” The namelist.txt file returned 194 records. The “subdomains-top1mil-5000.txt” file returned 408 records. Certainly food for thought in any case.

What if we wanted to be more targeted with our attempts though? What can we do if we want to generate a list that is specific to the business we are targeting?

CeWL—http://www.digininja.org/projects/cewl.php

Robin Wood’s CeWL project is a Ruby application that will Spider a target website, and build a wordlist based on that site’s content. While the vast majority of use cases for the tool are to build password lists, good results have been observed when used as a subdomain list generator. In essence, it will provide about as targeted a list as we could hope for. Every single word in the list comes directly from the target’s website.

Here’s how the command looks:

root@pentest:/home/cewl# ruby cewl.rb --help

The help file is always the best place to start. I’ll let you play around with the options, all we need for now is as follows:

root@pentest:/home/cewl# ruby cewl.rb --depth 1 www.apple.com

Pay attention to the depth option. As we touched upon when talking about Spidering, we don’t want go crawl every link on a gigantic site. It will take forever, and if you are midengagement, you may even set alarm bells ringing.

CeWL will print the output to screen by default, but you could pipe it to a file if needed.

The CeWL generated wordlist returned 171 record overall. While it doesn’t return anywhere near as many results as the “5000.txt file”, it will often pick out the handful that would otherwise have been missed as they are very business specific. This is why it is always a good idea to use both methods to ensure good coverage. The total size of our sorted and trimmed subdomain list is 282 entries in total. Try these methods against your own domain and see how few are missed.

Whois records

We touched briefly upon Whois records when talking about e-mail address harvesting. As well as e-mail addresses, they also provide us with other useful intelligence.

First of all, a Whois can identify the address space assigned to an organization. Start by performing an nslookup against the target website, then perform a Whois of the IP Address that comes back.

root@pentest:~# nslookup www.apple.com

Server: 172.16.55.2

Address: 172.16.55.2#53

Non-authoritative answer:

www.apple.com canonical name=www.isg-apple.com.akadns.net.

www.isg-apple.com.akadns.net canonical name=www.apple.com.edgekey.net.

www.apple.com.edgekey.net canonical name=e3191.dscc.akamaiedge.net.

Name:    e3191.dscc.akamaiedge.net

Address: 95.100.205.15

root@pentest:~# whois 95.100.205.15

inetnum:   95.100.192.0 - 95.100.207.255

netname:   AKAMAI-PA

descr:    Akamai Technologies

country:   EU

admin-c:   NARA1-RIPE

tech-c:   NARA1-RIPE

status:   ASSIGNED PA

mnt-by:   AKAM1-RIPE-MNT

mnt-routes: AKAM1-RIPE-MNT

source:   RIPE # Filtered

In this website, then the address space belongs to Akamai, but that won’t always be the case on your assessments. It is always worth looking to see if the target has their own registered address space, and then performing reconnaissance against that. Between this and your earlier DNS reconnaissance, you should have a good layout of their public facing resources.

Whois records also contain physical addresses, e-mail addresses, and phone numbers. One of our sales guys was recently trying to get through to the HQ of an organization, but the call center staff would not provide the number. A quick Whois later and we had the first number in the Direct Dial pool for the HQ, somebody answered and we spoke to the person we needed. This is not only useful to us, but evidence of what nontechnical people think is not public knowledge. Calling in with what is deemed to be privileged information adds weight to your claims.

Another piece of information you will see a lot can also be leveraged by a social engineer. When you run the Whois and get back a hosting company, you already have a potential pretext. You could go with “Hi, It’s Rob from XYZ hosting, we noticed a red light on one of the servers in your cab earlier, have you noticed any issues?” At this point you can get a feel for how on-guard an individual is, wait and see what the response is—they may at least confirm that their systems are indeed hosted there, even if it is just by saying they didn’t notice a problem. Maybe go down the route of offering to KVM onto the system and figure out what is going on. You may end being given credentials that will work elsewhere. Whatever the pitch you choose, you can at least make the call with a usable pretext.

An alternative would be to register an e-mail domain that is similar to the hosting company, and send a planned outage e-mail for a business critical time of the day. The reaction to this e-mail is likely to be panic, and you may just catch someone off-guard who will click your malicious link. What your e-mail contains will be covered in our chapter on e-mail attacks. As a starter though, I would say a cloned customer portal for the hosting company would be a safe bet.

It is surprising what can be gleaned from a simple Whois. In many cases, organizations will have the records sanitized to avoid abuse.

Making use of the intel

We have covered the manipulation of data as we have moved through this chapter. In essence, there will be data that we will make use of directly and indirectly. When I say indirect, I mean calling up a help desk and name dropping a VPN portal’s URL, or a user’s e-mail address. When I say directly, I’m thinking more along the lines of straight password attacks against the target’s assets. To perform attacks of this nature, we need to build user lists from the data we have harvested.

Let’s take the earlier Jigsaw module in recon-ng as an example. We used the ‘recon/contacts/gather/http/web/jigsaw’ module to gather contacts for a business. Once the module has finished, we can view the results by entering “show contacts.” We can also choose to drop the results into a comma separated values (CSV) file by entering “use reporting/csv_file” and then typing “run”. The output will look a little like the below:

“Walt”,“Christensen”,“”,“Vice President Shared Services”,“Maryland Heights, MO”,“United States”

“Wendy”,“Bibby”,“”,“General Manager”,“New York, NY”,“United States”

“Wendy”,“McMullen”,“”,“Senior Marketing Manager”,“Philadelphia, PA”,“United States”

“Wendy”,“Shiou”,“”,“Manager Planning and Analysis and Finance”,“New York, NY”,“United States”

“Wesley”,“Stark”,“”,“Director, Software Engineering”,“New York, NY”,“United States”

“Willem”,“Wijnen”,“”,“Test Engineer”,“New York, NY”,“United States”

“William”,“Schmitt”,“”,“Executive Publisher”,“New York, NY”,“United States”

As with the last Jigsaw example, my route now would be to try and establish an actual e-mail address convention, typically by looking on the corporate website or checking Whois records. Let’s say for the sake of argument the target users [email protected]. We can manipulate the above CSV file to build that list for us.

root@pentest:~/Desktop# sed 's/"//g' contacts.txt | awk -F, '{print $1"."$2"@offensivesite.com"}'

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

Obviously this is of benefit where the lists are much larger, which they were in this instance, for sanity’s sake this has been kept brief. First stripping out the quotes, before printing the first and second fields (first name and last name) based on a comma delimiter. Next, appending the @companyname.com to the end of each line. There are more elegant ways to go about this, but this is quick and it works and therefore it will be a case of experimenting for the one that suits best. Microsoft users may employ to manipulate data in this fashion. Even for a Windows guy or girl, it’s definitely worth having a Linux VM on hand to perform data manipulation.

Try combining the above list with the e-mail addresses gathered during the entire reconnaissance exercise. This could be split up into groups for targeted and broad scope phishing attacks, as well as using them for password attacks against public facing portals. All of this really boils down to the scope that has been given. As already mentioned, hybrid assessments include elements of penetration testing and social engineering, so password attacks against OWA are not beyond the realms of possibility.

The Metasploit framework ships with a module for password attacking OWA. The “auxiliary/scanner/http/owa_login” only needs to be configured with the following details to work:

RHOST—The target IP Address of the OWA system.

RPORT—The port that OWA is listening on, which is typically 443.

USER_FILE—The output we have created above. Often “firstname.lastname” will work as well as the full e-mail address.

PASSWORD—Start with the basics. Password1, password, password1. If a lot of contacts have been retrieved, the chances of finding one of these are high. Don’t forget to try the organization’s name too.

When running the module it will try the specified password against every user that has been harvested, displaying any successes or failures. Unfortunately there is a need to be aware that there is a likelihood of locking out accounts. Most certainly, it will be like flying blind unless someone can be convinced of giving out the lockout policy over the phone. Err on the side of caution and keep the attempts to a couple every 30 minutes or so. This should be below the most common lockout threshold of three attempts before lockout with a 30-minutes reset timer on the attempts.

If running a password attack and access to a handful of accounts is gained, what happens next? If the scope allows for it, check for sensitive information within the e-mails. Further access details may be discovered for additional systems or information that the client has asked to be compromised.

Having completed this step, the entire corporate address list could be downloaded, enabling yet another password attack against OWA. This may provide a more privileged account than was already possessed.

This account could also be used as a base for further phishing attacks, only this time they will be coming from inside the business and are much more likely to be trusted. It is likely that the victim will get wise when they see the responses come pouring in.

Another interesting idea is to attach a malicious attachment to a meeting request or calendar entry. People are far less likely to be suspicious as we are so used to being told that e-mail attachments are the root of all evil. Another angle is to try and employ these accounts against other systems. It is common to find that VPN systems are tied into Active Directory for their authentication. It may be possible to gain remote access to internal servers through Citrix. At this point, it may be getting far beyond the scope of this text and well into the realms of penetration testing.

Summary

In this chapter, we have covered many aspects of reconnaissance for social engineering and penetration testing. We have looked at using the corporate website as a source of intelligence and highlighted the types of information that could be retrieved. We have looked at search engine harvesting and the impact of living in such a connected age. We then looked at e-mail address harvesting and discussed why this seemingly harmless piece of information can be used against us when it finds its way into the wrong hands.

No discussion on reconnaissance would be complete without looking at the more popular social networking sites. While there are far more than just the big three that we discussed, these are certainly the most relevant today. We took a look at bypassing some of LinkedIn’s restrictions to gain contact details, as well as using Facebook’s graph search to surprising effect.

We also looked at how to use DNS and Whois records to augment your assessment, and how they often contain more information than is sensible.

To finish things up, we had a brief look at manipulating that data with a view to using it in an attack.

As luck would have it, or maybe careful planning, we will now move into e-mail attack vectors within our next chapter. That means we get to use a lot of the data that we have just gathered in a series of attacks.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 8. Leveraging Open-Source Intelligence

Create new playlist

Sign In

Sign Up

Keywords

Introduction

The corporate website

Business purpose

Partners, clients, vendors

E-mail addresses

Employee names

Staff hierarchy

Phone numbers

Photos of employees and business locations

Spidering

Passive Spider

Active spidering with OWASP Zed Attack Proxy

Why is this information useful to a social engineer?

Document metadata

Strings

FOCA—http://www.informatica64.com/foca.aspx

Metagoofil

Why document metadata is useful to social engineers

Photographic metadata

Exiftool—http://www.sno.phy.queensu.ca/~phil/exiftool/

Image Picker—a Firefox add-on—https://addons.mozilla.org/en-us/firefox/addon/image-picker/

Using Wget to download images from a site

GeoSetter—http://www.geosetter.de/en/

Reverse image search engines

Not so metadata

PDFGrep—http://pdfgrep.sourceforge.net

Document obfuscation

The Way Back Machine—http://archive.org/web/web.php

E-mail addresses

Phishing attacks

Password attacks

Insider knowledge

E-mail address conventions

theharvester—https://code.google.com/p/theharvester/

FOCA

Metagoofil

Whois

Sam Spade

Jigsaw

Recon-ng—https://bitbucket.org/LaNMaSteR53/recon-ng—also includes a handful of Jigsaw modules

Social media

LinkedIn

Recon-ng—https://bitbucket.org/LaNMaSteR53/recon-ng

Facebook

Twitter

Recon-ng

DNS records

Dnsrecon—https://github.com/darkoperator/dnsrecon—Twitter—@Carlos_Perez

Subdomain brute forcing

CeWL—http://www.digininja.org/projects/cewl.php

Whois records

Making use of the intel

Summary

Table of Contents for
Chapter 8. Leveraging Open-Source Intelligence