An Introduction to Open Source Intelligence

Open source intelligence refers to information collected from the public domain. Information must first be analyzed and applied before it is truly considered intelligence. An attacker has little use for mountains of data until it is analyzed and can be exploited.

When I am performing a penetration test for an organization I usually begin with a cursory Google search or two. Once I have a basic understanding of the organization I move on to more advanced information collection techniques.

My goal when conducting any penetration test is to interact with the target as little as possible. By collecting my information from third party sources I can avoid making contact with an organization’s systems until it is time to strike. Most attackers do not like to advertise their presence when evaluating a network. When penetration testing it critical to take the mindset of an attacker to conduct an accurate assessment.

In this article I explain the concept of open source intelligence and how it pertains to the world of cybersecurity and penetration testing. After that I describe some common OSINT tools. This should give you the skills you need to perform your own OSINT assessment.

Open Source Intelligence: The Information Goldmine

Open source intelligence is invaluable to both malicious cybercriminals, and penetration testers alike. The more information an attacker can gain about an organization and its employees, the more likely the attack will succeed.

The field of open source intelligence is not limited to discovering an organization’s technical vulnerabilities or network configuration. Open source intelligence can also include the abundance of information that employees freely publish in the public domain.

I always probe the ‘human firewall’ when performing a penetration test and OSINT is the tool that helps me do so.

Beyond the hardware

Today’s connected and social media Internet provides a near infinite canvas for users to share their every thought, picture, video, and much more. The analysis of this information can provide a frighteningly accurate picture of a user’s whereabouts, habits, weaknesses, and vulnerabilities.

Just think about that geo-tagged library of photos you have taken all over town. Many online services do very little to strip out all of this location data from your uploaded photos. Meta information like this is invaluable to an attacker trying to understand their target.

Cybercriminals are already making use of all the information we freely spew onto the Internet. Now penetration testers must use the same tactics to effectively assess an organization’s attack surface.

Before I delve into the different ways and sources from which a persistent attacker may attempt to gather their intelligence I will describe the different types of information an attacker may attempt to collect.

Types of Open Source Intelligence

Strictly speaking open source intelligence is any intelligence collected from the public domain. An attacker however will generally be interested in certain kinds of information. The information I describe below can give a persistent attacker (or pen tester for that matter) a significant upper hand.

User and Employee Information

Knowing the names and job titles of a few key employees is enough to launch a rather convincing spear phishing attack. When I am performing a social engineering test, I like to monitor employee’s social media profiles for a while. With this information I can initiate a very convincing spear phishing attack and hopefully gain easy access to a network.

Below is a non-exhaustive list of some types of information an attacker may attempt to collect.

  • Names
  • Addresses
  • Phone numbers
  • Pictures
  • Job titles
  • Social networks

The more information an attacker can gather the better prepared they will be. On top of that, users are more than happy to document their entire personal lives on the Internet giving attackers deep insights.

Organizational Information

In addition to information about employees, an attacker will also gather information about the organization itself. This will give the attacker an idea of the organization’s digital footprint. Once again, intelligence is power. The more information an attacker can gather about an organization the higher their chances of a success.

  • Domain names
  • IP addresses

Vulnerability Information

Once attackers have some information about an organization, they can begin to search for any known vulnerabilities. Public security disclosures and vulnerability information are a valuable asset for persistent attackers.

If an attacker can use an already known vulnerability, then they can save time attempting to exploit a system.

  • Advisories
  • Security reports
  • Vulnerability disclosures

One of my favorite discoveries when performing a penetration test is discovering an unprotected and unpatched system with a remote code execution vulnerability. This is an instant in to the network and usually spells certain doom for the rest of the network. Better me than a real attacker right?

A persistent attacker will search as long as they need to in order to gain an upper hand during an attack on an organization. The previously mentioned items are not exhaustive by any means, but should give an idea of what an attacker might search for.

I always believe in using the right tool for the job. Collecting OSINT by hand is one option, but now I will describe a few common OSINT gathering tools.

Open Source Intelligence Gathering Tools

Now that I have described the types of information that an attacker will seek, I will explain how an attacker can effectively collect such information.

Most of the tools I describe below are open source or allow free registration. Some third-party services also offer more information for a fee.

Other forms of intelligence require careful extraction of small amounts of information from sometimes hostile targets. OSINT is the exact opposite. There is an astounding volume of public data. The difficulty comes from effectively analyzing this information. To effectively use such information it must be collected and analyzed. Luckily there are countless tools which make the collection of this information much easier.

I will start with a tool that helps organize collected information. Then I will describe other available tools for filling in the gaps.


I want to start with Maltego as it is an excellent tool for organizing any kind of intelligence information. The free community version requires a simple registration for free access.

I like to keep my information organized when doing a penetration test. The amount of information I collect with various tools can quickly become overwhelming. Keeping information organized also helps you see new connections.

The simple interface allows you to arrange information, and make connections. An added benefit are the numerous modules which can perform further searches on known information. This includes domain lookups, extracting company information from WHOIS records, performing username lookups on social media platforms, and much more. The unique visual layout makes it easy to discover patterns, and discrepancies.


Yes, Google. Google is a very effective tool for finding open source intelligence deep on the Internet. The sheer volume of information that is searchable is astounding. Google’s crawlers are constantly seeking out new information and cataloging it away.

With a few specially crafted Google queries you can utilize the full power of the search engine. These queries are known as Google Dorks.

For example, the following query will find sites that mention ‘wp-content’ in their URL.


An attacker may use Google Dorking to find exposed files such as Excel documents or PDFs. It is not uncommon for an attacker to find exposed password files that have been mistakenly indexed by Google.

Once I have an understanding of an organization I will typically try a few GoogleDorks to find any low hanging fruit. The number of times I have found exposed sensitive information in Google is quite astounding.


After I have done some basic reconnaissance of my target, I typically move onto Shodan next. Shodan is like the Google for Internet connected devices. This tool can give me an idea of an organization’s infrastructure and security posture very quickly.

Since Shodan’s crawlers are the only ones making contact with the target servers, the target will not get any hints in their logs.

Registering for an account allows for use of the API, as well as much more detailed information about each host. The true power of Shodan is in the search feature. Searches can range from simply showing hosts with port 80 exposed, to showing all hosts that utilize the WordPress CMS.

I will be writing a follow up article to help you make the most of this powerful tool. Check back soon for the follow up.

Some organizations also make regular use of Shodan. If an organization’s private development server is showing up in a public search engine’s data then something is likely misconfigured. By periodically monitoring search results in Shodan an organization can monitor their public attack surface for any anomalies.


This is a powerful tool capable of collecting a ton of OSINT. theHarvester can gather emails, hosts, employee names and more from the Internet. This tool queries multiple search engines and even PGP key servers to discover any information about an organization.

Attackers typically use this tool in the early phases of an attack. After a full ‘harvest’ a penetration tester can gain a rather complete understanding of the target footprint.

An organization may also periodically run theHarvester on themselves to evaluate what an attacker can see about the organization.

This is also a great tool for investigating the history of a domain name. Whenever I purchase a new domain name I typically run theHarvester. This will show me if there are any email addresses or other resources still floating around in the Internet.

Open Source Intelligence Network

Open Source Intelligence: A Double Edged Sword

In this article I have described what open source intelligence is, and how it can be collected. I have also described how an attacker may use this information to do harm. The tools that I have described in this article are freely available to anyone with an interest in collecting open source intelligence.

Open source intelligence can benefit both penetration testers and cyber criminals when evaluating an organization’s footprint. I hope this article helped you understand the powers of open source intelligence and how it can be used as a tool to protect an organization. Remember that when trying to protect yourself from attackers it is important to think like an attacker.

Although the specifics of each attack may vary, the overall lifecycle of an attack is generally well defined. I describe this further in my article on the cyber kill chain.

I hope you enjoyed this article. My next few articles will focus on analysis of OSINT. I look forward to writing many more articles about the world of cybersecurity. Eventually I will also describe specific ways in which attackers apply this information.

Are there any tools or techniques I have missed? Please leave me a comment below. I am also quite interested in how other professions take advantage of open source intelligence outside the world of cybersecurity.

As always, if you have any questions please drop me a comment below and I will get back to you!

Leave a Reply

Your email address will not be published.