Reverse Engineering of PDF Document Honeyfiles

 


Reverse engineering is the process of dissecting specific objects such as hardware devices or software applications to understand how they were created and developed from the ground up.

Previously, we discussed honeyfiles as a type of honeypot, used to detect cybercriminals or insider threats within a computer system or organizational network. During that discussion, an idea came to mind: what if we reverse-engineer a honeyfile to understand how it works, especially on platforms like Canarytokens?

I have discovered some techniques for analyzing PDF-based honeyfiles created from Word documents. In this article, we will focus specifically on PDF documents how they function and how they connect back to the Canarytokens platform.

Tools of the Trade:

  1. PDF Stream Dumper

  2. PDFiD

To begin our analysis, we'll use PDFiD to scan the PDF honeyfile we downloaded from the Canarytokens platform. This tool helps identify potentially suspicious elements in the PDF structure, such as embedded JavaScript, automatic actions, or file attachments. Let’s run the tool and review the output to better understand how the document is constructed and what triggers the callback to Canarytokens.


As we can see, the PDF contains no /URI or /OpenAction entries. At first glance, this might seem suspicious or even indicate that the file isn't doing anything but that's not necessarily the case. Instead, we can switch to PDF Stream Dumper to take a closer look inside the PDF and analyze its internal objects and streams more thoroughly.


You can perform a search through the list of objects within PDF Stream Dumper and look for anything related to Canarytokens, similar to what’s shown in the image below:




Comments