Gathering Indicators of Compromise Through Malware Datasets.

 



Well, based on my personality, I prefer to discover things on my own and manually. I already use several platforms to look up malicious URLs, but over time, a question came to mind: where do these links actually originate? How do others find them? I don't want to always be just a user who depends on someone else's data I want to understand the source myself. Eventually, I came up with an idea for discovering new malicious URLs as a feed: by collecting a malware dataset and executing the samples in bulk. I believe this is one of the core techniques used by most security vendors and threat intelligence platforms.

What is needed to gather feeds such as URLs and C2 panels without relying on additional feed platforms, whether commercial or free?

Requirements:

  1. Lab Environment

    • Virtualization software such as VMware or VirtualBox

  2. Malware Dataset

    • A collection of malware samples for testing and analysis

  3. Network Monitoring & Extraction Tools

    • ApateDNS : to capture and redirect DNS requests

    • URL Revealer : to extract URLs and network connections during malware execution

   
Furthermore, we need a script to execute all the samples inside the folder containing our malware. You can find the batch script on the JustPaste.it platform at the following link: https://justpaste.it/jvzz8.

You can find platforms to download malware datasets from various sources. One good platform is called VX-Underground: https://vx-underground.org/Samples

When we extract the malware samples from the downloaded dataset, the file extensions are not .exe; instead, they are .vir. We need to rename all of them to .exe in order to execute them properly. I will guide you through how to do this. Below is an original image showing the file extensions from the malware dataset:

To change the file extensions to .exe, first select all the files and rename them to malware (Windows will automatically number them as malware (1), malware (2), etc.). After that, open the Command Prompt (CMD). Navigate to the original path meaning the folder where the malware files are located just like shown in the image below:

In the Command Prompt, type the following command: ren *.vir *.exe.
This means:

  • ren stands for rename

  • *.vir selects all files with the .vir extension

  • *.exe renames them to have the .exe extension

This converts the files into executable format (.exe).



You need to create a folder named malware and place it on your Desktop. All the malware samples that you extracted from the dataset and renamed to .exe should be placed inside this malware folder.

The batch script used to execute the malware samples must be placed outside of the malware folder. If the folder is not named exactly malware, the batch script will not execute any files.

Before launching the batch script, make sure you have already run URLRevealer.exe to simulate the internet connection.

Once URLRevealer.exe is running, it will generate a file named URL_Revealer_output.txt.
This file contains a log of the captured DNS and HTTP traffic, showing which domains or IPs the executable files attempted to communicate with. This log can then be used for further investigation and analysis.

Well, now everything is ready. Our malware dataset is placed inside the malware folder, the batch script is located outside the folder, and we have already launched URLRevealer.exe.

Now it's time to run the batch script and observe what URLRevealer.exe captures for us.
But please make sure, before executing anything, that your virtual lab environment is fully disconnected from any inbound or outbound internet connections. This is critical to avoid any real-world impact or unintentional spread of the malware.

URL Revealer PoC :



Batch Script PoC :



Result PoC :



Comments