AI Script Analysis, Hunting, and Decompilation Updates

AI Script Decompilation

Earlier this year, we introduced initial support for PyInstaller malware, including basic decompilation of compiled Python bytecode (PYC files). Since then, we’ve expanded that support with AI-assisted decompilation to help analyze more complex scripts and newer versions of Python.

Example Of Decompiled Python Code

This feature is still under active development. Currently, only a limited set of scripts are eligible for AI decompilation as we gather feedback and refine accuracy. However, when combined with traditional decompilation methods, we currently cover over 90% of use cases. Support will continue to expand as the feature matures and coverage grows.

Script Summarization Updates

We’ve updated ATIP Script Summaries to better highlight the core functionality of extracted scripts. In reviewing obfuscated samples, we noticed that previous summaries often focused too heavily on the obfuscation techniques themselves, rather than the core script behaviour.

In many cases, obfuscation is used to bloat the script and obscure its purpose, without hiding key logic or API interactions. Often, core function calls and calls to the Windows API remain unobfuscated, making it possible for ATIP to extract core behaviours while ignoring obfuscation.

These updates aim to reduce noise from obfuscation and improve how summaries surface meaningful behaviours, providing analysts with a more accurate view of what the script actually does. 

YARA Hunting at Scale

We’ve recently made improvements and bug fixes to YARA Hunting on UnpacMe to better handle the scale of our growing datasets and users. When we first built YARA Lightning Hunt we were scanning across GBs of samples. Our user base and dataset have grown significantly since the feature was first launched, and we are now routinely scanning terabytes of samples.

As the volume of samples and users grew, we began running into subtle and sometimes unexpected issues:

  • Rule matching performance
  • Edge cases in our distributed batching and scan logic
  • Failure management and error reporting
  • Memory and CPU pressure during large hunts
  • Race conditions when unifying results under high load

Solving these required rethinking parts of our pipeline and tightening up areas that had previously scaled “well enough.” The result is a faster, more stable hunting platform!

If you’re writing or testing YARA rules, you should notice smoother and more consistent results - and if you run into any issues, we’re always interested in hearing about it.

New Script Repository

In addition to scaling our YARA Lightning Hunts, we’ve introduced a new Scripts sample repository, which can now be selected when launching a YARA scan.

This repository is tailored for hunting malicious scripts, such as Python or AutoIt, and excludes non-script files to improve scan precision. The Scripts repository can be particularly useful for targeting script-based threats and reducing noise in your results.

YARA-X

With the recent release of YARA-X v1.0.0 we've upgraded UnpacMe to use this latest stable version. YARA-X scans now account for approximately ~30% of scans on the platform.

Now that YARA-X has reached it's first stable release and the traditional YARA officially in maintenance mode, we plan to make YARA-X the default engine for new scans on UnpacMe in the coming months.

Happy Hunting!