Announcement

👇Official Account👇

Welcome to join the group & private message

Article first/tail QR code

Skip to content

Crawler JS Reverse Engineering Webpack Tips

Concept:

Webpack is a static module bundler for modern JavaScript applications. When webpack processes an application, it internally builds a dependency graph from one or more entry points and then combines every module your project needs into one or more bundles. All resources are rendered via JavaScript.

Identification:

  1. As shown below:

Viewing the page source code, most are constructed by script tags

  1. As shown:

Most webpack projects can find this webpack JS file

Structure:

The structure of JS after webpack packaging is basically a self-executing function used as a loader to load modules. Common structures are as follows:

Type 1:

As shown:

js
!function('param'){"loader";}(["module"]) // Modules are stored as arrays, each element is a function
js
  function d(n) {.....} // This function is called the loader or distributor. All modules are loaded and executed from this function.
  d(0)  // Use the loader to call the first module

Type 2

As shown:

js
!function('param'){"loader";}({"module"}) // Modules are stored as objects, elements are function objects
js
  function d(n) {.....} // This function is called the loader or distributor. All modules are loaded and executed from this function.
  d('1x2y')  // Use the loader to call 1x2y for execution

Type 3

As shown above, the third type is also the most common. If there are many modules, they will be packaged into a JS file, then a global variable window["webpackJsonp"] = [ ] is defined to store modules that need to be dynamically imported, then the push() method of window["webpackJsonp"] is rewritten as webpackJsonpCallback(). That is, window["webpackJsonp"].push() actually executes webpackJsonpCallback(). window["webpackJsonp"].push() receives three parameters: the first is the module ID, the second is an array or object defining many functions, and the third is the function to call (optional).

Each JS module file starts with

js
(window.webpackJsonp = window.webpackJsonp || []).push([[2],{}]) // 2 is the module id, {} contains the function object to call

Reverse Extraction JS Ideas (1)

  1. First, find the encryption parameter entry

As shown above, the encryption parameter is sign

  1. Find the loader function

As shown above, find a function call like n('xx') to load the module, then set a breakpoint and refresh the page. Move the mouse over and you can basically find the loader function, as shown below:

  1. Set a breakpoint on the call() or apply() method in the loader function, extract the module containing the encryption function and related modules together.

  2. Export the encryption parameter as a global variable

Reverse Extraction JS Ideas (2)

  1. Find the encryption parameter function entry, determine which module contains the encryption function.

  1. Find the encryption module and extract it

As shown above, after loading the m function's module, continue to find other modules, and finally find the encryption module as shown below

  1. Locally extract the code upwards, then fix any missing functions or parameters as needed

After extracting the above encryption module, start to fix other missing functions and parameters, then you can try it out

Note: Set two breakpoints after the encryption function (method), after the breakpoint in the encryption function (method), step into the loader function, then set a breakpoint after the loader (like return e[n].call(r.exports, r, r.exports, d)), jump to the breakpoint after the loader, in the console input the Hook function (depending on the loader function, modify the Hook function code), remove the breakpoint after the loader, jump to the breakpoint after the encryption function (method), in the console input window._wbpk to get all module code related to the encryption function.

Other ideas are being updated...

Last updated: