Cryptocurrency Spoofing – A Tale of Savage Manipulation.

In the previous post in our technical blog series, we had discussed how pump and dump manipulation is implemented by sleazy entities and their means of detection through supervised machine learning models. Now, let’s dive deep on the other common technique – spoofing that market manipulators use to influence investors on their decision-making.

Spoofing is said to have taken place when a manipulator starts placing large bogus ask (selling) or bid (buying) orders into the market to trick other investors into believing that there is tremendous demand for a cryptocurrency. The manipulator has no intention for these orders to be matched and these orders will be canceled when they are about to be matched. These orders are known as passive orders. The volume of these passive selling or buying orders tend to be larger than usual.

Now, the spoof orders get implemented in either of the two ways:

1. The passive sell price is lower than the current ask price.

2. The passive ask price is higher than the current bid price.

The manipulator intends to buy a cryptocurrency at a price lower than the current ask price. Then they place a large-volume order at a passive price, which will be lower than the current ask price. Other innocent investors follow suit on this spoofing order on the expectation that their current ask price will decrease.

The manipulator then withdraws the massively spoofed sell orders and buys all the remaining sell orders from the other investors who were not cautious about the manipulated price.

In order to analyze and model these events, we need to use the level 2 data obtained from the order books and create a model-training protocol for the machine/neural network to learn and deploy it as a rule-based expert systemSo let us discuss the mathematical protocol that can enable our neural network to detect anomalies and predict the occurrence of spoofing at any instant in time.

Spoofing sucks. WMB rocks!

Thanks to the ingenious research published by three Thai financial researchers – Teema Leangarun, Poj Tangamchit, and Suttipong Thajchayapong, a rule-based expert system can be designed and fine-tuned for the cryptocurrency scenario.
For the sake of simplicity and explanatory convenience, we are going to assume an arbitrary sliding window (sample data distribution) pertaining to Bitcoin that visualizes the price (BTC) against the volume of orders (V).

The primary drivers for the arbitrary sliding window are the following:

$P=\text{price of the asset}$

$V =\text{volume of the asset}$

$t =\text{the time index}$

The secondary drivers for the arbitrary sliding window are:

$P^{ cancel}_{sell}(t) = \text{\small the price of sell orders that had been canceled at the time ' t '}$

$P^{ matched}_{buy}(t) = \text{\small the price of the last buy order that had been matched at the time ' t '}$

$V^{ cancel}_{sell}(t) = \text{\small the volume of sell orders that had been canceled at the time ' t '}$

$V^{ matched}_{sell}(t) = \text{\small the volume of matched orders that had been matched at the time ' t '}$

$V^{ cancel}_{total}(t) = \text{\small the price of sell orders that had been canceled at the time ' t '}$

Now, there are three essential conditions to predict if a spoofing event has occurred:

1. Order cancellation price almost close to the current buying or selling price

The absolute value of the difference between the price of cancellation sell orders and the current ask price has to be lower than the threshold of 50% (i.e, threshold factor = 0.5).

$\text{ spoofing condition A } = \{\text{'True', } |\frac{P^{ cancel}_{sell}(t) - P^{ matched}_{buy}(t)}{P^{ cancel}_{sell}(t)}| < 0.5 \text{; 'False', otherwise}\}$

2. High cancellation volume

The amount of canceled selling orders is five times more than the summation of matched orders, since the starting point.

$\text{ spoofing condition B } = \{\text{'True', } V^{ cancel}_{sell}(t) > 5 \times (\sum_{i = 1}^{t - 1} V^{ matched}(i)) \text{; 'False', otherwise}\}$

3. High volume matched at the last buying or selling order

The amount of matched buying orders is 50% more than the summation of matched orders since the starting point.

$\text{ spoofing condition C } = \{\text{'True', } V^{ matched}_{buy}(t) >0. 5 \times (\sum_{i = 1}^{t - 1} V^{ matched}(i)) \text{; 'False', otherwise}\}$

When all the three conditions are satisfied, the likelihood of a trading event of being a spoof trading event is higher. The conditional logic for the neural network model is derived by using the AND logic to establish the constraint that all three conditions are necessary to affirm spoof trading.

$\text{ spoofing } = \text{(spoof trading A) }\wedge\text{ (spoof trading B) }\wedge\text{ (spoof trading C) }$

The neural network learns from the training data and discriminates it into various classes. The non-linearity from the activation functions (ReLU, tanh, etc) comes in handy to create better decision boundaries and accurately identify the spoofed events. The output of the neural network is a binary classifier variable that indicates the probability of manipulation for each event.

$\text{ probability of spoofing } = \{\text{1 : Manipulated event, 0: otherwise }\}$

Stay tuned to this technical blog series to find out more on WatermelonBlock’s engineering culture and other exciting articles that we have for you!

References:

1. “Stock Price Manipulation Detection Based On Mathematical Models “,  Teema Leangarun et al.