Sybil Detection on the Scroll Blockchain
Welcome to our guide on how we detect Sybil addresses on the Scroll blockchain. In simple terms, our system looks for clusters of addresses that seem to act in unison—a common tactic for fraud or “Sybil” behavior. Although we plan to support many blockchains in the future, this guide focuses on how we work with Scroll.
1. Data Extraction
Where the Data Comes From: We begin by pulling on-chain transaction data from the scroll.transactions
table using Dune data. This gives us all the raw transactions that occur on the Scroll blockchain.
What We’re Looking For: The raw data includes details such as the source of funds, timestamps, amounts, contract addresses. We then use this data to carry out two main steps in our analysis.
2. Step 1 – Bulk Transfers (Cluster Aggregations)
What It Means: Bulk transfers are about grouping addresses that have received funds from the same source. Our idea is that if a single source sends funds to a group of more than five addresses (ignoring known exchanges, decentralized exchanges, and contract addresses), it might be a coordinated effort to farm tokens or airdrops—classic Sybil behavior.
How We Analyze Bulk Transfers
A. Temporal Patterns
- Batching by Time: We sort the transaction timestamps for a given cluster and group those that occur within a short window (e.g. within 30 minutes) as a “batch.”
- What It Tells Us: More and larger batches suggest that transactions are happening in coordinated bursts—a red flag for sybil activity.
B. Amount Patterns
- Grouping Similar Amounts: We check if many transactions have nearly the same value (within a 15% difference) and group them together.
- Low Variety is Suspicious: If most transactions involve nearly identical amounts (e.g., 80% similarity), it hints at an automated process.
C. Putting It Together
The analysis produces a description (e.g., “Found 3 transaction batches, Largest batch contains 10 transactions, Found 8 similar amount transactions and Low variety in transaction amounts: 20.00% unique”) and calculates a risk score based on these findings. The higher the risk score from bulk transfers, the more likely the cluster is involved in Sybil activity.
3. Step 2 – User Behavior
What It Means: Once clusters are identified through bulk transfers, we analyze how each address in the cluster behaves. This step helps us decide if the addresses are acting like independent users or as part of a coordinated Sybil attack.
How We Analyze User Behavior
A. Method Patterns (Contract Interaction)
We count which transaction methods are used. For example, if 70% of transactions in a cluster use the same method (contract interaction), it indicates a coordinated behavior likely driven by automation.
B. Value Patterns
We check if many transactions have nearly the same value. If the transactions involve almost identical amounts, it is a strong indicator that the behavior is automated.
C. Time Patterns Within the Cluster
We review the time gaps between transactions. If multiple transactions occur almost simultaneously (e.g., within 5 minutes and in sequences of at least 3), it suggests automated behavior. We calculate metrics such as the average time gap, the longest sequence, and the total number of sequences.
D. Overall Behavior Risk
Each aspect—method, value, and timing—is given a weighted risk score that forms an overall behavior risk score for the cluster.
4. The Final Sybil Score & Risk Ranges
After processing both the Bulk Transfers and User Behavior, we compute an overall Sybil Score for each cluster. This score indicates how likely it is that the addresses in the cluster are Sybil.
Calculation: The final Sybil Score is derived by weighting the two parts of our analysis. We assign a weight of 0.3 to the Bulk Transfers score and a weight of 0.7 to the User Behavior score:
final_sybil_score = (bulk_transfers_score * 0.3) + (user_behavior_score * 0.7)
Risk Ranges:
- No-Risk: The address is not part of any cluster.
- Low-Risk: Sybil score between 0 and 30.
- Moderate-Risk: Sybil score between 30 and 60.
- High-Risk: Sybil score above 60.
This scoring system allows us to quickly assess the risk level of a cluster and flag those that may indicate coordinated sybil behavior.
5. Bringing It All Together
From Data to Decision:
- Data Storage: We store on-chain transactions from Scroll (via Dune data) in our database.
- Analysis – Step 1: Bulk Transfers identifies clusters by analyzing transaction timings and amounts.
- Analysis – Step 2: User Behavior dives into how addresses in these clusters act—examining methods, values, targets, and timing.
- Final Sybil Score: By combining the two analyses (with weights of 0.3 for Bulk Transfers and 0.7 for User Behavior), we generate a Sybil Score that helps determine the risk level.
This systematic approach enables our platform to accurately flag suspicious activity and protect users from sybil attacks.
Conclusion
Our sybil detection system on the Scroll blockchain is built on two main pillars:
- Bulk Transfers (Cluster Aggregations): We identify groups of addresses that receive funds together.
- User Behavior: We analyze the transaction patterns within these clusters, including method usage, transaction values, targets, and timing.
By combining these insights and calculating a weighted Sybil Score, we can reliably flag clusters as No-Risk, Low-Risk, Moderate-Risk, or High-Risk. This helps maintain a secure and trustworthy ecosystem on the blockchain.
If you have any questions or need further clarification, please feel free to reach out!