A Novel Methodology for HYIP Operators' Bitcoin Addresses Identification

Kentaroh Toyoda, P. Takis Mathiopoulos, Tomoaki Ohtsuki

Research output: Contribution to journalArticle

Abstract

Bitcoin is one of the most popular decentralized cryptocurrencies to date. However, it has been widely reported that it can be used for investment scams, which are referred to as high yield investment programs (HYIP). Although from the security forensic point of view it is very important to identify the HYIP operators' Bitcoin addresses, so far in the open technical literature no systematic method which reliably collects and identifies such Bitcoin addresses has been proposed. In this paper, a novel methodology is introduced, which efficiently collects a large number of the HYIP operators' Bitcoin addresses and identifies them based upon a novel analysis of their transactions history. In particular, a scraping-based method is first proposed which is able to collect more than 2,000 HYIP operators' Bitcoin addresses from the Internet thus providing a large number of the HYIPs' samples. Second, a supervised machine learning technique, which classifies, whether or not, specific Bitcoin addresses belong to the HYIP operators, is introduced and its performance is evaluated. The proposed classification method is based upon two novel approaches, namely the rate conversion technique that mitigates the effect of Bitcoin price volatility and the sampling technique that reduces the computational amount without sacrificing the classification performance. By employing close to 30,000 real Bitcoin addresses, extensive performance evaluation results obtained by means of computer simulation experiments have shown that the proposed methodology achieves excellent performance, i.e., 95% of the HYIP addresses can be correctly classified, while maintaining a false positive rate less than 4.9%. In order to further validate the proposed classifier's ability to detect the HYIP operators' Bitcoin addresses, our designed classifier has been tested against a recently published list of the HYIP addresses maintaining its excellent detection accuracy by achieving a 93.75% success rate.

Original languageEnglish
Article number8731919
Pages (from-to)74835-74848
Number of pages14
JournalIEEE Access
Volume7
DOIs
Publication statusPublished - 2019 Jan 1

Fingerprint

Classifiers
Learning systems
Internet
Sampling
Computer simulation
Experiments
Electronic money

Keywords

  • Bitcoin
  • blockchain analysis
  • data mining
  • forensics
  • HYIP (high yield investment programs)

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Cite this

A Novel Methodology for HYIP Operators' Bitcoin Addresses Identification. / Toyoda, Kentaroh; Takis Mathiopoulos, P.; Ohtsuki, Tomoaki.

In: IEEE Access, Vol. 7, 8731919, 01.01.2019, p. 74835-74848.

Research output: Contribution to journalArticle

Toyoda, Kentaroh ; Takis Mathiopoulos, P. ; Ohtsuki, Tomoaki. / A Novel Methodology for HYIP Operators' Bitcoin Addresses Identification. In: IEEE Access. 2019 ; Vol. 7. pp. 74835-74848.
@article{d280d06d74094aa6a81a8cd1d7fbae08,
title = "A Novel Methodology for HYIP Operators' Bitcoin Addresses Identification",
abstract = "Bitcoin is one of the most popular decentralized cryptocurrencies to date. However, it has been widely reported that it can be used for investment scams, which are referred to as high yield investment programs (HYIP). Although from the security forensic point of view it is very important to identify the HYIP operators' Bitcoin addresses, so far in the open technical literature no systematic method which reliably collects and identifies such Bitcoin addresses has been proposed. In this paper, a novel methodology is introduced, which efficiently collects a large number of the HYIP operators' Bitcoin addresses and identifies them based upon a novel analysis of their transactions history. In particular, a scraping-based method is first proposed which is able to collect more than 2,000 HYIP operators' Bitcoin addresses from the Internet thus providing a large number of the HYIPs' samples. Second, a supervised machine learning technique, which classifies, whether or not, specific Bitcoin addresses belong to the HYIP operators, is introduced and its performance is evaluated. The proposed classification method is based upon two novel approaches, namely the rate conversion technique that mitigates the effect of Bitcoin price volatility and the sampling technique that reduces the computational amount without sacrificing the classification performance. By employing close to 30,000 real Bitcoin addresses, extensive performance evaluation results obtained by means of computer simulation experiments have shown that the proposed methodology achieves excellent performance, i.e., 95{\%} of the HYIP addresses can be correctly classified, while maintaining a false positive rate less than 4.9{\%}. In order to further validate the proposed classifier's ability to detect the HYIP operators' Bitcoin addresses, our designed classifier has been tested against a recently published list of the HYIP addresses maintaining its excellent detection accuracy by achieving a 93.75{\%} success rate.",
keywords = "Bitcoin, blockchain analysis, data mining, forensics, HYIP (high yield investment programs)",
author = "Kentaroh Toyoda and {Takis Mathiopoulos}, P. and Tomoaki Ohtsuki",
year = "2019",
month = "1",
day = "1",
doi = "10.1109/ACCESS.2019.2921087",
language = "English",
volume = "7",
pages = "74835--74848",
journal = "IEEE Access",
issn = "2169-3536",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - A Novel Methodology for HYIP Operators' Bitcoin Addresses Identification

AU - Toyoda, Kentaroh

AU - Takis Mathiopoulos, P.

AU - Ohtsuki, Tomoaki

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Bitcoin is one of the most popular decentralized cryptocurrencies to date. However, it has been widely reported that it can be used for investment scams, which are referred to as high yield investment programs (HYIP). Although from the security forensic point of view it is very important to identify the HYIP operators' Bitcoin addresses, so far in the open technical literature no systematic method which reliably collects and identifies such Bitcoin addresses has been proposed. In this paper, a novel methodology is introduced, which efficiently collects a large number of the HYIP operators' Bitcoin addresses and identifies them based upon a novel analysis of their transactions history. In particular, a scraping-based method is first proposed which is able to collect more than 2,000 HYIP operators' Bitcoin addresses from the Internet thus providing a large number of the HYIPs' samples. Second, a supervised machine learning technique, which classifies, whether or not, specific Bitcoin addresses belong to the HYIP operators, is introduced and its performance is evaluated. The proposed classification method is based upon two novel approaches, namely the rate conversion technique that mitigates the effect of Bitcoin price volatility and the sampling technique that reduces the computational amount without sacrificing the classification performance. By employing close to 30,000 real Bitcoin addresses, extensive performance evaluation results obtained by means of computer simulation experiments have shown that the proposed methodology achieves excellent performance, i.e., 95% of the HYIP addresses can be correctly classified, while maintaining a false positive rate less than 4.9%. In order to further validate the proposed classifier's ability to detect the HYIP operators' Bitcoin addresses, our designed classifier has been tested against a recently published list of the HYIP addresses maintaining its excellent detection accuracy by achieving a 93.75% success rate.

AB - Bitcoin is one of the most popular decentralized cryptocurrencies to date. However, it has been widely reported that it can be used for investment scams, which are referred to as high yield investment programs (HYIP). Although from the security forensic point of view it is very important to identify the HYIP operators' Bitcoin addresses, so far in the open technical literature no systematic method which reliably collects and identifies such Bitcoin addresses has been proposed. In this paper, a novel methodology is introduced, which efficiently collects a large number of the HYIP operators' Bitcoin addresses and identifies them based upon a novel analysis of their transactions history. In particular, a scraping-based method is first proposed which is able to collect more than 2,000 HYIP operators' Bitcoin addresses from the Internet thus providing a large number of the HYIPs' samples. Second, a supervised machine learning technique, which classifies, whether or not, specific Bitcoin addresses belong to the HYIP operators, is introduced and its performance is evaluated. The proposed classification method is based upon two novel approaches, namely the rate conversion technique that mitigates the effect of Bitcoin price volatility and the sampling technique that reduces the computational amount without sacrificing the classification performance. By employing close to 30,000 real Bitcoin addresses, extensive performance evaluation results obtained by means of computer simulation experiments have shown that the proposed methodology achieves excellent performance, i.e., 95% of the HYIP addresses can be correctly classified, while maintaining a false positive rate less than 4.9%. In order to further validate the proposed classifier's ability to detect the HYIP operators' Bitcoin addresses, our designed classifier has been tested against a recently published list of the HYIP addresses maintaining its excellent detection accuracy by achieving a 93.75% success rate.

KW - Bitcoin

KW - blockchain analysis

KW - data mining

KW - forensics

KW - HYIP (high yield investment programs)

UR - http://www.scopus.com/inward/record.url?scp=85068341241&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068341241&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2019.2921087

DO - 10.1109/ACCESS.2019.2921087

M3 - Article

VL - 7

SP - 74835

EP - 74848

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

M1 - 8731919

ER -