Authors:
(1) MICHAEL PACHECO, Software Analysis and Intelligence Lab (SAIL) at Queen’s University, Canada;
(2) GUSTAVO A. OLIVA, Software Analysis and Intelligence Lab (SAIL) at Queen’s University, Canada;
(3) GOPI KRISHNAN RAJBAHADUR, Centre for Software Excellence at Huawei, Canada;
(4) AHMED E. HASSAN, Software Analysis and Intelligence Lab (SAIL) at Queen’s University, Canada.
Table of Links
2 Background and 2.1 Blockchain
4 Computing Transaction Processing Times
5 Data Collection and 5.1 Data Sources
6 Results
6.1 RQ1: How long does it take to process a transaction in Ethereum?
7 Can a simpler model be derived? A post-hoc study
11 Conclusion, Disclaimer, and References
A. COMPUTING TRANSACTION PROCESSING TIMES
B. RQ1: GAS PRICE DISTRIBUTION FOR EACH GAS PRICE CATEGORY
B.1 Sensitivity Analysis on Block Lookback
C. RQ2: SUMMARY OF ACCURACY STATISTICS FOR THE PREDICTION MODELS
D. POST-HOC STUDY: SUMMARY OF ACCURACY STATISTICS FOR THE PREDICTION MODELS
Ethereum is one of the most popular platforms for the development of blockchain-powered applications. These applications are known as ÐApps. When engineering ÐApps, developers need to translate requests captured in the front-end of their application into one or more smart contract transactions. Developers need to pay for these transactions and, the more they pay (i.e., the higher the gas price), the faster the transaction is likely to be processed. Developing cost-effective ÐApps is far from trivial, as developers need to optimize the balance between cost (transaction fees) and user experience (transaction processing times). Online services have been developed to provide transaction issuers (e.g., ÐApp developers) with an estimate of how long transactions will take to be processed given a certain gas price. These estimation services are crucial in the Ethereum domain and several popular wallets such as Metamask rely on them. However, despite their key role, their accuracy has not been empirically investigated so far. In this paper, we quantify the transaction processing times in Ethereum, investigate the relationship between processing times and gas prices, and determine the accuracy of state-of-the-practice estimation services. Our results indicate that transactions are processed in a median of 57s and that 90% of the transactions are processed within 8m. We also show that higher gas prices result in faster transaction processing times with diminishing returns. In particular, we observe no practical difference in processing time between expensive and very expensive transactions. With regards to the accuracy of processing time estimation services, we observe that they are equivalent. However, when stratifying transactions by gas prices, we observe that Etherscan’s Gas Tracker is the most accurate estimation service for very cheap and cheap transaction. EthGasStation’s Gas Price API, in turn, is the most accurate estimation service for regular, expensive, and very expensive transactions. In a post-hoc study, we design a simple linear regression model with only one feature that outperforms the Gas Tracker for very cheap and cheap transactions and that performs as accurately as the EthGasStation model for the remaining categories. Based on our findings, ÐApp developers can make more informed decisions concerning the choice of the gas price of their application-issued transactions.
1 INTRODUCTION
Blockchain is a novel software technology that enables secure and decentralized processing of digital transactions. The first mainstream blockchain platform was Bitcoin, which popularized the concept of cryptocurrencies. In the Bitcoin platform, the cryptocurrency is also called bitcoin (with a lowercase ‘b’) and it is represented by the code BTC. The primary purpose of the Bitcoin platform is to enable the transfer of BTCs among user accounts. That is, the Bitcoin platform provides a platform for the processing of cryptocurrency transactions.
After Bitcoin, many other blockchain platforms have been developed. A special class of these platforms known as programmable blockchains has recently gained particular notoriety. Different from Bitcoin, programmable blockchains also host and execute smart contracts in addition to supporting cryptocurrency transactions. A smart contract is a stateful, general purpose computer program that is typically written with a high-level, object-oriented programming language (e.g., Solidity). One of the most popular programmable blockchain platforms is Ethereum. In Ethereum, a user account can send contract transactions. A contract transaction triggers the execution of a function defined in a smart contract.
Programmable blockchains enable the development of blockchain-powered applications. In the world of Ethereum, these applications are known as decentralized applications or simply ÐApps. Due to the inherent properties of a blockchain (e.g., security, distributed processing), ÐApps have the potential to transform how businesses currently operate. Indeed, this transformational potential yielded a critical demand for professionals with blockchain expertise. A recent report by LinkedIn [4] states: Last year, cloud computing, artificial intelligence, and analytical reasoning led LinkedIn’s global list of the most in-demand hard skills. They’re all on the list again this year, but a skill we weren’t even looking at a year ago – blockchain – tops the list of most in-demand hard skills for 2020.
When engineering a ÐApp, developers need to translate requests captured in the frontend of their application into one or more contract transactions. For example, assume that a finance company wishes to develop a bank ÐApp on top of Ethereum. The developers of this bank application would thus need to translate financial operations (e.g., pay a bill) into one or more contract transactions. In order to deliver a pleasant end-user experience, these transactions need to be processed as quickly as possible by the nodes that maintain the blockchain. Yet, the actual amount of time that it takes to process a transaction in Ethereum depends on several factors, including: the gas price set for the transaction (an Ethereum-specific form of transaction fees), the blockchain utilization level (i.e., how large the current workload is), and the transaction prioritization algorithms employed by the miner nodes (i.e., those entities that select and effectively process transactions in the blockchain). In other words, despite the critical role of transaction processing time in the final end-user experience, determining such time is far from trivial.
Out of the three aforementioned factors influencing transaction processing time, only the gas price can be controlled by the transaction issuer (e.g., ÐApp developers). In the above described bank example, developers would likely achieve fast transaction processing times by setting a very high gas price. However, setting high gas prices for all transactions would likely render the application economically unviable. In other words, the challenge is to dynamically determine the cheapest gas price that will provide the best possible end-user experience (transaction processing time).
Online services have been developed to support transaction issuers (e.g., ÐApp developers) in choosing appropriate gas prices. Currently, the two most popular services are Etherscan and EthGasStation. These services provide real time estimates of processing times for a given gas price (or set of gas prices). The rationale is that, by analyzing these estimates, transaction issuers can make a more informed gas price choice. Despite the popularity of the two aforementioned services, the accuracy of their processing time estimates remains unclear. In addition, Etherscan’s service is proprietary and black box (i.e., its internal workings are undisclosed, preventing an interpretation of how the model operates).
In this study, we empirically investigate transaction processing times in Ethereum. More specifically, we determine the typical processing times, investigate the relationship between processing times and gas prices, and evaluate the accuracy of processing time estimation services. In the following, we list our research questions and the key results that we obtained:
• RQ1: How long does it take to process a transaction in Ethereum? Transactions are processed in a median of 57s. Also, 90% of them are processed within 8m. We also observe that higher gas prices result in fast transaction processing times with diminishing returns (e.g., there is no practical difference between the processing times of expensive and very expensive transactions).
• RQ2: How accurate are the estimates for transaction processing time provided by Etherscan and EthGasStation? Etherscan and EthGasStation use two prediction models each. Our results show that the four studied models are equivalent with a median absolute error in the range of 40.8s to 58.2s. However, in a stratified analysis based on gas price categories, we observe that the Etherscan Gas Tracker (proprietary, black box) is the most accurate model for very cheap and cheap transactions. The EthGasStation Gas Price API, in turn, is the most accurate model for the remaining price categories (regular, expensive, and very expensive.)
Based on the results from RQ1 and RQ2, we conducted a post-hoc study in which we aimed at designing a simple and interpretable model that was at least as accurate as the existing topperforming models. In such a study, we show that a simple linear regression model that builds on only one feature is able to perform at least as accurately as the top-performing models for all price categories. In particular, our model outperforms the Etherscan Gas Tracker for very cheap and cheap transactions, which are the most difficult ones to predict the processing time for.
The results of our paper support ÐApp developers in making more informed decisions concerning the gas price of their application-issued transactions. Furthermore, our descriptive statistics of processing times in Ethereum should be of value to those who are considering the development of ÐApps on top of this blockchain platform.
The contributions of our study are as follows: (i) designing an approach to collect transaction processing times, which enables future studies in the area, (ii) characterizing transaction processing times for different gas price categories (very cheap, cheap, regular, expensive, and very expensive), (iii) determining how accurate the existing processing time estimation services are, and (iv) developing a model that outperforms the existing estimation services. A supplementary package with the data analyzed in this study is made available online[1].
Paper organization. This paper is organized as follows. Section 2 introduces the key concepts that we use throughout this paper. Section 3 describes a motivating example, which clarifies how a practitioner can use a processing time estimation service in practice. Section 4 describes how we compute transaction processing times. Section 5 outlines the data collection process of our study. Section 6 presents the motivation, approach, and our findings for each research question. Section 7 presents our post-hoc study. Section 8 discusses the implications of our findings. Section 9 presents related work. Section 10 discusses the threats to the validity of our findings. Finally, Section 11 concludes the study.
This paper is
[1] https://bit.ly/2YzfcKt. For the final version of the paper, the data will be made available through a permanent link to a GitHub repository.