Running a Full Node: A Comprehensive Guide to Maintaining a Blockchain Ledger Locally

cover
26 Jul 2024

Abstract and I. Introduction

II. Expectations of Blockchain Accessibility

III. Approach I: Maintain a Ledger Locally - Run a Full Node

IV. Approach II: Query a Third-Party Ledger-Node-As-A-Service (NAAS)

V. Approach III: Light Node - External Query & Local Verification

VI. Concluding Remark and References

III. APPROACH I: MAINTAIN A LEDGER LOCALLY - RUN A FULL NODE

For any user, the most straightforward way to access any blockchain is to maintain its ledger on their computer. A user can maintain a ledger by running a full node on their machine which constantly interacts with the whole blockchain network to keep track of the blocks on the ledger, even without participating in profit-generating activities like mining or staking. Such behavior is usually encouraged by blockchain protocols, as it is frequently cited to help propagate the peer-to-peer network and improve the security guarantee of the network [27].

Most blockchain services provide an optional interface within their node software that allows the user to fetch refined information from the node they own. The interface is usually implemented with the mechanism known as the Remote Procedure Call (RPC) [42]. In this request-response mechanism, the user’s node serves as a server that accumulates information about the ledger through the network and answers the user’s query (see Fig. 2) when they initiate one. Most RPC queries are transmitted as a JSON message between the client and the node. For example, in Bitcoin, a user can query the balance of their wallet by calling getbalance, sending the following JSON via HTTP to the node [8]:

The result from the node is a numerical value, indicating the amount of balance available in the loaded wallet.

Fig. 2. Example of a user interacting with a full node they own via RPC. The full node constantly synchronizes with nodes in the blockchain network to keep track of information on the ledger. Different clients use RPC to interact with the node to query information about the blockchain.

A. Cost Consideration

As the data on any blockchain continues to grow, running a node capable of accessing and storing all information on the ledger (known as a full node in most blockchains) is becoming increasingly more expensive. Official documentations and some unofficial tutorials outline the minimal system specification for Bitcoin [7], Ethereum [26], Solana [50], Zcash [21], Litecoin [5], Ripple [53], Dash [18], and Monero [14] (see Table I).

In particular, in blockchains based heavily on smart contracts, like Ethereum and Solana, full nodes have a high minimal system requirement: it is unlikely for every casual user to have a full-time server of 700GB of disk space, 8GB of RAM and 10Mbps of Internet connection dedicated to running an Ethereum full node.

Meanwhile, if the user wants to access more advanced information, they may need additional indexes built on top of a full node, which requires even more hardware power. For instance, an Ethereum archive node that can access all historical states takes more than 12TB disk space to run [26]. Solana’s RPC nodes that provide RPC functionality also require a 16-core CPU in addition to more than 256GB RAM [50]. More accessibility functionalities such as subscribing to certain events may require the node to interface with the web socket, which further amplifies the hardware demand. Therefore, in most scenarios, casual users outsource at least part of the ledger to an external system and query the system when necessary.

B. Open Questions

Traditional research on the cost of running nodes on blockchains usually focuses on the mining aspect [51] [54], since this is the only profit-generating activity of the proof-of-work blockchains. Because the cost of running a full node is negligible compared to the cost of mining equipment, it is usually overlooked.

While proof-of-stake systems do not have the problem of mining, the amount any validator has to stake to participate in the system usually outweighs the cost of running a full node, and the system-running cost is not considered heavily in the economics [28].

As a result, while the most secure way for any user to interact with the blockchain is to run their full node locally, it is not well-studied what the cost of keeping a full node online is. In particular, we note that some popular payment methods such as Zcash and Litecoin have little information on the hardware required to run a full node that synchronizes with the network. Studies are needed to determine the exact cost for users to access blockchains most securely.

On the flip side, there have been attempts to run full nodes of Bitcoin and Litecoin on Raspberry Pi 3, eliminating the setup of a traditional server entirely [37] [38]. It remains to be determined if such a method can be adopted to reduce the cost of running a full node.

Authors:

(1) Zhongtang Luo, Purdue University (luo401@purdue.edu);

(2) Rohan Murukutla, Supra (r.murukutla@supraoracles.com);

(3) Aniket Kate, Purdue University / Supra (aniket@purdue.edu).


This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.