CODE_COLOR: CODE_YELLOW_MAINNET
RELEASE_VERSION: 2.0.0
PROTOCOL_UPGRADE: TRUE
DATABASE_UPGRADE: TRUE
SECURITY_UPGRADE: FALSE
Protocol Upgrade Voting
This release contains two protocol upgrades. The first protocol upgrade introduces the congestion control feature, and the second upgrade introduces a set of protocol changes for stateless validation. There are therefore two relevant protocol version voting dates. This doesn't require any action on the part of node operators, as your node will follow the current protocol version's rules as long as you upgrade. Validator nodes will vote first for protocol version 68, and then for version 69 after 68 is adopted.
Voting for upgrading to protocol version 68 will start on Monday 2024-08-12 14:00:00 UTC and voting for upgrading to protocol version 69 will start on Wednesday 2024-08-14 04:00:00 UTC
Congestion Control
Relevant NEP: near/NEPs#539
RPC API change: #11419
Congestion control introduces a feature that's meant to limit the transactions and receipts included in chunks when a shard is under load. Before this protocol change, when a particular shard is under heavy load, users can often experience frustrating experiences where transactions take a long time to be included because there is a very long queue of delayed receipts that take up chunk space. When this delayed receipt queue grows to an unreasonable size, it can be difficult for users to understand when things will settle down and their transactions will begin to be included again.
The congestion control feature adds fields to the chunk header that indicate the load on the corresponding shard in terms of delayed and outgoing receipt gas, and the protocol rules limit transactions and receipts destined for congested shards, mitigating the problem of delayed receipt queues growing unbounded to unreasonable levels.
Things to note
Transactions can be rejected with a “ShardCongested” error if the chain is congested heavily
Stateless Validation
Relevant NEP: near/NEPs#509
This protocol upgrade is the main reason for designating this release as version 2.0.0. It will alter the roles of validators within the network and fundamentally change the process of validating state transitions. Currently when a node (validator or not) receives a new block and downloads all its chunks, it applies the transactions and receipts contained in those chunks by looking up and modifying values in the state of each shard. This means nodes need to keep the full state of all shards on disk. The stateless validation protocol upgrade will introduce a change that allows nodes to check the validity of chunks without needing to keep a local copy of the full state. There will be two roles after the upgrade: chunk producers and chunk validators. Chunk producers hold the state of the shard they are assigned to in memory and produce chunks for that shard. Chunk validators do not maintain any state locally and rotate on every single block to verify the state transition of a shard. A specific node may function both as a chunk producer and a chunk validator in the network. Whether a node is a chunk producer is based on stake: top 100 nodes by stake are chunk producers.
Changes to validator roles
Here is a high level summary of changes to validator roles
TL;DR;
- All validators
- Serve as chunk validators
- Additional network usage is expected
- Less disk usage is expected
- Top 100 validators by stake:
- In addition to chunk validator role, serve as chunk/block producers and block validators
- Do not have to track all shards
- State of tracked shard is loaded into memory
- All other validators
- Don’t need to track any shard
Overview of different validator roles
- [NO CHANGES] Block producers (top 100 validators):
- (Same as today) Produce blocks, (new) including waiting for chunk endorsements
- (Same as today) Maintain chunk parts (i.e. participates in data availability based on Reed-Solomon erasure encoding)
- (Same as today) Should have a high barrier of entry for security reasons, to make block double signing harder.
- (New) No longer require tracking any shard
- Chunk producers (top 100 validators):
- (Same as today) Produce chunks
- (Same as today) Must track the shard it produces the chunk for
- (New) Produces and distributes state witnesses to chunk validators
- [New] Chunk validators (all validators):
- Validate state witnesses and sends chunk endorsements to block producers
- Do not require tracking any shard
- Must collectively have a majority of all the validator stake, to ensure the security of chunk validation.
Reward calculation
Reward calculation will remain the same and be based on online ratio.
Changes to contract requirements
- New size limits were introduced to limit the size of state witness:
- Transaction size must not exceed 1.5MB (used to be 4MB).
- Receipt size must not exceed 4MB (possibly it may be reduced further down to 1.5MB).
- A receipt is not allowed to read more than 4MB of state during a single function call.
- Total size of transactions included in two consecutive chunks is limited to 4MB
- Cross shard bandwidth is more limited than before
- Increased cost of sending receipts
- Sending a receipt to a shard (cross contract call) will now cost 50 TGas / MiB. Previously the cost varied between 2-18 TGas / MiB
Config changes
Overview
This section outlines the configuration changes that node operators must apply to their nodes.
Estimated protocol upgrade timeline
- Current Protocol version: 67
- Voting on protocol version 68 upgrade: Monday, August 12, 2024, at 14:00 UTC
- Protocol version 68 upgrade: approximately Tuesday, August 13, 2024, at 17:00 UTC (2 epochs after voting)
- Voting on protocol version 69 upgrade: Wednesday, August 14, 2024, at 04:00 UTC
- Protocol version 69 upgrade: approximately Thursday, August 15, 2024, at 07:00 UTC (2 epochs after voting)
About transition period:
The transition period is the time between the initial voting on Monday, August 12, 2024, at 14:00 UTC and the upgrade to Protocol Version 69, which occurs 2 epochs after the second voting. Each epoch is approximately 13.5 hours, but may vary slightly. We anticipate the upgrade will happen around Thursday, August 15, 2024, at 07:00 UTC. To verify if the transition is complete, simply check the protocol version; it should read 69.
When to make configuration change
- After upgrading to neard binary 2.0.0: Make your configuration changes once the neard 2.0.0 binary is applied BUT BEFORE voting for the protocol version 68 upgrade.
- After protocol version 69 upgrade: Make any necessary configuration updates following the protocol version 69 upgrade.
Which section to follow
- Top 100 validators (current block producers): Follow the section for Top-100 validators.
- Validators ranked 100-110 (current chunk-only producers with potential to become block producers): Follow the section for Top-100 Validators.
- Note: validator rankings are based on stake and can change each epoch. If you're within the 100-110 range, consider following the Top-100 section to stay informed about block validation.
- Validators ranked below 110 (current chunk-only producers): Follow the section for Non-top-100 validators.
- RPC/Archival Node Operators: Follow the section for RPC or Archival Nodes.
Top-100 validator
Neard 2.0.0 on protocol 67 (before any protocol upgrade)
- Upgrade node to meet hardware requirements:
- RAM:
- MainNet: 80+GB
- TestNet: 40+GB
- Network bandwidth:
- Recommended burst capacity is 1 Gbps up/down.
- Sustained usage rate will be below 100 Mbps up/down.
- RAM:
- Set
store.load_mem_tries_for_tracked_shards
inconfig.json
to true. - Verify that
tracked_shards
inconfig.json
is set to a non-empty list (e.g.[0]
). - Recommendation: apply network optimization below.
Failover node follows the same configuration and hardware requirements.
Neard 2.0.0 on protocol 68 (after the network upgrades to protocol 68, but before version 69)
No action required.
Neard 2.0.0 on protocol 69 (after two protocol upgrades are done and version 69 is the currently active protocol version)
- (Optional) Decrease the RAM capacity, new requirements:
- MainNet: 32+GB
- TestNet: 32+GB
- Set
tracked_shards
inconfig.json
to[]
- Recommendation: apply network optimization below.
We can replace the failover node with a smaller one too. It should follow the above configuration, with extra field added to config.json
:
“tracked_shadow_validator”: “<validator_id>”
This is because we no longer track all shards and the failover node needs to know which validator’s shard to follow.
Non top-100 validator
Neard 2.0.0 on protocol 67 (before any protocol upgrade)
- Upgrade node to meet hardware requirements:
- RAM:
- MainNet: 32+GB
- TestNet: 32+GB
- RAM:
- Set
load_mem_tries_for_tracked_shards
inconfig.json
to false. - Verify that
tracked_shards
in config.json is set to a non-empty list (e.g.[0]
). - Recommendation: apply network optimization below.
Failover node follows the same configuration and hardware requirements.
Neard 2.0.0 on protocol 68 (after the network upgrades to protocol 68, but before version 69)
- No other action required.
Neard 2.0.0 on protocol 69 (after two protocol upgrades are done and version 69 is the currently active protocol version)
- Set
load_mem_tries_for_tracked_shards
inconfig.json
to true. - Set
tracked_shards
inconfig.json
to[]
We can replace the failover node with a new one configured as above. In such case, we need to add an extra field to config.json
:
“tracked_shadow_validator”: “<validator_id>”
This is because we no longer track all shards and the failover node needs to know which validator’s shard to follow.
RPC or archival node
Neard 2.0.0 on protocol 67 (before any protocol upgrade)
No action required.
Neard 2.0.0 on protocol 68 (after the network upgrades to protocol 68, but before version 69)
No action required.
Neard 2.0.0 on protocol 69 (after two protocol upgrades are done and version 69 is the currently active protocol version)
No action required.
Hardware requirements
- RAM (for both MainNet and TestNet): 24+GB
- No changes to disk size and network bandwidth requirements
Network optimization
These are permanent recommendations for system tunables which apply for all protocol versions starting immediately. It is recommended to use the persistent configuration file so that the changes are not lost upon system reboot.
If you already have larger maximum buffer sizes configured there is no need to decrease them, but make sure to disable tcp_slow_start_after_idle.
Temporary configuration (does not persist across system reboots):
sudo sysctl -w net.core.rmem_max=8388608
sudo sysctl -w net.core.wmem_max=8388608
sudo sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608'
sudo sysctl -w net.ipv4.tcp_wmem='4096 16384 8388608'
sudo sysctl -w net.ipv4.tcp_slow_start_after_idle=0
Persistent configuration placed in /etc/sysctl.d/local.conf
:
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 16384 8388608
net.ipv4.tcp_slow_start_after_idle = 0
Failover procedures
Typical incident recovery plan on mainnet with a primary validator node and a secondary node
- Copy over
node_key.json
to secondary node. - Copy over
validator_key.json
to secondary node. - Stop the primary node.
- Stop the secondary node.
- Restart the secondary node.
Validator key hot swap
This is a new method that can be used to move validator key to another node with near-zero downtime:
- Copy over
validator_key.json
to secondary node. - Stop the primary node.
- Send a
SIGHUP
signal to the new node (without restart).
The new node will pick up the validator key without restart and it will start validating.