We’re delighted to have reached another huge milestone on our development journey approaching mainnet launch – the completion of Stage 1 of our Counter Stake incentivised testnet initiative! The network has now been comprehensively tested and the latest iteration of our Counter Stake testnet (CS-2006) has been running perfectly for over two weeks – in short, Stage 1 has been a huge success!
Throughout the six iterations of our Counter Stake testnet, beginning with CS-2001 and ending with CS-2006, we have worked closely with our Counter Stake validators to identify and fix a wide array of issues to achieve a functional running network. We are now ready to progress onto Stage 2 in the coming days, the final stage of Counter Stake.
At this juncture we wanted to provide a summary to the Matic community on exactly how Stage 1 went. In this article, we will cover each iteration of the Counter Stake testnet to explain exactly what took place during each phase, which functionalities have been tested, and which issues have been identified and overcome.
Where do we go from here
We are soon commencing Stage 2, which is an adversarial stage, meaning that the validators will now be incentivized to attempt to break the network. This stage is crucial as it creates an environment for the validators to dedicate their best efforts to break the test network and expose any remaining critical bugs in the system. Apart from that, unlike Stage 1, we will be providing frequent updates of Stage 2 to our entire community, in order to make sure that our community is kept informed of the ongoing situation of Counter Stage 2 on a regular basis.
Highlight: 250+ validators registered an active interest in participating.
Our Counter Stake Stage 1 journey began with our first staking testnet, CS-2001. The amount of interest from the community in joining Counter Stage as validators was incredible; we received over 1000 registrations overall. Of those, 250+ actively registered to take part in the initial phase of Stage 1.
On February 13th, we rolled out our first testnet for Stage 1, named CS-2001. We initially began with 10 Matic Foundation nodes in addition to 10 external validators, and continued to add more external validators in a staggered manner.
After running uninterrupted for around one week, we encountered a trivial bug and decided to implement a fix and move onto our next testnet iteration, in order to avoid further issues for the network and the validators.
We released our first Counter Stake Weekly Update to keep our validator community abreast of the current situation.
Highlight: Our Staking Dashboard was deployed for the first time.
We corrected the issue found with CS-2001, added some improvements to the network and launched CS-2002, the next iteration of our Counter Stake testnet.
We again started adding validators to the network in a staggered manner, and reached a total of 51 validators on this testnet, until we encountered our next network issue:
“We encountered a moderate-severity bug in Heimdall Core which resulted in validator nodes stopping syncing after a particular block number (137849). This issue was encountered by several new validators that were attempting to set up their node.”Delroy Bosco, Product Manager – Matic Network
We attempted to fix this by releasing an updated version of Heimdall and asking the community to perform certain steps to fix it on their end. This fix worked for a while, however we encountered the same issue later at a different block height and decided to work on a solution to deploy with the next CS testnet iteration.
CS-2002 is also the phase during which we first released our Staking Dashboard to the validators! See the sneak peek we revealed for the wider community below:
We were thrilled by the positive response from the validator community regarding the dashboard! Although at this time this was only the first version released to the public, the feedback was overwhelmingly positive. In particular, the dashboard was praised for its attractive UI and ease of use.
Using the dashboard, the validators were able to move away from the command line and interact with the network seamlessly, in an ultra-streamlined manner; it was now possible to complete the staking process and become a validator in just a few clicks of a button. In addition, validators could now easily monitor their performance statistics including uptime, checkpoints signed, rewards earned etc.
The validators were hugely valuable in providing feedback for minor improvements to the interface. With their input, we continued to improve the Staking Dashboard throughout the next iterations of the testnet.
Highlight: Our first bug fix on a running network was conducted.
After fixing the Heimdall Core bug identified in CS-2002, we launched the CS-2003 testnet and added all 51 of the validators which were active on the previous testnet.
We eventually encountered a medium-severity issue which was causing a failure on the Bor node for some of the validators. This issue was the result of us setting a fixed hard limit of gas for a transaction, which caused some problems upon more validators being added to the network due to the gas limit crossing the maximum threshold.
For this we had a fix identified and ready; we informed the community to follow a set of instructions to uninstall Heimdall and install an updated version. Once completed, this activated changes which solved the overall gas restriction issue. This was the first time we provided a fix on a running network, where other validators had to apply the hotfix in order to maintain the continuous running of their node.
A total of 80+ validators were onboarded to CS-2003. However, as the number of validators increased we started encountering a new issue: a bug that was causing non-determinism leading to different app hashes for different nodes depending on when they got the transaction. We quickly identified a workaround and decided to spawn the next iteration of the CS testnet.
Highlight: First testnet with 100 active validators.
CS-2004 was our most stable network to date. This was largely the result of rigorous internal testing which was carried out by the Matic team for almost a week. The new testnet was the rolled out to the community, and we onboarded all of the 81 validators which had successfully staked with us on the previous testnet.
Most of the validators were now well versed with the setup process. Within a day or two, almost all of the validators had setup their nodes and staked successfully. We were thrilled to see community members assisting each other whilst facing trivial issues when the Matic team was unavailable for support. We added more validators to the network until we reached 100. Running a testnet with 100 validators was incredible!
One week into the running of CS-2004, we began our mini-contests for the validator community, to give them the opportunity to earn some additional rewards for their excellent work. The first of these was the Re-stake and Claim mini-contest, which saw an overwhelming response from the community – more than 50 participants took part. Congratulations to the 30 winners of the first contest!
CS-2004 continued to remain stable and we commenced our second mini-contest, Delegation. This contest ran for a span of 4 days and saw a huge influx of 80+ entires from participants. The winners were announced here.
During this phase, we invited a total of 130+ validators to participate in the network, with 100 of them becoming active validators. Some of the inactive validators used their own initiative to replace validators (those who were not running their nodes correctly or had downtime due to issues in the node code) via the Staking Dashboard, which was excellent to see!
However, shortly after the Delegation contest, we encountered conditions that led to non-determinism again amongst the nodes leading to different app hashes, and decided to move onto a fresh CS testnet to attempt to skirt the issue temporarily until a permanent fix was put in place.
CS-2005 was short-lived as the persistent issue of different nodes getting different app hashes was encountered again and we decided to stop the testnet immediately in order to find an effective solution.
We updated the validator community with details of the problem:
There is a known issue in the network node code for a circular dependency between Heimdall and Bor. Simply speaking, there can be instances when Heimdall reads data from Bor, but does not receive it due to failed network calls – and vice versa with Bor reading data from Heimdall.
Shortly after, we provided an update on the solution we were working on:
Basically this issue is most certainly caused because Heimdall node does certain network calls to the Ethereum staking smart contracts. If these fail, in certain cases, the node does not know how to recover and panics. We are working on a sidechannel implementation to fix this – but essentially if Goerli connectivity is ensured, you probably won’t face this issue. However, we have made an internal implementation ready and we are currently testing the solution. Once the fix is ready for production we will roll it out to the testnet.
Although the sidechannel implementation fix is still under development, we decided to restart the network with adequate protections against certain network calls from the Matic contracts on Goerli failing – to test certain other features such as validator replacement, signer node change and others.
Highlight: Achieving a persistently running network.
Once again, we conducted rigorous internal testing on CS-2006 before rolling out to the validator community, to ensure all components were functioning correctly. We gradually began adding validators to the new network, until we reached 100 active validators.
CS-2006 has now been running successfully without any issues for more than two weeks, since April 13th. No bugs have been found during this phase. Two weeks of testing is much longer than the traditional timeframe to declare the network a success, but we wanted to be as confident as possible that we have now achieved a persistently running network.
Concluding thoughts: onto the final stage
We’d like to say a huge thank you to all of the Counter Stake participants who have contributed to making Stage 1 a huge success! Together, we’ve identified and fixed an array of issues, refined the Staking Dashboard and we are now ready to move onto Stage 2.
Stage 2 will see our validators essentially attempt to ‘break’ the network, in order to ensure with certainty that all bugs and issues have been identified and rectified during Stage 1. This is the final hurdle before our mainnet is ready to launch, and we couldn’t be more excited!
Mainnet launch is now clearly within our sights.