fix: add panic catch on operator calling FFI#1196
Merged
entropidelic merged 11 commits intostagingfrom Oct 18, 2024
Merged
Conversation
added 7 commits
October 7, 2024 17:40
MarcosNicolau
requested changes
Oct 9, 2024
Collaborator
|
Would be nice a review from @Oppen |
Contributor
|
This sucessfully caught panics for the merkle tree verification and sp1 verification but failed when I tested with risc0. |
Contributor
Author
Could you check again, please? |
MarcosNicolau
approved these changes
Oct 16, 2024
Member
MarcosNicolau
left a comment
There was a problem hiding this comment.
It is working for me now! 🚀
Oppen
reviewed
Oct 16, 2024
Contributor
There was a problem hiding this comment.
This seems to have been committed by accident.
Oppen
reviewed
Oct 16, 2024
Oppen
approved these changes
Oct 16, 2024
Contributor
Oppen
left a comment
There was a problem hiding this comment.
Works for me and the code looks good. Just remove the leftover binaries.
PatStiles
approved these changes
Oct 17, 2024
Contributor
PatStiles
left a comment
There was a problem hiding this comment.
Caught all panics on a local testnet.
uri-99
approved these changes
Oct 18, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
We have some calls to external Rust functions on operator Golang code. If any of those functions panics, that error is propagated to the golang application and it fails completly. We want to catch those errors so this doesn't happen.
Changes
recover()onVerifySp1Proof,VerifyRiscZeroReceiptandVerifyMerkleTreeBatch.catch_unwindfrom Rust side forverify_sp1_proof_ffi,verify_risc_zero_receipt_ffiandverify_merkle_tree_batch_ffi.verify_sp1_proof_ffi,verify_risc_zero_receipt_ffiandverify_merkle_tree_batch_ffito returni32instead ofboolto inform about errors.How to test
In order to test properly, we first have to take the
stagingbranch as baseline and add apanic!()in the following FFI Rust functions (one at a time):verify_merkle_tree_batch_ffionmerkle_tree/lib/src/lib.rsverify_sp1_proof_ffionsp1/lib/src/lib.rsverify_risc_zero_receipt_ffionrisc_zero/lib/src/lib.rsThis way, we will make the golang operator crash on a call to any of these FFIs, stoping execution immediately.
Note: Explicit instructions on how to run the operator are provided in the last section of this PR. The operator must be rebuilt after modifying the FFI functions
verify_merkle_tree_batch_ffiwe have to run the hole system and start sending transactions with:make batcher_send_infinite_sp1verify_sp1_proof_ffiwe have to run the hole system and start sending transactions with:make batcher_send_infinite_sp1verify_risc_zero_receipt_ffiwe have to run the hole system and start sending transactions with:make batcher_send_risc0_burst BURST_SIZE=5Once you have tested the crashing behaviour, you can now switch to
fix/operator-ffi-panic-catchbranch and repeat the process, but this time the panics will have to be added to theinnerversions of the provided FFI functions, which are:inner_verify_merkle_tree_batch_ffionmerkle_tree/lib/src/lib.rsinner_verify_sp1_proof_ffionsp1/lib/src/lib.rsinner_verify_risc_zero_receipt_ffionrisc_zero/lib/src/lib.rsThese should also be tested individually.
After executing you should see some errors being logged in the operator, providing information about the rust code panics. If the operator hasn't crashed, then everything is working properly.
How to run the operator
make anvil_start_with_block_timemake aggregator_startmake batcher_start_localmake build_operatormake operator_register_and_start CONFIG_FILE=config-files/config-operator-1.yamlCloses #1174