Restoring a node causes LatestMessagesLogIsMalformed error

Description

As reported by

Even though the rnode behaves correctly when nodes are botted for the first time, trying to reboot the node after previously killing it will cause errros:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 ➜ ~ rn_validator 2 running validator nr 2 bootstrapping from rnode://a9a35d1563af6d65be24992bb8318af6adbd0945@127.0.0.1?protocol=10400&discovery=10404 18:03:10.909 [main] INFO c.r.n.configuration.Configuration$ - Trying to load configuration file: /Users/rabbit/.rnode2/rnode.conf 18:03:11.012 [main] INFO c.r.n.configuration.Configuration$ - Starting with profile default 18:03:11.556 [main] INFO coop.rchain.node.Main$ - RChain Node 0.8.3.git1e2dcb05 (1e2dcb052cb1e523c66705290bec0b222f36dec7) 18:03:11.566 [main] INFO coop.rchain.node.NodeEnvironment$ - Using data dir: /Users/rabbit/.rnode2 18:03:12.835 [main] ERROR c.r.b.BlockDagFileStorage$ - Latest messages log is malformed Exception in thread "main" coop.rchain.blockstorage.LatestMessagesLogIsMalformed$ at coop.rchain.blockstorage.LatestMessagesLogIsMalformed$.<clinit>(errors.scala) at coop.rchain.blockstorage.BlockDagFileStorage$.$anonfun$readLatestMessagesData$6(BlockDagFileStorage.scala:631) at cats.data.EitherT.$anonfun$flatMap$1(EitherT.scala:93) at monix.eval.internal.TaskRunLoop$.startFull(TaskRunLoop.scala:147) at monix.eval.Task$.unsafeStartNow(Task.scala:4249) at monix.eval.internal.TaskBracket$BaseStart$$anon$1.onSuccess(TaskBracket.scala:143) at monix.eval.internal.TaskRunLoop$.startFull(TaskRunLoop.scala:143) at monix.eval.Task$.unsafeStartNow(Task.scala:4249) at monix.eval.internal.TaskBracket$BaseStart.apply(TaskBracket.scala:131) at monix.eval.internal.TaskBracket$BaseStart.apply(TaskBracket.scala:121) at monix.eval.internal.TaskRestartCallback.run(TaskRestartCallback.scala:65) at monix.execution.internal.Trampoline.monix$execution$internal$Trampoline$$immediateLoop(Trampoline.scala:66) at monix.execution.internal.Trampoline.startLoop(Trampoline.scala:32) at monix.execution.schedulers.TrampolineExecutionContext$JVMOptimalTrampoline.startLoop(TrampolineExecutionContext.scala:146) at monix.execution.internal.Trampoline.execute(Trampoline.scala:39) at monix.execution.schedulers.TrampolineExecutionContext.execute(TrampolineExecutionContext.scala:65) at monix.execution.schedulers.BatchingScheduler.execute(BatchingScheduler.scala:50) at monix.execution.schedulers.BatchingScheduler.execute$(BatchingScheduler.scala:47) at monix.execution.schedulers.ExecutorScheduler.execute(ExecutorScheduler.scala:34) at monix.execution.Callback$Base.onSuccess(Callback.scala:229) at monix.execution.Callback.apply(Callback.scala:49) at monix.execution.Callback.apply(Callback.scala:41)

The way to repoduce it:

run 3 nodes as validators, form p2p network
run few deploys and proposes on each of the node
kill one of the nodes gracefully (Ctrl+C)
bring that node back
What is expected: node will reuse file storages to restore DAG
What happens: above exception is thrown

Environment

None

Status

Assignee

Daniyar Itegulov

Reporter

Daniyar Itegulov

Priority

Medium

Affects versions

None

Components

Sprint

None

Epic Link

None

Labels

None

Fix versions

Configure