架构设计
L1 Ethereum (自建节点 10.169.21.21)
↓
Sequencer (排序器 10.169.20.40) → op-geth + op-node + op-batcher + op-proposer
↓
Replica 1 (10.169.20.41) → op-geth + op-node
Replica 2 (10.169.21.42) → op-geth + op-node
↓
Blockscout 浏览器 (10.169.20.30)
K2 项目运维负责人说明
运维负责人:mrchi(SRE 负责人)
职责范围:
- K2 L2 Rollup 链的节点部署、监控与维护
- L1 合约部署后的节点同步与状态验证
- Sequencer / Replica 多节点架构的高可用保障
- 与智能合约团队、开发团队的跨部门协作
- 故障演练计划制定与执行
- 安全事件响应与应急预案执行
运维里程碑:
- 2024-05-15:L1 合约部署完成(以太坊主网)
- 2024-05-20:Sequencer 主节点上线
- 2024-05-25:2 个 Replica 节点同步完成
- 2024-05-28:监控告警体系接入
- 2024-06-10:完成首次故障演练
- 2024-06-15:去中心化 Sequencer 方案评估
服务器规划
| 主机 | 配置 | 角色 |
|---|---|---|
| prod-eth-node | 8c/32g/1500g | L1 自建 ETH 节点 |
| prod-sequencer | 4c/16g/1000g | Sequencer + Batcher + Proposer |
| prod-replica1 | 4c/16g/1000g | Replica 节点 |
| prod-replica2 | 4c/16g/1000g | Replica 节点 |
| prod-explorer | 8c/16g/200g | Blockscout 浏览器 |
依赖安装
# Ubuntu 22.04
apt install -y git curl make jq
# direnv 环境变量工具
apt install direnv -y
cat >> /root/.bashrc <<'EOF'
eval "$(direnv hook bash)"
EOF
source /root/.bashrc
# Go 1.21.3
wget -O /usr/local/src/go1.21.3.linux-amd64.tar.gz https://studygolang.com/dl/golang/go1.21.3.linux-amd64.tar.gz
cd /usr/local/src/
tar -C /usr/local -xzf go1.21.3.linux-amd64.tar.gz
cat >> /root/.bashrc <<'EOF'
export PATH=$PATH:/usr/local/go/bin
export GO111MODULE=on
export GOPROXY=https://goproxy.cn
EOF
source /root/.bashrc
go version
# Node.js v20
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt update && sudo apt install -y nodejs
npm install -g yarn pnpm
# Foundry
curl -L https://foundry.paradigm.xyz | bash
source /root/.bashrc
# just 命令运行器
cd /usr/local/src/
curl -s https://api.github.com/repos/casey/just/releases/latest | grep "browser_download_url.*x86_64-unknown-linux-musl.tar.gz" | cut -d '"' -f 4 | wget -i -
tar -xzf just-*-x86_64-unknown-linux-musl.tar.gz
mv just /usr/local/bin/
源码编译
1. optimism (v1.7.3)
mkdir -pv /data/github/
cd /data/github/
git clone --recurse-submodules https://github.com/ethereum-optimism/optimism.git
cd optimism && git checkout v1.7.3
# 检查依赖版本
./packages/contracts-bedrock/scripts/getting-started/versions.sh
# 编译
make op-node op-batcher op-proposer
2. op-contracts (v1.3.0)
cd /data/github/
git clone https://github.com/ethereum-optimism/optimism.git op-contract
cd op-contract && git checkout op-contracts/v1.3.0
pnpm install && pnpm build
3. op-geth (v1.101311.0)
cd /data/github/
git clone https://github.com/ethereum-optimism/op-geth.git
cd op-geth && git checkout v1.101311.0
make geth
账户生成与充值
cd /data/github/optimism
./packages/contracts-bedrock/scripts/getting-started/wallets.sh
生成 4 个关键账户:
- Admin:部署/升级合约
- Batcher:发布 Sequencer 交易数据到 L1
- Proposer:发布 L2 交易结果到 L1
- Sequencer:p2p 网络签署区块
充值要求:
- Admin:0.5 ETH
- Proposer:1.0 ETH
- Batcher:0.5 ETH
L1 合约部署
配置环境变量
cp /data/github/optimism/.envrc.example /data/deploy/.envrc
cd /data/deploy
# 编辑 .envrc 填入账户地址和私钥
# L1_RPC_URL 指向自建 ETH 节点
export L1_RPC_URL=http://10.169.21.21:8545
export L1_BECON_URL=http://10.169.21.21:3500
export L1_RPC_KIND=standard
export DEPLOYMENT_CONTEXT=k2
# 激活
direnv allow .
生成配置文件
cd /data/deploy/contracts-bedrock
./scripts/getting-started/config.sh
mv deploy-config/getting-started.json deploy-config/k2.json
关键配置项:
{
"l1StartingBlockTag": "0x4be9c8afeff60b3e7f3209a7cb26dec1024e3d54f0c5670f2f49e563c8309977",
"l1ChainID": 1,
"l2ChainID": 328527,
"l2BlockTime": 2,
"l1BlockTime": 12,
"maxSequencerDrift": 600,
"sequencerWindowSize": 3600,
"channelTimeout": 300,
"p2pSequencerAddress": "0xBD2978F35B0eE13993848137130882B031a89869",
"batchInboxAddress": "0xff00000000000000000000000000000000328527",
"batchSenderAddress": "0x7dE808312aF134f353feCaC2f5A20d9f13f1434D",
"l2OutputOracleSubmissionInterval": 1800,
"l2OutputOracleStartingBlockNumber": 0,
"l2OutputOracleStartingTimestamp": 1716174432,
"finalizationPeriodSeconds": 604800,
"proxyAdminOwner": "0xxxx",
"baseFeeVaultRecipient": "0xyyy",
"l1FeeVaultRecipient": "0xyyy",
"sequencerFeeVaultRecipient": "0xyyy",
"finalSystemOwner": "0x46bD52E9978F556534B6b1D97F7bd265C2bf7A73",
"superchainConfigGuardian": "0x46bD52E9978F556534B6b1D97F7bd265C2bf7A73",
"baseFeeVaultMinimumWithdrawalAmount": "0x1bc16d674ec80000",
"l1FeeVaultMinimumWithdrawalAmount": "0x1bc16d674ec80000",
"sequencerFeeVaultMinimumWithdrawalAmount": "0x1bc16d674ec80000",
"baseFeeVaultWithdrawalNetwork": 0,
"l1FeeVaultWithdrawalNetwork": 0,
"sequencerFeeVaultWithdrawalNetwork": 0,
"gasPriceOracleOverhead": 2100,
"gasPriceOracleScalar": 1000000,
"enableGovernance": false,
"governanceTokenSymbol": "OP",
"governanceTokenName": "Optimism",
"governanceTokenOwner": "0x46bD52E9978F556534B6b1D97F7bd265C2bf7A73",
"l2GenesisBlockGasLimit": "0x1c9c380",
"l2GenesisBlockBaseFeePerGas": "0x3b9aca00",
"l2GenesisRegolithTimeOffset": "0x0",
"eip1559Denominator": 50,
"eip1559DenominatorCanyon": 250,
"eip1559Elasticity": 6,
"l2GenesisDeltaTimeOffset": null,
"l2GenesisCanyonTimeOffset": "0x0",
"systemConfigStartBlock": 0,
"requiredProtocolVersion": "0x0000000000000000000000000000000000000000000000000000000000000000",
"recommendedProtocolVersion": "0x0000000000000000000000000000000000000000000000000000000000000000",
"faultGameAbsolutePrestate": "0x03c7ae758795765c6664a5d39bf63841c71ff191e9189522bad8ebff5d4eca98",
"faultGameMaxDepth": 44,
"faultGameMaxDuration": 1200,
"faultGameGenesisBlock": 0,
"faultGameGenesisOutputRoot": "0x0000000000000000000000000000000000000000000000000000000000000000",
"faultGameSplitDepth": 14,
"preimageOracleMinProposalSize": 1800000,
"preimageOracleChallengePeriod": 86400
}
获取 L1 起始块信息
cast block finalized --rpc-url $L1_RPC_URL | grep -E "(timestamp|hash|number)"
# 输出:
# hash 0x22854ffba21864efe571b6e3195d2e1c1941b41aaf8b8b5f97ea34f1615e9438
# number 5986354
# timestamp 1716790752
部署合约
cd /data/deploy/contracts-bedrock
forge script scripts/Deploy.s.sol:Deploy --private-key $GS_ADMIN_PRIVATE_KEY --broadcast --rpc-url $L1_RPC_URL --with-gas-price 7.00gwei --legacy
部署结果(关键合约):
{
"L2OutputOracleProxy": "0xaE25ea4Cc185585Fa6abf344F3354bf8207Cd7D1",
"OptimismPortalProxy": "0x872902b91fB2aa95147fCDc346a567B7970DBe47",
"L1StandardBridgeProxy": "0x8a471dF117E2fEA79DACE93cF5f6dd4217931Db7",
"SystemConfigProxy": "0xD32FbeaC71164D71Aa62640231D022e285472D1E",
"DisputeGameFactoryProxy": "0x0CE5684754c44822B2351617eC561d2aB89bc781"
}
更新环境变量
cd /data/deploy
cat >> .envrc <<'EOF'
export L2OO_ADDR=0xaE25ea4Cc185585Fa6abf344F3354bf8207Cd7D1
EOF
direnv allow .
L2 初始化
生成配置文件
cd /data/deploy/op-node
./op-node genesis l2 --deploy-config /data/deploy/contracts-bedrock/deploy-config/k2.json --l1-deployments /data/deploy/contracts-bedrock/deployments/k2/.deploy --outfile.l2 genesis.json --outfile.rollup rollup.json --l1-rpc $L1_RPC_URL
# 生成 JWT
openssl rand -hex 32 > jwt.txt
# 复制到 op-geth
cp genesis.json jwt.txt /data/deploy/op-geth/
初始化数据目录
cd /data/deploy/op-geth
mkdir -pv datadir
./geth init --datadir=datadir genesis.json
Sequencer 节点运行
op-geth
cat > /etc/supervisor/conf.d/prod-k2-op-geth.conf <<'EOF'
[program:prod-k2-op-geth]
process_name=%(program_name)s
command=/bin/bash -c 'source /data/deploy/.envrc && /data/deploy/op-geth/geth --datadir ./datadir --http --http.corsdomain="*" --http.vhosts="*" --http.addr=0.0.0.0 --http.api=web3,debug,eth,txpool,net,engine --ws --ws.addr=0.0.0.0 --ws.port=8546 --syncmode=full --gcmode=archive --nodiscover --maxpeers=0 --networkid=328527 --authrpc.vhosts="*" --authrpc.addr=0.0.0.0 --authrpc.port=8551 --authrpc.jwtsecret=./jwt.txt --rollup.disabletxpoolgossip=true'
user=root
stopsignal=Kill
directory=/data/deploy/op-geth
autostart=true
autorestart=true
startsecs=3
startretries=3
stderr_logfile=/data/logs/op-geth/prod-k2-op-geth-err.log
stdout_logfile=/data/logs/op-geth/prod-k2-op-geth-out.log
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=10
EOF
op-node
cat > /etc/supervisor/conf.d/prod-k2-op-node.conf <<'EOF'
[program:prod-k2-op-node]
process_name=%(program_name)s
command=/bin/bash -c 'source /data/deploy/.envrc && /data/deploy/op-node/op-node --l2=ws://localhost:8551 --l2.jwt-secret=./jwt.txt --sequencer.enabled --sequencer.l1-confs=5 --verifier.l1-confs=4 --rollup.config=./rollup.json --rpc.addr=0.0.0.0 --rpc.port=8547 --rpc.enable-admin --p2p.sequencer.key=$GS_SEQUENCER_PRIVATE_KEY --p2p.listen.ip=0.0.0.0 --p2p.listen.tcp=9003 --p2p.listen.udp=9003 --l1=$L1_RPC_URL --l1.beacon=$L1_BECON_URL --l1.rpckind=$L1_RPC_KIND'
user=root
stopsignal=Kill
directory=/data/deploy/op-node
autostart=true
autorestart=true
startsecs=3
startretries=3
stderr_logfile=/data/logs/op-node/prod-k2-op-node-err.log
stdout_logfile=/data/logs/op-node/prod-k2-op-node-out.log
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=10
EOF
op-batcher
cat > /etc/supervisor/conf.d/prod-k2-op-batcher.conf <<'EOF'
[program:prod-k2-op-batcher]
command=/bin/bash -c 'source /data/deploy/.envrc && /data/deploy/op-batcher/op-batcher --l2-eth-rpc=http://localhost:8545 --rollup-rpc=http://localhost:8547 --poll-interval=1s --sub-safety-margin=6 --num-confirmations=1 --safe-abort-nonce-too-low-count=3 --resubmission-timeout=30s --rpc.addr=0.0.0.0 --rpc.port=8548 --rpc.enable-admin --l1-eth-rpc=$L1_RPC_URL --private-key=$GS_BATCHER_PRIVATE_KEY'
user=root
stopsignal=Kill
directory=/data/deploy/op-batcher
autostart=true
autorestart=true
startsecs=3
startretries=3
stderr_logfile=/data/logs/op-batcher/prod-k2-op-batcher-err.log
stdout_logfile=/data/logs/op-batcher/prod-k2-op-batcher-out.log
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=10
EOF
op-proposer
cat > /etc/supervisor/conf.d/prod-k2-op-proposer.conf <<'EOF'
[program:prod-k2-op-proposer]
command=/bin/bash -c 'source /data/deploy/.envrc && /data/deploy/op-proposer/op-proposer --poll-interval=12s --rpc.port=8560 --rollup-rpc=http://127.0.0.1:8547 --l2oo-address=$L2OO_ADDR --private-key=$GS_PROPOSER_PRIVATE_KEY --l1-eth-rpc=$L1_RPC_URL'
user=root
stopsignal=Kill
directory=/data/deploy/op-proposer
autostart=true
autorestart=true
startsecs=3
startretries=3
stderr_logfile=/data/logs/op-proposer/prod-k2-op-proposer-err.log
stdout_logfile=/data/logs/op-proposer/prod-k2-op-proposer-out.log
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=10
EOF
Replica 节点部署
文件拷贝
# 从 Sequencer 节点拷贝到 Replica
# Replica 1 (10.169.20.41)
scp -P51022 /data/deploy/op-geth/geth root@prod-replica1:/data/deploy/op-geth/
scp -P51022 /data/deploy/op-geth/genesis.json root@prod-replica1:/data/deploy/op-geth/
scp -P51022 /data/deploy/.envrc root@prod-replica1:/data/deploy/
scp -P51022 /data/deploy/op-node/op-node root@prod-replica1:/data/deploy/op-node/
scp -P51022 /data/deploy/op-node/rollup.json root@prod-replica1:/data/deploy/op-node/
# Replica 2 (10.169.21.42)
scp -P51022 /data/deploy/op-geth/geth root@prod-replica2:/data/deploy/op-geth/
scp -P51022 /data/deploy/op-geth/genesis.json root@prod-replica2:/data/deploy/op-geth/
scp -P51022 /data/deploy/.envrc root@prod-replica2:/data/deploy/
scp -P51022 /data/deploy/op-node/op-node root@prod-replica2:/data/deploy/op-node/
scp -P51022 /data/deploy/op-node/rollup.json root@prod-replica2:/data/deploy/op-node/
Replica 初始化
# 在 Replica 节点执行
cd /data/deploy/op-geth
mkdir -pv datadir
./geth init --datadir=datadir genesis.json
# 生成 JWT
openssl rand -hex 32 > jwt.txt
cp jwt.txt ../op-node/
Replica op-geth(关键差异)
[program:replica1-k2-op-geth]
command=/bin/bash -c 'source /data/deploy/.envrc && /data/deploy/op-geth/geth --datadir ./datadir --http --http.corsdomain="*" --http.vhosts="*" --http.addr=0.0.0.0 --http.api=web3,debug,eth,txpool,net,engine --ws --ws.addr=0.0.0.0 --ws.port=8546 --syncmode=full --gcmode=archive --nodiscover --maxpeers=0 --networkid=328527 --authrpc.vhosts="*" --authrpc.addr=0.0.0.0 --authrpc.port=8551 --authrpc.jwtsecret=./jwt.txt --rollup.disabletxpoolgossip=true --rollup.sequencerhttp=$SEQUENCER_URL'
关键差异:--rollup.sequencerhttp=$SEQUENCER_URL 指向 Sequencer 节点
Replica op-node(关键差异)
[program:replica1-k2-op-node]
command=/bin/bash -c 'source /data/deploy/.envrc && /data/deploy/op-node/op-node --l2=ws://localhost:8551 --l2.jwt-secret=./jwt.txt --sequencer.l1-confs=5 --verifier.l1-confs=4 --rollup.config=./rollup.json --rpc.addr=0.0.0.0 --rpc.port=8547 --p2p.static=/ip4/10.169.20.40/tcp/9003/p2p/16Uiu2HAmEosNsVUxM457Fk2VKDGyMuXRXQMQgdzpKiSZqjoeUVeC --p2p.listen.ip=0.0.0.0 --p2p.listen.tcp=9003 --p2p.listen.udp=9003 --rpc.enable-admin --p2p.sequencer.key=$GS_SEQUENCER_PRIVATE_KE --l1.beacon=$L1_BECON_URL --l1=$L1_RPC_URL --l1.rpckind=$L1_RPC_KIND'
关键差异:--p2p.static 指定 Sequencer 节点的 peerID
同步验证
查看 peer 连接
grep -r "connected to peer" /data/logs/op-node/replica1-k2-op-node-out.log
# 回显
t=2024-05-31T02:39:40+0000 lvl=info msg="connected to peer" peer=16Uiu2HAmK8oVSbeJjgtfUAVEVfC75GFnSVgMZQ5VfZgm6NX8XWE3 addr=/ip4/10.169.20.40/tcp/9003
查看同步进度
grep -r "Starting P2P sync client event loop" /data/logs/op-node/replica1-k2-op-node-out.log
区块一致性验证
# 查询本地最新区块
local_number=$(curl -s -H "Content-Type: application/json" -X POST --data '{"jsonrpc":"2.0","method":"eth_getBlockByNumber","params":["latest",false],"id":1}' http://localhost:8545 | jq .result.number | tr -d '"')
echo $(($local_number))
# 查询 Sequencer 最新区块
master_number=$(curl -s -H "Content-Type: application/json" -X POST --data '{"jsonrpc":"2.0","method":"eth_getBlockByNumber","params":["latest",false],"id":1}' https://rpc.nal.network | jq .result.number | tr -d '"')
echo $(($master_number))
测试网分叉处理
现象
主节点与副本节点块高度相差太多,无法自动同步。
紧急处理
# 1. Nginx 切换流量到 Sequencer
# 修改 testnet-rpc.nal.network 配置
upstream testnet-rpc {
# 注释掉 Replica 节点
#server 10.7.66.87:8545;
#server 10.7.102.137:8545;
# 仅指向 Sequencer
server 10.7.113.106:8545;
}
# 2. 从 Sequencer 同步数据到 Replica
# Sequencer 节点执行
cd /data/deploy/op-geth/datadir/
mkdir -pv /data/nfs/op/20240813/datadir
cp -r /data/deploy/op-geth/datadir/* /data/nfs/op/20240813/datadir/
# Replica 节点执行
supervisorctl stop test-replica1-op-geth
supervisorctl stop test-replica1-op-node
cd /data/deploy/op-geth/
mv datadir datadir_bad_0813
cp -r /data/nfs/op/20240813/datadir ./
# 重启
supervisorctl start test-replica1-op-geth
supervisorctl start test-replica1-op-node
复盘
问题根因:创世块 hash 与 rollup.json 中 L2Hash 不一致(升级特性时间戳影响)
解决:启动 op-geth 后查询实际创世块 hash,重写 rollup.json
curl -H "Content-Type: application/json" -X POST --data '{"jsonrpc":"2.0","method":"eth_getBlockByNumber","params":["0x0",false],"id":1}' http://localhost:8545 | jq .result.hash
改进措施:
- 文档化创世块 hash 校验流程
- 建立区块高度差异监控告警(> 100 块触发)
- 定期全量备份 datadir,缩短分叉恢复时间
本文首发于 wr.mrchi.cn,转载请注明出处。