在解析前,需要准备比特币区块数据。可在安装比特币QT客户端后,到指定目录下获取。注意不需要全部同步完成,只需要开启同步后,有第一个区块数据即可。文件路径,分别是:
  • Mac: $HOME/Library/Application Support/Bitcoin/blocks/blkXXXX.dat
  • Windows: %APPDATA%/Bitcoin/blocks/blkXXXX.dat
比特币钱包
通过Go代码的方式,一步一步讲解比特币区块的解析逻辑。 下图是比特币区块数据结构,主要由区块头和交易清单组成。 比特币区块结构
name字节用途
神奇数magic4固定为0xD9B4BEF9,作为区块间的分隔符
区块大小block_size4记录当前区块的大小
区块头信息header80记录当前区块的头部信息
交易总数tx_count1-9当前区块所含记录数
交易清单tx_list*记录当前区块交易细节
其中,神奇数固定为0xD9B4BEF9,作为区块的分隔标识符。而比特币的交易数据是以128Mb为上限记录在dat文件中的。利用此神奇数和区块大小,可将区块数据读取。
先打开文件blk00001.bat

1
2
3
4
5
f, err := os.Open(name)
if err != nil {
log.Fatal(err)
}
defer f.Close()

再按分隔符读取区块数据:

1
2
3
4
5
6
7
8
9
10
for {
mg := ReadUint32(f)
if mg != MARGIC { // MARGIC uint32 = 0xd9b4bef9
Failt(fmt.Errorf("expect margic %x,not %x", margic, mg))
}
//get block size
blockSize := ReadUint32(f)
//get block bytes
blockData := ReadBytes(f, uint64(blockSize))
}

区块头信息

区块头是重要的信息,保障了区块内容的不可篡改。因此一些轻钱包,可不下载完整的区块,而只需获取区块头,便可验证交易。
字节用途
版本号4区块的版本号
上一区块头Hash32上一区块的 SHA256 Hash值,以确保上一区块无法修改,继续加强稳定性
Merkle树根值32交易Merkle树根节点Hash
时间戳4区块生成时间UNIX格式
目标值4本区块头Hash必须>=本值
随机数4一个使得本区块头Hash>=目标值的任意数。
区块头信息中记录上一个区块头HASH值,是对历史交易的再次确认。比特币中只有在6次确认后才交易稳定。
比特币区块链简图
那么此Hash是对头部所有内容经过两次SHA256加密获得,而在内容中含有交易记录的Merkle数根值,可以保障交易内容摘要也在此HASH范围内。注意每个区块的HASH并不存储在当前区块中,其他区块要引用此区块时需对此区块头进行HASH计算。

1
2
3
4
5
6
7
8
9
10
11
12
13
// head
head := decodeHeader(r)
head.Hash = DoubleHashH(blockData[:80]) //头部80个字节
func decodeHeader(r io.Reader) (h BlockHeader) {
h.Version = ReadUint32(r)
h.PrevBlock = ReadHash(r)
h.MerkleRoot = ReadHash(r)
h.Timestamp = time.Unix(int64(ReadUint32(r)), 0)
h.Bits = ReadUint32(r)
h.Nonce = ReadUint32(r)
return
}

此处的Merkle树便是一个Hash二叉树。本区块中所有交易,按时间顺序将单笔交易的HASH值(TXID)两两组成一个二叉树的叶子,根便是最终的Hash值。 通过对每笔交易进行HASH签名,构成唯一的根HASH存放在区块头中,以确保每笔交易不可伪造,不可重复。

交易记录

交易记录便是整个账簿的一页内容,详细记录当下所有比特币交易信息。每个比特币账户下的收支被永久的记录在区块中。
在解析交易记录前,需要获取总记录数,这个总记录数是一个不固定长度数值。

1
2
3
4
5
6
// transaction
txCount := ReadVarInt(r)
txs := make([]*MsgTx, txCount)
for i := uint64(0); i < txCount; i++ {
txs[i] = decoderTX(r)
}

name字节用途
版本version4比特币协议版本
输入数tx_in_count1+区块中所有输入记录数
输入明细tx_in40+区块中所有支出明细
输出数tx_out_count1+区块中所有输出记录数
输出明细tx_out40+区块中所有输出明细
交易时间戳local_time4交易被网络确认的UNIX时间
其中,支出和接收明细是衔接在一块,拥有相同的数据结构。每一单笔收支明细都有自己的编号以供查询。

1
2
3
4
5
6
7
8
9
10
11
12
13
msg := &MsgTx{}
msg.Version = ReadUint32(r)
txInCount := ReadVarInt(r)
msg.TxIn = make([]*TxIn, txInCount)
for i := uint64(0); i < txInCount; i++ {
//input
}
txOutCount := ReadVarInt(r)
msg.TxOut = make([]*TxOut, txOutCount)
for i := uint64(0); i < txOutCount; i++ {
//output
}
msg.LockTime = ReadUint32(r)

输出和输入明细稍有不同,输入:
name字节用途
输出记录Hashhash32本输入资金来源的输出记录HASH
脚本长度length1+交易脚本可变长度值
脚本内容script+本输入的脚本内容,包含签名信息
Sequence4当前无用途的一个固定数值:0xFFFFFFFF

1
2
3
4
5
6
7
8
tx := TxIn{}
// prev
tx.PreviousOutPoint.Hasx = ReadHash(r)
tx.PreviousOutPoint.Index = ReadUint32(r)
scriptLen := ReadVarInt(r)
tx.SignatureScript = ReadBytes(r, scriptLen)
tx.Sequence = ReadUint32(r)

输出信息仅仅包含两项信息,
name字节用途
金额value8接收方获得的金额,单位:聪
公钥脚本长度length1+交易脚本可变长度值
公钥脚本length接收方信息脚本

1
2
3
4
tx := TxOut{}
tx.Value = ReadUint64(r)
scriptLen := ReadVarInt(r)
tx.PkScript = ReadBytes(r, scriptLen)

区块完整数据结构

比特币区块数据以节约空间和简洁为前提,布局为一个区块头和交易明细结构。下图是完整的区块结构。
比特币区块数据结构

协助

实际在解析比特币区块时,为简化代码,提取了些公共方法。以方便错误处理和获取不同类型的数据,具体如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
type Hash [HashSize]byte
const (
HashSize = 32
)
// String returns the Hash as the hexadecimal string of the byte-reversed
// hash.
func (hash Hash) String() string {
for i := 0; i < HashSize/2; i++ {
hash[i], hash[HashSize-1-i] = hash[HashSize-1-i], hash[i]
}
return hex.EncodeToString(hash[:])
}
func DoubleHashH(b []byte) Hash {
first := sha256.Sum256(b)
return Hash(sha256.Sum256(first[:]))
}
func Failt(err error) {
log.Output(2, err.Error())
os.Exit(1)
}

常见的读取单项值方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
func ReadHash(r io.Reader) Hash {
var h Hash
_, err := io.ReadFull(r, h[:])
if err != nil {
Failt(err)
}
return h
}
func ReadBytes(r io.Reader, size uint64) []byte {
b := make([]byte, size)
_, err := io.ReadFull(r, b)
if err != nil {
Failt(err)
}
return b
}
func ReadUint32(r io.Reader) uint32 {
var v uint32
err := binary.Read(r, binary.LittleEndian, &v)
if err != nil {
Failt(err)
}
return v
}
func ReadUint64(r io.Reader) uint64 {
var v uint64
err := binary.Read(r, binary.LittleEndian, &v)
if err != nil {
Failt(err)
}
return v
}
// ReadVarInt reads a variable length integer from r and returns it as a uint64
func ReadVarInt(r io.Reader) uint64 {
var discriminant uint8
err := binary.Read(r, binary.LittleEndian, &discriminant)
if err != nil {
Failt(err)
}
var rv uint64
switch discriminant {
case 0xff:
err = binary.Read(r, binary.LittleEndian, &rv)
if err != nil {
Failt(err)
}
// The encoding is not canonical if the value could have been
// encoded using fewer bytes.
min := uint64(0x100000000)
if rv < min {
Failt(fmt.Errorf("var int %x < min value %x", rv, min))
}
case 0xfe:
var v uint32
err = binary.Read(r, binary.LittleEndian, &v)
if err != nil {
Failt(err)
}
rv = uint64(v)
// The encoding is not canonical if the value could have been
// encoded using fewer bytes.
min := uint64(0x10000)
if rv < min {
Failt(fmt.Errorf("var int %x < min value %x", rv, min))
}
case 0xfd:
var v uint16
err = binary.Read(r, binary.LittleEndian, &v)
if err != nil {
Failt(err)
}
rv = uint64(v)
// The encoding is not canonical if the value could have been
// encoded using fewer bytes.
min := uint64(0xfd)
if rv < min {
Failt(fmt.Errorf("var int %x < min value %x", rv, min))
}
default:
rv = uint64(discriminant)
}
return rv
}

结尾

上文中还有许多内容未详细说明,如脚本是什么?Merkle树如何构建的?交易费如何计算?交易中的脚本的用途等。待后续一一写文。
本文完整源代码见:github gist

package main
import (
"bytes"
"crypto/sha256"
"encoding/binary"
"encoding/hex"
"fmt"
"io"
"log"
"math"
"os"
"time"
)
const (
MARGIC uint32 = 0xd9b4bef9
)
func main() {
log.SetFlags(log.Ltime | log.Lmicroseconds | log.Lshortfile)
//filename := "./data/blk_0_to_4.dat"
filename := "/Users/ysqi/Library/Application Support/Bitcoin/blocks/blk00001.dat"
blocks := decodeBlocks(filename, 10)
for _, b := range blocks {
fmt.Printf("block version %x, block hash:%s, prev hash=%s \n", b.Head.Version, b.Head.Hash, b.Head.PrevBlock)
fmt.Printf("\t bits:%x,nonce:%x,number of tx:%d, merkle root:%s \n", b.Head.Bits, b.Head.Nonce, len(b.Transactions), b.Head.MerkleRoot)
var sum uint64
for _, t := range b.Transactions {
for i, input := range t.TxIn {
fmt.Printf("\t input %2d: Sequence:%d,index=%d,phash=%s \n",
i, input.Sequence, input.PreviousOutPoint.Index, input.PreviousOutPoint.Hasx)
}
for i, o := range t.TxOut {
sum += o.Value
fmt.Printf("\t output %d,value=%d \n", i, o.Value)
}
}
fmt.Printf("\t %f \n", ToBit(sum))
}
}
func ToBit(satoshi uint64) float64 {
return float64(satoshi) / math.Pow10(8)
}
func decodeBlocks(name string, needBlocks int) []*Block {
f, err := os.Open(name)
if err != nil {
log.Fatal(err)
}
defer f.Close()
var blooks []*Block
for {
mg := ReadUint32(f)
if mg != MARGIC {
Failt(fmt.Errorf("expect margic %x,not %x", margic, mg))
}
//get block size
blockSize := ReadUint32(f)
//get block bytes
blockData := ReadBytes(f, uint64(blockSize))
// create a byte reader for single block
r := bytes.NewReader(blockData)
// head
head := decodeHeader(r)
head.Hash = DoubleHashH(blockData[:80])
// transaction
txCount := ReadVarInt(r)
txs := make([]*MsgTx, txCount)
for i := uint64(0); i < txCount; i++ {
txs[i] = decoderTX(r)
}
blooks = append(blooks, &Block{
Head: head,
Transactions: txs,
})
if len(blooks) >= needBlocks {
break
}
}
return blooks
}
func decodeHeader(r io.Reader) (h BlockHeader) {
h.Version = ReadUint32(r)
h.PrevBlock = ReadHash(r)
h.MerkleRoot = ReadHash(r)
h.Timestamp = time.Unix(int64(ReadUint32(r)), 0)
h.Bits = ReadUint32(r)
h.Nonce = ReadUint32(r)
return
}
type Block struct {
Head BlockHeader
Transactions []*MsgTx
}
type BlockHeader struct {
Hash Hash
Version uint32
PrevBlock Hash
MerkleRoot Hash
Timestamp time.Time
Bits uint32 // difficulty @see: https://en.bitcoin.it/wiki/Difficulty
Nonce uint32
}
type MsgTx struct {
Version uint32
TxIn []*TxIn
TxOut []*TxOut
LockTime uint32
}
// TxIn defines a bitcoin transaction input.
type TxIn struct {
PreviousOutPoint OutPoint
SignatureScript []byte
Sequence uint32
}
// TxOut defines a bitcoin transaction output.
type TxOut struct {
Value uint64
PkScript []byte
}
type OutPoint struct {
Hasx Hash
Index uint32
}
func decoderTX(r io.Reader) *MsgTx {
msg := &MsgTx{}
msg.Version = ReadUint32(r)
txInCount := ReadVarInt(r)
msg.TxIn = make([]*TxIn, txInCount)
for i := uint64(0); i < txInCount; i++ {
tx := TxIn{}
msg.TxIn[i] = &tx
// prev
tx.PreviousOutPoint.Hasx = ReadHash(r)
tx.PreviousOutPoint.Index = ReadUint32(r)
scriptLen := ReadVarInt(r)
tx.SignatureScript = ReadBytes(r, scriptLen)
tx.Sequence = ReadUint32(r)
}
txOutCount := ReadVarInt(r)
msg.TxOut = make([]*TxOut, txOutCount)
for i := uint64(0); i < txOutCount; i++ {
tx := TxOut{}
msg.TxOut[i] = &tx
tx.Value = ReadUint64(r)
scriptLen := ReadVarInt(r)
tx.PkScript = ReadBytes(r, scriptLen)
}
msg.LockTime = ReadUint32(r)
return msg
}
type Hash [HashSize]byte
const (
HashSize = 32
)
// String returns the Hash as the hexadecimal string of the byte-reversed
// hash.
func (hash Hash) String() string {
for i := 0; i < HashSize/2; i++ {
hash[i], hash[HashSize-1-i] = hash[HashSize-1-i], hash[i]
}
return hex.EncodeToString(hash[:])
}
func DoubleHashH(b []byte) Hash {
first := sha256.Sum256(b)
return Hash(sha256.Sum256(first[:]))
}
func Failt(err error) {
log.Output(2, err.Error())
os.Exit(1)
}
func ReadHash(r io.Reader) Hash {
var h Hash
_, err := io.ReadFull(r, h[:])
if err != nil {
Failt(err)
}
return h
}
func ReadBytes(r io.Reader, size uint64) []byte {
b := make([]byte, size)
_, err := io.ReadFull(r, b)
if err != nil {
Failt(err)
}
return b
}
func ReadUint32(r io.Reader) uint32 {
var v uint32
err := binary.Read(r, binary.LittleEndian, &v)
if err != nil {
Failt(err)
}
return v
}
func ReadUint64(r io.Reader) uint64 {
var v uint64
err := binary.Read(r, binary.LittleEndian, &v)
if err != nil {
Failt(err)
}
return v
}
// ReadVarInt reads a variable length integer from r and returns it as a uint64.
func ReadVarInt(r io.Reader) uint64 {
var discriminant uint8
err := binary.Read(r, binary.LittleEndian, &discriminant)
if err != nil {
Failt(err)
}
var rv uint64
switch discriminant {
case 0xff:
err = binary.Read(r, binary.LittleEndian, &rv)
if err != nil {
Failt(err)
}
// The encoding is not canonical if the value could have been
// encoded using fewer bytes.
min := uint64(0x100000000)
if rv < min {
Failt(fmt.Errorf("var int %x < min value %x", rv, min))
}
case 0xfe:
var v uint32
err = binary.Read(r, binary.LittleEndian, &v)
if err != nil {
Failt(err)
}
rv = uint64(v)
// The encoding is not canonical if the value could have been
// encoded using fewer bytes.
min := uint64(0x10000)
if rv < min {
Failt(fmt.Errorf("var int %x < min value %x", rv, min))
}
case 0xfd:
var v uint16
err = binary.Read(r, binary.LittleEndian, &v)
if err != nil {
Failt(err)
}
rv = uint64(v)
// The encoding is not canonical if the value could have been
// encoded using fewer bytes.
min := uint64(0xfd)
if rv < min {
Failt(fmt.Errorf("var int %x < min value %x", rv, min))
}
default:
rv = uint64(discriminant)
}
return rv
}