Boohyung's Gitbook
  • Introduction
  • Research
    • Byzantine Generals' Problem
    • Bitcoin P2P Network
    • Chapter 10. Networking
    • Hyperledger Caliper
    • The Operations Service
    • Project DYOR with Notion
    • zkSync 2.0 퍼블릭 테스트넷 사용해보기
  • Development
    • Building Your First Network: first-network
    • Building Your First Network: Run the tools
    • Writing Your First Application(fabcar)
    • Hyperledger Caliper Sample(caliper-benchmarks)
    • Hyperledger Fabric Operations Service Tutorial
    • Hyperledger Fabric Monitoring with Prometheus
Powered by GitBook
On this page
  • Network Messages
  • Exercise 1
  • Exercise 2
  • Exercise 3
  • Parsing the Payload
  • Exercise 4
  • Network Handshake
  • Connecting to the Network
  • Exercise 5
  • Getting Block Headers
  • Exercise 6
  • Headers Response
  1. Research

Chapter 10. Networking

Jimmy Song, "Programming Bitcoin: Chapter 10. Networking", 2019. 03

PreviousBitcoin P2P NetworkNextHyperledger Caliper

Last updated 5 years ago

Source:

The peer-to-peer network that Bitcoin runs on is what gives it a lot of its robustness. More than 65,000 nodes are running on the network as of this writing and are communicating constantly. The Bitcoin network is a broadcast network, or gossip network. Every node is announcing different transactions, blocks, and peers that it knows about.

One thing to note about the networking protocol is that it is not consensus-critical. The same data can be sent from one node to another using some other protocol and the blockchain itself will not be affected.

Network Messages

The envelope that contains the actual payload (The example of common network message below.)

  • network magic (4 bytes): Magic value indicating message origin network, and used to seek to next message when stream state is unknown

  • command (12 bytes): ASCII string identifying the packet content, NULL padded (non-NULL padding results in packet rejected)

  • payload length (4 bytes): Length of payload in number of bytes

  • payload checksum (4 bytes): First 4 bytes of sha256 (= sha256(payload))

  • payload

The code to handle network messages requires us to create a new class:

NETWORK_MAGIC = b'\xf9\xbe\xb4\xd9' 
TESTNET_NETWORK_MAGIC = b'\x0b\x11\x09\x07'

class NetworkEnvelope:

    def __init__(self, command, payload, testnet=False):
        self.command = command
        self.payload = payload
        if testnet:
            self.magic = TESTNET_NETWORK_MAGIC
        else:
            self.magic = NETWORK_MAGIC

    def __repr__(self):
        return '{}: {}'.format(
            self.command.decode('ascii'),
            self.payload.hex(),
        )

Exercise 1

Write the parse method for NetworkEnvelope.

def parse(cls, s, testnet=False):
    magic = s.read(4)
    if magic == b'':
        raise IOError('Connection reset!') # exception
    if testnet:
        expected_magic = TESTNET_NETWORK_MAGIC
    else:
        expected_magic = NETWORK_MAGIC
    if magic != expected_magic: # exception
        raise SyntaxError('magic is not right {} vs {}'.format(magic.hex(), 
          expected_magic.hex()))

    command = s.read(12)
    command = command.strip(b'\x00')
    payload_length = little_endian_to_int(s.read(4))
    checksum = s.read(4)
    payload = s.read(payload_length)
    calculated_checksum = hash256(payload)[:4]
    if calculated_checksum != checksum:
        raise IOError('checksum does not match')
    return cls(command, payload, testnet=testnet)

Exercise 2

Determine what this network message is: f9beb4d976657261636b000000000000000000005df6e0e2

>>> from network import NetworkEnvelope
>>> from io import BytesIO
>>> message_hex = 'f9beb4d976657261636b000000000000000000005df6e0e2'
>>> stream = BytesIO(bytes.fromhex(message_hex))
>>> envelope = NetworkEnvelope.parse(stream)
>>> print(envelope.command)
b'verack'
>>> print(envelope.payload)
b''

Exercise 3

Write the serialize method for NetworkEnvelope.

def serialize(self):
    result = self.magic
    result += self.command + b'\x00' * (12 - len(self.command))
    result += int_to_little_endian(len(self.payload), 4)
    result += hash256(self.payload)[:4]
    result += self.payload
    return result

Parsing the Payload

The below figure is the parsed payload for version message. The fields are meant to give enough information for two nodes to be able to communicate.

The following network services are currently assigned:

  • NODE_GETUTXO: Supported "getutxo" message that query UTXOs for verifying double spending on SPV nodes

  • NODE_BLOOM: Supported Bloom Filter

  • NODE_WITNESS: Supported Segwit

  • NODE_NETWORK_LIMITED: Supported relaying and verifying of all TXs and the most recent 288 blocks

Setting some reasonable defaults, our VersionMessage class looks like this:

class VersionMessage:
    command = b'version'

    def __init__(self, version=70015, services=0, timestamp=None,
                 receiver_services=0,
                 receiver_ip=b'\x00\x00\x00\x00', receiver_port=8333,
                 sender_services=0,
                 sender_ip=b'\x00\x00\x00\x00', sender_port=8333,
                 nonce=None, user_agent=b'/programmingbitcoin:0.1/',
                 latest_block=0, relay=False):
        self.version = version
        self.services = services
        if timestamp is None:
            self.timestamp = int(time.time())
        else:
            self.timestamp = timestamp
        self.receiver_services = receiver_services
        self.receiver_ip = receiver_ip
        self.receiver_port = receiver_port
        self.sender_services = sender_services
        self.sender_ip = sender_ip
        self.sender_port = sender_port
        if nonce is None:
            self.nonce = int_to_little_endian(randint(0, 2**64), 8)
        else:
            self.nonce = nonce
        self.user_agent = user_agent
        self.latest_block = latest_block
        self.relay = relay

Exercise 4

Write the serialize method for VersionMessage.

def serialize(self):
    result = int_to_little_endian(self.version, 4)
    result += int_to_little_endian(self.services, 8)
    result += int_to_little_endian(self.timestamp, 8)
    result += int_to_little_endian(self.receiver_services, 8)
    result += b'\x00' * 10 + b'\xff\xff' + self.receiver_ip
    result += self.receiver_port.to_bytes(2, 'big')
    result += int_to_little_endian(self.sender_services, 8)
    result += b'\x00' * 10 + b'\xff\xff' + self.sender_ip
    result += self.sender_port.to_bytes(2, 'big')
    result += self.nonce
    result += encode_varint(len(self.user_agent))
    result += self.user_agent
    result += int_to_little_endian(self.latest_block, 4)
    if self.relay:
        result += b'\x01'
    else:
        result += b'\x00'
    return result

Network Handshake

The network handshake is how nodes establish communication:

  • A wants to connect to B and sends a version message.

  • B receives the version message, responds with a verack message, and sends its own version message.

  • A receives the version and verack messages and sends back a verack message.

  • B receives the verack message and continues communication.

Once the handshake is finished, A and B can communicate however they want. Note that there is no authentication here, and it’s up to the nodes to verify all data that they receive. If a node sends a bad transaction or block, it can expect to get banned or disconnected.

Connecting to the Network

Network communication is tricky due to its asynchronous nature. To experiment, we can establish a connection to a node on the network synchronously:

>>> import socket
>>> from network import NetworkEnvelope, VersionMessage
>>> host = 'testnet.programmingbitcoin.com' 
# This is a server I’ve set up for testnet. The testnet port is 18333 by default.
>>> port = 18333
>>> socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>>> socket.connect((host, port))
>>> stream = socket.makefile('rb', None)
# We create a stream to be able to read from the socket. 
# A stream made this way can be passed to all the parse methods.
>>> version = VersionMessage()
# The first step of the handshake is to send a version message.
>>> envelope = NetworkEnvelope(version.command, version.serialize())
>>> socket.sendall(envelope.serialize())
# We now send the message in the right envelope.
>>> while True:
... new_message = NetworkEnvelope.parse(stream)
# This line will read any messages coming in through our connected socket.
... print(new_message)

Connecting in this way, we can’t send until we’ve received and can’t respond intelligently to more than one message at a time. A more robust implementation would use an asynchronous library (like asyncio in Python 3) to send and receive without being blocked.

We also need a verack message class, which we’ll create here:

class VerAckMessage:
    command = b'verack'

    def __init__(self):
        pass

    @classmethod
    def parse(cls, s):
        return cls()

    def serialize(self):
        return b''

Let’s now automate this by creating a class that will handle the communication for us:

class SimpleNode:

    def __init__(self, host, port=None, testnet=False, logging=False):
        if port is None:
            if testnet:
                port = 18333
            else:
                port = 8333
        self.testnet = testnet
        self.logging = logging
        self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.socket.connect((host, port))
        self.stream = self.socket.makefile('rb', None)

		def send(self, message):
# The send method sends a message over the socket. 
# The command property and serialize methods are expected to exist in the message object.
        '''Send a message to the connected node'''
        envelope = NetworkEnvelope(
            message.command, message.serialize(), testnet=self.testnet)
        if self.logging:
            print('sending: {}'.format(envelope))
        self.socket.sendall(envelope.serialize())

    def read(self):
# The read method reads a new message from the socket.
        '''Read a message from the socket'''
        envelope = NetworkEnvelope.parse(self.stream, testnet=self.testnet)
        if self.logging:
            print('receiving: {}'.format(envelope))
        return envelope

    def wait_for(self, *message_classes):  
# The wait_for method lets us wait for any one of several commands 
# (specifically, message classes). 
# Along with the synchronous nature of this class, a method like this makes for easier programming. 
# A commercial-strength node would definitely not use something like this. 
        '''Wait for one of the messages in the list'''
        command = None
        command_to_class = {m.command: m for m in message_classes}
        while command not in command_to_class.keys():
            envelope = self.read()
            command = envelope.command
            if command == VersionMessage.command:
                self.send(VerAckMessage())
            elif command == PingMessage.command:
                self.send(PongMessage(envelope.payload))
        return command_to_class[command].parse(envelope.stream())

Now that we have a node, we can handshake with another node:

>>> node = SimpleNode('testnet.programmingbitcoin.com', testnet=True)
>>> version = VersionMessage()
# Most nodes don’t care about the fields in version like IP address. 
# We can connect with the defaults and everything will be just fine.
>>> node.send(version)
# We start the handshake by sending the version message.
>>> verack_received = False
>>> version_received = False
>>> while not verack_received and not version_received:
# We only finish when we’ve received both verack and version.
... message = node.wait_for(VersionMessage, VerAckMessage)
# We expect to receive a verack for our version and the other node’s version. 
# We don’t know in which order they will arrive, though.
... if message.command == VerAckMessage.command:
... verack_received = True
... else:
... version_received = True
... node.send(VerAckMessage())

Exercise 5

Write the handshake method for SimpleNode.

def handshake(self):
    version = VersionMessage()
    self.send(version)
    self.wait_for(VerAckMessage)

Getting Block Headers

When any node first connects to the network, the data that’s most crucial to get and verify is the block headers.

For full nodes, downloading the block headers allows them to asynchronously ask for full blocks from multiple nodes, parallelizing the download of the blocks.

For light clients, downloading headers allows them to verify the proof-of-work in each block.

Nodes can give us the block headers without taking up much bandwidth. The command to get the block headers is called getheaders (the following figure is parsed getheaders)

Here's what the GetHeadersMessage class looks like:

class GetHeadersMessage:
    command = b'getheaders'

    def __init__(self, version=70015, num_hashes=1, 
        start_block=None, end_block=None):
        self.version = version
        self.num_hashes = num_hashes  
# For the purposes of this chapter, we’re going to assume that 
# the number of block header groups is 1. 
# A more robust implementation would handle more than a single block group, 
# but we can download the block headers using a single group.
        if start_block is None:  
# A starting block is needed, otherwise we can’t create a proper message.
            raise RuntimeError('a start block is required')
        self.start_block = start_block
        if end_block is None:
            self.end_block = b'\x00' * 32  
# The ending block we assume to be null, 
# or as many as the server will send to us if not defined.
        else:
            self.end_block = end_block

Exercise 6

Write the serialize method for GetHeadersMessage.

def serialize(self):
    result = int_to_little_endian(self.version, 4)
    result += encode_varint(self.num_hashes)
    result += self.start_block[::-1]
    result += self.end_block[::-1]
    return result

Headers Response

We can now create a node, handshake, and then ask for some headers:

>>> from io import BytesIO
>>> from block import Block, GENESIS_BLOCK
>>> from network import SimpleNode, GetHeadersMessage
>>> node = SimpleNode('mainnet.programmingbitcoin.com', testnet=False)
>>> node.handshake()
>>> genesis = Block.parse(BytesIO(GENESIS_BLOCK))
>>> getheaders = GetHeadersMessage(start_block=genesis.hash())
>>> node.send(getheaders)

Now we need a way to receive the headers from the other node. The other node will send back the headers command. The below figure is parsed block headers.

We can use the same parsing engine as when parsing a full block:

class HeadersMessage:
    command = b'headers'

    def __init__(self, blocks):
        self.blocks = blocks

    @classmethod
    def parse(cls, stream):
        num_headers = read_varint(stream)
        blocks = []
        for _ in range(num_headers):
            blocks.append(Block.parse(stream))  
# Each block gets parsed with the Block class’s parse method, 
#using the same stream that we have.
            num_txs = read_varint(stream)  
# The number of transactions is always 0 and is a remnant of block parsing. 
            if num_txs != 0:  
# If we didn’t get 0, something is wrong.
                raise RuntimeError('number of txs not 0')
        return cls(blocks)

Given the network connection that we’ve set up, we can download the headers, check their proof-of-work, and validate the block header difficulty adjustments as follows:

>>> from io import BytesIO
>>> from network import SimpleNode, GetHeadersMessage, HeadersMessage
>>> from block import Block, GENESIS_BLOCK, LOWEST_BITS
>>> from helper import calculate_new_bits
>>> previous = Block.parse(BytesIO(GENESIS_BLOCK))
>>> first_epoch_timestamp = previous.timestamp
>>> expected_bits = LOWEST_BITS
>>> count = 1
>>> node = SimpleNode('mainnet.programmingbitcoin.com', testnet=False)
# Handle the communication
>>> node.handshake()
>>> for _ in range(19):
...     getheaders = GetHeadersMessage(start_block=previous.hash())
...     node.send(getheaders)
...     headers = node.wait_for(HeadersMessage)
...     for header in headers.blocks:
...         if not header.check_pow():  
# Check that the proof-of-work is valid.
...             raise RuntimeError('bad PoW at block {}'.format(count))
...         if header.prev_block != previous.hash():  
# Check that the current block is after the previous one.
...             raise RuntimeError('discontinuous block at {}'.format(count))
...         if count % 2016 == 0:
...             time_diff = previous.timestamp - first_epoch_timestamp
...             expected_bits = calculate_new_bits(previous.bits, time_diff)  
# At the end of the epoch, calculate the next bits/target/difficulty.
...             print(expected_bits.hex())
...             first_epoch_timestamp = header.timestamp  
# Store the first block of the epoch to calculate bits at the end of the epoch.
...         if header.bits != expected_bits:  
# Check that the bits/target/difficulty is what we expect 
# based on the previous epoch calculation.
...             raise RuntimeError('bad bits at block {}'.format(count))
...         previous = header
...         count += 1
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
6ad8001d
28c4001d
71be001d
https://github.com/jimmysong/programmingbitcoin/tree/master/code-ch10
The number of connected full nodes (2019-09-10): https://bitnodes.earn.com
Known magic values: https://en.bitcoin.it/wiki/Protocol_documentation#Message_types
https://en.bitcoin.it/wiki/Protocol_documentation#Message_types
Network Handshake, Mastering Bitcoin