The peer-to-peer network that Bitcoin runs on is what gives it a lot of its robustness. More than 65,000 nodes are running on the network as of this writing and are communicating constantly. The Bitcoin network is a broadcast network, or gossip network. Every node is announcing different transactions, blocks, and peers that it knows about.
One thing to note about the networking protocol is that it is not consensus-critical. The same data can be sent from one node to another using some other protocol and the blockchain itself will not be affected.
Network Messages
The envelope that contains the actual payload (The example of common network message below.)
network magic (4 bytes): Magic value indicating message origin network, and used to seek to next message when stream state is unknown
command (12 bytes): ASCII string identifying the packet content, NULL padded (non-NULL padding results in packet rejected)
payload length (4 bytes): Length of payload in number of bytes
payload checksum (4 bytes): First 4 bytes of sha256 (= sha256(payload))
payload
The code to handle network messages requires us to create a new class:
def serialize(self):
result = self.magic
result += self.command + b'\x00' * (12 - len(self.command))
result += int_to_little_endian(len(self.payload), 4)
result += hash256(self.payload)[:4]
result += self.payload
return result
Parsing the Payload
The below figure is the parsed payload for version message. The fields are meant to give enough information for two nodes to be able to communicate.
The following network services are currently assigned:
NODE_GETUTXO: Supported "getutxo" message that query UTXOs for verifying double spending on SPV nodes
NODE_BLOOM: Supported Bloom Filter
NODE_WITNESS: Supported Segwit
NODE_NETWORK_LIMITED: Supported relaying and verifying of all TXs and the most recent 288 blocks
Setting some reasonable defaults, our VersionMessage class looks like this:
def serialize(self):
result = int_to_little_endian(self.version, 4)
result += int_to_little_endian(self.services, 8)
result += int_to_little_endian(self.timestamp, 8)
result += int_to_little_endian(self.receiver_services, 8)
result += b'\x00' * 10 + b'\xff\xff' + self.receiver_ip
result += self.receiver_port.to_bytes(2, 'big')
result += int_to_little_endian(self.sender_services, 8)
result += b'\x00' * 10 + b'\xff\xff' + self.sender_ip
result += self.sender_port.to_bytes(2, 'big')
result += self.nonce
result += encode_varint(len(self.user_agent))
result += self.user_agent
result += int_to_little_endian(self.latest_block, 4)
if self.relay:
result += b'\x01'
else:
result += b'\x00'
return result
Network Handshake
The network handshake is how nodes establish communication:
A wants to connect to B and sends a version message.
B receives the version message, responds with a verack message, and sends its own version message.
A receives the version and verack messages and sends back a verack message.
B receives the verack message and continues communication.
Once the handshake is finished, A and B can communicate however they want. Note that there is no authentication here, and it’s up to the nodes to verify all data that they receive. If a node sends a bad transaction or block, it can expect to get banned or disconnected.
Connecting to the Network
Network communication is tricky due to its asynchronous nature. To experiment, we can establish a connection to a node on the network synchronously:
>>> import socket
>>> from network import NetworkEnvelope, VersionMessage
>>> host = 'testnet.programmingbitcoin.com'
# This is a server I’ve set up for testnet. The testnet port is 18333 by default.
>>> port = 18333
>>> socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>>> socket.connect((host, port))
>>> stream = socket.makefile('rb', None)
# We create a stream to be able to read from the socket.
# A stream made this way can be passed to all the parse methods.
>>> version = VersionMessage()
# The first step of the handshake is to send a version message.
>>> envelope = NetworkEnvelope(version.command, version.serialize())
>>> socket.sendall(envelope.serialize())
# We now send the message in the right envelope.
>>> while True:
... new_message = NetworkEnvelope.parse(stream)
# This line will read any messages coming in through our connected socket.
... print(new_message)
Connecting in this way, we can’t send until we’ve received and can’t respond intelligently to more than one message at a time. A more robust implementation would use an asynchronous library (like asyncio in Python 3) to send and receive without being blocked.
We also need a verack message class, which we’ll create here:
Let’s now automate this by creating a class that will handle the communication for us:
class SimpleNode:
def __init__(self, host, port=None, testnet=False, logging=False):
if port is None:
if testnet:
port = 18333
else:
port = 8333
self.testnet = testnet
self.logging = logging
self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.socket.connect((host, port))
self.stream = self.socket.makefile('rb', None)
def send(self, message):
# The send method sends a message over the socket.
# The command property and serialize methods are expected to exist in the message object.
'''Send a message to the connected node'''
envelope = NetworkEnvelope(
message.command, message.serialize(), testnet=self.testnet)
if self.logging:
print('sending: {}'.format(envelope))
self.socket.sendall(envelope.serialize())
def read(self):
# The read method reads a new message from the socket.
'''Read a message from the socket'''
envelope = NetworkEnvelope.parse(self.stream, testnet=self.testnet)
if self.logging:
print('receiving: {}'.format(envelope))
return envelope
def wait_for(self, *message_classes):
# The wait_for method lets us wait for any one of several commands
# (specifically, message classes).
# Along with the synchronous nature of this class, a method like this makes for easier programming.
# A commercial-strength node would definitely not use something like this.
'''Wait for one of the messages in the list'''
command = None
command_to_class = {m.command: m for m in message_classes}
while command not in command_to_class.keys():
envelope = self.read()
command = envelope.command
if command == VersionMessage.command:
self.send(VerAckMessage())
elif command == PingMessage.command:
self.send(PongMessage(envelope.payload))
return command_to_class[command].parse(envelope.stream())
Now that we have a node, we can handshake with another node:
>>> node = SimpleNode('testnet.programmingbitcoin.com', testnet=True)
>>> version = VersionMessage()
# Most nodes don’t care about the fields in version like IP address.
# We can connect with the defaults and everything will be just fine.
>>> node.send(version)
# We start the handshake by sending the version message.
>>> verack_received = False
>>> version_received = False
>>> while not verack_received and not version_received:
# We only finish when we’ve received both verack and version.
... message = node.wait_for(VersionMessage, VerAckMessage)
# We expect to receive a verack for our version and the other node’s version.
# We don’t know in which order they will arrive, though.
... if message.command == VerAckMessage.command:
... verack_received = True
... else:
... version_received = True
... node.send(VerAckMessage())
Exercise 5
Write the handshake method for SimpleNode.
def handshake(self):
version = VersionMessage()
self.send(version)
self.wait_for(VerAckMessage)
Getting Block Headers
When any node first connects to the network, the data that’s most crucial to get and verify is the block headers.
For full nodes, downloading the block headers allows them to asynchronously ask for full blocks from multiple nodes, parallelizing the download of the blocks.
For light clients, downloading headers allows them to verify the proof-of-work in each block.
Nodes can give us the block headers without taking up much bandwidth. The command to get the block headers is called getheaders (the following figure is parsed getheaders)
Here's what the GetHeadersMessage class looks like:
class GetHeadersMessage:
command = b'getheaders'
def __init__(self, version=70015, num_hashes=1,
start_block=None, end_block=None):
self.version = version
self.num_hashes = num_hashes
# For the purposes of this chapter, we’re going to assume that
# the number of block header groups is 1.
# A more robust implementation would handle more than a single block group,
# but we can download the block headers using a single group.
if start_block is None:
# A starting block is needed, otherwise we can’t create a proper message.
raise RuntimeError('a start block is required')
self.start_block = start_block
if end_block is None:
self.end_block = b'\x00' * 32
# The ending block we assume to be null,
# or as many as the server will send to us if not defined.
else:
self.end_block = end_block
Exercise 6
Write the serialize method for GetHeadersMessage.
def serialize(self):
result = int_to_little_endian(self.version, 4)
result += encode_varint(self.num_hashes)
result += self.start_block[::-1]
result += self.end_block[::-1]
return result
Headers Response
We can now create a node, handshake, and then ask for some headers:
Now we need a way to receive the headers from the other node. The other node will send back the headers command. The below figure is parsed block headers.
We can use the same parsing engine as when parsing a full block:
class HeadersMessage:
command = b'headers'
def __init__(self, blocks):
self.blocks = blocks
@classmethod
def parse(cls, stream):
num_headers = read_varint(stream)
blocks = []
for _ in range(num_headers):
blocks.append(Block.parse(stream))
# Each block gets parsed with the Block class’s parse method,
#using the same stream that we have.
num_txs = read_varint(stream)
# The number of transactions is always 0 and is a remnant of block parsing.
if num_txs != 0:
# If we didn’t get 0, something is wrong.
raise RuntimeError('number of txs not 0')
return cls(blocks)
Given the network connection that we’ve set up, we can download the headers, check their proof-of-work, and validate the block header difficulty adjustments as follows:
>>> from io import BytesIO
>>> from network import SimpleNode, GetHeadersMessage, HeadersMessage
>>> from block import Block, GENESIS_BLOCK, LOWEST_BITS
>>> from helper import calculate_new_bits
>>> previous = Block.parse(BytesIO(GENESIS_BLOCK))
>>> first_epoch_timestamp = previous.timestamp
>>> expected_bits = LOWEST_BITS
>>> count = 1
>>> node = SimpleNode('mainnet.programmingbitcoin.com', testnet=False)
# Handle the communication
>>> node.handshake()
>>> for _ in range(19):
... getheaders = GetHeadersMessage(start_block=previous.hash())
... node.send(getheaders)
... headers = node.wait_for(HeadersMessage)
... for header in headers.blocks:
... if not header.check_pow():
# Check that the proof-of-work is valid.
... raise RuntimeError('bad PoW at block {}'.format(count))
... if header.prev_block != previous.hash():
# Check that the current block is after the previous one.
... raise RuntimeError('discontinuous block at {}'.format(count))
... if count % 2016 == 0:
... time_diff = previous.timestamp - first_epoch_timestamp
... expected_bits = calculate_new_bits(previous.bits, time_diff)
# At the end of the epoch, calculate the next bits/target/difficulty.
... print(expected_bits.hex())
... first_epoch_timestamp = header.timestamp
# Store the first block of the epoch to calculate bits at the end of the epoch.
... if header.bits != expected_bits:
# Check that the bits/target/difficulty is what we expect
# based on the previous epoch calculation.
... raise RuntimeError('bad bits at block {}'.format(count))
... previous = header
... count += 1
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
ffff001d
6ad8001d
28c4001d
71be001d
The number of connected full nodes (2019-09-10): https://bitnodes.earn.com
Known magic values: https://en.bitcoin.it/wiki/Protocol_documentation#Message_types