The peer-to-peer network that Bitcoin runs on is what gives it a lot of its robustness. More than 65,000 nodes are running on the network as of this writing and are communicating constantly. The Bitcoin network is a broadcast network, or gossip network. Every node is announcing different transactions, blocks, and peers that it knows about.
One thing to note about the networking protocol is that it is not consensus-critical. The same data can be sent from one node to another using some other protocol and the blockchain itself will not be affected.
Network Messages
The envelope that contains the actual payload (The example of common network message below.)
network magic (4 bytes): Magic value indicating message origin network, and used to seek to next message when stream state is unknown
command (12 bytes): ASCII string identifying the packet content, NULL padded (non-NULL padding results in packet rejected)
payload length (4 bytes): Length of payload in number of bytes
payload checksum (4 bytes): First 4 bytes of sha256 (= sha256(payload))
payload
The code to handle network messages requires us to create a new class:
defserialize(self): result = self.magic result += self.command +b'\x00'* (12-len(self.command)) result +=int_to_little_endian(len(self.payload), 4) result +=hash256(self.payload)[:4] result += self.payloadreturn result
Parsing the Payload
The below figure is the parsed payload for version message. The fields are meant to give enough information for two nodes to be able to communicate.
The following network services are currently assigned:
NODE_GETUTXO: Supported "getutxo" message that query UTXOs for verifying double spending on SPV nodes
NODE_BLOOM: Supported Bloom Filter
NODE_WITNESS: Supported Segwit
NODE_NETWORK_LIMITED: Supported relaying and verifying of all TXs and the most recent 288 blocks
Setting some reasonable defaults, our VersionMessage class looks like this:
defserialize(self): result =int_to_little_endian(self.version, 4) result +=int_to_little_endian(self.services, 8) result +=int_to_little_endian(self.timestamp, 8) result +=int_to_little_endian(self.receiver_services, 8) result +=b'\x00'*10+b'\xff\xff'+ self.receiver_ip result += self.receiver_port.to_bytes(2, 'big') result +=int_to_little_endian(self.sender_services, 8) result +=b'\x00'*10+b'\xff\xff'+ self.sender_ip result += self.sender_port.to_bytes(2, 'big') result += self.nonce result +=encode_varint(len(self.user_agent)) result += self.user_agent result +=int_to_little_endian(self.latest_block, 4)if self.relay: result +=b'\x01'else: result +=b'\x00'return result
Network Handshake
The network handshake is how nodes establish communication:
A wants to connect to B and sends a version message.
B receives the version message, responds with a verack message, and sends its own version message.
A receives the version and verack messages and sends back a verack message.
B receives the verack message and continues communication.
Once the handshake is finished, A and B can communicate however they want. Note that there is no authentication here, and it’s up to the nodes to verify all data that they receive. If a node sends a bad transaction or block, it can expect to get banned or disconnected.
Connecting to the Network
Network communication is tricky due to its asynchronous nature. To experiment, we can establish a connection to a node on the network synchronously:
>>>import socket>>>from network import NetworkEnvelope, VersionMessage>>> host ='testnet.programmingbitcoin.com'# This is a server I’ve set up for testnet. The testnet port is 18333 by default.>>> port =18333>>> socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)>>> socket.connect((host, port))>>> stream = socket.makefile('rb', None)# We create a stream to be able to read from the socket. # A stream made this way can be passed to all the parse methods.>>> version =VersionMessage()# The first step of the handshake is to send a version message.>>> envelope =NetworkEnvelope(version.command, version.serialize())>>> socket.sendall(envelope.serialize())# We now send the message in the right envelope.>>>whileTrue:... new_message = NetworkEnvelope.parse(stream)# This line will read any messages coming in through our connected socket.... print(new_message)
Connecting in this way, we can’t send until we’ve received and can’t respond intelligently to more than one message at a time. A more robust implementation would use an asynchronous library (like asyncio in Python 3) to send and receive without being blocked.
We also need a verack message class, which we’ll create here:
Let’s now automate this by creating a class that will handle the communication for us:
classSimpleNode:def__init__(self,host,port=None,testnet=False,logging=False):if port isNone:if testnet: port =18333else: port =8333 self.testnet = testnet self.logging = logging self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.socket.connect((host, port)) self.stream = self.socket.makefile('rb', None)defsend(self,message):# The send method sends a message over the socket. # The command property and serialize methods are expected to exist in the message object.'''Send a message to the connected node''' envelope =NetworkEnvelope( message.command, message.serialize(), testnet=self.testnet)if self.logging:print('sending: {}'.format(envelope)) self.socket.sendall(envelope.serialize())defread(self):# The read method reads a new message from the socket.'''Read a message from the socket''' envelope = NetworkEnvelope.parse(self.stream, testnet=self.testnet)if self.logging:print('receiving: {}'.format(envelope))return envelopedefwait_for(self,*message_classes): # The wait_for method lets us wait for any one of several commands # (specifically, message classes). # Along with the synchronous nature of this class, a method like this makes for easier programming. # A commercial-strength node would definitely not use something like this. '''Wait for one of the messages in the list''' command =None command_to_class ={m.command: m for m in message_classes}while command notin command_to_class.keys(): envelope = self.read() command = envelope.commandif command == VersionMessage.command: self.send(VerAckMessage())elif command == PingMessage.command: self.send(PongMessage(envelope.payload))return command_to_class[command].parse(envelope.stream())
Now that we have a node, we can handshake with another node:
>>> node =SimpleNode('testnet.programmingbitcoin.com', testnet=True)>>> version =VersionMessage()# Most nodes don’t care about the fields in version like IP address. # We can connect with the defaults and everything will be just fine.>>> node.send(version)# We start the handshake by sending the version message.>>> verack_received =False>>> version_received =False>>>whilenot verack_received andnot version_received:# We only finish when we’ve received both verack and version.... message = node.wait_for(VersionMessage, VerAckMessage)# We expect to receive a verack for our version and the other node’s version. # We don’t know in which order they will arrive, though.... if message.command == VerAckMessage.command:... verack_received =True... else:... version_received =True... node.send(VerAckMessage())
Exercise 5
Write the handshake method for SimpleNode.
defhandshake(self): version =VersionMessage() self.send(version) self.wait_for(VerAckMessage)
Getting Block Headers
When any node first connects to the network, the data that’s most crucial to get and verify is the block headers.
For full nodes, downloading the block headers allows them to asynchronously ask for full blocks from multiple nodes, parallelizing the download of the blocks.
For light clients, downloading headers allows them to verify the proof-of-work in each block.
Nodes can give us the block headers without taking up much bandwidth. The command to get the block headers is called getheaders (the following figure is parsed getheaders)
Here's what the GetHeadersMessage class looks like:
classGetHeadersMessage: command =b'getheaders'def__init__(self,version=70015,num_hashes=1,start_block=None,end_block=None): self.version = version self.num_hashes = num_hashes # For the purposes of this chapter, we’re going to assume that # the number of block header groups is 1. # A more robust implementation would handle more than a single block group, # but we can download the block headers using a single group.if start_block isNone:# A starting block is needed, otherwise we can’t create a proper message.raiseRuntimeError('a start block is required') self.start_block = start_blockif end_block isNone: self.end_block =b'\x00'*32# The ending block we assume to be null, # or as many as the server will send to us if not defined.else: self.end_block = end_block
Exercise 6
Write the serialize method for GetHeadersMessage.
defserialize(self): result =int_to_little_endian(self.version, 4) result +=encode_varint(self.num_hashes) result += self.start_block[::-1] result += self.end_block[::-1]return result
Headers Response
We can now create a node, handshake, and then ask for some headers:
Now we need a way to receive the headers from the other node. The other node will send back the headers command. The below figure is parsed block headers.
We can use the same parsing engine as when parsing a full block:
classHeadersMessage: command =b'headers'def__init__(self,blocks): self.blocks = blocks@classmethoddefparse(cls,stream): num_headers =read_varint(stream) blocks = []for _ inrange(num_headers): blocks.append(Block.parse(stream))# Each block gets parsed with the Block class’s parse method, #using the same stream that we have. num_txs =read_varint(stream)# The number of transactions is always 0 and is a remnant of block parsing. if num_txs !=0:# If we didn’t get 0, something is wrong.raiseRuntimeError('number of txs not 0')returncls(blocks)
Given the network connection that we’ve set up, we can download the headers, check their proof-of-work, and validate the block header difficulty adjustments as follows:
>>>from io import BytesIO>>>from network import SimpleNode, GetHeadersMessage, HeadersMessage>>>from block import Block, GENESIS_BLOCK, LOWEST_BITS>>>from helper import calculate_new_bits>>> previous = Block.parse(BytesIO(GENESIS_BLOCK))>>> first_epoch_timestamp = previous.timestamp>>> expected_bits = LOWEST_BITS>>> count =1>>> node =SimpleNode('mainnet.programmingbitcoin.com', testnet=False)# Handle the communication>>> node.handshake()>>>for _ inrange(19):... getheaders =GetHeadersMessage(start_block=previous.hash())... node.send(getheaders)... headers = node.wait_for(HeadersMessage)... for header in headers.blocks:... ifnot header.check_pow():# Check that the proof-of-work is valid.... raiseRuntimeError('bad PoW at block {}'.format(count))... if header.prev_block != previous.hash():# Check that the current block is after the previous one.... raiseRuntimeError('discontinuous block at {}'.format(count))... if count %2016==0:... time_diff = previous.timestamp - first_epoch_timestamp... expected_bits =calculate_new_bits(previous.bits, time_diff)# At the end of the epoch, calculate the next bits/target/difficulty.... print(expected_bits.hex())... first_epoch_timestamp = header.timestamp # Store the first block of the epoch to calculate bits at the end of the epoch.... if header.bits != expected_bits:# Check that the bits/target/difficulty is what we expect # based on the previous epoch calculation.... raiseRuntimeError('bad bits at block {}'.format(count))... previous = header... count +=1ffff001dffff001dffff001dffff001dffff001dffff001dffff001dffff001dffff001dffff001dffff001dffff001dffff001dffff001dffff001d6ad8001d28c4001d71be001d