diff --git a/website/config/_default/config.toml b/website/config/_default/config.toml index 35a2d015..e6936183 100644 --- a/website/config/_default/config.toml +++ b/website/config/_default/config.toml @@ -62,7 +62,7 @@ rel = "sitemap" tag = "tags" [permalinks] - blog = "/blog/:title/" + blog = "/blog/:year/:month/:slugorfilename/" # docs = "/docs/1.0/:sections[1:]/:title/" [minify.tdewolff.html] diff --git a/website/content/en/blog/btc-network/index.md b/website/content/en/blog/btc-network/index.md new file mode 100644 index 00000000..84051fbd --- /dev/null +++ b/website/content/en/blog/btc-network/index.md @@ -0,0 +1,99 @@ +--- +title: "Modelling the Bitcoin Network" +description: | + (draft). This block post does not have enough content yet. +excerpt: | + (draft). This block post does not have enough content yet. +date: 2023-02-13 +draft: true +weight: 50 +images: ["taylor-vick-M5tzZtFCOfs-unsplash.jpg"] +categories: ["Constructions"] +tags: ["bitcoin", "network", "assumptions"] +contributors: ["Patrik Keller"] +pinned: false +homepage: false +--- + +**TL;DR:** (draft). This block post does not have enough content yet. + +In this blog posts I'll shortly present a simple but realistic network +scenario that models Bitcoin as it is deployed today. We'll have a look +at the Bitcoin mining statistics and model our +[virtual network]({{< method "simulator" >}}#inputs) accordingly. + +We obtain an estimate about the hash-rate distribution from +[blockchain.com](https://www.blockchain.com/explorer/charts/pools). +The following table shows mining statistics for the last 7 days before +December 14th 2022. + +| Miner / Pool | Relative Hash-Rate | Blocks Mined | +| ------------ | ------------------ | ------------ | +| Foundry USA | 24,817% | 271 | +| AntPool | 20,147% | 220 | +| F2Pool | 15,110% | 165 | +| Binance Pool | 13,736% | 150 | +| ViaBTC | 10,440% | 114 | +| Braiins Pool | 5,311% | 58 | +| Poolin | 2,473% | 27 | +| BTC.com | 1,923% | 21 | +| Luxor | 1,648% | 18 | +| Mara Pool | 1,465% | 16 | +| Ultimus | 0,641% | 7 | +| SBI Crypto | 0,641% | 7 | +| BTC M4 | 0,366% | 4 | +| Titan | 0,275% | 3 | +| Unknown | rest, about 1% | 11 | + +There are 14 big and identifiable miners. The other participants' hash +rates sum up to about 1%. We simplify the network by combining the +unknown participants into a single node. Hence we end up with `n = 15`. + +1092 blocks have been mined over the past 7 days, implying an average +block interval of about 554 seconds. The underlying proof-of-work +puzzle can only be solved by repeated guessing. Thus individual block +intervals are exponentially distributed. We define `mining_delay()` to +meet these observations. `select_miner()` returns a random node +according to the hash-rate distribution in the table. + +Regarding the communication, me make a another simplification. We assume +that all nodes are connected to each other and that blocks propagate in +6 seconds with a uniformly distributed jitter of ± 2 seconds. + +We further assume that all nodes are honest. We load [the specification +of Nakamoto consensus]({{< protocol "nakamoto" >}}) from the +module `nakamoto`. + +```python +import nakamoto # protocol specification +from numpy import random + + +def mining_delay(): + return random.exponential(scale=554) + + +def select_miner(): + hash_rates = [0.24817, 0.20147, ..., 0.00275, 0.01] # from table + return random.choice(range(n), p=hash_rates) + + +def neighbours(i): + return [x for x in range(n) if x != i] + + +def message_delay(src, dst): + return 6 + random.uniform(low=-2, high=2) + + +def roots(): + return nakamoto.roots() + + +def validity(block): + return nakamoto.validity(block) + + +def node(i): + return (nakamoto.init, nakamoto.update, nakamoto.mining) +``` diff --git a/website/content/en/blog/btc-network/taylor-vick-M5tzZtFCOfs-unsplash.jpg b/website/content/en/blog/btc-network/taylor-vick-M5tzZtFCOfs-unsplash.jpg new file mode 100644 index 00000000..3d75e703 Binary files /dev/null and b/website/content/en/blog/btc-network/taylor-vick-M5tzZtFCOfs-unsplash.jpg differ diff --git a/website/content/en/blog/emulating-gamma/index.md b/website/content/en/blog/emulating-gamma/index.md index 7a09161d..22e914ab 100644 --- a/website/content/en/blog/emulating-gamma/index.md +++ b/website/content/en/blog/emulating-gamma/index.md @@ -52,8 +52,8 @@ first in case of a block race. I like this approach because it reduces a wide variety of possible network conditions into a single, explainable parameter. The big drawback is the close coupling of the definition with the [attacked -protocol]({{< protocol "nakamoto" >}}) and the [corresponding attack -space](#missing-pieces). +protocol]({{< protocol "nakamoto" >}}) and the corresponding attack +space. ## CPR's Generic Network @@ -65,8 +65,8 @@ sending node, outgoing connection, and message. {{< img-figure src="net4.png" alt="Fully-connected network with 4 nodes" >}} CPR network example. Circles represent nodes, arrows represent outgoing -connections. The four nodes are fully connected; message forwarding is -not necessary. +connections. The four nodes are fully connected, thus message +forwarding is not necessary. {{< /img-figure >}} In the remainder of this post, I will describe how to model CPR networks @@ -79,38 +79,37 @@ network assumptions compatible with the Selfish Mining literature. ## Reordering Messages In the Selfish Mining literature, $\gamma$ represents the share of -defenders (by hash-rate) who adopt the attacker's block in case of a -block race. The term block race refers to a situation where one of the -honest nodes mines a new block $b_0$ while the attacker already has -mined---but kept private---a block $b_1$ of the same height. If the -attacker decides to release $b_1$ at the very same time, the defenders' -reactions depend on individual message propagation delays and hence the -network assumptions. Defender's will proceed mining on the block they -received first. Attackers with network-level capabilities can delay -propagation of $b_0$ and hence make the defenders prefer the attacker's -block $b_1$. The prevalent assumption is that, after each block race -$\gamma$ of the defender's hash-rate is used to mine on $b_1$, the rest -on $b_0$. - -The block race situation uses what BFT researches would call message -reordering. The honest miner sends $b_0$ before the attacker sends -$b_1$, but (some of) the other defender nodes receive $b_1$ before $b_0$. -The $\gamma$ parameter abstractly defines how often and to which -defenders this reordering succeeds. - -Based on the more general notion of message reordering, we can derive a -CPR-network exhibiting a given $\gamma.$ +defenders (weighted by hash-rate) who adopt the attacker's block in case +of a block race. The term *block race* refers to a situation where one +of the honest nodes mines a new block $b_0$ while the attacker already +has mined---but kept private---a block $b_1$ of the same height. If the +attacker decides to release $b_1$ at the very same time it is not +obvious how the defenders will react. The [protocol]({{< protocol +"nakamoto" >}}) prescribes to prefer the block first received, but this +depends on individual message propagation delays. Attackers with +network-level capabilities can delay propagation of $b_0$ and hence make +the defenders prefer the attacker's block $b_1$. The prevalent +assumption is that, after each block race, $\gamma$ of the defender's +hash-rate is used to mine on $b_1$, the rest on $b_0$. + +This attack relies on what BFT researches would typically call *message +reordering*. The honest miner sends $b_0$ before the attacker sends +$b_1$, but (some of) the remaining defender nodes receive $b_1$ before +$b_0$. The $\gamma$ parameter abstractly defines how often the +reordering succeeds. +Based on this more general notion of message reordering, we can derive a +CPR network exhibiting a given $\gamma$. ## Constructing a Network We start by imposing a couple of restrictions on the design space. -First, the network should consist of $n \geq 3$ nodes. One attacker and -at least two defenders. We need at least two defenders---one sends a -message before the attacker does, the other receives the two messages, -potentially in reversed order. Second, all defenders have the same -hash-rate. Third, the nodes are fully-connected, that is, each node has -outgoing connections to all other nodes. Forth, the attacker receives -all messages immediately. Fifth, message delays between defenders are +First, the network should consist of $n \geq 3$ nodes. We need one +attacker and at least two defenders---one sends a message before the +attacker does, the other receives the two messages in potentially +reversed order. Second, all defenders have the same hash-rate. Third, +the nodes are fully-connected, that is, each node has outgoing +connections to all other nodes. Forth, the attacker receives all +messages immediately. Fifth, message delays between defenders are constant but small, let's say $\varepsilon$. Sixth, all outgoing connections from the attacker to defender have the same properties. @@ -126,7 +125,7 @@ With these restrictions in place, we have only one dimension left. How do we have to delay messages from attacker to defender, such that $\gamma$ of the block races end in favor of the attacker? -There are one attacker node and $n - 1$ defender nodes. One of the +We have one attacker node and $n - 1$ defender nodes. One of the defenders sends a message---in Selfish Mining that would be a freshly mined block. The other $n - 2$ defenders receive the message after $\varepsilon$ time. The attacker receives it immediately, sends another @@ -176,7 +175,7 @@ $\gamma$-assumption in Selfish Mining. ## Limiting Gamma -It is common practice to evaluate Selfish Mining for $\gamma = 1$. +Related literature evaluates Selfish Mining for $\gamma = 1$. This is not possible with our approach. But $\gamma = 1$ is also not possible in practice. @@ -194,28 +193,28 @@ first. In our network with $n-1$ defenders this manifest as follows. After any block race, $\frac{1}{n-1}$ of the defenders' hash-rate will be used to mine on the honest block. Consequently, $\gamma$ must be lower or equal $1 - -\frac{1}{n-1} = \frac{n-2}{n-1}$. In reverse, in order to emulate a -given $\gamma$ we have to simulate at least $n \geq \frac{1}{1 - -\gamma}+ 1$ nodes. Setting $\gamma = 1$ implies $n = \infty$ and makes -simulations unfeasible. +\frac{1}{n-1} = \frac{n-2}{n-1}$. Thus, in order to emulate a given +$\gamma$ we have to simulate at least $n \geq \frac{1}{1 - \gamma}+ 1$ +nodes. Setting $\gamma = 1$ implies $n = \infty$ and makes simulations +unfeasible. Astute readers will now (at the latest) start questioning the formulas in the previous section. Why does the limitation presented here not show -up above? It's because one of the steps requires qualification. We used -that $D_i$ is uniformly distributed on the interval from $0$ to $d$. -Then we said that $P[D_i < \varepsilon] = \frac{\varepsilon}{d}$. That's -true if $d \geq \varepsilon$ but otherwise the probability is $1$. -As a result, $E[X] \leq n-2$ and $\gamma \leq \frac{n-2}{n-1}$. +up above? It's because one of the steps above requires qualification. We +used that $D_i$ is uniformly distributed on the interval from $0$ to +$d$. Then we said that $P[D_i < \varepsilon] = \frac{\varepsilon}{d}$. +That's true if $d \geq \varepsilon$ but otherwise the probability is +$1$. As a result, $E[X] \leq n-2$ and $\gamma \leq \frac{n-2}{n-1}$. -In the real world, most of the hash-rate is concentrated around a view +In the real world, most of the hash-rate is concentrated around a few nodes. In December 2022, the two biggest Bitcoin mining pools have 26% -and 19% of the overall hash-rate. Assume the bigger one turns malicious -and starts selfish mining. That leaves 74% of the hash rate to the -defender, of which about 26% belong to the second-biggest miner -(the 19% one). In 26% of the block races, the second-biggest miner will -be the sender of the block which the attacker tries to outrun. So in -26% of the cases 26% of the defender hash-rate will not mine on the -attacker's block. This alone makes $\gamma > 0.94$ unrealistic. +and 19% of the overall hash-rate. Let's assume the bigger one turns +malicious and starts selfish mining. That leaves 74% of the hash-rate to +the defender, of which about 26% belong to the second-biggest miner (the +19% one). In 26% of the block races, the second-biggest miner will be +the sender of the block which the attacker tries to outrun. So in 26% of +the cases 26% of the defender hash-rate will not mine on the attacker's +block. This alone makes any $\gamma > 0.94$ unrealistic. ## Choosing Parameters @@ -226,7 +225,7 @@ description]({{< method "simulator" >}}) additionally sets the attacker's relative hash-rate $\alpha$, and the network's overall mining rate $\lambda$. -A couple of restrictions follow either from definitions of the +A couple of restrictions follow either from the definitions of the parameters themselves or from the construction above. * Gamma and alpha are fractions, thus $0 \leq \gamma \leq 1$ and $0 \leq @@ -249,22 +248,22 @@ to avoid them. That's possible by setting $\varepsilon \ll \lambda^{-1}$, that is, making the defender's message delay much smaller than the (expected) block interval. A nice trick is to set $\lambda = 1$ which implies that the natural orphan rate is bounded by $\varepsilon$. -We typically use $\lambda = 1$ and $\varepsilon = 10^{-5}$. +We typically use $\lambda = 1$ and $\varepsilon = 10^{-9}$. The network size $n$ matters as well. In Selfish Mining, after *each* block race *exactly* $\gamma$ of the defender hash-rate mines on the attacker's block. In the constructed network, the number of defenders mining on the attacker's block is random. The expected value is $\gamma$ -by construction. But the variance depends on the number of defenders. +by construction, but the variance depends on the number of defenders. Higher $n$ implies more messages per block race and---by the law of large numbers---less variance around $\gamma$. ## Specification -Last step is to produce some Python-like specification that works as +The last step is to produce some Python-like specification that works as input for our [network simulator]({{< method "simulator" >}}). As -promised, it works with any protocol specification (supplied in Python -module `protocol`) and attack (supplied in `attacker`). +promised above, it works with any protocol specification (supplied in +Python module `protocol`) and attack (supplied in `attacker`). ```python import protocol # protocol specification @@ -275,7 +274,7 @@ from numpy import random n = 42 # network size alpha = 1 / 3 # attacker's relative hash-rate gamma = 0.5 # Selfish Mining's Gamma -epsilon = 1e-5 # propagation delay honest --> honest +epsilon = 1e-9 # propagation delay honest --> honest lambda_ = 1 / 600 # network's mining rate # Safety checks: @@ -290,7 +289,7 @@ def mining_delay(): return random.exponential(scale=1 / lambda_) -def miner(): +def select_miner(): dhr = (1 - alpha) / (n - 1) # defender hash rate hash_rates = [alpha] + [dhr] * (n - 1) return random.choice(range(n), p=hash_rates) diff --git a/website/content/en/blog/launching-website/index.md b/website/content/en/blog/launching-website/index.md index fc8553e9..1736e24d 100644 --- a/website/content/en/blog/launching-website/index.md +++ b/website/content/en/blog/launching-website/index.md @@ -1,9 +1,9 @@ --- title: "Launching the CPR website" description: | - Introducing the CPR website. + (draft). Introducing the CPR website. excerpt: | - Introducing the CPR website. + (draft). Introducing the CPR website. date: 2022-11-28 draft: true weight: 50 diff --git a/website/content/en/docs/methods/concurrent_programming_asyncio.py b/website/content/en/docs/methods/concurrent_programming_asyncio.py new file mode 100644 index 00000000..23a331a2 --- /dev/null +++ b/website/content/en/docs/methods/concurrent_programming_asyncio.py @@ -0,0 +1,62 @@ +import asyncio +import time + +loop = asyncio.new_event_loop() + + +def delay(n, fun, *args): + delay.count += 1 + loop.call_later(n, decr_and_call, fun, *args) + + +def decr_and_call(fun, *args): + delay.count -= 1 + fun(*args) + + +delay.count = 0 + + +def delay_until(prop, fun, *args): + if prop(): + fun(*args) + else: + delay(1e-6, delay_until, prop, fun, *args) + + +## end of asyncio boilerplate + +## begin of program / script + + +def f(x): + f.count += 1 + t = time.time() - f.start + print(f"at second {t:.0f}: {x}") + + +f.start = time.time() +f.count = 0 + +f("a") +delay(2, f, "d") +delay(0, f, "c") +delay(1, delay, 2, f, "f") +delay_until(lambda: f.count >= 4, f, "e") +f("b") + +## end of program / script + +## begin of asyncio boilerplate + + +async def main(): + while True: + assert delay.count >= 0 + if delay.count == 0: + return + else: + await asyncio.sleep(1e-6) + + +loop.run_until_complete(main()) diff --git a/website/content/en/docs/methods/protocol-specification.md b/website/content/en/docs/methods/protocol-specification.md index a35c411a..f2d981b1 100644 --- a/website/content/en/docs/methods/protocol-specification.md +++ b/website/content/en/docs/methods/protocol-specification.md @@ -45,14 +45,12 @@ functions mentioned above. ## Blockchain -Protocols specify and use a global, append-only data-structure. [A-priori, blocks form a directed acyclic graph (DAG)](../virtual-environment#blobs-hashes-blockchain). Each block has -an arbitrary number of parent blocks and can store arbitrary data. The -`validity` function then restricts what blocks can be appended and -thereby imposes a certain structure on the DAG. - -The `validity` function takes a block as argument. It may return `True` +an arbitrary number of parent blocks and can store arbitrary data. +Protocol designers can impose additional structure by specifying a +restrictive `validity` function. +This function takes a block as argument. It may return `True` or `False`, or it may fail. The specification consumer must ensure that all appended blocks are valid, that is, `validity` returns `True`. @@ -99,7 +97,8 @@ subset](../virtual-environment#visibility-and-communication) of the blockchain depending on what information the other nodes share, when they share it, and how messages propagate through the network. -Nodes are specified with three functions `init`, `update`, and `mining`. +Protocol designers specify the behaviour of honest nodes through three +functions `init`, `update`, and `mining`. `init` takes a list of blocks as argument and returns the node's initial state. The specification consumer must ensure that the list of blocks @@ -119,22 +118,22 @@ block. {{< /code-figure >}} `update` informs the node about new blocks. It takes as argument the -node's old state, the new block, and a string indicating how the new -block became visible locally. It returns a new state, a list of blocks -to share, and a list of blocks to append without proof-of-work. The -protocol designer must ensure that +node's old state, the new block, and a string indicating the source of +the block. It returns a new state, a list of blocks to share, and a list +of blocks to append without proof-of-work. The specification consumer +must ensure that * `update` is called on all blocks as they become locally visible, -* the new block given as argument and all its ancestors are locally visible, +* the new block (second argument) and all its ancestors are locally visible, * the returned state will be given as first argument to the next update, * the to-be-shared blocks are validated and sent to the other nodes, * the to-be-appended blocks are validated, appended to the global block DAG, and made visible locally, -* the `event` argument is `"mining"` if the new block was mined +* the third argument is `"mining"` if the new block was mined locally, -* the `event` argument is `"append"` if the new block was appended locally +* the third argument is `"append"` if the new block was appended locally without proof-of-work, and -* the `event` argument is `"network"` if the new block was received from the +* the third argument is `"network"` if the new block was received from the network. {{< code-figure >}} @@ -181,14 +180,14 @@ that * `b.has_pow()` is true if and only if `b` was appended through proof-of-work, and * when a node learns about a successful proof-of-work with -`update(old_state, new_block, event = "mining")` it holds that +`update(old_state, new_block, "mining")` it holds that `new_block == mining(old_state)`. {{< code-figure >}} ```python def mining(b: Block): - return Block(height=b.height + 1, parents=[b], miner=Env.my_id) + return Block(height=b.height + 1, parents=[b], miner=my_id) ``` In [Nakamoto consensus]({{< protocol "nakamoto" >}}) all blocks require @@ -223,20 +222,11 @@ timestamps in orphaned blocks cannot be used. In short, difficulty adjustment is a complex control problem. It even has its own line of research. -Luckily, DA is somewhat orthogonal to consensus. In practice, -proof-of-work protocols require a DAA but in theory can often get away -without. Depending on the analysis, we either assume that the true -hash-rate is constant and known or we assume that the DAA does a perfect -job. To support the latter, protocols designers have to specify what -kind of growth the DAA should control for. In Bitcoin, the growth rate -equals increase in block height per time. Other protocols might choose a -different metric. - -The protocol specification defines a single function `progress` that -maps a given block (tip of blockchain) to a scalar value. The engineer -implementing the protocol will contribute a DAA that controls for -constant progress per time. The analyst sometimes assumes constant -progress per time. +Protocol designers use the `progress` function to specify what kind of +growth the DAA should control for. The function maps a given block (tip +of blockchain) to a scalar value. The engineer implementing the protocol +will contribute a DAA that controls for constant progress per time. The +analyst may just assume constant progress per time. {{< code-figure >}} @@ -261,20 +251,20 @@ participant, namely the holders of the private keys and hence users. In principle, crypto-currency users and node operators can be disjunct. -Users may operate a node but they do not have to. But in practice, +Users may operate a node but they do not have to. But in practice blockchain protocols motivate participation as operator by handing out crypto-currency denoted rewards. Hence node operators usually are crypto-currency users. As protocol designers, we want our protocol to accommodate as many -applications as possible. We avoid imposing requirements on the +applications as possible. We thus avoid imposing requirements on the crypto-currency. We just assume that each node has access to a unique identifier `my_id` that can receive rewards. In practice, `my_id` would be the account or wallet address of the node's operator. The specification itself however does not have any notion of crypto-currency address or wallet. -We specify reward calculation using four functions `local_tip`, `global_tip`, +We specify incentive mechanisms using four functions `local_tip`, `global_tip`, `history` and `reward`. * `local_tip` takes a node's state as argument and returns its preferred @@ -283,7 +273,7 @@ tip of the chain. tips. * `history` calculates the linear history of the best tip. * `reward` maps a block (in the linear history) to reward assignments. -* A reward assignment assigns a scalar reward to a node id. +* Each reward assignment assigns a scalar reward to a node id. {{< code-figure >}} @@ -326,6 +316,8 @@ calculate different useful metrics like * individual rewards for each block, and * accumulated historic rewards per node for any tip of the chain. -The reward API can model situations where miners get assigned rewards -for blocks that are not part of the linear history. This happens for -example in [tree-structured voting]({{< protocol "parallel-tree" >}}). +Note that this reward specification can model situations where miners +get assigned rewards for blocks that are not part of the linear history. +This happens for example in [tree-structured voting]({{< protocol +"parallel-tree" +>}}). diff --git a/website/content/en/docs/methods/simulator.md b/website/content/en/docs/methods/simulator.md index d67b077d..bfd5b023 100644 --- a/website/content/en/docs/methods/simulator.md +++ b/website/content/en/docs/methods/simulator.md @@ -13,8 +13,6 @@ weight: 230 toc: true --- -{{< alert icon="👉" text="WIP. You are looking at an unfinished page." />}} - Having talked about our [model for virtual protocol executions](../virtual-environment) and about [how we specify protocols](../protocol-specification), we can proceed with the @@ -25,6 +23,9 @@ in a compiled language to achieve good performance. ## Inputs +The simulator takes as input the following assumptions about the network +and participating nodes. + `n: int` defines the number of nodes, `range(n)` addresses the individual nodes. @@ -48,77 +49,34 @@ invalid blocks. `node(i: int) -> tuple[init, update, mining]` defines the behaviour of nodes as discussed in the [the section on protocol specifications](../protocol-specification). In the simplest case, when -all nodes are honest, `node(i)` for any `i` returns the tree functions +all nodes are honest, `node(i)` for all `i` returns the three functions `init`, `update`, and `mining` as defined in the protocol specification. In attack scenarios, `node(i)` returns nonconforming functions for the malicious nodes. -`mining_delay() -> float` and `miner() -> int` model proof-of-work -difficulty and hash-rates of the nodes. `mining_delay()` defines the -time between consecutive mining events. It is typically a random -function returning independent and exponentially distributed values. -`miner()` defines which node is successful at a particular mining event. -It is typically a random function which selects nodes based on their -relative hash-rate. - -### Example - -We provide an intuition about the simulator inputs by walking through a -realistic (but still simplified) example scenario. We consider [Nakamoto -consensus]({{< protocol "nakamoto" >}}) as it is deployed in Bitcoin. -The following table shows mining statistics for the last 7 days before -Dec. 14 2022. We got it from -[blockchain.com](https://www.blockchain.com/explorer/charts/pools). - -| Miner / Pool | Relative Hash-Rate | Blocks Mined | -| ------------ | ------------------ | ------------ | -| Foundry USA | 24,817% | 271 | -| AntPool | 20,147% | 220 | -| F2Pool | 15,110% | 165 | -| Binance Pool | 13,736% | 150 | -| ViaBTC | 10,440% | 114 | -| Braiins Pool | 5,311% | 58 | -| Poolin | 2,473% | 27 | -| BTC.com | 1,923% | 21 | -| Luxor | 1,648% | 18 | -| Mara Pool | 1,465% | 16 | -| Ultimus | 0,641% | 7 | -| SBI Crypto | 0,641% | 7 | -| BTC M4 | 0,366% | 4 | -| Titan | 0,275% | 3 | -| Unknown | rest, about 1% | 11 | - -We observe that there are 14 big and identifiable miners. The other -participants' hash rates sum up to about 1%. In our simplified scenario -we combine the unknown participants into a single node. Hence we end up -with `n = 15`. - -1092 blocks have been mined over the past 7 days, implying an average -block interval of about 554 seconds. Individual block intervals are -exponentially distributed. To meet the average block interval, we scale -the distribution by factor 554 and let `mining_delay()` do the sampling. -`miner()` returns a random node according to the hash-rate distribution -in the table. - -Regarding the communication, me make a another simplification. We assume -that all nodes are connected to each other and that blocks propagate in -6 seconds with a uniformly distributed jitter of ± 2 seconds. - -We further assume that all nodes are honest and that [the specification -of Nakamoto consensus]({{< protocol "nakamoto" >}}) is provided in the -Python module `nakamoto`. +`mining_delay() -> float` and `select_miner() -> int` model +proof-of-work difficulty and hash-rates of the nodes. `mining_delay()` +defines the time between consecutive mining events. It is typically a +random function returning independent and exponentially distributed +values. `select_miner()` defines which node is successful at a +particular mining event. It is typically a random function which selects +nodes based on their relative hash-rate. + +{{< code-figure >}} ```python import nakamoto # protocol specification from numpy import random +n = 7 + def mining_delay(): - return random.exponential(scale=554) + return random.exponential(scale=600) -def miner(): - hash_rates = [0.24817, 0.20147, ..., 0.00275, 0.01] # from table +def select_miner(): + hash_rates = range(1, n + 1) return random.choice(range(n), p=hash_rates) @@ -127,7 +85,7 @@ def neighbours(i): def message_delay(src, dst): - return 6 + random.uniform(low=-2, high=2) + return 6 def roots(): @@ -142,39 +100,293 @@ def node(i): return (nakamoto.init, nakamoto.update, nakamoto.mining) ``` -## Naive real time simulation +Example network with `n = 7` nodes following the [Nakamoto +consensus]({{< protocol "nakamoto" >}}) protocol. All messages are +delayed for 6 seconds. The block interval will be about 10 minutes. +Hash-rates are chosen arbitrarily. + +{{< /code-figure >}} + +## Concurrent Programming + +We describe the simulator as a concurrent program. +The program does not use parallelism---all computations happen +sequentially in a single thread. +We introduce two primitives to delay function evaluation. + +- `delay(n, fun, *args)` schedules the evaluation of `fun(*args)` in `n` + seconds. While waiting, other computations may happen concurrently. +- `delay_until(prop, fun, *args)` schedules the evaluation of +`fun(*args)` as soon as `prop()` returns `True`. While `prop()` returns +`False` other computations may take place concurrently. + +Consider the following example program which prints the letters *a* to +*f* in order. We invite the curious reader to take a look at the +[complete program](../concurrent_programming_asyncio.py), including +an implementation of the two primitives `delay` and `delay_until` using +Python's `asyncio`. + +{{< code-figure >}} + +```python +def f(x): + f.count += 1 + t = time.time() - f.start + print(f"at second {t:.0f}: {x}") + + +f.start = time.time() +f.count = 0 + +f("a") +delay(2, f, "d") +delay(0, f, "c") +delay(1, delay, 2, f, "f") +delay_until(lambda: f.count >= 4, f, "e") +f("b") + +## expected output: +# at second 0: a +# at second 0: b +# at second 0: c +# at second 2: d +# at second 2: e +# at second 3: f +``` + +Example program demonstrating the two scheduling primitives `delay` and +`delay_until`. +{{< /code-figure >}} -TODO. High level description of the simulator as concurrent program with -waiting. +## Block DAG and Visibility -## Discrete event simulation +As [discussed before]({{< method "virtual-environment" >}}) the +simulator will maintain a DAG of all blocks mined or appended by any +nodes. Individual nodes however have only a partial view on this DAG. In +our program, we set local views as follows. -TODO. Speeding things up with DES. +```python +dag = DAG() +... +# in this context dag represents the global view on _all_ blocks +with dag.set_view(i): + ... + # in this context visibility of parents and children is restricted + # according to the local view of node i +... +# in this context dag represents the global view on _all_ blocks +``` + +Local views are read-only. The node functions `init`, `mining`, and +`update` cannot append blocks to the DAG. Instead, they return requests +for appending a block to the simulator. We will describe below, how the +simulator handles these requests. + +The local view makes the nodes' id accessible to the node by setting +`my_id = i` within the context. + +## Initialization + +We start each simulation with the initialization of the block DAG, +the protocol's `root` blocks, and the nodes' state. + +```python +# initialize empty block DAG +dag = DAG() + +# obtain drafts for the protocol's root blocks, convert them into +# actual blocks, and make them visible to all nodes. +root_blocks = [dag.add(b) for b in roots()] +for b in root_blocks: + for i in range(n): + dag.make_visible(b, i) + +# nodes' state +state = [None for _ in range(n)] +for i in range(n): + init, _update, _mining = node(i) + with dag.set_view(i): + state[i] = init(blocks) +``` -## Brainstorm +## Proof-of-Work -To be removed. +After initialization, the simulator starts a concurrent loop which +models proof-of-work and mining. The loop repeatedly samples and waits +for a mining delay, selects a random node as successful miner, obtains a +block draft from this miner, converts the draft into a block, checks +block validity, and informs the miner about the freshly mined block. -State: +```python +def proof_of_work(): + # select random miner + i = select_miner() + # obtain block draft from miner + _init, _update, mining = node(i) + with dag.set_view(i): + draft = mining() + # convert draft to block w/ proof-of-work + block = dag.add(draft) + block.has_pow = True + # enforce validity of all appended blocks + if validity(block): + # inform miner about freshly mined block + deliver(block, i, "proof-of-work") + else: + dag.remove(block) + # continue proof-of-work loop after random delay + delay(mining_delay(), proof_of_work) + + +# enter proof-of-work loop +delay(mining_delay(), proof_of_work) +``` -* Global DAG. -* State for each node. -* Block-visibility for each node. +## Block Delivery -Time: +Block delivery is what we call informing a node about a new block. +From the perspective of a node, new blocks might be freshly mined +locally, just appended locally, or received from the network. + +We handle these deliveries in the `deliver` function. The function makes +the block visible to the node, applies the node's `update` function, and +handles the returned instructions to share existing or append new +blocks. + +```python +def deliver(block, i, event): + # update local visibility + dag.make_visible(block, i) -(implementation detail, nice to know but maybe not fully describe this -here) + # simulate node i's action + _init, update, _mining = node(i) + with dag.set_view(i): + ret = update(state[i], block, event) -* Discrete event simulation -* Time-ordered queue of future events -* Infinite loop consuming & handling first event in the queue -* Event handler might queue future events -* Time between events is skipped -* Single non-blocking process + # store updated state + state[i] = ret.state -Appends: + # handle communication + for msg in ret.share: + broadcast(i, msg) -Communication: + # handle appends w/o proof-of-work + for draft in ret.append: + append(i, draft) +``` + +The [protocol specification]({{< method "protocol-specification" >}}) +assumes that blocks are delivered in order, that is, that the parents of +a delivered block have been delivered before. For our previous use of +`deliver` in the proof-of-work loop, this invariant is trivially true. +The following wrapper ensures in-order delivery for messages received +from the network. + +```python +def deliver_in_order(block, i, event): + def deliver_once(block, i, event): + if not dag.is_visible(block, i): + deliver(block, i, event) + + def all_parents_visible(): + for parent in block.parents(): + if not dag.is_visible(parent, i): + return False + return True + + delay_until(parents_visible, deliver_once, block, i, event) +``` + +## Communication + +Nodes may request block broadcasts from their `update` function. The +simulator's [`delivery`](#block-delivery) function forwards these +requests to the `broadcast` function below. For each of the sending +node's (`src`) neighbors, this function samples a message delay (`t`) +and---after waiting for delay---delivers the block to the receiver +(`dst`). + +```python +def broadcast(src, block): + for dst in neighbours(src): + t = message_delay(src, dst) + delay(t, deliver_in_order, dst, block, "network") +``` + +## Non-PoW Blocks + +Nodes may request block appends from their `update` function. The +simulator's [`delivery`](#block-delivery) function forwards these +requests to the `append` function below. Without delay, the function +appends the block draft to the DAG, ensures validity, and delivers the +new block to the requesting node. Block's originating from this +mechanism do not carry a proof-of-work. + +```python +def append(i, draft): + block = dag.add(draft) + block.pow = False + if validity(block): + deliver(block, i, "append") + else: + dag.remove(block) +``` + +## Discrete Event Simulation + +[Above](#concurrent-programming) we suggest to use concurrent +programming to implement the scheduling primitives `delay` and +`delay_until`. If implemented naively, the simulator would spend most +the time waiting for the delays or checking properties becoming true. +A delay of `n` seconds would actually take `n` seconds to +simulate. Simulations would be real-time, but real-time is +inconveniently slow in the proof-of-work blockchain context. + +We speed up the simulation by skipping the delayed time instead of +waiting for it to pass. The approach is called [discrete-event +simulation](https://en.wikipedia.org/wiki/Discrete-event_simulation). +The trick is to maintain a time-ordered queue of future events and +define a loop that handles the scheduled events one after another until +none are left. In the context of this simulator, future events are +delayed function calls. Handling an event is as simple as evaluating the +call. Each call may schedule further delayed calls. The time-ordering of +the queue ensures that the calls happen in the same order as they would +with naive waiting. + +```python +from queue import PriorityQueue + +time = 0 +queue = PriorityQueue() +props = [] + + +def delay(seconds, fun, *args): + queue.put(time + seconds, fun, args) + + +def delay_until(prop, fun, *args): + prop.append(prop, fun, args) + + +# simulator code goes here +... + +while not queue.empty(): + # dequeue and handle next event + time, fun, args = queue.get() + fun(*args) + # re-evaluate delay_until properties + next_props = [] + for prop, fun, args in props: + if prop(): + fun(*args) + else: + next_props.append(prop, fun, args) + props = next_props +``` -Proof-of-work: +Note that using `delay_until` can cause significant overhead. We +introduced this scheduling primitive to improve readability of the +simulator pseudo-code. We completely avoid this kind of scheduling in +our optimized simulator implementation. diff --git a/website/content/en/docs/methods/virtual-environment.md b/website/content/en/docs/methods/virtual-environment.md index 8b92679d..5cc0fbc4 100644 --- a/website/content/en/docs/methods/virtual-environment.md +++ b/website/content/en/docs/methods/virtual-environment.md @@ -110,8 +110,8 @@ relationship from block $a$ are called ancestors of $a$. Blocks that can be reached with the child relationship from block $a$ are called descendants of $a$. Blocks can store arbitrary data in named fields. Blocks, their parents, and their fields are persistent. That is, -changing a block creates a copy of the block. Freshly copied blocks do -not have any descendants. +modifying a block creates a copy of the block. Freshly modified blocks +do not have any descendants. Blocks and the parent relationship form a directed acyclic graph which we call block DAG. Within the DAG, blocks without parents are @@ -156,7 +156,7 @@ In practice, verifying hash-links requires full knowledge of the referenced blob. Without knowing the blob, its hash cannot be computed. Without knowing the hash, the blob cannot be referred to. It is impossible to distinguish between a link that was intentionally -broken---let's say by replacing the hash with a random number---and a +broken---let's say by replacing it with a random number---and a link to an existing but locally unavailable blob. For the virtual environment, we introduce the concept of local block @@ -164,21 +164,22 @@ visibility. Each block can be either visible to a node or not. Blocks that are locally visible to a node will stay locally visible to this node forever. If a block is locally visible, then all its ancestors are also locally visible. This restriction does not apply to block children -where we local visibility might be partial. +where local visibility can be incomplete. -In other words, we assume that nodes have partial knowledge about the +To summarize, we assume that nodes have partial knowledge about the block DAG. A node's knowledge grows over time as it learns about new blocks in the order imposed by the DAG. -Nodes can create blocks. For this, all parents must be locally visible. -Freshly created blocks are invisible to all nodes but their creator. -Nodes can decide to share blocks. Depending on the communication -assumptions (made elsewhere), the virtual environment makes visible -shared blocks, including all their ancestors, to the other nodes. -Typically, blocks which have never been shared, and whose descendants -have not been shared, will not be visible to any nodes but its creator. -We say these blocks are withheld by their creator. Usually, honest nodes -do not withhold blocks but malicious nodes do. +Nodes can create blocks and append them to the DAG. Naturally, all +parent blocks must be locally visible to the appending node. Freshly +created blocks are invisible to all nodes but their creator. Nodes can +decide to share blocks. If so, the virtual environment makes visible +shared blocks, including all their ancestors, to the other nodes +according to the communication assumptions made elsewhere. Typically, +blocks which have never been shared, and whose descendants have not been +shared, will not be visible to any nodes but their creator. We say these +blocks are withheld by their creator. Usually, honest nodes do not +withhold blocks but malicious nodes do. We model the determinism of hash-linked lists by allowing nodes to re-create blocks. If two nodes create two identical blocks---that is, @@ -203,7 +204,7 @@ the nodes themselves. Blocks with a proof-of-work are unique, that means they cannot be re-created by appending a block with the same parents and fields. -Then, the virtual environment chooses which nodes succeed when at +The virtual environment chooses which nodes succeed when at proof-of-work. Whenever a node succeeds, the environment obtains from the node a block proposal (parents, fields, and field contents), creates the corresponding block, sets the proof-of-work property to true, and diff --git a/website/content/en/docs/prologue/faq.md b/website/content/en/docs/prologue/faq.md index fd6ca2e9..eae29ed1 100644 --- a/website/content/en/docs/prologue/faq.md +++ b/website/content/en/docs/prologue/faq.md @@ -27,4 +27,5 @@ no capacity to think about proof-of-stake. Please don't hesitate. You'll find my mail address in [my GitHub profile](https://github.com/pkel). For public communication, consider -creating a [new issue](https://github.com/pkel/cpr/issues). +creating a [new GitHub issue](https://github.com/pkel/cpr/issues) or a [new +GitHub discussion](https://github.com/pkel/cpr/discussions). diff --git a/website/content/en/docs/prologue/introduction.md b/website/content/en/docs/prologue/introduction.md index 33233dc7..4707d4c4 100644 --- a/website/content/en/docs/prologue/introduction.md +++ b/website/content/en/docs/prologue/introduction.md @@ -15,18 +15,77 @@ weight: 100 toc: true --- -## Get started +## Work in Progress -There are different ways for getting started, depending on what you want to do. +The whole website is under construction. There's not much information +for newcomers yet. At some point I want to provide tutorials the +following use-cases. -Unfortunately, documentation is lacking for all of them. At some point I -want to provide tutorials the following use-cases. - -- Evaluate existing protocols in a virtual environment. -- Evaluate existing attacks in a virtual environment. -- Search attacks against existing protocols using reinforcement learning. +- Evaluate specified protocols in a virtual environment. +- Evaluate supported attacks in a virtual environment. +- Search attacks against specified protocols using reinforcement learning. - Specification of protocols. - Specification of attack spaces. If you want to start now, don't hesitate to contact [me](https://github.com/pkel) for support. + +## Overview + +CPR’s core component is a network simulation engine for proof-of-work +protocols. The engine takes a protocol specification and a network +topology as input, then simulates the execution of the protocol in the +network over time. Additional tooling enables automated search for +attack strategies with reinforcement learning. + +I'm working on a detailed documentation such that my tooling becomes +useful for others. So far, I've documented most parts of the core +simulation engine, most of the specified protocols, and some useful +network topologies. + +The [virtual enviroment page]({{< method "virtual-environment" >}}) +informally describes CPR's approach to network simulation. The [protocol +specification page]({{< method "protocol-specification" +>}}) describes how we specify protocols. The [simulator page]({{< method +"simulator" >}}) describes how the simulator executes the specified +protocols in a virtual network. Read these pages in order, to learn +about the inner workings of CPR. Then, continue exploring the different +[proof-of-work consensus protocols]({{< protocol "overview" >}}). + +## Python/RL Quickstart + +CPR provides an OpenAI Gym environment for attack search with Python RL +frameworks. You can install it PyPI, if you meet the following +**requirements**. + +- Unix-like operating system with x86_64 support. I've tested Linux + (Fedora and Ubuntu) and Mac OSX. +- CPython, version >= 3.9 +- `pip` package installer for Python. + +In a terminal, run `pip install cpr-gym`. Then, you are ready to run the +following Python snippet which simulates 2016 steps (mined or received +blocks) of honest behaviour in [Nakamoto consensus]({{}}). + +```python +import gym +import cpr_gym + +env = gym.make("cpr-nakamoto-v0", episode_len=2016) +obs = env.reset() +done = False +while not done: + action = env.policy(obs, "honest") + obs, rew, done, info = env.step(action) +``` + +Until I catch up with the documentation, consider reading [our +tests](https://github.com/pkel/cpr/tree/master/gym/tests) to find out +how to attack the other protocols and how to set important parameters +like the attacker capabilities [alpha and gamma]({{< post +"emulating-gamma" >}}). + +If the PyPI package does not work for you, consult the [GitHub +project](https://github.com/pkel/cpr) for instructions on building the +project from source. diff --git a/website/content/en/docs/protocols/nakamoto.md b/website/content/en/docs/protocols/nakamoto.md index 9f0bdebf..e0c573a6 100644 --- a/website/content/en/docs/protocols/nakamoto.md +++ b/website/content/en/docs/protocols/nakamoto.md @@ -87,7 +87,7 @@ def update(old: Block, new: Block, event: string): def mining(b: Block): - return Block(height=b.height + 1, parents=[b], miner=Env.my_id) + return Block(height=b.height + 1, parents=[b], miner=my_id) ``` ### Difficulty Adjustment diff --git a/website/content/en/docs/protocols/parallel-tree.md b/website/content/en/docs/protocols/parallel-tree.md index bef7471f..45eae742 100644 --- a/website/content/en/docs/protocols/parallel-tree.md +++ b/website/content/en/docs/protocols/parallel-tree.md @@ -51,10 +51,10 @@ honest behaviour. Imagine a situation where two miners---one weak (let's say 2% of the hash-rate) and one strong (20% of the hash-rate)---mine blocks around the same time and create a fork. Obviously, the weak miner tries to confirm her block, the strong miner tries to confirm the other. -We do not know what the other miners (72% of the hash-rate) do. If we +We do not know what the other miners (78% of the hash-rate) do. If we assume that they are unbiased, that is, they mine one or the other block -with equal probability, we end up in a situation were 38% of the hash -rate tries to confirm the weak miners block and 62% of the hash rate +with equal probability, we end up in a situation were 41% of the hash +rate tries to confirm the weak miners block and 59% of the hash rate tries to confirm the strong miners block. Only one of the blocks will end up in the final blockchain. The other does not get rewards. The punishment mechanism for non-linearity is biased in favour of the strong diff --git a/website/content/en/docs/protocols/tailstorm.md b/website/content/en/docs/protocols/tailstorm.md index ee7923c4..419dacb8 100644 --- a/website/content/en/docs/protocols/tailstorm.md +++ b/website/content/en/docs/protocols/tailstorm.md @@ -148,7 +148,7 @@ def preference(old: Block, new: Block): def summarize(b: Block): - assert b.kind == "block" + assert b.kind == "summary" if len(confirming_sub_blocks(b)) < k: return [] # summary infeasible else: diff --git a/website/layouts/shortcodes/post.html b/website/layouts/shortcodes/post.html new file mode 100644 index 00000000..d39dfa63 --- /dev/null +++ b/website/layouts/shortcodes/post.html @@ -0,0 +1 @@ +{{ printf "/blog/%s" (.Get 0) | relref . }} diff --git a/website/package.json b/website/package.json index fedf908a..4618c151 100644 --- a/website/package.json +++ b/website/package.json @@ -17,7 +17,7 @@ "init": "shx rm -rf .git && git init -b main", "create": "exec-bin node_modules/.bin/hugo/hugo new", "prestart": "npm run clean", - "start": "exec-bin node_modules/.bin/hugo/hugo server --bind=0.0.0.0 --disableFastRender", + "start": "exec-bin node_modules/.bin/hugo/hugo server -D --bind=0.0.0.0 --disableFastRender", "prebuild": "npm run clean", "build": "exec-bin node_modules/.bin/hugo/hugo --gc --minify", "build:preview": "npm run build -D -F",