Improved recovery #301

Open
opened 2025-10-21 10:35:29 +00:00 by a1denvalu3 · 5 comments
a1denvalu3 commented 2025-10-21 10:35:29 +00:00 (Migrated from github.com)

Improved Recovery (Efficiency and Privacy)

The way recovery is currently performed is a privacy disaster: the wallet leaks the entire spending history in an attempt to recover unspent user funds. This issue describes the current recovery method, followed by a revised approach which reduces the number of leaked unissued notes by a logarithmic factor and limits the number of leaked issued notes to an arbitrary d.

Current Recovery Modal

Let T be the index of the nonce that was issued in the last BlindMessage (meaning there is no other issued note whose nonce index O > T).

[0] |--------------------------------------------------------------------------------- [T] >                                                                 

On the first iteration our current approach sets a batch size b, then computes the nonces and relative BlindMessages in [0, b-1] and queries the Mint find out which one of them are issued and spent.

[0] |xxxxxxxxxx [b-1] --------------------------------------------------------------- [T] > 
                                                                                 

On the second iteration, we check the next batch of BlindMessagess in [b, 2b-1].

[0] |xxxxxxxxxxxxxxxxxxxx [2b-1] --------------------------------------------------- [T] > 
                                                                                 

This process continues until the $i$-th iteration, when we find that the last batch was returned empty or partially empty.

[0] |xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx [T] > ***************  [i*b-1]

After which it follows that if u is the number of un-issued notes in the last batch, then T = i\cdot b-u-1, and we conclude.
Obviously, this procedure is terrible for privacy because we're linking together our entire wallet activity history and -on top of that- we're revealing nonces for which BlindMessages were unissued (the segment part marked with *).

Other than being non-privacy friendly, this approach is also relatively in-efficient: it takes \mathcal{O}(\frac{n}{b}) network requests, with n being the total number of issued notes.

Improved Approach

We subdivide our improved approach into two "phases" to make it more digestable:

  1. we enstablish a property for spent and issued notes.
  2. we introduce a binary search method to find T with logarithmically fewer queries.

1. Spent-Issued Invariant

Let d be an arbitrary size and K be the index of the nonce used for the last spent note (meaning that there is no other spent note whose nonce is of index O > K). Our invariant provides T - d < K \leq T always .

In english, this means: "The unspent ecash notes are always situated in a region of the nonce space that is never larger than a fixed, pre-decided, size."

Visualization:

[0] |=============================== [d] ============= [K] -------------------------------------- [T] > 

Where = marks nonces used for spent notes and - marks nonces used for issued notes.

This property alone doesn't help us: we still need to scan [0, 2^{32}-1] to find K and T.
But we at least know that, when we find K, T can't be far away (and vice-versa).

This is where things get interesting. Turns out we can use binary search to efficiently determine T and then K, leaking far fewer notes that we would with a linear scan.

Here's how it works. At iteration i = 0 we have:

\displaylines{
\begin{aligned}
r^{(0)} &\leftarrow 2^{32}-1\\
l^{(0)} &\leftarrow 0\\
m^{(0)} &= \frac{l^{(0)}+r^{(0)}}{2}
\end{aligned}
}
i = 0
[l = 0]|-----------------------------------x [m] ---------------------------------- [r = 2^32-1] >                                                                 

The wallet only queries the Mint with the BlindedMessage for the nonce with index m.
If m was issued then we restrict the range from the left (we update l). Vice-versa if m is un-issued (empty response) we restrict the range from the right (update r).

i = 1
[l]|--------------- x [m] --------------- [r] ---------------------------------------- [2^32-1] >

i = 2
|=================== [l] ----- x[m] ----- [r] ---------------------------------------- [2^32-1] >

i = n
...

Formally, we have:

\displaylines{
(r^{(i+1)}, l^{(i+1)}) = 
\begin{cases}
r^{(i)}, \frac{l^{(i)} + r^{(i)}}{2} & \text{if m was issued}\\
\frac{l^{(i)} + r^{(i)}}{2}, l^{(i)} &  \text{if m was un-issued}
\end{cases}
}

At the end of this process, we are ideally left with l = m = r, which is our T.
We can repeat this a second time to find K, now with a much smaller admissible range which, due to our invariant, is [T-d, T]

Once both T and K are determined, we can get the unspent notes with a single network request to the mint containing T-K blinded messages whose nonces have indices in [T-K, T].

Note

In this last step, we also save network calls because we don't need to determine the spent status of the notes
we recovered.

Result

Privacy

We improve on privacy, by only revealing \log_2{N} + \log_2{d} + d notes at worst, with N being the nonce space (i.e. 2^{32}).

Efficiency

We improve on efficiency for the same reason: we make fewer network requests.

Consequences

This results don't come for free. Maintaining the invariant described above would require:

  1. Storing an additional counter for K
  2. Storing the index of the secret derivation inside the Proof object.
  3. When sending, we give up our ability to coin-select most efficiently. Instead we sum the value of ecash notes in order of derivation until we get to at least the target amount. ALTERNATIVELY, we could allow arbitrary coin-selection with the condition that -if d is exceeded by a quantity e - we reissue the oldest e unspent notes and update K.
  4. When receiving, we would need to make sure that the gap between T and K doesn't get larger than d. Therefore, we would have to re-issue some of the notes from index K onwards, under bigger denominations. If this can't be done the wallet would be at capacity and the operation should be aborted.
## Improved Recovery (Efficiency and Privacy) The way recovery is currently performed is a privacy disaster: the wallet leaks the entire spending history in an attempt to recover unspent user funds. This issue describes the current recovery method, followed by a revised approach which reduces the number of leaked **unissued** notes by a logarithmic factor and limits the number of leaked **issued** notes to an arbitrary $d$. ## Current Recovery Modal Let $T$ be the index of the nonce that was issued in the last `BlindMessage` (meaning there is no other issued note whose nonce index $O > T$). ``` [0] |--------------------------------------------------------------------------------- [T] > ``` On the first iteration our current approach sets a batch size $b$, then computes the nonces and relative `BlindMessage`s in $[0, b-1]$ and queries the Mint find out which one of them are issued and spent. ``` [0] |xxxxxxxxxx [b-1] --------------------------------------------------------------- [T] > ``` On the second iteration, we check the next batch of `BlindMessages`s in $[b, 2b-1]$. ``` [0] |xxxxxxxxxxxxxxxxxxxx [2b-1] --------------------------------------------------- [T] > ``` This process continues until the $i$-th iteration, when we find that the last batch was returned empty or partially empty. ``` [0] |xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx [T] > *************** [i*b-1] ``` After which it follows that if $u$ is the number of un-issued notes in the last batch, then $T = i\cdot b-u-1$, and we conclude. Obviously, this procedure is terrible for privacy because we're linking together our entire wallet activity history and -on top of that- we're revealing nonces for which `BlindMessage`s were unissued (the segment part marked with `*`). Other than being non-privacy friendly, this approach is also relatively in-efficient: it takes $\mathcal{O}(\frac{n}{b})$ network requests, with $n$ being the total number of issued notes. ## Improved Approach We subdivide our improved approach into two "phases" to make it more digestable: 1. we enstablish a property for spent and issued notes. 2. we introduce a binary search method to find $T$ with logarithmically fewer queries. ### 1. Spent-Issued Invariant Let $d$ be an arbitrary size and $K$ be the index of the nonce used for the *last* **spent** note (meaning that there is no other **spent** note whose nonce is of index $O > K$). Our invariant provides $T - d < K \leq T$ **always** . In english, this means: *"The unspent ecash notes are always situated in a region of the nonce space that is never larger than a fixed, pre-decided, size."* Visualization: ``` [0] |=============================== [d] ============= [K] -------------------------------------- [T] > ``` Where `=` marks nonces used for spent notes and `-` marks nonces used for issued notes. This property alone doesn't help us: we still need to scan $[0, 2^{32}-1]$ to find $K$ and $T$. But we at least know that, when we find $K$, $T$ can't be far away (and vice-versa). ### 2. Nonce space binary search This is where things get interesting. Turns out we can use [binary search](https://en.wikipedia.org/wiki/Binary_search) to efficiently determine $T$ and then $K$, leaking far fewer notes that we would with a linear scan. Here's how it works. At iteration $i = 0$ we have: ```math \displaylines{ \begin{aligned} r^{(0)} &\leftarrow 2^{32}-1\\ l^{(0)} &\leftarrow 0\\ m^{(0)} &= \frac{l^{(0)}+r^{(0)}}{2} \end{aligned} } ``` ``` i = 0 [l = 0]|-----------------------------------x [m] ---------------------------------- [r = 2^32-1] > ``` The wallet only queries the Mint with the `BlindedMessage` for the nonce with index $m$. If $m$ was issued then we restrict the range from the left (we update $l$). Vice-versa if $m$ is un-issued (empty response) we restrict the range from the right (update $r$). ``` i = 1 [l]|--------------- x [m] --------------- [r] ---------------------------------------- [2^32-1] > i = 2 |=================== [l] ----- x[m] ----- [r] ---------------------------------------- [2^32-1] > i = n ... ``` Formally, we have: ```math \displaylines{ (r^{(i+1)}, l^{(i+1)}) = \begin{cases} r^{(i)}, \frac{l^{(i)} + r^{(i)}}{2} & \text{if m was issued}\\ \frac{l^{(i)} + r^{(i)}}{2}, l^{(i)} & \text{if m was un-issued} \end{cases} } ``` At the end of this process, we are ideally left with $l = m = r$, which is our $T$. We can repeat this a second time to find $K$, now with a much smaller admissible range which, due to our invariant, is $[T-d, T]$ Once both $T$ and $K$ are determined, we can get the unspent notes with a **single** network request to the mint containing $T-K$ blinded messages whose nonces have indices in $[T-K, T]$. > [!NOTE] > In this last step, we also save network calls because we don't need to determine the spent status of the notes > we recovered. ## Result ### Privacy We improve on privacy, by only revealing $\log_2{N} + \log_2{d} + d$ notes at worst, with $N$ being the nonce space (i.e. $2^{32}$). ### Efficiency We improve on efficiency for the same reason: we make fewer network requests. ## Consequences This results don't come for free. Maintaining the invariant described above would require: 1. Storing an additional counter for $K$ 2. Storing the index of the secret derivation inside the `Proof` object. 3. When sending, we give up our ability to *coin-select* most efficiently. Instead we sum the value of ecash notes in order of derivation until we get to at least the target amount. **ALTERNATIVELY**, we could allow arbitrary coin-selection with the condition that -if $d$ is exceeded by a quantity $e$ - we reissue the oldest $e$ unspent notes and update $K$. 4. When receiving, we would need to make sure that the gap between $T$ and $K$ doesn't get larger than $d$. Therefore, we would have to re-issue some of the notes from index $K$ onwards, under bigger denominations. If this can't be done the wallet would be at capacity and the operation should be aborted.
davidcaseria commented 2025-10-21 14:30:22 +00:00 (Migrated from github.com)

When sending, we give up our ability to coin-select most efficiency. Instead we sum the value of ecash notes in order of derivation until we get to at least the target amount.

Since recovery is infrequent, ideally never used, and coin selection is a routine operation for a wallet, it doesn't seem to me that the benefit justifies this consequence.

> When sending, we give up our ability to coin-select most efficiency. Instead we sum the value of ecash notes in order of derivation until we get to at least the target amount. Since recovery is infrequent, ideally never used, and coin selection is a routine operation for a wallet, it doesn't seem to me that the benefit justifies this consequence.
a1denvalu3 commented 2025-10-21 14:58:11 +00:00 (Migrated from github.com)

Since recovery is infrequent

recovery as it is, however infrequent, destroys the privacy properties of the ecash retroactively for the entire history of the wallet.

Note that I'm not expecting to see this implemented any time soon. It's necessary but definetely not urgent.

> Since recovery is infrequent recovery as it is, however infrequent, destroys the privacy properties of the ecash retroactively for the entire history of the wallet. Note that I'm not expecting to see this implemented any time soon. It's necessary but definetely not urgent.
robwoodgate commented 2025-10-23 12:08:00 +00:00 (Migrated from github.com)

This is a fantastic use for binary search, but the consequences 3 & 4 are pretty restrictive, and may be hard to swallow.

3) Coinselection in order of derivation

The loss of good coinselection is a blow - if you HAVE to take oldest proofs in order, you can't optimize for fees. For example, you might end up burning loads of single unit proofs in fees, rather than being able to use them judiciously to cover fees in larger swaps.

The only upside to that might be that it takes "old proofs" out of circulation faster, which is useful for key rotations.

But I'm not sure the gain will outweigh the potential pain.

Maybe this is less of a concern if the wallet does good target denomination management?

4) Arbitrary capacity (d)

Again, this creates high potential for fee churn, unless the wallet is good at managing target denominations.

Do you envisage the user being able to set this arbitrary capacity (ie take on more privacy leak risk for convenience)?

I wonder if we need the d restriction at all... with a binary search, even a search from origin is never going to leak a fantastic number of proofs vs an arbitrary d. The chop will narrow down the field pretty quickly.

Less consequential option?

Maybe the "halfway-house" is a recovery that does binary search to find T, starting at some "typical" (arbitrary) point nearer the start than the end of the nonce space.

That way, there isn't as much leakage of unissued nonces as current method.

Be interesting to see what people think of the pros and cons of this approach.

This is a fantastic use for binary search, but the consequences 3 & 4 are pretty restrictive, and may be hard to swallow. ### 3) Coinselection in order of derivation The loss of good coinselection is a blow - if you HAVE to take oldest proofs in order, you can't optimize for fees. For example, you might end up burning loads of single unit proofs in fees, rather than being able to use them judiciously to cover fees in larger swaps. The only upside to that might be that it takes "old proofs" out of circulation faster, which is useful for key rotations. But I'm not sure the gain will outweigh the potential pain. Maybe this is less of a concern if the wallet does good target denomination management? ### 4) Arbitrary capacity (`d`) Again, this creates high potential for fee churn, unless the wallet is good at managing target denominations. Do you envisage the user being able to set this arbitrary capacity (ie take on more privacy leak risk for convenience)? I wonder if we need the `d` restriction at all... with a binary search, even a search from origin is never going to leak a fantastic number of proofs vs an arbitrary `d`. The chop will narrow down the field pretty quickly. ### Less consequential option? Maybe the "halfway-house" is a recovery that does binary search to find `T`, starting at some "typical" (arbitrary) point nearer the start than the end of the nonce space. That way, there isn't as much leakage of unissued nonces as current method. Be interesting to see what people think of the pros and cons of this approach.
a1denvalu3 commented 2025-10-25 08:47:43 +00:00 (Migrated from github.com)

I wonder if we need the d restriction at all...

The d-invariant is necessary if you want to ensure a full recovery (meaning you are certain you're not leaving behind nonces for unspent notes).

BUT here's how I think it could be relaxed: we could allow arbitrary coin-selection with the condition that -if d is exceeded by a quantity $e$- we reissue the oldest e unspent notes and update K.

This is way simpler and more compatible with the current wallet implementations.

> I wonder if we need the `d` restriction at all... The d-invariant is necessary if you want to ensure a full recovery (meaning you are certain you're not leaving behind nonces for unspent notes). **BUT** here's how I think it could be relaxed: we could allow arbitrary coin-selection with the condition that -if $d$ is exceeded by a quantity $e$- we *reissue* the oldest $e$ unspent notes and update $K$. This is way simpler and more compatible with the current wallet implementations.
gandlafbtc commented 2025-11-03 10:57:40 +00:00 (Migrated from github.com)

There is also the option to have zero leaks, by downloading all BM, signatures and spent secrets from the mint and do the search locally.

Not elegant, but best privacy

There is also the option to have zero leaks, by downloading all BM, signatures and spent secrets from the mint and do the search locally. Not elegant, but best privacy
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo-admin/nuts#301
No description provided.