[ExI] possible scheme for privacy

Mon Jul 21 13:21:46 UTC 2014

On Monday, July 21, 2014 12:09 AM, Rafal Smigrodzki wrote,
> On Sun, Jul 20, 2014 at 12:52 PM, Harvey Newstrom wrote:
> > In TOR, a far-reaching entity can monitor a very large number of ISPs
> > and exit nodes to match traffic patterns between the sender (real IP and
> > encrypted message) and the TOR exit node (fake IP and unencrypted
> > message) to link the real IP with the unencrypted message.  Although TOR is
> > generally safe, there is no way to prevent a big enough monitoring system
> > from catching everything.
> > (https://en.wikipedia.org/wiki/Tor_(anonymity_network)#cite_note
> > -torproject-fail-both-ends-32)
>
> ### How about using a continuous data stream from all users to cover up
> actual usage pattern? If I am sending and receiving an encrypted one-time
> pad-randomized 1 kb/s 24/7, I can send and receive an arbitrary number of
> text messages without an adversary being able to determine when or what I
> send to or from whom, unless they have the one time pad and full access to
> the stream of data and access to the nodes routing my data stream, or they
> have hardware access on my end (i.e. they own me anyway).

In theory, yes, this would be the perfect answer.  However, the devil is in the details, as usual.

The problem is the way applications, operating systems, and routers fragment packets and send them.  Even if all users send identical messages at identical times, their different environments will fragment them into a different number of packets, and buffer them into different timing patterns.  So the fingerprint of the traffic analysis would still be variable between different users doing the exact same thing.  

And it's more than just the intended traffic that needs to be made uniform.  The underlying TCP/IP stack on the operating systems and routers will respond differently to lost packets, resend requests, time-out duration, dynamic window sizes for throughput, optimizing throughput speeds, buffer sizes and related delays, etc.  To make the traffic look identical, all users would have to use the exact same hardware, software, operating systems, TCP/IP stacks, patching levels, router brands, memory/disk sizes and delays, number of hops in their local network, and constant unchanging traffic loads on their local networks.  To be even more extreme, there could be differing timing delays or error rates based on what brand Ethernet cables they use and how far they are from the electrical wires in each home.  There would be no way to make everything exactly identical.

Beyond the above items that might be within the user's control, there is no way all users could obtain the same distance/delay to their local ISP, or have all ISPs using the same exact same hardware, software, operating systems, TCP/IP stacks, patching levels, router brands, memory/disk sizes and delays, number of hops in their local network, and constant unchanging traffic loads on their metropolitan area networks.  The extremely complex chain of connectivity between each user and their ISP will add traffic analysis signatures unique to that user, but outside their control, somewhere between their location and their ISP.

At first glance, this seems unlikely to be doable by individual users.  Maybe if a whole apartment building or neighborhood block merged their traffic and tunneled it through a shared VPN, they might be able to mask individual differences.  But then they would be traceable back to that local group.  As long as each person has an individual data stream to their ISP, they will probably have unique traffic analysis signatures.

--
Harvey Newstrom   www.HarveyNewstrom.com