This post will focus on the performance and scalability of fteproxy. Specifically, we’re going to look at it’s maximal throughput in its default configuration and how it scales as the number of simultaneously connected users increases. This is, in part, motivated by the spike in fteproxy users February, which begs the question: how many fteproxy Tor bridges do we need if the number of users increase from 100s, to 1Ks, to 10Ks, etc.
TL;DR We can support at least two orders of magnitude more users with the existing infrastructure of six Tor bridges.
Our Environment
We spun up two AWS instances. Our client is an m.large instance in the US East (N. Virgina) region. Our server is an m.medium instance in he US West (Oregon) region. This asymmetry enables us to know the performance limitations of the server-side m.medium instance.
In order to establish a baseline for the performance between the two instances we use iperf. On our server we run:
$ iperf -s -p 8081
This spins up a listener on the server’s port 8081. Then on our client we run:
$ iperf -c $SERVER_IP -p 8081 -P 1 ------------------------------------------------------------ Client connecting to $SERVER_IP, TCP port 8081 TCP window size: 85.0 KByte (default) ------------------------------------------------------------ [ 3] local $CLIENT_IP port 34918 connected with $SERVER_IP port 8081 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.1 sec 192 MBytes 160 Mbits/sec
Iperf spins up, generates a single connection and sends as much data as possible within 10 seconds, then reports the throughput. This test reports ~160 Mbits/sec, which was a consistent result across multiple invocations of this test.
fteproxy: Single-Connection Performance
Next, we want to use iperf to test the performance of fteproxy between the client and server. We install fteproxy 0.2.19 via PyPI and then spin up the fteproxy client on our client:
$ fteproxy --mode client --server_ip $SERVER_IP
and the fteproxy server on our server:
$ fteproxy --mode server --server_ip 0.0.0.0
Then, client-side, we’re able to run iperf and connect to our local fteproxy listening port in order to tunnel iperf via fteproxy:
$ iperf -c 127.0.0.1 -p 8079 -P 1 ------------------------------------------------------------ Client connecting to 127.0.0.1, TCP port 8079 TCP window size: 2.50 MByte (default) ------------------------------------------------------------ [ 3] local 127.0.0.1 port 53831 connected with 127.0.0.1 port 8079 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.1 sec 170 MBytes 142 Mbits/sec
Single-threaded performance turns out to be exceptional. We achieve 142Mbps, compared to 160Mbps with a direct connection. This is a performance penalty of only about 12%.
fteproxy: Multiple-Connection Performance
As are final test, we use the same configuration as in the single-connection test. However, we use the “-P” switch in iperf in order to create multiple simultaneous datastreams. We performed this test for P=[10,20,…,100] and achieved the following:
With 100 simultaneous streams fteproxy achieves roughly 1.5Mbps throughput per stream, compared to 3.8Mbps throughput with a direct connection.
Conclusion
A cursory analysis of the server-side side performance during these tests indicate the fteproxy is, in fact, CPU bound with 100 simultaneous users. However, this test is extreme in the sense that all connections attempted to send data as aggressively as possible. Whereas in practice a user that is, say, browsing the web using fteproxy will exhibit a less demanding workload. Hence, an fteproxy server hosted on an m3.medium instance will likely scale well beyond 100 simultaneous users.
Given that the current fteproxy usage in Tor has peaked at ~300 users/day, it appears that the fteproxy software will probably scale to serve two orders of magnitude more users with our current infrastructure of six servers.