Official Onion URL: https://catharibrmbuat2is36fef24gqf3rzcmkdy6llybjyxzrqthzx7o3oyd.onion/
How Onion Routing Works: Circuit Building, Relay Types and Hidden Services | Catharsis Market Wiki

How Onion Routing Works: Circuit Building, Relay Types and Hidden Services

Onion routing is the foundational technology behind the Tor network, the most widely deployed anonymous communication system in the world. Originally developed in the mid-1990s at the United States Naval Research Laboratory by Paul Syverson, Michael Reed, and David Goldschlag, onion routing was designed to protect intelligence communications from traffic analysis. The technology was later released as open-source software, and the Tor Project was established as a nonprofit organization to maintain and develop it. Today, Tor serves millions of users worldwide, including journalists, activists, law enforcement, military personnel, and ordinary citizens who value their privacy. This article provides a comprehensive technical examination of how onion routing works, from the fundamental cryptographic principles through circuit construction, relay architecture, directory authorities, and the hidden services protocol that enables .onion addresses.

Fundamental Principles of Onion Routing

The core insight of onion routing is that anonymity can be achieved by separating identification from routing. In normal internet communication, every packet contains both the source and destination IP addresses, allowing any observer along the network path to determine who is communicating with whom. Onion routing breaks this linkage by encrypting the communication in multiple layers and routing it through a series of intermediate nodes, each of which can only decrypt one layer of encryption. No single node in the path knows both the origin and the destination of the communication. The entry node knows who the user is but not where they are going. The exit node knows the destination but not who the user is. The middle node knows neither. This architecture provides unlinkability: the property that an observer cannot determine that two communication endpoints are connected.

The "onion" metaphor refers to the layered encryption structure. When a Tor client constructs a circuit, it encrypts the payload three times using the public keys of each relay in the path, creating nested layers of encryption like the layers of an onion. As the message passes through each relay, one layer of encryption is peeled away, revealing routing instructions for the next hop. The complete Tor specification documents this process in exhaustive technical detail.

Cryptographic Foundations

Tor's security rests on well-established cryptographic primitives. The current Tor protocol uses a combination of RSA (for long-term identity keys), Curve25519 (for ephemeral key agreement via the ntor handshake), AES-128-CTR (for symmetric stream encryption), and SHA-256 (for hashing and integrity verification). The ntor handshake, introduced in Tor 0.2.4, replaced the older TAP (Tor Authentication Protocol) handshake and provides significantly better performance and security properties, including forward secrecy and one-way authentication.

Forward secrecy is a critical property: it means that even if a relay's long-term keys are compromised in the future, previously recorded traffic cannot be decrypted. This is achieved through the use of ephemeral Diffie-Hellman key exchanges. Each circuit uses fresh key material that is discarded when the circuit is closed. The cryptographic implementation details are specified in the Tor specification repository on GitHub, which contains the authoritative technical documentation for the protocol.

The ntor Handshake

The ntor handshake is a one-round-trip authenticated key exchange protocol. The client knows the relay's identity key (an Ed25519 key) and its ntor onion key (a Curve25519 public key), both of which are published in the relay's descriptor. The client generates an ephemeral Curve25519 keypair, computes a shared secret using both the relay's onion key and the ephemeral key, and sends the ephemeral public key along with the relay's identity to the relay. The relay performs the complementary computation and responds with its own ephemeral public key and a proof of knowledge of the shared secret. Both parties then derive symmetric keys for encrypting the circuit's traffic. This handshake provides protection against a passive attacker who records the exchange, and it authenticates the relay to the client (though not the client to the relay, which is intentional for anonymity).

Circuit Building: The Path Selection Algorithm

When a Tor client needs to communicate, it first builds a circuit -- a path through the Tor network consisting of exactly three relays: a guard (entry) node, a middle node, and an exit node. The circuit-building process is incremental, meaning the client extends the circuit one hop at a time, negotiating a separate set of encryption keys with each relay.

Step 1: Guard Node Selection

The client first connects to a guard node. Guard nodes (also called entry guards) are a critical security feature introduced to defend against a class of statistical attacks. In the original Tor design, the client selected all three relays randomly for each circuit. This meant that over time, there was a high probability that at least one circuit would use a malicious entry node and a malicious exit node simultaneously, enabling a traffic correlation attack. Entry guards solve this by restricting the client to a small set of long-lived entry nodes. The current implementation maintains a set of primary guards that the client uses persistently for months. Even if some relays in the network are malicious, the probability of a correlation attack is bounded by the probability of selecting a malicious guard in the first place, rather than increasing with each new circuit.

Guard selection is weighted by bandwidth: relays that contribute more bandwidth to the network are more likely to be selected, ensuring efficient use of network resources. The client establishes a TLS connection to the guard node and performs the ntor handshake to negotiate circuit-layer encryption keys. At this point, the client has a one-hop circuit to the guard.

Step 2: Middle Node Extension

The client then extends the circuit to a middle node by sending an EXTEND2 cell through the guard. This cell is encrypted with the guard's circuit key, so the guard can decrypt it and see that it should forward a CREATE2 cell to the middle node. The CREATE2 cell contains the client's ephemeral public key for the ntor handshake with the middle node. The guard does not know the content of the ntor handshake -- it only knows the identity of the middle node it should connect to. The middle node responds with a CREATED2 cell, which the guard wraps in an EXTENDED2 cell and sends back to the client. The client now shares encryption keys with both the guard and the middle node, and the guard only knows that the client is communicating through it to the middle node.

Step 3: Exit Node Extension

The circuit is extended to the exit node in exactly the same manner. The client sends an EXTEND2 cell that is encrypted under two layers: the outer layer for the guard (which strips it and forwards to the middle) and the inner layer for the middle node (which strips it and forwards a CREATE2 cell to the exit node). After this handshake completes, the client has a fully constructed three-hop circuit with independent encryption keys for each hop.

When the client sends data through the circuit, it encrypts the payload three times: first with the exit node's key, then with the middle node's key, then with the guard's key. Each relay strips one layer of encryption and forwards the result. The exit node receives the original plaintext (or the TLS-encrypted data if the destination uses HTTPS) and forwards it to the destination server.

Relay Types and Their Roles

Guard (Entry) Relays

Guard relays occupy the most security-sensitive position in the Tor network. Because the guard is the only relay that knows the client's real IP address, a compromised guard can identify the user. To become a guard, a relay must meet stability and bandwidth requirements and must have been running for a sufficient period (currently at least 8 days with the Stable and Fast flags). The directory authorities assign the Guard flag to relays that meet these criteria. Users should be aware that their ISP can observe that they are connecting to a Tor guard node (though not what they are doing through Tor), which is why obfuscation plugins like obfs4 and Snowflake exist to disguise Tor traffic as ordinary HTTPS traffic.

Middle Relays

Middle relays serve as the intermediary hop in the circuit. They receive encrypted traffic from the guard and forward it to the exit (or to another middle relay in circuits longer than three hops, which can occur with onion services). Middle relays are in the safest position for operators because they never see the client's IP address and never see the destination or the unencrypted traffic. For this reason, operating a middle relay carries minimal legal risk, and the Tor Project actively encourages volunteers to run them. The source code for the Tor relay software is available at github.com/torproject/tor.

Exit Relays

Exit relays are the final hop in the circuit and are responsible for making connections to destination servers on the regular internet. They can see the unencrypted traffic (if the destination does not use TLS) and the destination address, but they cannot see the client's IP address. Exit relays are the most resource-intensive and legally sensitive relays to operate because they are the apparent source of all traffic passing through them. Exit relay operators often receive abuse complaints and must configure exit policies that specify which ports and destinations they are willing to connect to. Many exit relays restrict their policies to common ports like 80 (HTTP) and 443 (HTTPS) to reduce exposure to abuse.

Directory Authorities

The Tor network's topology is managed by a set of nine hard-coded directory authorities operated by trusted members of the Tor community. These authorities collectively maintain the network consensus: an hourly-updated document that lists every known relay in the network along with its capabilities, bandwidth, flags, and keys. The consensus is produced through a voting protocol where each authority independently measures and evaluates the relays, publishes a vote, and then the votes are combined into a consensus document that is signed by a majority of authorities.

Directory authorities assign flags to relays based on measured performance and behavior. Important flags include Guard (eligible to be selected as an entry guard), Exit (configured to allow exit traffic), Stable (long uptime and reliable), Fast (high bandwidth), HSDir (willing to serve as a hidden service directory), and Valid (appears to be correctly configured and not obviously malicious). The network consensus also includes bandwidth weights that determine the probability of each relay being selected for circuits. The full directory protocol is documented in the Tor directory specification.

Bandwidth authorities, a subset of the directory authorities, run the sbws (Simple Bandwidth Scanner) tool to actively measure each relay's throughput and publish bandwidth weights. This ensures that relays with more capacity handle proportionally more traffic, optimizing network performance.

Onion Services (Hidden Services)

Onion services, formerly known as hidden services, allow servers to offer services through the Tor network without revealing their IP address. Both the client and the server are anonymous to each other, communicating through a rendezvous point within the Tor network. The current version, v3 onion services (specified in the rend-spec-v3 document), uses 56-character .onion addresses derived from the service's Ed25519 public key.

How Onion Services Work

The onion service protocol involves several steps. First, the service generates a long-term Ed25519 identity keypair. The .onion address is derived from the public key using a specific encoding scheme. The service then builds circuits to several relays designated as introduction points and publishes signed descriptors to a distributed hash table (DHT) maintained by relays with the HSDir flag. These descriptors are encrypted and contain the addresses of the introduction points along with authentication keys.

When a client wants to connect to an onion service, it first retrieves the service's descriptor from the HSDir DHT. The descriptor lookup uses a blinded key derived from the service's public key and the current time period, which prevents HSDir relays from tracking which services they are hosting. The client decrypts the descriptor to learn the introduction points, then builds a circuit to a relay that it selects as a rendezvous point and sends the rendezvous point's identity to the service through an introduction point. The service builds a circuit to the rendezvous point, and the two circuits are joined, creating a six-hop path between the client and the service (three hops from the client to the rendezvous point, three hops from the service to the rendezvous point).

The HSDir Distributed Hash Table

The Hidden Service Directory (HSDir) is a distributed storage system where onion service descriptors are stored. Relays with the HSDir flag participate in this system. The descriptor for a given onion service is stored on a set of HSDirs determined by a hash derived from the service's blinded public key, a time period number, and a shared random value generated by the directory authorities. This design ensures that descriptors are distributed across multiple relays and that the set of responsible HSDirs rotates over time, making it difficult for an adversary to persistently target the HSDirs for a specific service.

V3 onion services introduced significant improvements over the v2 protocol, including stronger cryptography (Ed25519 and Curve25519 instead of RSA-1024), resistance to HSDir-based deanonymization attacks, and client authorization that allows services to restrict access to authorized clients.

Traffic Analysis and Attacks on Tor

Despite its strong cryptographic foundations, Tor has known limitations against certain classes of attacks. The most significant is traffic correlation, also known as end-to-end correlation. If an adversary can observe traffic entering the Tor network (at the guard) and traffic leaving the Tor network (at the exit or at the destination), they can correlate the two streams based on timing, volume, and pattern. This attack does not require breaking any cryptography -- it exploits the fundamental property that Tor is a low-latency network that does not introduce significant delays or reorder packets.

Academic research has demonstrated that traffic correlation attacks are feasible even with relatively noisy observations. The 2013 paper by Johnson et al., "Users Get Routed: Traffic Correlation on Tor by Realistic Adversaries," showed that a moderate adversary controlling a fraction of the internet's autonomous systems could deanonymize a significant fraction of Tor users over time. The Tor Project has acknowledged this limitation and has implemented several mitigations, including entry guards (which reduce the window of vulnerability), padding cells (which add noise to traffic patterns), and vanguards (which protect the guard nodes of onion services from identification).

Sybil Attacks

A Sybil attack involves an adversary running many relays in the Tor network to increase their probability of being selected in circuits. If the adversary controls both the guard and the exit of a circuit, they can perform traffic correlation. The directory authorities attempt to detect and remove suspicious relays, and the bandwidth authority system limits the influence of relays that claim more bandwidth than they actually provide. The Tor Community portal discusses ongoing efforts to improve relay vetting and Sybil detection.

Website Fingerprinting

Website fingerprinting attacks attempt to identify which website a Tor user is visiting by analyzing the encrypted traffic between the user and the guard node. Each website produces a characteristic traffic pattern based on the number and sizes of resources loaded. Machine learning models trained on these patterns have achieved high accuracy in laboratory settings, though real-world effectiveness is reduced by factors such as caching, dynamic content, and background traffic. The Tor Project has implemented padding-based defenses and continues to research more effective countermeasures.

Running a Tor Relay

Contributing a relay to the Tor network strengthens anonymity for all users by increasing the diversity and capacity of available paths. Setting up a middle relay on a Linux server is straightforward:

# Install Tor on Debian/Ubuntu
sudo apt update
sudo apt install tor

# Edit the Tor configuration
sudo nano /etc/tor/torrc

# Add these lines for a middle relay:
ORPort 9001
Nickname YourRelayName
ContactInfo your-email@example.com
ExitRelay 0
SocksPort 0

# Restart Tor
sudo systemctl restart tor

# Monitor your relay's status
sudo journalctl -u tor -f

After approximately 24-48 hours, your relay will appear in the Tor network consensus and begin carrying traffic. You can monitor its status through the Tor Metrics portal. For a comparison with alternative anonymity networks, see our article on the I2P network.

Current Developments and Future Directions

The Tor network continues to evolve. Active development areas include improved congestion control algorithms (replacing the traditional window-based flow control with more sophisticated algorithms), walking onion routing (which aims to reduce the bandwidth overhead of distributing the full network consensus to every client), and proof-of-work defenses for onion services (which require clients to solve a computational puzzle before connecting, making denial-of-service attacks more expensive). The Tor Project also continues to improve its censorship circumvention tools, with Snowflake allowing volunteers to run ephemeral WebRTC-based proxy bridges through their browsers.

Understanding onion routing at this level of detail is essential for anyone who relies on Tor for their anonymity. The protocol's security properties are not absolute -- they are contingent on correct usage, a diverse and healthy relay network, and an adversary model that does not include global passive surveillance. For practical guidance on integrating Tor into a comprehensive security strategy, see our OPSEC fundamentals guide and our article on Monero privacy for protecting financial transactions.