Sending Digital Information Over a Wire

  • One way to convey digital information across distances is through copper wire (Ethernet cable). Here we just vary the voltage in the wire between two states A and B. When we are at A, we are sending a 0, and when we are at B we are sending a 1.

  • What is voltage? Voltage is the delta between two points of an electrical field.

  • These states are called symbols.

  • Number of symbols / seconds is a unit called baud. If your symbol rate is 1 symbol per second, you are sending information at 1 baud.

Intro to Fiber Optics and RF Encoding

Two more ways of conveying digital information across distances.

  • Fiber optic cables: basically, they use reflection to give light the ability to go around corners. The symbols are (for example) "light is on" and "light is off".

  • Radio waves: an antenna is constantly radiating radio waves. Information is conveyed through the phase of the wave: e.g. a phase of 0 radians is a 0, a phase of π/4 radians is a 1.

Clock synchronization and Manchester coding

  • All these methods require that sender and receiver agree on when to sample the continuous signal in order to convert it to binary information. This means that they need to have synchronized clocks. Another way of saying this is that sender and receiver need to agree on the same function $\mathbb{R} \rightarrow [0, 1]$.

  • How can we synchronize clocks between distant computers?

    • They could both use a third party to keep their clocks in sync, e.g. GPS satellites (which have very accurate clocks by necessity).

    • We could give each one a highly accurate (e.g. atomic) clock that we promised wouldn't get out of sync.

    • The sender could transmit its clock signal at the same time as the actual data, along a separate connection. Then the receiver interprets the data according to the sender's clock signal, not its own. But the issue with this is that by transmitting clock and data separately, we run the risk that they will get out of sync.

    • Or (what is actually done) we combine clock and data into one signal. Given a data signal that stays at its state for 2 seconds and a clock that flips every second, we XOR the data signal with the clock signal.

    • Now the receiver interprets as a 0 and as a 1. Note that these are only counted when they occur in the middle of a period. So there is still clock-dependence, but the encoding is much more robust to clock errors. In particular, the receiver can identify if its clock is wrong because there will no longer always be a transition during each period.

The Importance of Framing

  • How do computers agree on how to chunk the stream of bits into bytes? (Agree upon a framing scheme). We need to agree on where the transmission starts.

    • One protocol for this is called HDLC -- it has a certain bit pattern (01111110) which denotes the start of a frame. After you see this pattern, the very next bit is the first in the transmission.

    • This means we need to escape the start-of-frame sequence whenever it appears naturally in our data. ("Bit stuffing.")

    • Another protocol is Ethernet -- it uses a different bit pattern and has a distinctive preamble before the bit pattern.

  • There is an efficieny/error recovery tradeoff when choosing your frame size, because an error spoils the whole frame. Make it too low and you will not send data efficiently (you will waste time sending start-of-frame bit patterns). Make it too high and any given error corrupts more data. In practice most frame sizes are between 64 bytes and 1500 bytes.

Frame Formats

  • Ethernet addresses, also called MAC addresses, are unique identifiers built into hardware that are required for communication over Ethernet because you need to specify where your message is going. Interestingly, manufacturers coordinate to make sure they're unique.

Lower Layers of the OSI Model

  • The OSI model (with Wikipedia's descriptions because I feel I am unlikely to summarize better than them).

    1. Physical: Transmission and reception of raw bit streams over a physical medium.
    2. Data Link: Reliable transmission of data frames between two nodes connected by a physical layer.
    3. Network: Structuring and managing a multi-node network, including addressing, routing and traffic control.
    4. Transport: Reliable transmission of data segments between points on a network, including segmentation, acknowledgement and multiplexing.
    5. Session: Managing communication sessions, i.e. continuous exchange of information in the form of multiple back-and-forth transmissions between two nodes
    6. Presentation: Translation of data between a networking service and an application; including character encoding, data compression and encryption/decryption.
    7. Application: High-level APIs, including resource sharing, remote file access

The Internet Protocol

  • Basic layout: ethernet networks (where nodes are identified by MAC addresses) are connected to the Internet (where nodes are identified by IP addresses) by routers. MAC addresses identify other nodes locally, IP addresses identify nodes globally.

  • The internet is a collection of routers which have a collection of rules telling them how to forward packets, based on their destination IP addresses. A router is connected to N other routers ("interfaces") and can forward a packet to any of them. If the interfaces are 1, 2 and 3, the rules ("forwarding table") might look like this:

    • 172.17 / 16 $\rightarrow$ 2 (meaning: route the packet to interface 2 if its first 16 bits match 172.17).
    • 172.17.6 / 24 $\rightarrow$ 3 (meaning: route the packet to interface 3 if its first 24 bits match 172.17.6)
  • Precedence is always given to the rule that matches the most bits, so here the packet 172.17.64 would be routed to interface 3.

ARP: Mapping Between IP and Ethernet

  • Suppose I am a laptop connected to an Ethernet switch in San Francisco, and I want to send a packet to Boston. There's a local router that can route my packet, but in order to use Ethernet to send data to it I need to know its MAC address. How can I discover that? (In general, how do I learn about my router?)

    • We broadcast a message to the whole Ethernet network we are on, saying "Whoever is the owner of this IP address: -- please tell me your MAC address."

    • The protocol used here is ARP (Address Resolution Protocol).

    • We can broadcast a message to the whole Ethernet network by sending a message with the special destination address ff:ff:ff:ff:ff. The switch will send this message to everyone.

Looking at ARP and Ping Packets

  • I asked this SO question, let's see if it gets any results.

  • A packet may, and will often, have a destination MAC address at the Ethernet level that differs from its destination IP address at the IP level. The MAC address is where it's getting routed to right now, but the IP address is where it's going overall.

Hop-by-hop Routing

  • A forwarding table might give multiple best ways for a packet to reach a given destination.

TCP: Transmission Control Protocol

  • TCP handles the problems of ordering packets, guarding against packet loss, distinguishing between multiple conversations that might be going on between computers A and B, and flow control.

  • Each TCP connection is uniquely identified by the four-tuple of (source IP, destination IP, source port, destination port).

TCP Connection Walkthrough

  • TCP is a fundamentally bidirectional protocol; it doesn't matter who is client and who is server once you connect.