\newcommand{\Enc}{\mathsf{Enc}} \newcommand{\Dec}{\mathsf{Dec}} \newcommand{\Gen}{\mathsf{Gen}} \newcommand{\cM}{\mathcal{M}} \newcommand{\cK}{\mathcal{K}} \newcommand{\card}[1]{\vert{#1}\vert}

Introduction

A toy example: match-making

Alice and Bob want to find out if they are meant for each other. Each of them have two choices: either they love the other person or they do not. Now, they wish to perform some interaction that allows them to determine whether there is a match (i.e., if they both love each other) or not—and nothing more. For instance, if Bob loves Alice, but Alice does not love him back, Bob does not want to reveal to Alice that he loves her.

Note that the desired function is simply an AND gate that takes an input from Alice and an input from Bob. Also, if we have a trusted third party Charlie, Charlie solve the problem. The question is, can Alice and Bob compute AND without trusting any others?

The protocol

Assume that Alice and Bob have access to five cards, three identical hearts(♥) and two identical clubs(♣). Also assume they have a round dish.

  1. Put 1 heart face down on the top of dish.
  2. Both A/B: encode Yes by heart-club, No by club-heart.
  3. Both A/B: face down their cards on left/right of dish.
  4. A secretly rotate the cards with dish
  5. B secretly rotate the cards with dish
  6. Open all cards, A-B are matched if and only if 3 hearts in a row.

Analysis

We need to show both correctness and privacy. The correctness is easy. The privacy can be argued by enumeration: all 3 cases {No-No, Yes-No, No-Yes} yield the same sequence (H,C,H,C,H) that is cyclic identical.

Discuss

Alternatively we can use 2 hearts and 3 clubs to compute AND. To compute OR, we can encode Yes/No oppositely. To compute XOR, we can use 2 hearts and 2 clubs. Unfortunately, these protocols do not compose when we want to compose gates.

Course outline

We will cover:

  • Essential primitives: one-way functions (OWF), pseudorandom generators (PRG), pseudorandom functions (PRF), encryption (symmetric key (SK), public key (PK)), authentication (message authentication codes (MAC), signatures)
    • Are they different / related? Sure, they serve different purposes. How to study them systematically?
    • Construction of the essentials (basic number theory and assumptions)
  • Modern crypto (cool stuff): zero-knowledge proofs (ZKP), secure two-party computation (2PC), secure multiparty computation (MPC), fully homomorphic encryption (FHE), my research (oblivious RAM (ORAM), doubly efficient private information retrieval (DEPIR), RAM-FHE)

Topics in a tree

Related but almost NOT cover:

Classical cryptography: hidden writing

Historically, human considered the scenario of encryption in communication.

  • Alice ~~~ mm ~~~> Bob Eavesdropper Eve, an adversary, may be listening on the channel.

Alice/Bob want to hide the message from Eve. To do so, they share two algorithms Enc,Dec\Enc, \Dec secretly and before the communication.

  • Alice ~~~ ctct ~~~>Bob
  • ctEnc(m)ct \gets \Enc(m), ciphertext, where mm is the plaintext
  • Bob recovers plaintext by Dec(ct)\Dec(ct)
  • yA(x)y \gets A(x) denotes algo AA computes on input xx and gets output yy.

Notice: it is important that which info is public (known to all A/B/E) and which is private. What if Eve knows Enc\Enc or Dec\Dec?

Kerckhoffs’s principle

The enemy knows the system.

~ Claude Shannon. Communication Theory of Secrecy Systems. The Bell System Technical Journal, 1949. [https://ieeexplore.ieee.org/document/6769090]

Reason: the algorithms are eventually leaked to Eve. We shall be conservative.

Consequence: let algos public, but keep a short secret key kk.

Generalize: sample key kGenk \gets \Gen

Note: (Gen,Enc,Dec)(\Gen, \Enc, \Dec) can not be all deterministic. What if only Gen\Gen is randomized?

Classical encryption

Rotation, substitution, enigma, …. They are mostly broken now. DES is introduced in 70s and broken 90s. AES is introduced in 90s, but now?

Breaking German Army Ciphers Jefferson disk

Modern cryptography: provable security

A principle driven science (instead of an art). Modern cryptography relies on the following paradigms:

  • Providing mathematical definitions of security.
  • Providing precise mathematical assumptions (e.g. “factoring is hard”, where hard is formally defined).
  • Providing proofs of security, i.e., proving that, if some particular scheme can be broken, then it contradicts the assumption.

Definition of secure encryption

We want to accurately model the A-B communication.

Attempt 1

The adversary cannot learn (all or part of) the key from the ciphertext.

It can leak the plaintext.

Attempt 2

The adversary cannot learn the plaintext from the ciphertext.

Leaking some function of the plaintext can be fatal.

Attempt 3

The adversary cannot learn any function of, or any part of the plaintext from the ciphertext.

The adversary may already learned something even not looking at ct.

Intuitive Definition

Given some a priori information, the adversary cannot learn any additional information about the plaintext by observing the ciphertext.

Formal definition

Let K\cK be the space of keys, and let M\cM be the space of all messages. We want to model the information as probability distributions.

Definition: Private-key encryption.

(Gen,Enc,Dec)(\Gen,\Enc,\Dec) is said to be a private-key encryption scheme over the messages space M\cM and the keyspace K\cK if the following syntax holds.

  1. Gen\Gen is a randomized algorithm that returns a key kKk \in \cK. We denote by kGenk \gets \Gen the process of generating kk.
  2. Enc\Enc is a potentially randomized algorithm that on input a key kKk\in\cK and a message mMm \in \cM, outputs a ciphertext cc. We denote by cEnck(m)c \gets \Enc_k(m) the computation of Enc\Enc on kk and mm.
  3. Dec\Dec is a deterministic algorithm that on input input a key kk and a ciphertext cc outputs a message mMm \in \cM. We denote by mDeck(c)m' \gets \Dec_k(c) the computation of Dec\Dec.
  4. (Correctness.) For all mMm \in \cM,

    Pr[kGen:Deck(Enck(m))=m]=1.\Pr[k \gets \Gen : \Dec_k(\Enc_k(m)) = m] = 1.

    (the probability is taken over the randomness of Gen,Enc\Gen, \Enc.)

Definition: Shannon Secrecy.

The private-key encryption scheme (M,K,Gen,Enc,Dec)(\cM,\cK,\Gen,\Enc,\Dec) is Shannon-secret with respect to the distribution DD over M\cM if for all mMm' \in \cM and for all cc,

Pr[kGen;mD:m=m  Enck(m)=c]=Pr[mD:m=m].\Pr[k \gets \Gen; m \gets D : \qquad m = m' \ | \ \Enc_k(m) = c] \quad = \quad \Pr[m \gets D : m = m'].

An encryption scheme is said to be Shannon secret if it is Shannon secret with respect to all distributions DD over M\cM.

The RHS is the distribution of messages without cc. The LHS is the distribution conditioned on observing cc. Is the definition good if we skip the quantifier for “all distribution DD”?

An alternative intuition is that the distribution of ciphertexts for any two messages are identical.

Definition: Perfect Secrecy.

The private-key encryption scheme (M,K,Gen,Enc,Dec)(\cM,\cK,\Gen,\Enc,\Dec) is perfectly secret if for all m1,m2Mm_1, m_2 \in \cM, and for all cc,

Pr[kGen:Enck(m1)=c]=Pr[kGen:Enck(m2)=c].\Pr[k \gets \Gen : \Enc_k(m_1) = c] \quad = \quad \Pr[k \gets \Gen : \Enc_k(m_2) = c].

Note: this definition is simpler and easier to use.

Claim:

Perfect secrecy implies Shannon secrecy.

Proof:

Suppose that (M,K,Gen,Enc,Dec)(\cM,\cK,\Gen,\Enc,\Dec) is perfectly secret. For any DD, any cc, and any mˉ\bar m, we have

Prk,m[m=mˉEnck(m)=c]=Prk,m[m=mˉEnck(m)=c]/Prk,m[Enck(m)=c]=Prk,m[Enck(m)=cm=mˉ]Prm[m=mˉ]/Prk,m[Enck(m)=c]\begin{align*} \Pr_{k,m}[m = \bar m | \Enc_k(m) = c] & = \Pr_{k,m}[m = \bar m \cap \Enc_k(m) = c] / \Pr_{k,m}[\Enc_k(m) = c]\\ & = \Pr_{k,m}[\Enc_k(m) = c | m = \bar m] \Pr_m[m = \bar m] / \Pr_{k,m}[\Enc_k(m) = c]\\ \end{align*}

Notice that mm is a shared random variable in both events, so, we can not say that they are independent events.

We will write Prk[Enck(mˉ)]\Pr_{k}[\Enc_k(\bar m)] instead of Prk,m[Enck(m)=cm=mˉ]\Pr_{k,m}[\Enc_k(m) = c | m = \bar m], and we want to show it equals to Prk,m[Enck(m)]\Pr_{k,m}[\Enc_k(m)] (note mˉ\bar m is not r.v. but mm is).

Prk,m[Enck(m)=c]=mMPrk[Enck(m)=c]Prm[m=m]=Prk[Enck(mˉ)=c]mMPrm[m=m]=Prk[Enck(mˉ)=c]1\begin{align*} & \Pr_{k,m}[\Enc_k(m) = c] \\ = & \sum_{m' \in \cM} \Pr_{k}[\Enc_k(m') = c] \Pr_{m}[m = m'] \\ = & \Pr_{k}[\Enc_k(\bar m) = c] \sum_{m' \in \cM} \Pr_{m}[m = m'] \\ = & \Pr_{k}[\Enc_k(\bar m) = c] \cdot 1 \end{align*}

The first eq is just sum of prob. The second use perfect secrecy: Prk[Enck(mˉ)=c]=Prk[Enck(m)=c]\Pr_{k}[\Enc_k(\bar m) = c] = \Pr_{k}[\Enc_k(m') = c] for any mˉ\bar m and mm'. The third is also sum of prob. That implies

Prk[Enck(mˉ)=c]=Prk,m[Enck(m)=c].\Pr_{k}[\Enc_k(\bar m) = c] = \Pr_{k,m}[\Enc_k(m) = c].

That gives Shannon secrecy:

Prk,m[m=mˉEnck(m)=c]=Prm[m=mˉ]Prk[Enck(mˉ)=c]/Prk,m[Enck(m)=c]=Prm[m=mˉ].\begin{align*} & \Pr_{k,m}[m = \bar m | \Enc_k(m) = c] \\ & = \Pr_{m}[m = \bar m] \cdot \Pr_{k}[\Enc_k(\bar m) = c] / \Pr_{k,m}[\Enc_k(m) = c] \\ & = \Pr_{m}[m = \bar m]. \end{align*}

Claim:

Shannon secrecy implies perfect secrecy.

One-Time Pad

Definition: One-Time Pad.

The One-Time Pad encryption scheme is described by the following 5-tuple (M,K,Gen,Enc,Dec)(\cM, \cK, \Gen, \Enc, \Dec):

  • M={0,1}n\cM = \{0, 1\}^n
  • K={0,1}n\cK = \{0, 1\}^n
  • Gen\Gen: k:=k1k2kn{0,1}nk := k_1 k_2 \dots k_n \gets \{0, 1\}^n
  • Enck(m1m2mn)\Enc_k(m_1m_2\dots m_n): c1c2cnc_1 c_2 \dots c_n where ci=mikic_i = m_i \oplus k_i
  • Deck(c1c2cn)\Dec_k(c_1c_2\dots c_n): m1m2mnm_1 m_2 \dots m_n where mi=cikim_i = c_i \oplus k_i

The \oplus operator represents the binary XOR operation.

Theorem:

One-Time Pad is a perfectly secure private-key encryption scheme.

Proof:

We need to prove correctness and privacy.

The cost of OTP is the long key kk.

One-Time Pad is Optimal in Key Length

Theorem: (Shannon)

If scheme (M,K,Gen,Enc,Dec)(\cM, \cK, \Gen, \Enc, \Dec) is a perfectly secret private-key encryption scheme, then KM\card{\cK} \geq \card{\cM}.

Proof:

Let cEnck(m)c \gets \Enc_k(m) be a fixed ciphertext for some fixed k,mk, m. Let P:=m:Deck(c)=m for any kP := \\{m : \Dec_k'(c) = m \text{ for any } k' \\}. We have PK<M\card{P} \leq \card{\cK} \lt \card{\cM} as Dec\Dec is deterministic. So, there exists m2Pm_2 \notin P. Then, it follows that

Prk[Enck(m2)=c]=0\Pr_k[\Enc_k(m_2) = c] = 0

by correctness. However, we have Prk[Enck(m)=c]>0\Pr_k[\Enc_k(m) = c] \gt 0, and it violates perfect secrecy.

Notice that we can quantify the difference of probability (which yields a stronger theorem) by quantifying K\card{\cK}.