Introduction

About me

  • PhD: Cornell, postdoc: Northeastern
  • Undergrad major: chemistry
  • Worked in the industry as SE

About this course

  • Goal: provable security, 1) define the desired security, 2) construct and prove a system
  • Rigorous and theoretic approach, comfortable with mathematical proofs
  • Prerequisites: Discrete Mathematics, Probability, Theory of Computation

Logistics

  • Website (weikailin.github.io/cs6222-fa23), my email (wklin-course)
  • Office hours (Tue 9:30, Rice Hall 505)
  • Course works: 4 HW, final proj, final exam, quizzes
  • HW policies:

    The goal: critical def, rigorous proof, to solve future challenges

    1. in-class, reference, office hours
    2. other in-class students (ack)
    3. other pub (cite)
    4. ready-made solutions like other people, AI, solutions in prev courses (avoid)

    Eg, wrote your own, then google without edit. Good. Another eg, wrote your own, then google and edit. Shall cite or ack.

    Internalize your writeup. No direct copying. Must be explainable orally. Edit history (Overleaf) is recommended.

    See website for details.

  • Survey (quiz) due on Aug 23

A toy example: match-making

Alice and Bob want to find out if they are meant for each other. Each of them have two choices: either they love the other person or they do not. Now, they wish to perform some interaction that allows them to determine whether there is a match (i.e., if they both love each other) or not—and nothing more. For instance, if Bob loves Alice, but Alice does not love him back, Bob does not want to reveal to Alice that he loves her.

Note that the desired function is simply an AND gate that takes an input from Alice and an input from Bob. Also, if we have a trusted third party Charlie, Charlie solve the problem. The question is, can Alice and Bob compute AND without trusting any others?

The protocol

Assume that Alice and Bob have access to five cards, three identical hearts(♥) and two identical clubs(♣). Also assume they have a round dish.

  1. Put 1 heart face down on the top of dish.
  2. Both A/B: encode Yes by heart-club, No by club-heart.
  3. Both A/B: face down their cards on left/right of dish.
  4. A secretly rotate the cards with dish
  5. B secretly rotate the cards with dish
  6. Open all cards, A-B are matched if and only if 3 hearts in a row.

Analysis

We need to show both correctness and privacy. The correctness is easy. The privacy can be argued by enumeration: all 3 cases {No-No, Yes-No, No-Yes} yield the same sequence (H,C,H,C,H) that is cyclic identical.

Discuss

Alternatively we can use 2 hearts and 3 clubs to compute AND. To compute OR, we can encode Yes/No oppositely. To compute XOR, we can use 2 hearts and 2 clubs. Unfortunately, these protocols do not compose when we want to compose gates.

Course outline

We will cover:

  • Essential primitives: one-way functions (OWF), pseudorandom generators (PRG), pseudorandom functions (PRF), encryption (symmetric key (SK), public key (PK)), authentication (message authentication codes (MAC), signatures)
    • Are they different / related? Sure, they serve different purposes. How to study them systematically?
    • Construction of the essentials (basic number theory and assumptions)
  • Modern crypto (cool stuff): zero-knowledge proofs (ZKP), secure two-party computation (2PC), secure multiparty computation (MPC), fully homomorphic encryption (FHE), my research (oblivious RAM (ORAM), doubly efficient private information retrieval (DEPIR), RAM-FHE)

Topics in a tree

Related but almost NOT cover:

  • System security
  • Blockchain
  • (Really math) Number theory
  • Quantum comp

Classical cryptography: hidden writing

Historically, human considered the scenario of encryption in communication.

  • Alice ~~~ m ~~~> Bob Eavesdropper Eve, an adversary, may be listening on the channel.

Alice/Bob want to hide the message from Eve. To do so, they share two algorithms Enc,Dec secretly and before the communication.

  • Alice ~~~ ct ~~~>Bob
  • ctEnc(m), ciphertext, where m is the plaintext
  • Bob recovers plaintext by Dec(ct)
  • yA(x) denotes algo A computes on input x and gets output y.

Notice: it is important that which info is public (known to all A/B/E) and which is private. What if Eve knows Enc or Dec?

Kerchoff’s principle

The enemy knows the system.

~ Claude Shannon. Communication Theory of Secrecy Systems. The Bell System Technical Journal, 1949. [https://ieeexplore.ieee.org/document/6769090]

Reason: the algorithms are eventually leaked to Eve. We shall be conservative.

Consequence: let algos public, but keep a short secret key k.

Generalize: sample key kGen

Note: (Gen,Enc,Dec) can not be all deterministic. What if only Gen is randomized?

Classical encryption

Rotation, substitution, enigma, …. They are mostly broken now. DES? AES?

Breaking German Army Ciphers

Modern cryptography: provable security

A principle driven science (instead of an art). Modern cryptography relies on the following paradigms:

  • Providing mathematical definitions of security.
  • Providing precise mathematical assumptions (e.g. “factoring is hard”, where hard is formally defined).
  • Providing proofs of security, i.e., proving that, if some particular scheme can be broken, then it contradicts the assumption.

Definition of secure encryption

We want to accurately model the A-B communication.

Attempt 1

The adversary cannot learn (all or part of) the key from the ciphertext.

It can leak the plaintext.

Attempt 2

The adversary cannot learn the plaintext from the ciphertext.

Leaking some function of the plaintext can be fatal.

Attempt 3

The adversary cannot learn any function of, or any part of the plaintext from the ciphertext.

The adversary may already learned something even not looking at ct.

Intuitive Definition

Given some a priori information, the adversary cannot learn any additional information about the plaintext by observing the ciphertext.

Formal definition

Let K be the space of keys, and let M be the space of all messages. We want to model the information as probability distributions.

Definition: Private-key encryption.

(Gen,Enc,Dec) is said to be a private-key encryption scheme over the messages space M and the keyspace K if the following syntax holds.

  1. Gen is a randomized algorithm that returns a key kK. We denote by kGen the process of generating k.
  2. Enc is a potentially randomized algorithm that on input a key kK and a message mM, outputs a ciphertext c. We denote by cEnck(m) the computation of Enc on k and m.
  3. Dec is a deterministic algorithm that on input input a key k and a ciphertext c outputs a message mM. We denote by mDeck(c) the computation of Dec.
  4. (Correctness.) For all mM,

    Pr[kGen:Deck(Enck(m))=m]=1.

    (the probability is taken over the randomness of Gen,Enc.)

Definition: Shannon Secrecy.

The private-key encryption scheme (M,K,Gen,Enc,Dec) is Shannon-secret with respect to the distribution D over M if for all mM and for all c,

Pr[kGen;mD:m=m | Enck(m)=c]=Pr[mD:m=m].

An encryption scheme is said to be Shannon secret if it is Shannon secret with respect to all distributions D over M.

The RHS is the distribution of messages without c. The LHS is the distribution conditioned on observing c. Is the definition good if we skip the quantifier for “all distribution D”?

An alternative intuition is that the distribution of ciphertexts for any two messages are identical.

Definition: Perfect Secrecy.

The private-key encryption scheme (M,K,Gen,Enc,Dec) is perfectly secret if for all m1,m2M, and for all c,

Pr[kGen:Enck(m1)=c]=Pr[kGen:Enck(m2)=c].

Note: this definition is simpler and easier to use.

Claim:

Perfect secrecy implies Shannon secrecy.

Proof:

Suppose that (M,K,Gen,Enc,Dec) is perfectly secret. For any D, any c, and any m¯, we have

Prk,m[m=m¯|Enck(m)=c]=Prk,m[m=m¯Enck(m)=c]/Prk,m[Enck(m)=c].

Then, we want to split the joint prob. so that we can cancel it with the denominator. They are not independent, so we rearrange

Prk,m[m=m¯Enck(m)=c]=Prk,m[Enck(m)=c|m=m¯]Prm[m=m¯]

We will write Prk[Enck(m¯)] instead of Prk,m[Enck(m)=c|m=m¯], and we want to show it equals to Prk,m[Enck(m)] (note m¯ is not r.v. but m is).

Prk,m[Enck(m)=c]=mMPrk[Enck(m)=c]Prm[m=m]=Prk[Enck(m¯)=c]mMPrm[m=m]=Prk[Enck(m¯)=c]1

The first eq is just sum of prob. The second use perfect secrecy: Prk[Enck(m¯)=c]=Prk[Enck(m)=c] for any m¯ and m. The third is also sum of prob. That implies

Prk[Enck(m¯)=c]=Prk,m[Enck(m)=c].

That givens Shannon secrecy:

Prk,m[m=m¯|Enck(m)=c]=Prm[m=m¯]Prk[Enck(m¯)=c]/Prk,m[Enck(m)=c]=Prm[m=m¯].

Claim:

Shannon secrecy implies perfect secrecy.

One-Time Pad

Definition: One-Time Pad.

The One-Time Pad encryption scheme is described by the following 5-tuple (M,K,Gen,Enc,Dec):

  • M=0,1n
  • K=0,1n
  • Gen: k:=k1k2kn0,1n
  • Enck(m1m2mn): c1c2cn where ci=miki
  • Deck(c1c2cn): m1m2mn where mi=ciki

The operator represents the binary XOR operation.

Theorem:

One-Time Pad is a perfectly secure private-key encryption scheme.

Proof:

We need to prove correctness and privacy.

The cost of OTP is the long key k.

One-Time Pad is Optimal in Key Length

Theorem: (Shannon)

If scheme (M,K,Gen,Enc,Dec) is a perfectly secret private-key encryption scheme, then |K||M|.

Proof:

Let cEnck(m) be a fixed ciphertext for some fixed k,m. Let P:={m:Deck(c)=m for any k}. We have |P||K|<|M| as Dec is deterministic. So, there exists m2P. Then, it follows that

Prk[Enck(m2)=c]=0

by correctness. However, we have Prk[Enck(m)=c]>0, and it violates perfect secrecy.

Notice that we can quantify the difference of probability (which yields a stronger theorem) by quantifying |K|.