Notes on “An Introduction to Mathematical Cryptography”

可能某些地方mathjax解析有问题。。

Introduction

比较喜欢Asymmetric Cryptography，再加上对一些公钥密码算法的某些攻击算法只有一个粗浅的认识，于是听说这本书上有相关的原理，就去找到了这本书看了起来。

Chapter 1: An Introduction to Cryptography

由Caesar Cipher (also sometimes referred to as a shift cipher)引入，介绍了一波Symmetric Cipher。

Origin of the word Cryptography:

Cryptography, the methodology of concealing the content of messages, comes from the Greek root words kryptos, meaning hidden, and graphikos, meaning writing. The modern scientific study of cryptography is sometimes referred to as cryptology.

Cryptanalysis of Simple Substitution Ciphers

Cryptanalysis explaination:

The process of decrypting a message without knowing the underlying key is called cryptanalysis.

注意这边，密码分析在不知道key的情况下，而非只知道密文的情况，条件很宽松。也就是说，一个密码系统的安全，要全部依赖于这个key。

Powerful tools against simple Substitution Cipher: Frequency Analysis.

http://storage.googleapis.com/books/ngrams/books/datasetsv2.html

单字母频率分析

Frequency of letters in English

双字母组合频率分析

Most common English bigrams

Divisibility and Greatest Common Divisors

这边就是一些关于数论的东西了，基本在An Friendly Introduction to Number Theory里面看过了。

The Euclidean Algorithm: The division step is executed at most $2\log_2(b) + 2$ times. It has been proven that the Euclidean algorithm takes no more than $1.45\log_2(b) + 1.68$ iterations, and that the average number of iterations for randomly chosen $a$ and $b$ is approximately $0.85\log_2(b) + 0.14$.

欧几里得算法很快，$O(\log(n))$的复杂度。

Extended Euclidean Algorithm: 用来求二元一次方程$au + bv = gcd(a, b)$的整数解。

Modular Arithmetic

The theory of congurences is a powerful method in number theory that is based on the simple idea of clock arithmetic.

加法，减法，乘法，模逆。

用扩展欧几里得算法求模逆，$O(\log(n))$的复杂度。

The ring of integers modulo m: $$ \mathbb{Z}/m\mathbb{Z} = {0, 1, 2, …, m-1} $$

Numbers that have inverses are called units.

$$ (\mathbb{Z}/m\mathbb{Z})* = {a \in \mathbb{Z}/m\mathbb{Z}: gcd(a, m) = 1} $$

The set $(\mathbb{Z}/m\mathbb{Z})*$ is called the group of units modulo m.

Euler’s phi function (Euler’s totient function): $$ \phi(m) = #(\mathbb{Z}/m\mathbb{Z})* = #{0 \le a < m: gcd(a,m)=1} $$

The Fast Powering Algorithm == Fast Exponentiation Algorithm == Square and Mutilply Algorithm 快速幂算法

Prime Numbers, Unique Factorization, and Finite Fields

The Fundamental Theorem of Arithmetic: $$ a = p_1^{e_1}\cdot p_2^{e_2}\cdot p_3^{e_3}\cdot \cdot \cdot p_r^{e_r} $$

We denote this power by $ord_p(a)$ and call it the order (or exponent) of p in a. View $ord_p$ as a function, called valuation. $$ ord_p: {1, 2, 3, …} \longrightarrow {0, 1, 2, 3, … } $$

实际上这个$ord_p$还可以进行扩展，将domain(定义域)扩展到有理数，将range(值域)扩展到所有整数(加上负整数)。详见http://math.uga.edu/~pete/4400intro.pdf Page 12.

Field:

Finite fields are of fundamental importance throughout cryptography, and indeed throught all of mathematics.

Finite fields == Galois fields. $\mathbb{F}_p == GF(p)$.

计算逆元的两种方法:

Fermat’s little Theorem
Extended Euclidean algorithm

Primitive Root Theorem:

Primitive roots of $\mathbb{F}p$ or generators of $\mathbb{F}{p}^{}$ are the elements of $\mathbb{F}_{p}^{}$ having order p-1.

Cryptography Before the Computer Age

一个挺好的密码学历史介绍。

Caesar $\longrightarrow$ 单表替换 $\longrightarrow$ Vignere(简单的多表替换)$\longrightarrow$ Enigma(复杂的多表替换)

Symmetric Ciphers

$$ e: \mathcal{K} \times \mathcal{M} \rightarrow \mathcal{C} $$

$$ d: \mathcal{K} \times \mathcal{C} \rightarrow \mathcal{M} $$

Kerckhoff’s principle: The security of a cryptosystem should depend only on the secrecy of the key, and not on the secrecy of the encryption algorithm itself.

If ($\mathcal{K}$, $\mathcal{M}$, $\mathcal{C}$, $e$, $d$) is to be a successful cipher, it must have the following properties:

For any key $k \in \mathcal{K}$ and plaintext $m \in \mathcal{M}$, it must be easy to compute the ciphertext $e_k(m)$.

For any key $k \in \mathcal{K}$ and ciphertext $c \in \mathcal{C}$, it must be easy to compute the plaintext $d_k(C)$.

Given one or more ciphertexts $c_1, c_2, …, c_n \in \mathcal{C}$ encrypted using the key $k \in \mathcal{K}$, it must be very difficult to compute any of the corresponding plaintexts $d_{k}(c_1), …, d_{k}(c_n)$ without knowledge of $k$.

Given one or more pairs of plaintexts and their corresponding ciphertexts, ($m_1, c_1$), ($m_2, c_2$), …, ($m_n, c_n$), it must be very difficult to decrypt any ciphertext $c$ that is not in the given list without knowing $k$. This property is called security against a known plaintext attack.

For any list of plaintexts $m_1, …, m_n \in \mathcal{M}$ chosen by the adversary, even with knowledge of the corresponding ciphertexts $e_k(m_1), …, e_k(m_n)$, it is very difficult to decrypt any ciphertexts $c$ that is not in the given list without knowing $k$. This is known as security against a chosen plaintext attack. N.B. In this attack, the adversary is allowed to choose $m_1, …, m_n$, as opposed to a known plaintext attack, where the attacker is given a list of plaintext/ciphertext pairs not of his choosing.

Encoding Schemes: 编码方案 != 加密方案两者目的不同。编码方案只是为了将数据从一种类型转化为另一种类型，应当被公众知晓。而加密方案则是为了隐藏信息，为了保证信息的机密性。

Symmetric Encryption of Encoded Blocks: 字节序与整数之间的一个转化。

exhaustive search attack(brute-force attack), meet-in-the-middle(collison) attacks

Examples of Symmetric Ciphers: 模(质数)运算下的乘法, 模运算下的加法(shift or Caesar cipher), 模运算下的线性运算(affine cipher), 模运算下的矩阵运算(Hill cipher), 异或运算(stream cipher)。

Random Bit Sequences and Symmetric Cipher: 伪随机数的一些理论上的东西

Asymmetric Ciphers Make a First Appearance: 对称密码算法中，如何交换密钥key? $\longrightarrow$ *public key (or asymmetric) cryptography.

Elgamal, RSA, Goldwassser-Micali, ECC, GGH and NTRU.

Chapter 2: Discrete Logarithms and Diffie-Hellman

The Birth of Public Key Cryptography

An extremely important event: in 1976, Whitfield Diffie and Martin Hellman published their now famous paper entitled “New Directions in Cryptography.”

We stand today on the brink of a revolution in cryptography.

Public key encryption cryptosystem 的概念第一次被提出。

需要用到one-way trapdoor function

one-way trapdoor function

并不真正意义上的one-way function

Two years later, two major papers describing the PKC were published:

the RSA scheme of Rivest, Shamir, and Adleman
the knapsack scheme of Merkle and Hellam (broken)

The Discrete Logarithm Problem

The discrete logarithm problem is a mathematical problem that arises in many settings, including the mod $p$ version and the elliptic curve version.

$$ \mathbb{F}_p == \mathbb{Z}/p\mathbb{Z}. $$

$\mathbb{F}_p$ is a field with a prime number of elements.

$\mathbb{F}_p^{*}$ is the list of elements $1, g, g^2, g^3, …, g^{p-2}$ where $g$ is the generator (or primitive element) of the group.

Definition of the DLP: Let $g$ be a primitive root for $\mathbb{F}_p$. The Discrete Logarithm Problem (DLP) is the problem of finding an exponent $x$ such that $$ g^x \equiv h \quad (\text{mod}\ p). $$

The number $x$ is called the discrete logarithm of $h$ to the base $g$ and is denoted by $\log_g(h)$.

Powers $627^i\ \text{mod}\ 941\ \text{for}\ i = 1,2,3,…$

Discrete distribution of exponent mod

Generalized definition of the DLP: Let $G$ be a group whose group law we denote by the symbol $*$. The Discrete Logarithm Problem for $G$ is to determine, for any two given elements $g$ and $h$ in $G$, an integer $x$ satisfying $$ \begin{matrix} \underbrace{g * g * g * … * g} \newline \text{totally}\ x \ \text{times} \end{matrix} = h. $$

Diffile-Hellman Key Exchange

一个基于DLP的交换随机密钥的方法

Diffie-Hellman key exchange

The Elgamal Public Key Cryptosystem

One probabilistic encryption scheme.

Elgamal PKC

An Overview of the Theory of Groups

Definition of group

A group consists of a set $G$ and a rule, which we denote by $$, for combining two elements $a, b \in G$ to obtain an element $a * b \in G$. The composition operation $$ is required to have the following three properties: [Identity Law] There exists an $e \in G$ such that $e * a = a * e = a \quad \text{for every } a \in G$. [Inverse Law] For every $a \in G$ there is a (unique) $a^{-1} \in G$ satisfying $a * a^{-1} = a^{-1} * a = e$. [Associative Law] $a * (b * c) = (a * b) * c$ for all $a, b, c \in G$.

If, in addition, composition satisfies the [Commutative Law] $a * b = b * a$ for all $a, b \in G$, then the group is called a commutative group or an abelian group.

order: the smallest integer $d$ such that $a^d = e$.

Two important properties of the orders of group elements:

Let $G$ be a finite group. Then every element of $G$ has finite order. Further, if $a \in G$ has order $d$ and if $a^k = e$, then $d|k$.
(Lagrange’s Theorem). Let $G$ be a finite group and let $a \in G$. Then the order of $a$ divides the order $G$.

How Hard Is the Discrete Logarithm Problem?

How can we quantify “hard”?

A natural measure of hardness is the approximate number of operations necesssary for a person or a computer to solve the problem using the most efficient method currently known.

Definition of Order Notation:

Let $f(x)$ and $g(x)$ be functions of $x$ taking values that are positive. We say that “$f$ is big-$\mathcal{O}$ of $g$” and write $$ f(x) = \mathcal{O}(g(x)) $$ if there are positive constants $c$ and $C$ such that $$ f(x) \le c g(x) \quad \text{for all}\ x \ge C. $$ In particular, we write $f(x) = \mathcal{O}(1)$ if $f(x)$ is bounded for all $x \ge C$.

If the limit $$ \lim_{x \rightarrow \infty}{\frac{f(x)}{g(x)}} $$ exists (and is finite), the $f(x) = \mathcal{O}(g(x))$.

Order notation allows us to define several fundamental concepts that are used to get a rough handle on the computational complexity of mathematical problems.

Suppose that there is a constant $a \ge 0$, independent of the size of the input, such that if the input is $\mathcal{O}(k)$bits long, then

polynomial time (easy): there is a constant $A \ge 0$, such that it takes $\mathcal{O}(k^A)$ steps to solve the problem.
- linear time: it takes $\mathcal{O}(k^1)$ steps to solve the problem, i.e., $A = 1$.
- quadratic time: it takes $\mathcal{O}(k^2)$ steps to solve the problem, i.e., $A = 2$.
exponential time (hard): there is a constant $c \gt 0 $, such that it takes $\mathcal{O}(e^{ck})$ steps to solve the problem.
subexponential time: for every $\epsilon \gt 0$, they solve the problem in $\mathcal{O}_{\epsilon}(e^{\epsilon k})$ steps.

Method to solve DLP in $\mathbb{F}_p^{*}$:

Trial-and-error method: $\mathcal{O}(p) = \mathcal{O}(2^k)$, which takes expoential time.
Pohlig-Hellman algorithm: $p-1$ is smooth, then quite easy.
Shanks’s Babystep-Giantstep algorithm: $\mathcal{O}(\sqrt{N}\cdot \log{N})$ + $\mathcal{O}(\sqrt{N})$ storage.
Index Calculus algorithm: $\mathcal{O}(e^{c \sqrt{(\log_p)(\log{log_p})}})$, which is a subexponential algorithm.

Method to solve DLP in $\mathbb{F}_p$: $$ x\cdot g \equiv h \quad (\text{mod}\ p) $$

Extended Euclidean algorithm: $\mathcal{O}(\log{p})$, which is a linear-time algorithm.

The discrete logarithm problems in different groups may display levels of difficulty for their solutions.

Another sort of group called the elliptic curve: the best known algorithm to solve the DLP requires $\mathcal{O}(\sqrt{N})$ steps. Thus it takes exponetial time to solve the elliptic curve discrete logarithm problem (ECDLP).

A Collision Algorithm for the DLP

In this section we describe a discrete logarithm algorithm due to Shanks. It is an example of a collision, or meet-in-the-middle, algorithm.

Shanks’s algorithm works in any group, not just $\mathbb{F}_p^*$.

先是一个暴力破解的算法：

Screen Shot 2020-02-13 at 8.17.03 PM

The idea behind a collision algorithm is to make two lists and look for an element that appears in both lists.

Screen Shot 2020-02-13 at 8.18.21 PM

用空间换时间。

Screen Shot 2020-02-13 at 8.19.03 PM

Screen Shot 2020-02-13 at 8.19.42 PM

代码实现：

[https://github.com/soreatu/Cryptography/blob/master/Shank’s%20Babystep-Giantstep%20Algorithm.py](https://github.com/soreatu/Cryptography/blob/master/Shank's Babystep-Giantstep Algorithm.py)

The Chinese Remainder Theorem

中国剩余定理。

Number Theory里学过，不多说。

然后又探讨了下mod p下的平方根，可以看一下

[http://www.soreatu.com/essay/Notes%20on%20Quadratic%20Congruence.html](http://www.soreatu.com/essay/Notes on Quadratic Congruence.html)。

Screen Shot 2020-02-14 at 1.08.42 PM

from paper

The Pohlig-Hellman Algorithm

这一节就是来讲如何用中国剩余定理来解DLP。

Screen Shot 2020-02-13 at 8.30.10 PM

中国剩余定理不仅仅只是一个定理，更是一种思维方式。

在同余运算中，可以用它把任何模数$m$都拆成若干个素数$p$作为模数的形式。

$$ g^x \equiv h \pmod{p}. $$

注意这个指数$x$实际上是$\bmod{p-1}$的（费马小定理）。

我们可以尝试去对$p-1$进行因数分解，将其化简为若干个素数乘积的形式，进而将这个大的DLP化为若干个小的DLP。

Screen Shot 2020-02-13 at 8.35.56 PM

将原来big order的DLP化为了若干个small order的DLP，然后一一求解小DLP，再用CRT得到在big order下的解

The Pohlig-Hellman algorithm thus tells us that the discrete logarithm problem in a group $G$ is not secure if the order of the group is a product of small primes.

但是上面只把big order化成了若干个prime powers，而非primes。

We now explain the algorithm that reduces the discrete logarithm problem for elements of prime power order to the discrete logarithm problem for elements of prime order.

The idea is simple: if $g$ has order $q^e$, then $g^{q^{e-1}}$ has order $q$. The trick is to repeat this process several times and then assemble the information into the final answer.

现在来解一个以prime power作为order的小DLP。

$$ g^x \equiv h \pmod{p} $$

其中$g$的order为$q^e$。

解这个的关键一点就在于，我们可以把$x$写成$q$进制的形式： $$ x = x_0 + x_1 q + x_2 q^2 + \dots + x_{e-1} q^{e-1} \quad \text{with}\ 0 \le x_i < q $$ 然后依次算出$x_0, x_1, \dots$

观察到$g’ = g^{q^{e-1}}$的order为$q$，所以我们可以尝试在$g^x \equiv h \pmod{p}$两边同时加上$q^{e-1}$次方， $$ (g^x)^{q^{e-1}} \equiv h^{q^{e-1}} \pmod{p} $$ 把$x$的$q$进制形式代入， $$ \begin{aligned} (g^{x_0 + x_1 q + x_2 q^2 + \dots + x_{e-1} q^{e-1}})^{q^{e-1}} &\equiv h^{q^{e-1}} \pmod{p}\newline g^{x_0 q^{e-1}} \cdot (g^{q^e})^{x_1 + x_2 q + \dots + x_{e-1} q^{e-2}} &\equiv h^{q^{e-1}} \pmod{p}\newline (g^{q^{e-1}})^{x_0} &\equiv h^{q^{e-1}} \pmod{p} \end{aligned} $$

其中$g’ = g^{q^{e-1}}$的order即为$q$，上式可以化为 $$ (g’)^{x_0} \equiv h^{q^{e-1}} \pmod{p} $$ 这是一个order为$q$的DLP，容易求出$x_0$。

再依次求出$x_1, x_2, \dots$

Screen Shot 2020-02-13 at 9.02.47 PM

Screen Shot 2020-02-13 at 9.03.25 PM

Rings, Quotient Rings, Polynomial Rings, and Finite Fields

下面的就是一些abstract algebra里面的内容了，十分烧脑，不过也挺有意思的。

是理解进阶的椭圆曲线和NTRU格密码的基础。

Groups are fundamental objects that appear in many areas ofmathematics. A group $G$ is a set and an operation that allows us to “multiply” two elements to obtain a third element.

Another fundamental objects in mathematics, called a ring, is a set having two operations. These two operations are analogous to ordinary addition and multiplication, and they are linked by the distribution law.

An Overview of The Theory of Rings

Screen Shot 2020-02-13 at 10.43.32 PM

在这本书里，ring特指commutative rings with (multiplicative) identity。

Screen Shot 2020-02-13 at 10.45.33 PM

例子：

$\mathbb{Q}$: field
$\mathbb{Z}$: ring
$\mathbb{Z}/n\mathbb{Z}$: ring, and field if $n$ is a prime
$\mathbb{F}_p$: field
$\mathbb{Z}[x]$: ring
If $R$ is any ring, we can form a ring of polynomials whose coefficients are taken from the ring $R$.

Divisibility and Quotient Rings

The concept of divisibility, originally introduced for the integers $\mathbb{Z}$ can be generalized to any ring.

在广义上的环里，也有整除的定义。

Screen Shot 2020-02-13 at 10.52.19 PM

Elements that have multiplicative inverses and elements that have only trivial factorizations are special elements of a ring.

Screen Shot 2020-02-13 at 10.54.08 PM

例如：在$\mathbb{Z}$中，$\pm1$就是unit，而所有素数$p$（正的和负的）就是irreducible。且任何整数都有一个性质：unique factorization (every integer factors uniquely into a product of irreducible integers, up to rearranging the order of the factors and throwing in some extra factors of 1 and -1)

通过整除，我们也可以把同余推广：

Screen Shot 2020-02-13 at 11.02.17 PM

借此，我们就可以从old rings中构建出new rings。

Screen Shot 2020-02-13 at 11.07.15 PM

商环：等价类的集合

Polynomial Rings and the Euclidean Algorithm

另一方面，我们可以从一个已知的ring $R$，构造出一个polynomial ring. $$ R[x] = { a_0 + a_1 x + a_2 x^2 + \dots + a_n x^n : n\ge 0\ \text{and}\ a_0, a_1, \dots, a_n \in R}. $$ 我们定义一个非零多项式的degree为多项式中最高项的次数，用$\deg(a)$来表示。

如果 $$ a(x) = a_0 + a_1 x + a_2 x^2 + \dots + a_n x^n $$ 那么，$deg(a) = n$。

我们将$a_n$称做$a(x)$的leading coefficient（首项系数？）。

如果一个非零多项式的首项系数为1，我们就称之为monic polynomial（首一多项式）。

Especially important are those polynomial rings in which the ring $R$ is a field.

One reason why it is so useful to take R to be a field $\mathbb{F}$ is because virtually all of the properties of $\mathbb{Z}$ that we proved are also true for the polynomial ring $\mathbb{F}[x]$.

下面就主要探究由域$\mathbb{F}$所形成的多项式域$\mathbb{F}[x]$的一些性质。

对于$\mathbb{F}[x]$，我们也可以进行多项式除法。

Screen Shot 2020-02-13 at 11.19.29 PM

对于$\mathbb{F}[x]$，我们也有公约数和最大公约数。

Screen Shot 2020-02-13 at 11.21.27 PM

对于$\mathbb{F}[x]$，我们也有扩展欧几里得算法。

Screen Shot 2020-02-13 at 11.23.36 PM

It is not hard to see that the units in a polynomial ring $\mathbb{F}[x]$ are precisely the nonzero constant polynomials, i.e., the nonzero elements of $\mathbb{F}$.

对于$\mathbb{F}[x]$，也具有uniquely factorization这个性质。

Screen Shot 2020-02-13 at 11.25.52 PM

Quotients of Polynomial Rings and Finite Fields of Prime Power Order

前面我们搞过了商环和多项式环，现在我们把它们两个结合起来，来搞一下多项式商环。

Screen Shot 2020-02-13 at 11.33.52 PM

多项式域里的所有非零多项式，均同余等价于（可由xxx唯一表示）度数不大于$\deg(m)$的那些多项式。

我们用这个来数一下多项式商域里的元素个数。

Screen Shot 2020-02-13 at 11.36.55 PM

度数为0的有$p$个，度数为1的有$p$个，…；最高度数为$d-1$，因此有$p^d$个。

下面我们给出一个多项式商环里的units的一个特性，这个能让我们从finite fields中构造出新的finite fields。

Screen Shot 2020-02-13 at 11.39.49 PM

和除数互素的多项式就是unit。

再用一下前面的扩展欧几里得算法。 $$ a \cdot u + m \cdot v = 1 $$ 取模$m$， $$ a \cdot u \equiv 1 \pmod{m} $$ 如果这个$m$恰巧是一个irreducible polynomial，那么这个多项式商域就是一个新的域。

Screen Shot 2020-02-13 at 11.42.13 PM

而我们所熟知的复数集$\mathbb{C}$就是一个很好的例子：

Screen Shot 2020-02-13 at 11.43.00 PM

$\mathbb{F}_p$是一个有$p$个元素的finite field，如果我们尝试进行$\mathbb{F}_p[x]/m$，其中$m$的度数为$d$，那么就能构建出一个拥有$p^d$（prime order）个元素的finite field。

Screen Shot 2020-02-13 at 11.46.07 PM

Screen Shot 2020-02-13 at 11.46.38 PM

因此，若想要构建一个有$p^d$个元素的域，我们需要去找到一个在$\mathbb{F}_p[x]$中度数为$d$的不可约多项式$m$，然后构建多项式商域$\mathbb{F}_p[x]/m$即可。

总结如下：

Screen Shot 2020-02-13 at 11.49.29 PM

这些域又被称为Galois fields，用$GF(p^d)$来表示，以纪念19世纪的法国数学家Evariste Galois（abstract algebra的创始人）。

If $\mathbb{F}$ is a finite field, then $\mathbb{F}$ has $p^d$ elements for some prime $p$ and some $d \ge 1$.

有限域里元素的个数只能是prime或者prime power个。

费马小定理也可以推广到域中。

Screen Shot 2020-02-13 at 11.54.19 PM

Chapter 3: Integer Factorization and RSA

Euler’s Formula and Roots Modulo $pq$

给出了在模$pq$下的欧拉定理：

Screen Shot 2020-02-14 at 11.08.50 AM

并将DLP与RSA的底层难题做了一个对比： $$ \text{DLP} : a^x \equiv b \pmod{p} $$

$$ \text{RSA} : x^e \equiv c \pmod{pq} $$

DLP是计算指数的困难，而RSA则是计算底数的困难。

The security of RSA relies on the assumption that it is difficult to take $e$th roots modulo $N=pq$.

If we know how to factor $N$, then it is again easy to compute $e$th roots.

如果我们知道$N$的因数分解，我们就可以知道$\phi(N)$。

如果 $$ \gcd(e, \phi(N)) = 1 $$ 令 $$ d \equiv e^{-1} \pmod{\phi(N)} $$ 那么 $$ x^e \equiv c \pmod{N} $$ 有唯一解 $$ x \equiv c^d \pmod{N} $$

We can often make the computation faster by using a smaller value of $d$.

Let $g = \gcd(p-1, q-1)$ and suppose that we solve the following congruence for $d$: $$ d e \equiv 1 \pmod{ \frac{(p-1)(q-1)}{g}} $$ Euler’s formula says that $a^{(p-1)(q-1)/g} \equiv 1 \pmod{pq}$.

Hence, $$ (c^d)^e = c^{de} = c^{1 + k(p-1)(q-1)/g} = c \cdot (c^{(p-1)(q-1)/g})^k \equiv c \pmod{pq} $$

$\frac{(p-1)(q-1)}{g}$实际上就是$\text{lcm}(p-1, q-1)$。

可以用更小的$d$来使运算更快一些。

Screen Shot 2020-02-14 at 11.27.34 AM

Factorization via Difference of Squares

The most powerful factorization methods known today rely on one of the simplest identities in all of mathematics.

平方差公式： $$ X^2 - Y^2 = (X + Y)(X - Y) $$

两个平方数的差等于一个乘积。

如果我们想要分解一个数$N$，那么我们可以找到这样一个的$b$，使得$N + b^2$是一个平方数$a^2$。

那么 $$ N = a^2 - b^2 = (a + b)(a - b) $$ 这样，我们就能有效地找到$N$的一个因数分解。

One example of factorization

如果$N$很大，那么能寻找这样的$a, b$的概率是极低的。

事实上，只要 $$ kN = a^2 - b^2 = (a + b)(a - b), $$ 就有一定的概率使得$N$的因数散落在等式右边，只要去计算$\text{gcd}(N, a + b)$和$\text{gcd}(N, a - b)$就能找到$N$的因数。

Factorization_2

为什么不尝试$2N + b^2$？ $$ \begin{equation} 2N + b^2 \equiv 2 + b^2 \equiv \left{ \begin{aligned} 2 + 0 \equiv 2 &\quad (\text{mod}\ 4) \quad \text{if}\ b \ \text{is even},\newline 2 + 1 \equiv 3 &\quad (\text{mod}\ 4) \quad \text{if}\ b \ \text{is odd}. \end{aligned} \right. \end{equation} $$ 而如果$x$一个平方数的话，必须要是$x \equiv 0\ \text{or}\ 1\ (\text{mod}\ 4)$.

$N$的倍数在模$N$的情况下，同余0。因此，去寻找$kN = a^2 - b^2$，也就是去寻找 $$ a^2 \equiv b^2 \quad (\text{mod}\ N).\tag{3-1} $$ 这样就可以用模运算来阐明我们的目标。

事实上，直接（暴力）去寻找满足(3-1)的$a, b$也是不可行的。

但可以用如下的一个three-step process来寻找:

This procedure, in one form or another, underlies most modern methods of factorization.

Three-step process

Factorization_31

Factorization_32

Factorization_4

…

步骤1需要$c_i$能被分解为小质数的乘积 => Smooth Numbers

步骤2需要$c_{i_{1}}, c_{i_{2}}, …$的乘积的质因数分解的阶数都为偶数 => $GF(2)$上解线性方程组

步骤3得到的gcd还很有可能是1

Smooth Numbers, Sieves, and Building Relations for Factorization

Smooth Numbers那边没怎么看懂。。

Definidtion. An integer $n$ is called B-smooth if all of its prime factors are less than or equal to B.

Smooth_Numbers_1

polynomial < subexponential < exponential

我们需要至少找到和小于B的所有质数的个数一样多的B-smooth numbers。

Proposition. Let $L(X) = e^{\sqrt{(\ln{X})(\ln{\ln{X}})}}$, let $N$ be a large integer, and set $B = {L(N)}^{\frac{1}{\sqrt{2}}}$.

(a) We expect to check approximately ${L(N)}^{\frac{1}{\sqrt{2}}}$ random numbers modulo $N$ in order to find $\pi(B)$ numbers that are B-smooth.

(b) We expect to check approximately ${L(N)}^{\frac{1}{\sqrt{2}}}$ random numbers of the form $a^2 (\text{mod}\ N)$ in order to find enough B-smooth numbers to factor $N$.

Hence the factorization procedure described above should have a *subexponential running time.

Two sieves:

Quadratic sieve
Number field sieve

How can we efficiently find many numbers $a > \sqrt{N}$ such that each $a^2\ (\text{mod}\ N)$ is B-smooth?

Allow slightly larger values of $a$ and use an efficient cancellation process called a sieve to simultaneously create a large number of values $a^2 \ (\text{mod}\ N)$ that are B-smooth.

Pomerance’s quadratic sieve: still the fastest known method for factoring large numbers $N = pq$ up to about $2^{350}$.

For numbers considerably larger than this, say larger than $2^{450}$, the more comlicated number field sieve holds the world record for quickest factorization.

Quadratic Sieve

我们需要列出形如$a^2\ (\text{mod}\ N)$的B-smooth numbers。

考虑多项式： $$ F(T) = T^2 - N. $$ 从$a = \lfloor\sqrt{N}\rfloor + 1$开始，列出$F(a), F(a+1), F(a+2), …, F(b)$.

对于所有小于$B$的素数$p$，找出方程 $$ t^2 \equiv N \quad (\text{mod}\ p) $$ 的两解（无解直接跳过）。

用$p$除$F(a), F(a+1), F(a+2), …, F(b)$中是解的数。

直到最后结果为$1$的数，即为B-smooth numbers.

Number Field Sieve

抽代，wsl

Theorem. Under some reasonable assumptions, the expected running time of the number field sieve to factor the number $N$ is $L_{1/3}(N)^c$ for a small value of $c$.

只有当$N$充分大的时候，Number Field Sieve才比别的方法快。

小于$10^{100}$: Quadratic Sieve

大于$10^{130}$: Number Field Sieve

The Index Calculus Method for Computing Discrete Logarithms in $\mathbb{F}_p$

The indes calculus is a method for solving the discrete logarithm problem in a finite field $\mathbb{F}_p$.

我们想要解离散对数问题 $$ g^x \equiv h \quad (\text{mod}\ p),\tag{3-8} $$ 其中$p, g, h$均已知。

我们不直接去解(3-8)，我们先解 $$ g^x \equiv l \quad (\text{mod}\ p) \quad\quad \text{for all primes}\ l \le B. $$ 也就是，我们先算出所有小于B的质数$l$的离散对数$\log_{g}(l)$.

再来看 $$ h\cdot g^{-k} \ (\text{mod}\ p) \quad \text{for}\ k = 1, 2, … $$ 直到我们找到了一个$k$使得$h\cdot g^{-k} \ (\text{mod}\ p)$是B-smooth的。

那么，对于这个$k$，有： $$ h\cdot g^{-k} \equiv \prod_{l \le B} l^{e_l} \quad (\text{mod}\ p) $$ 可以对上式取对数： $$ log_g(h) \equiv k + \sum_{l \le B} e_l\cdot log_g(l) \quad (\text{mod}\ p-1) $$ 即可求出我们想要的$x = log_g (h) $

如何解小质数$l$的离散对数?

随机的指数$i$，计算 $$ g_i \equiv g^i \quad (\text{mod}\ p) $$ 如果$g_i$是B-smooth的，那么可以把它分解成 $$ g_i = \prod_{l \le B} l^{u_l(i)} $$ 也就是 $$ i \equiv log_g(g_i) \equiv \sum_{l \le B} u_l(i) \cdot log_g (l) \quad (\text{mod} p-1). $$ 上式中，仅$log_g(l)$未知。只需找到$\pi(B)$个像上式一样的方程，即可用线代知识解出变量$log_g(l)$.

Index_Calculus_1

Index_Calculus_2

Index_Calculus_3

In any case, the index calculus is a subexponential algorithm for solving the discrete logarithm problem in $\mathbb{F}_p$.

Quadratic Residues and Quadratic Reciprocity

If $g$ is a primitive root modulo $p$, then $$ \begin{equation} g^m\ \text{is a} \left{ \begin{aligned} &\text{quadratic residue} &\text{if}\ m\ \text{is even},\newline &\text{quadratic nonresidue} &\text{if}\ m\ \text{is odd}. \end{aligned} \right. \end{equation} $$ 二次剩余，二次非剩余；Legendre symbol；二次互反律；Jacobi symbol.

如果$(\frac{a}{b}) = 1$，其中$b$是一个奇数，那么$a$是一个模$b$的平方？

如果$b$是一个质数，当然是。

如果$b = p\cdot q$，有两种情况可以使得$(\frac{a}{b}) = 1$，也就是$1 = 1\cdot 1$和$1 = (-1)\cdot (-1)$.

Case 1: $(\frac{a}{p}) = (\frac{a}{q}) = 1$. $a$是一个模$pq$的平方.

Case 2: $(\frac{a}{p}) = (\frac{a}{q}) = -1$. $a$不是一个模$pq$的平方.

因此如果$b = pq$，那么从$(\frac{a}{b})=1$中是无法看出来$a$究竟是不是模$b$的平方剩余。

Probabilistic Encryption and the Goldwasser-Micali Cryptosystem

Base on the idea:

Let $p$ and $q$ be (secret) prime numbers and let $N = pq$ be given. For a given integer $a$, determine whether $a$ is a square modulo $N$, i.e., determine whether there exists an integer $u$ satisfying $u^2 \equiv a \pmod{N}.$

Note that Bob, who knows how to factor $N = p q$, is able to solve this problem very easily, since $$ a \ \text{is a square modulo } pq \quad \text{if and only if}\quad (\frac{a}{p}) = 1 \quad \text{and}\quad (\frac{a}{q}) = 1 $$ Eve, on the other hand, has a harder time, since she knows only he value of N. She can compute $(\frac{a}{N})$, but cannot tell whether $a$ is a square modulo $N$.

Screen Shot 2020-02-12 at 1.57.42 AM

Screen Shot 2020-02-12 at 1.56.30 AM

Note: The Goldwasswer-Micali public key cryptosystem has a message expansion ratio of 1000, hence impractical.

Chapter 4: Digital Signature

Chapter 5: Combinatorics, Probability, and Information Theory

Chapter 6: Elliptic Curves and Cryptography

Chapter 7: Lattices and Cryptography

Public key cryptosystems exclusively base on the difficulty of:

factoring large numbers,
finding discrete logarithms in a finite group.

A new type of hard problem arising in the theory of lattices can be uesd as the basis for a public key cryptosystem.

Advantages:

Faster encryption/decryption
quantum resistence

A Congruential Public Key Cryptosystem

A toy model of a real public key cryptosystem(NTRU).

2维的NTRU。

Screen Shot 2020-01-09 at 8.34.33 PM

实际上是一个CVP。

Due to Gauss, there is an extremely rapid method for finding short vectors in two-dimensional lattices.

CTF Challenge:

2019巅峰极客线上赛——NTURE

wp：https://xz.aliyun.com/t/7163

Subset-Sum Problems and Knapsack Cryptosystems

The Subset-Sum Problem

Suppose that you are given a list of positive integers ($M_1, M_2, …, M_n$) and another integer $S$.

Find a subset of the elements in the list whose sum is $S$.

$n$个元素的集合，有$2^n$个子集。要找到一个确切的满足条件的子集，很难。

令 $$ \mathbf{M} = (M_1, M_2, …, M_n) $$ 为one list of public positive integers。

取一个secret binary vector $$ \boldsymbol{x} = {x_1, x_2, …, x_n} $$ 其中每一个$x_i$不是0就是1。

计算 $$ S = \sum_{i=1}^{n}{x_i M_i} $$ subset-sum problem就是要找到一个binary vector能够算得同样的$S$（并不一定要是原来那个）。

存在一个obvious的平方根$\mathcal{O(2^{n/2})}$的collision算法。

Screen Shot 2020-02-10 at 12.35.36 AM

Screen Shot 2020-02-10 at 12.36.58 AM

n很大的话，想要求解一个子集和问题是很困难的。

但是如果有trapdoor information就会变得很容易去求解。

由此可以构建一个public key cryptosystem。

Definition.

A superincreasing sequence of integers is a list of positive integers $\mathbf{r} = (r_1, r_2, \dots, r_n)$ with the property that $$ r_{i+1} \ge 2r_i \quad \text{for all}\ 1 \le i \le n-1. $$

超增序列有一个性质：

Let $\mathbf{r} = (r_1, r_2, \dots, r_n)$ be a superincreasing sequence. Then $$ r_k > r_{k-1} + \dots + r_2 + r_1 \quad \text{for all}\ 2\le k \le n. $$

因此超增序列的subset-sum problem是非常容易去解的：

Screen Shot 2020-02-10 at 12.46.49 AM

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


def solve(M, S):
    n = len(M)
    x = [0]*n
    i = n - 1
    while i >= 0:
        if S >= M[i]:
            x[i] = 1
            S -= M[i]
        i -= 1
    return x

M = (3, 11, 24, 50, 115)
S = 142
print(solve(M, S))
# [1, 0, 1, 0, 1]

Merkle和Hellman由此提出了一个基于超增序列的subset-sum problem的public key cryptosystem。

Screen Shot 2020-02-10 at 12.54.37 AM

Screen Shot 2020-02-10 at 12.58.05 AM

Cryptosystems based on disguised subset-sum problem are known as subset-sum cryptosystems or knapsack cryptosystems. The general idea is to start with a secret superincreasing sequence, disguise it using secret modular linear operations, and publish the disguised sequence as the public key.

密钥空间大小问题：

If $r_1$ is too small, then there are easy attacks, so we must insist that $r_1 > 2^n$. $$ r_n > 2r_{n-1} > 4r_{n-2} > \dots > 2^nr_1 > 2^{2n}. $$

$$ M_i = O(2^{2n}), \quad S = O(2^{2n}) $$

n = 160 —> 51200bits

Attacks:

The first attack, by Shamir, Odlyzko, Lagarias and others, used various ad hoc methods.
After the publication of LLL paper in 1985, it became clear that knapsack-based cryptosystems have a fundamental weakness.

Suppose S as a subset-sum from the set $\mathbf{M} = (m_1, \dots, m_n)$.

构造矩阵

Screen Shot 2020-02-10 at 1.10.23 AM

即为lattice $$ L = {a_1\mathbf{v_1} + a_2\mathbf{v_2} + \dots + a_n\mathbf{v_n} + a_{n+1}\mathbf{v_{n+1}} : a_1, a_2, \dots, a{n+1} \in \mathbb{Z}}. $$

假设$\mathbf{x} = (x_1, \dots, x_n)$是这个subset-sum problem的一个解，那么lattice L就会包含这样一个向量： $$ \mathbf{t} = \sum^n_{i=1}x_i\mathbf{v_i} - v_{n+1} = (2x_1 - 1, 2x_2 - 1, \dots, 2x_n - 1, 0) $$

最后一项为0的原因是$S = x_1m_1 + \dots + x_nm_n$

注意到$\mathbf{t}$中的每一项都只能是$\pm1$，因此$| \mathbf{t} | = \sqrt{n}$。

但是lattice每一个基向量$| \mathbf{v}_i| = O(2^{2n})$，所以L里面不太可能有（除零向量外）比$\mathbf{t}$还要短的向量。

因此，如果有一种算法（lattice reduction algorithms）能快速找到上述lattice里的最短向量，就可以恢复出明文$\mathbf{x}$。

LLL
LLL-BKZ (LLL variants)

A Brief Review of Vector Space

跟线性代数的关联度很大。

Definition:

Screen Shot 2020-02-10 at 1.33.50 AM

。。。定义太多了。

给定一组基底，如何将其正交化？

Screen Shot 2020-02-10 at 1.35.58 AM

这个算法对理解LLL算法挺关键的。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


# sage
def GramSchmidtAlgorithm(M):
    '''

    Given a basis for a vector space $V \in R^m$, creates
    an orthogonal basis for $V$.

    INPUT:

    * "M" -- a matrix.

    OUTPUT:

    * "G" -- a matrix consisting of vectors that are pairwise orthogonal.

    NOTE:

    Better way to do this -- sage: G, _ = M.gram_schmidt()

    '''
    n = M.nrows()
    G = matrix(RR, n)
    for i in range(n):
        vi = M[i]
        for j in range(i):
            mu = (M[i]*G[j]) / G[j].norm()^2
            vi -= mu * G[j]
        G[i] = vi
    return G

Lattice: Basic Definition and Properties

Definition of lattice.

Screen Shot 2020-02-10 at 1.40.26 AM

Lattice：一组基向量的整系数线性组合。

Lattice可能有很多个基底。它们之间被由整数组成且行列式为$\pm1$的矩阵所联系。

An integral (or integer) lattice is:

a lattice with all of whose vectors have integer coordinations
an additive subgroup of $\mathbb{Z}^m$ for some $m \ge 1$

Lattice is:

a set of points
a discrete additive subgroup of $\mathbb{R}^m$

Screen Shot 2020-02-10 at 2.02.21 AM

Screen Shot 2020-02-10 at 2.02.40 AM

$\mathbb{R}^n$里的每一点$w$都可以写为 $$ w = t + v \quad \text{for a unique}\ t \in \mathcal{F} \quad \text{and a unique}\ v \in L. $$

All fundamental domains of a lattice $L$ have the same volume, which turns out to be an extremely important invariant of the lattice.

Screen Shot 2020-02-10 at 2.06.58 AM

Upper bound for the derterminat of a lattice:

Screen Shot 2020-02-10 at 2.09.07 AM

$\text{Vol}(\mathcal{F})$就是由基向量所组成的矩阵的秩。 $$ \text{det}(L) = \text{Vol}(\mathcal{F}) = \text{det}(M) $$ 是lattice中的不变量。

Short Vectors in Lattices

The Shortest Vector problem and the Closest Vector Problem

lattice中的两大根本难题：

The Shortest Vector Problem (SVP)

Find a shortest nonzero vector in a lattice $L$, i.e., find a nonzero vector $v \in L$ that minimizes the Euclidean norm $| v|$.
The Closest Vector Problem (CVP)

Given a vector $w \in \mathbb{R}^m$ that is not in $L$, find a vector $v \in L$ that is closest to $w$, i.e., find a vector $v \in L$ that minimizes the Euclidean norm $| w-v|$.

最短向量（最近向量）可能不止一个，我们只需要求出一个就可以了。

In full generality, CVP is known to be $NP$-hard and SVP is $NP$-hard under a certain “randomized reduction hypothesis”.

CVP is considered to be “a little bit harder” than SVP, since CVP can often be reduced to SVP in a slightly higher dimension.

In real world scenarios, cryptosystems based on $NP$-hard or $NP$-complete problems tend to rely on a particular subclass of problems, either to achieve efficiency or to allow the creation of a trapdoor.

一些SVP和CVP的变种：

Shortest Basis Problem (SBP)

Find a basis $v_1, \dots, v_n$ for a lattice that is shortest in some sense.
Approximate Shortest Vector Problem (apprSVP)

Let $\Psi(n)$ be a function of n. In a lattice $L$ of dimension n, find a nonzero vector that is no more than $\Psi(n)$ times longer than a shortest nonzero vector.
Approximate Closest Vector Problem (apprCVP)

This is the same as apprSVP, but now we are looking for a vector that is an approximate solution to CVP, instead of an approximate solution to SVP.

Hermite’s Theorem and Minkowski’s Theorem

Lattice中的最短向量到底有多长？取决于lattice的维度和$det(L)$。

Hermite’s Theorem

Every lattice $L$ of dimension $n$ contains a nonzero vector $v \in L$ satisfying $$ | v | \le \sqrt{n}\ \det(L)^{1/n}. $$

给出了最短向量的一个上限。

对于基底$v_1, \dots, v_n$有： $$ \det(L) \le |v_1| |v_2| \dots |v_n| \le n^{n/2}\det(L) $$ 给出了所有基向量乘积的一个上下界。

Hadamard ratio of the basis $mathcal{B} = {v_1, \dots, v_n}$: $$ \mathcal{H}(\mathcal{B}) = (\frac{\det(L)}{|v_1| |v_2| \dots |v_n|})^{1/n}. $$ Thus $0 < \mathcal{H}(\mathcal{B}) \le 1$, and the closer that the value is to 1, the more orthogonal are the vectors in the basis.

可以用这个来衡量一个基底的正交程度。

Minkowski’s Theorem

这边有点高深了。。

Screen Shot 2020-02-10 at 9.13.33 PM

Screen Shot 2020-02-10 at 9.13.53 PM

可以用这个来证Hermite’s Theorem。

The Gaussian Heuristic

gamma函数：

Screen Shot 2020-02-10 at 9.18.40 PM

然后一通操作，可以优化一下Hermite’s Theorem的结果。

Gaussian expected shortest length

Screen Shot 2020-02-10 at 9.23.24 PM

对于比较小的$n$，有一个公式： $$ \sigma(L) = (\Gamma(1 + n/2) \det(L))^{1/n} / \sqrt{\pi}. $$ 可以用这个公式来大致地估量一下最短向量的长度。

Babai’s Algorithm and Using a “Good” Basis to Solve apprCVP

如果基底是正交的（十分接近正交），那么就很容易解SVP和CVP。

Screen Shot 2020-02-10 at 9.38.16 PM

$$ w = v + \epsilon_1 v_1 + \epsilon_2 v_2 + \dots + \epsilon_n v_n \quad \text{for some}\ 0 \le \epsilon_1, \epsilon_, \dots, \epsilon < 1. $$

因此只要让小于1/2的$\epsilon_i$变为0，大于1/2的$\epsilon_i$变为1即可。

“good” basis没问题，但是如果是"bad" basis就会有很大问题。

Screen Shot 2020-02-10 at 9.33.09 PM

Screen Shot 2020-02-10 at 9.42.38 PM

具体的算法：

Screen Shot 2020-02-10 at 9.33.31 PM

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


# sage
def BabaisClosestVertexAlgorithm(L, w):
    '''
    INPUT:
    * "L" -- a matrix representing the LLL-reduced basis (v1, ..., vn) of a lattice.
    * "w" -- a target vector to approach to.
    OUTPUT:
    * "v" -- a approximate closest vector.
    '''
    vt = L.solve_left(w)
    v = vector(ZZ, [0]*len(L[0]))
    for t, vi in zip(vt, L):
        v += round(t) * vi
    return v

Cryptosystems Based on Hard Lattice Problems

几大重要的基于lattice的cryptosystems：

Ajtai-Dwork cryptosystem
- Provably secure unless a worst-case lattice problem can be solved in polynomial time.
- key size: $O(n^4)$.
- Nguyen and Stern showed that any practical and efficient implementation of the Ajtai-Dwork system is insecure.
GGH cryptosystem of Goldreich, Goldwasser, and Halevi
- Private key: good basis; public key: bad basis.
- Message: a binary vector $m$
- Encryption: linear combination $\sum{m_iv_i^{\text{bad}}}$with the bad basis, and then perturbs the sum by adding a small random vector $r$.
- Decryption: knowing a good basis, use Babai’s algorithm to recover $m$.
- Nguyen showed that a transformation of the original GGH encryption scheme reduced the problem to an easier CVP.
NTRU cryptosystem proposed by Hoffstein, Pipher, and Silverman
- is most naturally described in terms of quotients of polynomial rings.
- hard problem underlying is easily transformed into an SVP (for key recovery) or a CVP (for plaintext recovery) in a special class of lattices.
- public key size: $O(n \log{n})$.

Roughly speaking, in order to achieve $k$ bits of security, encryption and decryption for Elgamal, RSA, and ECC require $O(k^3)$ operations, while encryption and decryption for lattice-based systems require only $O(k^2)$ operations. Further, the simple linear algebra operations used by lattice-based systems are very easy to implement in hardware and software. However, it must be noted that the security analysis of lattice-based cryptosystems is not nearly as well understood as it is for factorization and discrete logarithm-based systems.

The GGH Public Key Cryptosystem

推荐阅读：https://kel.bz/post/lattices/

Screen Shot 2020-02-10 at 10.06.27 PM

GGH is an example of a probabilistic cryptosystem.

Convolution Polynomial Rings

下面就是NTRU了。

NTRU的一些前置知识。

卷积多项式商环？

$x^i$取模$x^n$，系数$a_i$取模$q$。

Screen Shot 2020-02-11 at 1.01.44 AM

Screen Shot 2020-02-11 at 1.02.43 AM

Screen Shot 2020-02-11 at 1.05.08 AM

Screen Shot 2020-02-11 at 1.06.13 AM

The NTRU Public Key Cryptosystem

实际上RSA、Diffile-Hellman、Elgamal、ECC都是group-based。

而NTRU则是ring-based。

NTRUEncrypt

NTRU(pronounced $en-tr\bar{u}$): “N-th degree Truncated polynomial Ring Units”

N：维度，prime
p, q：两个模数，gcd(N, q) = gcd(p, q) = 1

Screen Shot 2020-02-11 at 1.12.52 AM

Screen Shot 2020-02-11 at 1.14.26 AM

为了让长度尽可能地短。

Public parameters：(N, p, q, d)
Private parameters：$f(x) \in \mathcal{T}(d+1, d) \quad \text{and}\ \quad g(x) \in \mathcal{T}(d, d)$

Screen Shot 2020-02-11 at 1.19.23 AM

The condition $q>(6d+1)$ ensures that decryption never fails.

For additional efficiency and to reduce the size of the public key, it may be advantageous to choose a smaller value of $q$.

Decryption failures have the potential to reveal private key information to an attacker.

NTRUEncrypt is an example of a probabilistic cryptosystem.

It is a bad idea for Bob to send the same message twice using different random elements.

$F_q(x) \in R_q$ tend to be randomly and uniformly distributed modulo $q$, same as $h(x)$ and $e(x)$.

The most time consuming part of encryption and decryption is the convolution product.

NTRUEncrypt encryption and decryption take $O(N^2)$ steps, where each step is extremely fast.

Screen Shot 2020-02-11 at 1.25.38 AM

Screen Shot 2020-02-11 at 1.26.00 AM

Mathematical Problems for NTRUEncrypt

A hidden relationship: $$ f(x) \star h(x) \equiv g(x) \pmod{q} $$ where $f(x)$ and $g(x)$ have very small coefficients, and $h(x)$ appear to be random integers modulo $q$.

Thus breaking NTRUEncrypt by finding the private key comes down to solving the following problem:

Screen Shot 2020-02-11 at 1.28.56 AM

实际上，如果$(f(x), g(x))$是一个解，那么$(x^k \star f(x), x^k \star g(x))$也是一个解。

Any pair of polynomials $(f(x), g(x))$ with sufficiently small coefficients and satisfying $f(x) \star h(x) \equiv g(x) \pmod{q}$ serves as an NTRU decryption key.

The problem is not practically solvable by a brute-force or collision search.

It is proved that solving the NTRU key recovery problem is (almost certainly) equivalent to solving SVP in a certain class of lattices.

The use of lattice reduction is currently the best known method to recover an NTRU private key from the public key.

Example. NTRUEncrypt with parameters $$ (N, p, q, d) = (251, 3, 257, 83) $$ requires $2^{381.6}$ times check.

A collision search take $O(3^{N/2}/\sqrt{N})$ steps.

NTRUEncrypt as a Lattice Cryptosystem

如何用lattice来描述NTRUEncrypt？

The NTRU Lattice

$$ h(x) = h_0 + h_1 x + \dots + h_{N-1} x^{N-1} $$

The NTRU lattice $L_h^{\text{NTRU}}$ associated to $h(x)$ is the 2N-dimensional lattice spanned by the rows of the matirx

Screen Shot 2020-02-11 at 1.38.48 AM

Screen Shot 2020-02-11 at 8.16.18 PM

两个private parameters $(f(x), g(x))$其实就是这个lattice上的一个格点。

Screen Shot 2020-02-11 at 8.17.22 PM

且$(f, g)$就是这个lattice上的最短向量。

Screen Shot 2020-02-11 at 8.19.32 PM

Screen Shot 2020-02-11 at 8.21.07 PM

Quantifying the Security of an NTRU Lattice

The security of NTRUEncrypt depends at least on the difficulty of solving SVP in $L_h^{\text{NTRU}}$.

More generally, if Eve can solve apprSVP in $L_h^{\text{NTRU}}$ to within a factor of approximately $N^{\epsilon}$ for some $\epsilon < \frac{1}{2}$, then the short vector that she finds will probably serve as a decryption key.

Experiments suggest that values of $N$ in the range from 250 to 1000 yield security levels comparable to currently secure implementation of RSA, Elgamal, and ECC.

Lattice-Based Digital Signature Schemes

如何从lattice难题中构造出电子签名方案？

The GGH Digital Signature Scheme

Sign：只有拥有"good" basis（private key）的人才能解出CVP。

Verify：通过验证给定的lattice point（signature）的确是离目标向量最短的。

Screen Shot 2020-02-11 at 8.45.50 PM

The signature $s \in L$ is a solution to apprCVP for the vector $d \in R^n$, so signing a document is equivalent to solving apprCVP.

Screen Shot 2020-02-11 at 8.50.05 PM

Screen Shot 2020-02-11 at 8.50.20 PM

The GGH signature scheme suffers the same drawback as the GGH cryptosystem, namely security requires lattices of high dimension, which in turn lead to very large public verification keys.

However, both GGH and NTRU signature schemes have a more serious shortcoming which we now describe.

Transcript Analysis

Screen Shot 2020-02-11 at 9.29.13 PM

Screen Shot 2020-02-11 at 9.29.31 PM

这种签名方法是不安全的，容易暴露private key。

Rejection Sampling

如何防范这种transcript attacks？

An alternative method of thwarting transcript attacks was proposed by Lyubashevsky. It is based on the idea from statistics called rejection sampling, in which one generates samples from a desired probability distribution by using samples from another distribution.

没怎么看懂。。。

The NTRU Modular Lattice Signature Scheme

Lyubashevsky gave an example of a transcript-secure signature scheme based on the learning with errors (LWE) problem.

NTRUMLS：NTRU modular lattice signature scheme

Screen Shot 2020-02-12 at 12.31.14 AM

Screen Shot 2020-02-12 at 12.31.30 AM

An attacker, using only the publick key $h$, can create a transcript of signed document hashes that is indistinguishable from a transcript created using the private key $(f, g)$. Hence the latter transcript (created using the private key) contains no information about the private key.

Lattice Reduction Algorithms

In this section we describe an algorithm called LLL that solves apprSVP and/or apprCVP to within a factor of $C^n$, where $C$ is a small constant and $n$ is the dimension of the lattice.

Ultimately, the security of lattice-based cryptosystems depends on the inability of LLL and other lattice reduction algorithms to efficiently solve apprSVP and apprCVP to within a factor of, say, $O(\sqrt(n))$.

Screen Shot 2020-02-12 at 1.16.25 AM

Babai’s algorithm有两个版本：

the closest vertex algorthm
the closes plane algorithm (更好一些)

Generalizations of LLL

还有很多对LLL算法有优化或者推广的算法。

Most of these methods involve trading increased running time for improved output.

The deep insertion method

LLL算法中交换的是$v_k$和$v_{k-1}$，这个算法是把$v_k$插到前面的两个$v_{i-1}$和$v_i$之间。
BKZ-LLL

基于一个叫做Korkin-Zolotarev (KZ) reduced的东西。

把交换那一步换成了对一个区块的reduction。

维度为$n$，区块大小为$\beta$，那么复杂度为$O(\beta^{c\beta}n^d)$。

Applications of LLL to Cryptoanalysis

实际上LLL算法在cryptoanalysis里面有很多应用

Attacks on knapsack public key cryptosystems to more recent analysis of lattice-based cryptosystems such as Ajtai-Dwork, GGH, and NTRU.
Lattice reduction attacks on RSA in certain situations.
…

这里主要介绍一下LLL算法对四个cryptosystems的攻击（congruential, knapsack, GGH, NTRU)

Congruential Cryptosystems

2维的NTRU，直接用Gauss Lattice Reduction或者LLL把$(f, g)$求出来就可以了。

一道CTF题目：https://xz.aliyun.com/t/7163

Applying LLL to Knapsacks

直接对那个knapsack lattice用LLL就可以了。

Screen Shot 2020-02-12 at 1.30.05 AM

When using LLL to solve subset-sum problems, it is often helpful to multiply $m_1, \dots, m_n, S$ by a large constant $C$. 能使Gaussian expected shortest vector 变大$C^{1/(n+1)}$倍。