Public - Private Key Cryptography

This document represents Public-Private Key Cryptography Experience I have in the world of Bitcoin, Ethereum, Solana Blockchain.

Let's start with comparison between Web 2.0 vs Web 3.0 Authentication

How Banks Do Authentication:

notion image

Username and Password:

  • Traditional banks rely on a username and password for authentication.

  • These credentials allow you to:

    • View your funds.

    • Transfer funds.

    • Review your transaction history.

How Blockchains Do Authentication:

Public-Private Keypair:

Blockchain accounts are secured through a public-private keypair.

  • A public-private keypair consists of two keys used in asymmetric cryptography.

Public Key:

Private Key:

  • The private key is a secret string that must be kept confidential.

  • It is used to sign transactions and prove ownership of the associated public key.

  • Never share your private key with anyone.


Bits and Bytes

chevron-rightWhy Learning this to execute Public-Private Key Pairs?hashtag

Private keys are stored in a certain format, whenever you use them or generate them.They are stored/generated in bits and bytes.Bits and Bytes understanding, and various encoding-decoding helps when dealing and understanding private keys.

What is a Bit?

  • A bit is the smallest unit of data in a computer.

  • It can have one of two values: 0 or 1.

  • All programs and code you write are eventually converted to 0's and 1's.

Analogy:

  1. Think of a bit like a light switch that can either be off (0) or on (1).

  2. Bit Representation in JavaScript:

  3. Here, x represents a single bit with a value of 0.

What is a Byte?

  • A byte is a group of 8 bits.

  • It’s the standard unit of data used to represent a single character in memory.

Possible Values:

  • Since each bit can be either 0 or 1, a byte can have 2^8 (256) possible values, ranging from 0 to 255.

  • Example: The binary sequence 11001010 represents a specific value in decimal (we'll cover this in the assignment below).

Byte Representation:

  • Here, x is a byte, representing the decimal value 202, which is equivalent to 11001010 in binary.

Array of Bytes:

  • This is an array containing multiple bytes.

Using UInt8Array in JavaScript:

Definition

  • UInt8Array is a typed array in JavaScript that represents an array of 8-bit unsigned integers (bytes).

Advantages:

  • Memory Efficiency: Uses less space; each value takes only 1 byte.

  • Constraints: Ensures that values don’t exceed 255, which is the maximum value a byte can hold.

Example:

  • This code creates a UInt8Array with four bytes, ensuring that each value stays within the valid byte range.

Example:


Why Use UInt8Array Over Native Arrays?

Memory Efficiency:

  • Native arrays in JavaScript store numbers using 64 bits (8 bytes) per number, regardless of the actual size of the number.

  • UInt8Array stores each number using only 1 byte, which is sufficient for values between 0 and 255.

Constraints:

  • UInt8Array enforces that each element doesn’t exceed 255, preventing potential overflow errors.


Encodings

  • When working with computers, data is often represented in a format that is not human-readable, such as binary or bytes.

  • Encoding is the process of converting this data into a more readable format.

  • Some common encodings include ASCII, Hex, Base64, and Base58.

  • These encodings help us represent binary data in a more understandable way.

1. ASCII (American Standard Code for Information Interchange)

  • 1 character = 7 bits

  • ASCII is one of the oldest encodings used to represent text in computers. Each character in ASCII corresponds to a specific number (ranging from 0 to 127), which is represented in binary.

  • For example, the letter 'A' is represented by the number 65 in ASCII, which is 01000001 in binary.

chevron-rightConverting Bytes to ASCIIhashtag
chevron-rightConverting ASCII to Byteshashtag
chevron-rightUsing UInt8Array for ASCIIhashtag
chevron-rightASCII to UInt8Arrayhashtag

💡ASCII table - Table of ASCII codes, characters and symbols (ascii-code.com)arrow-up-right HTML ASCII Reference (w3schools.com)arrow-up-right

2. Hexadecimal (Hex)

  • 1 character = 4 bits

  • Hexadecimal is a base-16 encoding system that uses 16 characters: 0-9 and A-F. It is commonly used in programming and digital systems to represent binary data in a more compact and readable format.

  • Each hex digit represents four bits (a nibble), and two hex digits represent one byte.

chevron-rightConverting Array to Hexhashtag
chevron-rightConverting Hex to Arrayhashtag

💡Hex (Base16) encoder & decoder, a simple online tool 🧰 (hexator.com)arrow-up-right

3. Base64

  • 1 character = 6 bits

  • Base64 is an encoding scheme that represents binary data in an ASCII string format. It uses 64 different characters (A-Z, a-z, 0-9, +, /). It is commonly used in data transfer, encoding images, and storing complex data as text.

chevron-rightEncoding to Base64hashtag

💡Base64 Encode/Decodearrow-up-rightBase64 Decode/Encodearrow-up-right

4. Base58

  • Base58 is similar to Base64 but uses a different set of characters to avoid visually similar characters (e.g., 0 and O, l and 1) and to make the encoded output more user-friendly.

  • It is often used in Bitcoin and other cryptocurrencies for encoding addresses and other data.

chevron-rightEncoding to Base58hashtag
chevron-rightDecoding from Base58hashtag

Hashing vs Encryption

Hashing

  • Hashing converts data into a fixed-size string of characters, known as a hash.

Key points:

  • Deterministic: The same input will always produce the same hash.

  • Fixed Size: Regardless of the input size, the output hash will always be the same length.

  • One-Way Function: Hashes cannot be reversed to retrieve the original input data.

  • Collision Resistance: It is computationally difficult to find two different inputs that produce the same hash.

Common Hashing Algorithms:

  • SHA-256: Widely used in blockchain technology, ensuring data integrity.

  • MD5: Once popular for checksums, now considered insecure due to vulnerabilities.

SHA-256 (Secure Hash Algorithm - 256 bit)

SHA-256 is a cryptographic hash function that outputs a 256-bit hash value. It is widely used in blockchain technology for ensuring data integrity, notably in Bitcoin, Ethereum, and other cryptocurrency networks. The function maps data of any size to a fixed-size hash, making it impossible to reverse-engineer the original input from the hash.

Mathematical Equation for SHA-256

The SHA-256 algorithm involves a series of logical and bitwise operations applied to an input message. Here's a simplified view of the SHA-256 hashing process:

  1. Padding: The input message is padded to a length that is a multiple of 512 bits.

  1. Initialize Hash Values: SHA-256 initializes eight constant hash values (H0,H1,H2,…,H7H_0, H_1, H_2, \ldots, H_7H0​,H1​,H2​,…,H7​)

  1. Process Message in 512-bit Chunks: The padded message is divided into 512-bit blocks. For each block, SHA-256 uses logical functions and bitwise operations like AND, OR, XOR, and right rotations to compress the message.

  1. Final Hash Value: After processing all chunks, the final 256-bit hash is formed by concatenating the updated hash values:

Example of SHA-256 Calculation:

For the input "hello world", the resulting SHA-256 hash is:

SHA-256 Mathematical Representation

Where:

  • H_i is the hash value

  • Ch is the choose function

  • \Sigma_1 is the uppercase sigma 1 function

  • W_i is the message schedule

  • K_i is the round constant

SHA-256 Implementation (Python):

MD5 (Message Digest Algorithm 5)

MD5 is another cryptographic hash function that produces a 128-bit hash value. It was once widely used for verifying data integrity but is now considered insecure due to its susceptibility to collision attacks.

Mathematical Equation for MD5

The MD5 algorithm processes the message in 512-bit chunks and outputs a 128-bit hash. Like SHA-256, MD5 also involves padding and dividing the message into blocks.

  1. Padding: The input message is padded to ensure that the total length is congruent to 448 modulo 512 (i.e., 64 bits less than a multiple of 512). The padding is similar to SHA-256:

  1. Initialize MD5 Constants: MD5 initializes four 32-bit variables (A, B, C, D):

  1. Process Message in 512-bit Chunks: For each 512-bit chunk, the algorithm performs 64 iterations divided into four rounds. Each round applies a non-linear function (F, G, H, I), bitwise operations, and additions.

  1. Final Hash Value: After processing all the chunks, the final MD5 hash is formed by concatenating the updated variables (A, B, C, D):

Example of MD5 Calculation:

For the input "hello world", the resulting MD5 hash is:

MD5 Mathematical Representation

Where:

  • a_i, b_i, c_i, d_i are 32-bit words

  • F is a nonlinear function

  • X[k] is the k-th 32-bit word of the message

  • T[i] is the i-th element of the table of constants

  • <<< s denotes a left bit rotation by s positions

MD5 Mathematical Representation

Security and Integrity:

  • SHA-256: Considered secure and widely used in blockchain for data integrity and digital signatures.

  • MD5: Known for vulnerabilities (collision attacks), making it insecure for cryptographic uses but still used in non-critical scenarios like file checksums.

Comparison: SHA-256 vs MD5

Feature
SHA-256
MD5

Output Size

256 bits

128 bits

Security

Secure for most uses

Vulnerable to collisions

Use in Blockchain

Widely used

Not recommended

Speed

Slower

Faster

Encryption

  • Encryption converts plaintext into ciphertext using an algorithm and a key.

Key points:

  • Reversible: With the correct key, the ciphertext can be decrypted back to plaintext.

  • Key-Dependent: The security of encryption relies on the secrecy of the key.

Types of Encryptions:

notion image

1. Symmetric Encryption:

Definition: The same key is used for both encryption and decryption.

Common Algorithms:

  • AES (Advanced Encryption Standard)

  • DES (Data Encryption Standard)

chevron-rightExample:hashtag

💡AES Encryption and Decryption Online (devglan.com)arrow-up-right

2. Asymmetric Encryption:

notion image

Uses a pair of keys – a public key and a private key – for encryption and decryption.

Key Pair:

  • Public Key: Can be shared openly and is used to encrypt data.

  • Private Key: Must be kept confidential and is used to decrypt data encrypted with the corresponding public key.

Common Algorithms:

  • RSA (Rivest–Shamir–Adleman)

  • ECC (Elliptic Curve Cryptography) - Used by ETH and BTC

  • EdDSA (Edwards-curve Digital Signature Algorithm) - Used by SOL

Common Elliptic Curves:

  • secp256k1: Used in Bitcoin (BTC) and Ethereum (ETH).

  • ed25519: Used in Solana (SOL).

💡How Elliptic Curves Workarrow-up-right

  • Use Cases of Public-Key Cryptography:

  • SSL/TLS Certificates: Ensuring secure communication over the internet.

  • SSH Keys: For secure server access or pushing code to GitHub.

  • Blockchains and Cryptocurrencies: Ensuring secure and verifiable transactions.


💡- A message on the blockchain is signed using private key. - A miner verifies the transaction using the signature and public key. - Public/PrivateKeys & Signing - Blockchain Demo: Public / Private Keys & Signing (andersbrownworth.com)arrow-up-right

Creating a public/private keypair

notion image

EdDSA - Edwards-curve Digital Signature Algorithm - ED25519

chevron-rightUsing @noble/ed25519hashtag
chevron-rightUsing @solanaweb3.jshashtag

ECDSA (Elliptic Curve Digital Signature Algorithm) - secp256k1

chevron-rightUsing @noble/secp256k1hashtag
chevron-rightUsing ethershashtag

Hierarchical Deterministic (HD) Wallet

HD wallets generate a tree of key pairs from a single seed, allowing users to manage multiple addresses from one root seed.Problem:

  • Traditionally, maintaining multiple wallets required storing multiple public-private key pairs.

  • This is cumbersome and risky, as losing any one of these keys can result in the loss of associated funds.

Solution - BIP-32:

  • Bitcoin Improvement Proposal 32 (BIP-32), introduced by Bitcoin Core developer Pieter Wuille in 2012, addresses this problem by standardizing the derivation of private and public keys from a single master seed.

  • BIP-32 introduced the concept of hierarchical deterministic (HD) wallets, which use a tree-like structure to manage multiple accounts easily.

How to Create an HD Wallet

Mnemonics

  • A mnemonic phrase, or seed phrase, is a human-readable sequence of words used to generate a cryptographic seed.

  • BIP-39(Improvement to BIP-32) defines how mnemonic phrases are generated and converted into a seed.

Example Code to Generate a Mnemonic:

💡Reference:

Seed Phrase

  • The seed is a binary number derived from the mnemonic phrase. This seed is used to generate the master private key.

Example Code to Generate a Seed from a Mnemonic:

Reference:

Derivation Paths

notion image
  • Derivation paths specify a systematic way to derive various keys from the master seed.

  • They allow users to recreate the same set of addresses and private keys from the seed across different wallets, ensuring interoperability and consistency.

  • A derivation path is typically expressed in a format like m / purpose' / coin_type' / account' / change / address_index.

  • m: Refers to the master node, or the root of the HD wallet.

  • purpose: A constant that defines the purpose of the wallet (e.g., 44' for BIP44, which is a standard for HD wallets).

  • coin_type: Indicates the type of cryptocurrency (e.g., 0' for Bitcoin, 60' for Ethereum, 501' for solana).

  • account: Specifies the account number (e.g., 0' for the first account).

  • change: This is either 0 or 1, where 0 typically represents external addresses (receiving addresses), and 1 represents internal addresses (change addresses).

  • address_index: A sequential index to generate multiple addresses under the same account and change path.

Example Code for Deriving Paths and Generating Keys:

Reference:


Additional

💡 Can you guess the 12-word recovery phrase [Explanation with Calculations] 💡

Understanding the 12-Word Recovery Phrase

  1. Mnemonic Phrases (BIP39 Standard):

  • The 12-word recovery phrase is based on the BIP39 standard, which is commonly used to generate and restore wallets.

  • These phrases are used to generate the wallet's private key. The words are chosen from a specific list of 2,048 words (known as the BIP39 wordlist).

  1. Combinatorial Explosion:

  • A 12-word recovery phrase can be any combination of 12 words from this list.

  • The number of possible combinations of 12 words from a list of 2,048 words is astronomical.

Computation of Combinations

To compute the total number of possible 12-word combinations:

Total Combinations = 204812 = 204812

Let's calculate that:

204812 ≈ 2132≈ 5.444517870735016×1039204812 ≈ 2132 ≈ 5.444517870735016×1039

This is approximately 5.4×10395.4×1039 possible combinations.

Probability of Guessing Correctly

The probability of correctly guessing a 12-word recovery phrase is the inverse of the number of combinations:Probability=1204812≈1.8×10−40Probability=2048121​≈1.8×10−40This probability is incredibly small, making it nearly impossible to guess the correct recovery phrase by chance.

Computational Effort and Time

Let’s assume you could check a huge number of phrases per second:

  • Hypothetical Scenario:

  • Suppose you could check 1 billion (109)(109) Phrases per second. This is an unrealistically high number but will help illustrate the difficulty.

  • Number of seconds in a year: 31,536,000seconds/year31,536,000seconds/year

  • Number of checks per year: 109×31,536,000≈3.1536×1016109×31,536,000≈3.1536×1016

Even at this rate, it would take:5.4×10393.1536×1016≈1.71×1023 years3.1536×10165.4×1039​≈1.71×1023 yearsThis is longer than the current age of the universe by many orders of magnitude.

Practical Considerations

  • Random Generation Is Impractical: Generating a random 12-word phrase and finding a matching wallet by brute force is practically impossible due to the enormous number of possible combinations.

  • Cryptographic Security: Modern cryptocurrencies are designed with security in mind, making brute force attacks infeasible.

Conclusion

It is theoretically possible to find a 12-word recovery phrase by luck or by generating random phrases, but the probability of success is so low that it is effectively impossible.Even with the most powerful computational resources, the time required would exceed the age of the universe by an unimaginable factor.Cryptocurrencies rely on this extremely low probability to ensure the security of wallet keys, making it virtually impossible to guess or brute-force someone's private key or recovery phrase.

Last updated