Public - Private Key Cryptography
This document represents Public-Private Key Cryptography Experience I have in the world of Bitcoin, Ethereum, Solana Blockchain.
Last updated
This document represents Public-Private Key Cryptography Experience I have in the world of Bitcoin, Ethereum, Solana Blockchain.
Last updated
Traditional banks rely on a username and password for authentication.
These credentials allow you to:
View your funds.
Transfer funds.
Review your transaction history.
Blockchain accounts are secured through a public-private keypair.
A public-private keypair consists of two keys used in asymmetric cryptography.
The public key is a string that can be shared openly with anyone.
It acts like your "account number" on the blockchain.
Example: Ethereum Address on Etherscan
The private key is a secret string that must be kept confidential.
It is used to sign transactions and prove ownership of the associated public key.
Never share your private key with anyone.
A bit is the smallest unit of data in a computer.
It can have one of two values: 0
or 1
.
All programs and code you write are eventually converted to 0's
and 1's
.
Think of a bit like a light switch that can either be off (0
) or on (1
).
Bit Representation in JavaScript:
Here, x
represents a single bit with a value of 0
.
A byte is a group of 8 bits.
Itβs the standard unit of data used to represent a single character in memory.
Since each bit can be either 0
or 1
, a byte can have 2^8 (256) possible values, ranging from 0
to 255
.
Example: The binary sequence 11001010
represents a specific value in decimal (we'll cover this in the assignment below).
Here, x
is a byte, representing the decimal value 202
, which is equivalent to 11001010
in binary.
This is an array containing multiple bytes.
Using UInt8Array
in JavaScript:
UInt8Array
is a typed array in JavaScript that represents an array of 8-bit unsigned integers (bytes).
Memory Efficiency: Uses less space; each value takes only 1 byte.
Constraints: Ensures that values donβt exceed 255
, which is the maximum value a byte can hold.
This code creates a UInt8Array
with four bytes, ensuring that each value stays within the valid byte range.
Why Use UInt8Array
Over Native Arrays?
Native arrays in JavaScript store numbers using 64 bits (8 bytes) per number, regardless of the actual size of the number.
UInt8Array
stores each number using only 1 byte, which is sufficient for values between 0
and 255
.
UInt8Array
enforces that each element doesnβt exceed 255
, preventing potential overflow errors.
When working with computers, data is often represented in a format that is not human-readable, such as binary or bytes.
Encoding is the process of converting this data into a more readable format.
Some common encodings include ASCII, Hex, Base64, and Base58.
These encodings help us represent binary data in a more understandable way.
1 character = 7 bits
ASCII is one of the oldest encodings used to represent text in computers. Each character in ASCII corresponds to a specific number (ranging from 0 to 127), which is represented in binary.
For example, the letter 'A' is represented by the number 65 in ASCII, which is 01000001
in binary.
π‘ASCII table - Table of ASCII codes, characters and symbols (ascii-code.com) HTML ASCII Reference (w3schools.com)
1 character = 4 bits
Hexadecimal is a base-16 encoding system that uses 16 characters: 0-9
and A-F
. It is commonly used in programming and digital systems to represent binary data in a more compact and readable format.
Each hex digit represents four bits (a nibble), and two hex digits represent one byte.
π‘Hex (Base16) encoder & decoder, a simple online tool π§° (hexator.com)
1 character = 6 bits
Base64 is an encoding scheme that represents binary data in an ASCII string format. It uses 64 different characters (A-Z
, a-z
, 0-9
, +
, /
). It is commonly used in data transfer, encoding images, and storing complex data as text.
π‘Base64 Encode/DecodeBase64 Decode/Encode
Base58 is similar to Base64 but uses a different set of characters to avoid visually similar characters (e.g., 0
and O
, l
and 1
) and to make the encoded output more user-friendly.
It is often used in Bitcoin and other cryptocurrencies for encoding addresses and other data.
Hashing converts data into a fixed-size string of characters, known as a hash.
Deterministic: The same input will always produce the same hash.
Fixed Size: Regardless of the input size, the output hash will always be the same length.
One-Way Function: Hashes cannot be reversed to retrieve the original input data.
Collision Resistance: It is computationally difficult to find two different inputs that produce the same hash.
SHA-256: Widely used in blockchain technology, ensuring data integrity.
MD5: Once popular for checksums, now considered insecure due to vulnerabilities.
SHA-256 is a cryptographic hash function that outputs a 256-bit hash value. It is widely used in blockchain technology for ensuring data integrity, notably in Bitcoin, Ethereum, and other cryptocurrency networks. The function maps data of any size to a fixed-size hash, making it impossible to reverse-engineer the original input from the hash.
Mathematical Equation for SHA-256
The SHA-256 algorithm involves a series of logical and bitwise operations applied to an input message. Here's a simplified view of the SHA-256 hashing process:
Padding: The input message is padded to a length that is a multiple of 512 bits.
Initialize Hash Values: SHA-256 initializes eight constant hash values (H0,H1,H2,β¦,H7H_0, H_1, H_2, \ldots, H_7H0β,H1β,H2β,β¦,H7β)
Process Message in 512-bit Chunks: The padded message is divided into 512-bit blocks. For each block, SHA-256 uses logical functions and bitwise operations like AND, OR, XOR, and right rotations to compress the message.
Final Hash Value: After processing all chunks, the final 256-bit hash is formed by concatenating the updated hash values:
Example of SHA-256 Calculation:
For the input "hello world", the resulting SHA-256 hash is:
Where:
H_i is the hash value
Ch is the choose function
\Sigma_1 is the uppercase sigma 1 function
W_i is the message schedule
K_i is the round constant
SHA-256 Implementation (Python):
MD5 is another cryptographic hash function that produces a 128-bit hash value. It was once widely used for verifying data integrity but is now considered insecure due to its susceptibility to collision attacks.
Mathematical Equation for MD5
The MD5 algorithm processes the message in 512-bit chunks and outputs a 128-bit hash. Like SHA-256, MD5 also involves padding and dividing the message into blocks.
Padding: The input message is padded to ensure that the total length is congruent to 448 modulo 512 (i.e., 64 bits less than a multiple of 512). The padding is similar to SHA-256:
Initialize MD5 Constants: MD5 initializes four 32-bit variables (A, B, C, D):
Process Message in 512-bit Chunks: For each 512-bit chunk, the algorithm performs 64 iterations divided into four rounds. Each round applies a non-linear function (F, G, H, I), bitwise operations, and additions.
Final Hash Value: After processing all the chunks, the final MD5 hash is formed by concatenating the updated variables (A, B, C, D):
Example of MD5 Calculation:
For the input "hello world", the resulting MD5 hash is:
Where:
a_i, b_i, c_i, d_i are 32-bit words
F is a nonlinear function
X[k] is the k-th 32-bit word of the message
T[i] is the i-th element of the table of constants
<<< s denotes a left bit rotation by s positions
SHA-256: Considered secure and widely used in blockchain for data integrity and digital signatures.
MD5: Known for vulnerabilities (collision attacks), making it insecure for cryptographic uses but still used in non-critical scenarios like file checksums.
Feature | SHA-256 | MD5 |
---|---|---|
Output Size | 256 bits | 128 bits |
Security | Secure for most uses | Vulnerable to collisions |
Use in Blockchain | Widely used | Not recommended |
Speed | Slower | Faster |
Encryption converts plaintext into ciphertext using an algorithm and a key.
Reversible: With the correct key, the ciphertext can be decrypted back to plaintext.
Key-Dependent: The security of encryption relies on the secrecy of the key.
1. Symmetric Encryption:
Definition: The same key is used for both encryption and decryption.
Common Algorithms:
AES (Advanced Encryption Standard)
DES (Data Encryption Standard)
π‘AES Encryption and Decryption Online (devglan.com)
2. Asymmetric Encryption:
Uses a pair of keys β a public key and a private key β for encryption and decryption.
Public Key: Can be shared openly and is used to encrypt data.
Private Key: Must be kept confidential and is used to decrypt data encrypted with the corresponding public key.
RSA (RivestβShamirβAdleman)
ECC (Elliptic Curve Cryptography) - Used by ETH and BTC
EdDSA (Edwards-curve Digital Signature Algorithm) - Used by SOL
secp256k1: Used in Bitcoin (BTC) and Ethereum (ETH).
ed25519: Used in Solana (SOL).
Use Cases of Public-Key Cryptography:
SSL/TLS Certificates: Ensuring secure communication over the internet.
SSH Keys: For secure server access or pushing code to GitHub.
Blockchains and Cryptocurrencies: Ensuring secure and verifiable transactions.
π‘- A message on the blockchain is signed using private key. - A miner verifies the transaction using the signature and public key. - Public/PrivateKeys & Signing - Blockchain Demo: Public / Private Keys & Signing (andersbrownworth.com)
EdDSA - Edwards-curve Digital Signature Algorithm - ED25519
ECDSA (Elliptic Curve Digital Signature Algorithm) - secp256k1
HD wallets generate a tree of key pairs from a single seed, allowing users to manage multiple addresses from one root seed.Problem:
Traditionally, maintaining multiple wallets required storing multiple public-private key pairs.
This is cumbersome and risky, as losing any one of these keys can result in the loss of associated funds.
Bitcoin Improvement Proposal 32 (BIP-32), introduced by Bitcoin Core developer Pieter Wuille in 2012, addresses this problem by standardizing the derivation of private and public keys from a single master seed.
BIP-32 introduced the concept of hierarchical deterministic (HD) wallets, which use a tree-like structure to manage multiple accounts easily.
Mnemonics
A mnemonic phrase, or seed phrase, is a human-readable sequence of words used to generate a cryptographic seed.
BIP-39(Improvement to BIP-32) defines how mnemonic phrases are generated and converted into a seed.
Example Code to Generate a Mnemonic:
π‘Reference:
Example in where it is done in Backpack: GitHub Link
Seed Phrase
The seed is a binary number derived from the mnemonic phrase. This seed is used to generate the master private key.
Example Code to Generate a Seed from a Mnemonic:
Reference:
Example in Backpack: GitHub Link
Derivation Paths
Derivation paths specify a systematic way to derive various keys from the master seed.
They allow users to recreate the same set of addresses and private keys from the seed across different wallets, ensuring interoperability and consistency.
A derivation path is typically expressed in a format like m / purpose' / coin_type' / account' / change / address_index
.
m
: Refers to the master node, or the root of the HD wallet.
purpose
: A constant that defines the purpose of the wallet (e.g., 44'
for BIP44, which is a standard for HD wallets).
coin_type
: Indicates the type of cryptocurrency (e.g., 0'
for Bitcoin, 60'
for Ethereum, 501'
for solana).
account
: Specifies the account number (e.g., 0'
for the first account).
change
: This is either 0
or 1
, where 0
typically represents external addresses (receiving addresses), and 1
represents internal addresses (change addresses).
address_index
: A sequential index to generate multiple addresses under the same account and change path.
Example Code for Deriving Paths and Generating Keys:
Reference:
Solana-specific Implementation:
π‘ Can you guess the 12-word recovery phrase [Explanation with Calculations] π‘
Understanding the 12-Word Recovery Phrase
Mnemonic Phrases (BIP39 Standard):
The 12-word recovery phrase is based on the BIP39 standard, which is commonly used to generate and restore wallets.
These phrases are used to generate the wallet's private key. The words are chosen from a specific list of 2,048 words (known as the BIP39 wordlist).
Combinatorial Explosion:
A 12-word recovery phrase can be any combination of 12 words from this list.
The number of possible combinations of 12 words from a list of 2,048 words is astronomical.
Computation of Combinations
To compute the total number of possible 12-word combinations:
Total Combinations = 204812 = 204812
Let's calculate that:
204812 β 2132β 5.444517870735016Γ1039204812 β 2132 β 5.444517870735016Γ1039
This is approximately 5.4Γ10395.4Γ1039 possible combinations.
Probability of Guessing Correctly
The probability of correctly guessing a 12-word recovery phrase is the inverse of the number of combinations:Probability=1204812β1.8Γ10β40Probability=2048121ββ1.8Γ10β40This probability is incredibly small, making it nearly impossible to guess the correct recovery phrase by chance.
Computational Effort and Time
Letβs assume you could check a huge number of phrases per second:
Hypothetical Scenario:
Suppose you could check 1 billion (109)(109) Phrases per second. This is an unrealistically high number but will help illustrate the difficulty.
Number of seconds in a year: 31,536,000seconds/year31,536,000seconds/year
Number of checks per year: 109Γ31,536,000β3.1536Γ1016109Γ31,536,000β3.1536Γ1016
Even at this rate, it would take:5.4Γ10393.1536Γ1016β1.71Γ1023 years3.1536Γ10165.4Γ1039ββ1.71Γ1023 yearsThis is longer than the current age of the universe by many orders of magnitude.
Practical Considerations
Random Generation Is Impractical: Generating a random 12-word phrase and finding a matching wallet by brute force is practically impossible due to the enormous number of possible combinations.
Cryptographic Security: Modern cryptocurrencies are designed with security in mind, making brute force attacks infeasible.
Conclusion
It is theoretically possible to find a 12-word recovery phrase by luck or by generating random phrases, but the probability of success is so low that it is effectively impossible.Even with the most powerful computational resources, the time required would exceed the age of the universe by an unimaginable factor.Cryptocurrencies rely on this extremely low probability to ensure the security of wallet keys, making it virtually impossible to guess or brute-force someone's private key or recovery phrase.