EIP-6690: EVM Modular Arithmetic Extensions (EVMMAX) Ethereum Improvement Proposals AllCoreNetworkingInterfaceERCMetaInformational ⚠️ Draft Standards Track: Core EIP-6690: EVM Modular Arithmetic Extensions (EVMMAX) Create modular addition, subtraction, and multiplication opcodes. Authors Jared Wasinger (@jwasinger), Alex Beregszaszi (@axic) Created 2023-03-15 Discussion Link https://ethereum-magicians.org/t/eip-6690-evm-modular-arithmetic-extensions-evmmax-decoupled-from-eof/13322 Table of Contents Abstract Motivation Specification Overview Conventions Constants Context Variables Helpers New Opcodes Changes to Contract Execution Rationale Montgomery Modular Multiplication Memory Alignment for EVMMAX Values Gas Costs Backwards Compatibility Security Considerations Copyright Abstract This EIP proposes the addition of new optimized modular addition, subtraction and multiplication opcodes to the EVM. These support odd moduli up to 4096 bits in size. Motivation Benefits of the changes proposed in this EIP: enables elliptic curve arithmetic operations on various curves including BLS12-381 to be implemented as EVM contracts For operations on values up to 256bits in size, reduces gas cost per operation by 90-95% compared to the current MULMOD and ADDMOD opcodes. for all cases where modexp precompile is useful, it could now be implemented as an EVM contract. enables substantial cost reductions for algebraic hash functions (e.g. MiMC/Poseidon), zkp verification in the EVM. Specification Overview During contract execution, a contract calls a setup instruction SETUPX, sourcing a modulus from a specified memory offset/size and computing several parameters used to speed up modular multiplication (referred to as “Montgomery” parameters). A zeroed memory space (whose size is a stack parameter passed to SETUPX) is allocated separate from EVM memory. The modulus, computed parameters and memory space are associated with the current call frame state and referred to as the active modulus state. If SETUPX is called again to switch to a different modulus, the memory space and Montgomery parameters of the previous active modulus state remain allocated (the memory spaces of active/previously-active modulus state are separate). New store and load opcodes STOREX/LOADX are used to copy multiples values to/from EVM memory and the memory space of the active modulus state. Arithmetic is performed with ADDMODX/SUBMODX/MULMODX opcodes which take and return no stack items, require a 3-byte immediate value appended to the opcode. The immediate is interpreted as 3 1-byte values z, x, y which are indexes to the array of EVMMAX values that comprise the memory space of the active modulus state. An arithmetic operation is performed on inputs at index x/y placing the result in index z. Conventions x === y % m: x % m == y % m pow(x, -1, m): The modular multiplicative inverse of x with respect to modulus m. Opcode definition syntax is formatted as mneumonic {immediate - type} {immediate2 - type} ...: stack_arg_1, stack_arg_2, ... where immediates are listed in the order that they proceed the opcode and stack arguments are ordered starting at the top of the stack. In the provided pseudocode, it is assumed that opcode gas charging logic is executed prior to execution logic. Any exception thrown should immediately end the current execution frame and return to the caller. Constants Name Value Description STOREX_BASE_GAS 3 base gas cost for STOREX opcode LOADX_BASE_GAS 3 base gas cost for LOADX opcode SETUPX_BASE_GAS 3 base gas cost for SETUPX opcode EVMMAX_MAX_MEM 65,536 bytes maximum amount of EVMMAX memory that can be used in a call frame MAX_MOD_SIZE 4096 bits tentative modulus size limit (can probably be removed because EVMMAX_MAX_MEM_SIZE effectively caps the modulus size) MULMODX_SUBQUADRATIC_START 50 modulus size in in multiples of 8 bytes where we switch to subquadratic mulmont cost model SYSTEM_WORD_SIZE_BITS varies depending on the system word size in bits of a client’s CPU Context Variables Name Type Meaning evmmax_state EVMMAXState a variable representing ephemeral state which exists for the duration of the current call and in the scope of the current call frame evm_memory bytes EVM memory for the current call context expand_evm_memory func(size_words: int) expands EVM memory by size_words * 32 bytes cost_evm_memory_expansion func(new_size_evm_words: int) -> int EVM memory expansion cost function, modified according to this EIP evm_stack object Allows access to the stack via pop() and peek(n) which return int stack elements contract_code bytes code of the currently-executing contract pc int EVM program counter class EVMMAXState(): def __init__(self): # ModState currently being used self.active_mod_state = None # a lookup of mod_id (int) -> ModState self.mods = {} class ModState(): def __init__(self, mod: int, num_vals_used: int, mod: int, r: int, r_squared: int, mod_inv_full=None, mod_inv=None): self.mod = mod # size (expressed in multiples of 8 bytes) needed to represent mod self.val_size_multiplier = math.ceil(len(hex(mod)[2:]) / (2 * 8)) self.num_vals_used = num_vals_used self.mod_inv = mod_inv self.mod_inv_full = mod_inv_full self.r = r self.r_squared = r_squared # a memory space of size num_vals_used * val_size_multiplier self.values = [0] * self.num_vals_used Helpers # ----------------------------------------------------------------------------- # gas-charging helpers def cost_precompute_mont(val_size_multiplier: int) -> int: PRECOMPUTE_MONT_LO_GAS_A = ? PRECOMPUTE_MONT_LO_GAS_B = ? PRECOMPUTE_MONT_HI_GAS_A = ? PRECOMPUTE_MONT_HI_GAS_B = ? cost = 0 if val_size_multiplier < MULMODX_SUBQUADRATIC_START: cost = math.ceil(PRECOMPUTE_MONT_LO_GAS_A * val_size_multiplier + \ PRECOMPUTE_MONT_LO_GAS_B) else: cost = math.ceil(PRECOMPUTE_MONT_HI_GAS_A * val_size_multiplier + \ PRECOMPUTE_MONT_HI_GAS_B) return cost def cost_addmodx(val_size_multiplier: int) -> int: ADDMODX_GAS_A = 0.20 ADDMODX_GAS_B = 0.15 cost = 0 if val_size_multiplier == 6: cost = 1 else: cost = round(ADDMODX_GAS_A * limb_count + ADDMODX_GAS_B) if cost == 0: cost = 1 return cost def cost_mulmodx(val_size_multiplier: int) -> int: MULMODX_LO_GAS_A = 0.090 MULMODX_LO_GAS_B = 0 MULMODX_LO_GAS_C = 0.24 MULMODX_HI_GAS_A = 0 MULMODX_HI_GAS_B = 10.0 MULMODX_HI_GAS_C = -270.0 cost = 0 if val_size_multiplier == 6: cost = 2 elif val_size_multiplier < MULMODX_SUBQUADRATIC_START: cost = math.ceil(MULMODX_LO_GAS_A * (val_size_multiplier ** 2) + \ MULMODX_LO_GAS_B * val_size_multiplier + \ MULMODX_LO_GAS_C) else: cost = math.ceil(MULMODX_HI_GAS_A * val_size_multiplier ** 2 + \ MULMODX_HI_GAS_B * val_size_multiplier + \ MULMODX_HI_GAS_C) if cost == 0: cost = 1 return cost # ----------------------------------------------------------------------------- # bigint helpers # a bigint is a unsigned number represented as a list of unsigned system words in descending order of significance # split a double-width value into hi/low words def hi_lo(double_width_val: int) -> (int, int): base = 2**SYSTEM_WORD_SIZE_BITS assert double_width_val < base**SYSTEM_WORD_SIZE_BITS, "val must fit in two words" return (double_width_val >> SYSTEM_WORD_SIZE_BITS) % base, double_width_val % base def bigint_to_int(x: [int]) -> int: res = 0 for i in reversed(range(len(x))): res += x[i] * 2**(SYSTEM_WORD_BITS * (len(x) - i - 1)) return res def int_to_bigint(x: int, word_count: int): res = [0] * word_count for i in range(word_count): res[word_count - i - 1] = x & (2**SYSTEM_WORD_BITS - 1) x >>= SYSTEM_WORD_BITS return res # return x - y (omitting borrow-out) def bigint_sub(x: [int], y: [int]) -> [int]: num_words = len(x) res = [0] * num_words c = 0 for i in reversed(range(num_words)): c, res[i] = sub_with_borrow(x[i], y[i], c) return res # return x >= y def bigint_gte(x: [int], y: [int]) -> bool: for (x_word, y_word) in list(zip(x,y)): if x_word > y_word: return True elif x_word < y_word: return False # x == y return True # CIOS Montgomery multiplication algorithm # # input: # * x, y, mod - bigint inputs of `val_size_multiplier` length. the most significant limb of the modulus cannot be zero. # * mod_inv - pow(-mod, -1, 2**SYSTEM_WORD_SIZE_BITS) # requires: # * x < mod and y < mod # * mod_int % 2 != 0 # * mod[0] != 0 # returns: # (x * y * pow(2**(SYSTEM_WORD_SIZE_BITS * val_size_multiplier), -1, mod)) % mod represented as a bigint # note: references to x_int/y_int/mod_int/t_int refer to the python int representation of the corresponding bigint variable def mulmont_quadratic(x: [int], y: [int], mod: [int], modinv: int) -> [int]: assert len(x) == len(y) and len(y) == len(mod), "{}, {}, {}".format(x, y, mod) assert mod[0] != 0, "modulus must occupy all words" word_count = len(mod) t = [0] * (word_count + 2) for i in reversed(range(word_count)): # first inner-loop: t <- t + x_int * y[i] c = 0 for j in reversed(range(word_count)): c, t[j + 2] = hi_lo(t[j + 2] + x[j] * y[i] + c) t[0], t[1] = hi_lo(t[1] + c) m = (modinv * t[-1]) % BASE c, _ = hi_lo(m * mod[-1] + t[-1]) # second inner-loop: # 1. t_int <- t_int + modinv * mod_int * t[-1] # 2. t_int <- t_int // (2**SYSTEM_WORD_SIZE) # note: # after step 1: # * modinv * mod_int * t[-1] === -1 % (2**SYSTEM_WORD_SIZE_BITS) # * t_int === (t_int + (-1) t_int) % (2**SYSTEM_WORD_SIZE_BITS) === 0 % (2**SYSTEM_WORD_SIZE_BITS) # so the shift in step 2 is a word-sized right shift. # Steps 1 and 2 are combined and the shift is implicit. for j in reversed(range(1, word_count)): c, t[j + 2] = hi_lo(t[j + 1] + mod[j - 1] * m + c) hi, t[2] = hi_lo(t[1] + c) t[1] = t[0] + hi # t_int = (t_int + t_int * mod_int * pow(-(2**(SYSTEM_WORD_SIZE_BITS*len(mod))), -1, mod_int)) // (2 ** (len(mod) * SYSTEM_WORD_SIZE_BITS)) # 0 < t_int < 2 * mod_int t = t[1:] if t[0] != 0: # result occupies len(mod) + 1 words so it must be greater than modulus return bigint_sub(t, [0] + mod)[1:] elif bigint_gte(t[1:], mod): return bigint_sub(t[1:], mod) else: return t[1:] # subquadratic mulmont: same general algorithm as mulmont_quadratic with the assumption # that any multiplications will be performed using Karatsuba subquadratic multiplication algorithm # input: # x, y, mod (int) - x < mod and y < mod # mod (int) - an odd modulus # R (int) - a power of two, and greater than mod # mod_inv (int) - pow(-mod, -1, R) # output: # (x * y * pow(R, -1, mod)) % mod # def mulmont_subquadratic(x: int, y: int, mod: int, mod_inv_full: int, R: int) -> int: T = x * y m = ((T % R) * mod_inv_full) % R T = T + m * mod T /= R if T >= mod: T -= mod return T def mulmont(mod_state: ModState, x: int, y: int) -> int: if mod_state.val_size_multiplier >= MULMODX_SUBQUADRATIC_START: return mulmont_subquadratic(x, y, mod_state.mod, mod_state.mod_inv) else: x_bigint = int_to_bigint(x, (mod_state.val_size_multiplier * 64) // SYSTEM_WORD_SIZE_BITS) y_bigint = int_to_bigint(y, (mod_state.val_size_multiplier * 64) // SYSTEM_WORD_SIZE_BITS) mod_bigint = int_to_bigint(mod_state.mod) return bigint_to_int(mulmont_quadratic(x_bigint, y_bigint, mod_bigint, mod_state.mod_inv_full, mod_state.r)) New Opcodes Mneumonic Opcode Immediate size (bytes) Stack in Stack out SETUPX 0x21 0 4 0 ADDMODX 0x22 3 0 0 SUBMODX 0x23 3 0 0 MULMODX 0x24 3 0 0 LOADX 0x25 0 3 0 STOREX 0x26 0 3 0 SETUPX SETUPX : mod_id, mod_offset, mod_size, vals_used Gas Charging mod_id = evm.stack.peek(0) mod_offset = evm_stack.peek(1) mod_size = evm_stack.peek(2) vals_used = evm_stack.peek(3) cost = SETUPX_BASE_GAS if mod_id in evmmax_state.mods: # the modulus state keyed by mod_id was already active in this call-frame. # no additional charge beyond SETUPX_BASE_GAS return if vals_used > 256: raise Exception("cannot use more than 256 values for a given mod_id") if mod_offset + mod_size > len(evm_memory): raise Exception("cannot load a modulus that would extend beyond the bounds of EVM memory") val_size_multiplier = math.ceil(mod_size / 8) cost += cost_precompute_mont(val_size_multiplier) cost += cost_evm_memory_expansion(math.ceil((num_vals_used * val_size_multiplier * 8) / 32)) Execution mod_id = stack.pop() mod_offset = stack.pop() mod_size = stack.pop() vals_used = stack.pop() mod_inv = None if mod_id in evmmax_state.mods[mod_id]: # this mod state was previously used in this call frame. # the associated montgomery parameters and memory space are already allocated. # mark mod_id as the current active modulus state evmmax_state.active_mod_state = evmmax_state.mods[mod_id] return val_size_multiplier = math.ceil(mod_size / 8) mod = int.from_bytes(evm_memory[mod_offset:mod_offset+val_size], byteorder='big') if mod == 0 or mod % 2 == 0: raise Exception("modulus must be nonzero and odd") if val_size_multiplier >= MULMODX_SUBQUADRATIC_START: mod_inv_full = pow(-r, -1, mod) else: mod_inv = pow(-mod, -1, 2**SYSTEM_WORD_SIZE_BITS) r = 2**(SYSTEM_WORD_SIZE_BITS * val_size_multiplier) r_squared = r**2 % mod mod_state = ModState(mod, val_size, r, r_squared, mod_inv_full=mod_inv_full, mod_inv=mod_inv) evmmax_state.mods[mod_id] = mod_state evmmax_state.active_mod_state = mod_state LOADX LOADX: dst_offset, val_idx, num_vals Description Load EVMMAX values in the current active modulus state to EVM memory. Gas Charging cost = LOADX_BASE_GAS dst_offset = evm_stack.peek(0) val_idx = evm_stack.peek(1) num_vals = evm_stack.peek(2) val_size_multiplier = evmmax_state.active_mod_state.val_size_multiplier if dst_offset + num_vals * val_size_multiplier > len(evm_memory): cost += cost_evm_mem_expansion(evm_memory, (dst_offset + num_vals * val_size_multiplier) - len(evm_memory)) cost += cost_mulmodx(val_size_multiplier) * mod_state.num_vals Execution dst_offset = evm_stack.pop() val_idx = evm_stack.pop() num_vals = evm_stack.pop() if num_vals == 0: return mod_state = evmmax_state.active_mod_state if mod_state == None: raise Exception("no modulus set") if val_idx + num_vals > len(mod_state.vals): raise Exception("attempt to load beyond allocated values") if dst_offset + num_vals * mod_state.val_size_multiplier > len(evm_memory): expand_evm_memory(evm_memory, (dst_offset + num_vals * mod_state.val_size_multiplier * 8) - len(evm_memory)) cur_dst_offset = dst_offset for i in range(num_vals): mont_val = mod_state.vals[start_val + i] # convert the value to canonical form val = mulmont(mod_state, mont_val, 1) evm_memory[cur_dst_offset:cur_dst_offset + mod_state.val_size_multiplier] = val.to_bytes(mod_state.val_size_multiplier * 8, byteorder='big') cur_dst_offset += mod_state.val_size_multiplier * 8 STOREX STOREX: dst_val, offset, num_vals Description Store values from EVM memory into EVMMAX memory space of the current active modulus state, validating that they are reduced by the modulus. Gas Charging dst_val = evm_stack.peek(0) offset = evm_stack.peek(1) num_vals = evm_stack.peek(2) val_size_multiplier = evmmax_state.active_mod_state.val_size_multiplier cost = STOREX_BASE_COST + num_vals * cost_mulmodx(val_size_multiplier) Execution dst_val = evm_stack.pop() offset = evm_stack.pop() num_vals = evm_stack.pop() if num_vals == 0: return mod_state = evmmax_state.active_mod_state if mod_state == None: raise Exception("no modulus set") if dst_val + num_vals > len(mod_state.vals): raise Exception("attempt to copy to destination beyond allocated values") if offset + num_vals * mod_state.val_size_multiplier * 8 > len(evm_memory): raise Exception("source of copy would extend beyond allocated memory") cur_src_offset = offset r = 2** (mod_state.val_size_multiplier * SYSTEM_WORD_SIZE_BITS) % mod_state.mod r_squared = r ** 2 % mod_state.mod for i in range(num_vals): val = int.from_bytes(evm_memory[cur_src_offset:cur_src_offset + mod_state.val_size_multiplier * 8], byteorder='big') if val >= mod_state.modulus: raise Exception("values cannot be greater than the modulus") # convert the value to Montgomery form mont_val = mulmont(mod_state, val, mod_state.r_squared) mod_state.vals[dst_val + i] = mont_val cur_offset += mod_state.val_size_multiplier * 8 ADDMODX ADDMODX {z_offset - byte}, {x_offset - byte}, {y_offset - byte}: Description Compute the modular addition of two EVMMAX values, storing the result in an output. Gas Charging val_size_multiplier = evmmax_state.active_mod_state.val_size_multiplier cost = cost_addmodx(val_size_multiplier) Execution mod_state = evmmax_state.active_modulus if mod_state == None: raise Exception("no mod state set") z_offset = int(contract_code[pc+1:pc+2]) x_offset = int(contract_code[pc+2:pc+3]) y_offset = int(contract_code[pc+3:pc+4]) if x_offset >= mod_state.num_vals_used or y_offset >= mod_state.num_vals_used or z_offset >= mod_state.num_vals_used: raise Exception("out of bounds value reference") mod_state.values[z_offset] = (mod_state.values[x_offset] + mod_state.values[y_offset]) % mod_state.mod SUBMODX SUBMODX {z_offset - byte}, {x_offset - byte}, {y_offset - byte}: Description Compute the modular subtraction of two EVMMAX values in the current active modulus state, storing the result in an output. Gas Charging Same as ADDMODX. Execution mod_state = evmmax_state.active_modulus if mod_state == None: raise Exception("no mod state set") z_offset = int(contract_code[pc+1:pc+2]) x_offset = int(contract_code[pc+2:pc+3]) y_offset = int(contract_code[pc+3:pc+4]) if x_offset >= mod_state.num_vals_used or y_offset >= mod_state.num_vals_used or z_offset >= mod_state.num_vals_used: raise Exception("out of bounds value reference") mod_state.values[z_offset] = (mod_state.values[x_offset] - mod_state.values[y_offset]) % mod_state.mod MULMODX MULMODX {z_offset - byte}, {x_offset - byte}, {y_offset - byte}: Description Compute the Montgomery modular multiplication of two EVMMAX values in the current active modulus state, storing the result in an output. Gas Charging val_size_multiplier = evmmax_state.active_mod_state.val_size_multiplier cost = cost_mulmodx(val_size_multiplier) Execution mod_state = evmmax_state.active_modulus if mod_state == None: raise Exception("no mod state set") z_offset = int(contract_code[pc+1:pc+2]) x_offset = int(contract_code[pc+2:pc+3]) y_offset = int(contract_code[pc+3:pc+4]) if x_offset >= mod_state.num_vals_used or y_offset >= mod_state.num_vals_used or z_offset >= mod_state.num_vals_used: raise Exception("out of bounds value reference") mod_state.values[z_offset] = mulmont(mod_state, mod_state.values[x_offset], mod_state.values[y_offset]) Changes to Contract Execution EVM Memory Expansion Cost Function Any EVM operation which expands memory x bytes will charge to expand memory to cur_evm_mem_size + x + evmmax_mem_size bytes where evmmax_mem_size is the size of all allocated EVMMAX values in the current call context (the sum of the values used by each mod_id that has been previously/currently set with SETUPX). Jumpdest Analysis Jumpdest analysis is modified to disallow jumps into immediate data for ADDMDOX/SUBMODX/MULMODX. Rationale Montgomery Modular Multiplication EVMMAX values are stored internally in Montgomery form. Expressing values in Montgomery form enables the use of Montgomery reduction in modular multiplication which gives a substantial performance gain versus naive modular multiplication. Modular addition and subtraction on Montgomery form values is computed the same as normal. Memory Alignment for EVMMAX Values LOADX/STOREX move 64bit-aligned big-endian values to/from the memory space of the active modulus state. SETUPX memory expansion pricing is tuned to assume that values will be stored in a as 64bit-aligned values in their EVMMAX memory space. This choice is made to keep EVMMAX memory aligned to ensure performance. Gas Costs Gas models assume a rate of 1 gas per 25ns of execution time. ADDMODX/SUBMODX/MULMODX ADDMODX and SUBMODX can each be implemented using a single extended-precision addition, and single extended precision subtraction. This justifies a linear cost model. MULMODX runtime scales quadratically with input size. After a certain threshold, the quadratic complexity of mulmont_quadratic dominates and it becomes more performant to use mulmont_subquadratic. Thus, there is a segmented cost model to reflect different asymptotic behavior between quadratic/subquadratic mulmont. ADDMODX/SUBMODX/MULMODX pricing includes the cost of arithmetic and latency of accessing input values from CPU cache. The price model assumes that the implementation will be generic for most bitwidths with the exception of 321-384bits which is priced aggressively. LOADX/STOREX These perform conversion to/from Montgomery and canonical forms for each value copied (a single mulmont per value converted). The overhead of memory loading/copying is covered by cost_mulmontx. SETUPX Backwards Compatibility Jumpdest analysis changes in ths EIP could potentially break existing contracts where a jump destination occurs in the 3 bytes proceeding a 0x22/0x23/0x24. This is unlikely to affect many existing contracts. Further analysis of deployed contract bytecode can determine with certainty, which (if any) contracts could be broken. Security Considerations Copyright Copyright and related rights waived via CC0. Citation Please cite this document as: Jared Wasinger (@jwasinger), Alex Beregszaszi (@axic), "EIP-6690: EVM Modular Arithmetic Extensions (EVMMAX) [DRAFT]," Ethereum Improvement Proposals, no. 6690, March 2023. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-6690. Ethereum Improvement Proposals Ethereum Improvement Proposals ethereum/EIPs Ethereum Improvement Proposals (EIPs) describe standards for the Ethereum platform, including core protocol specifications, client APIs, and contract standards.