ERC-2678: Revised Ethereum Smart Contract Packaging Standard (EthPM v3) Ethereum Improvement Proposals AllCoreNetworkingInterfaceERCMetaInformational Standards Track: ERC ERC-2678: Revised Ethereum Smart Contract Packaging Standard (EthPM v3) Authors g. nicholas d’andrea (@gnidan), Piper Merriam (@pipermerriam), Nick Gheorghita (@njgheorghita), Christian Reitwiessner (@chriseth), Ben Hauser (@iamdefinitelyahuman), Bryant Eisenbach (@fubuloubu) Created 2020-05-26 Table of Contents Simple Summary Abstract Motivation Guiding Principles Use Cases Package Specification Conventions Document Format Document Specification EthPM Manifest Version Package Name Package Version Package Metadata Sources Contract Types Compilers Deployments Build Dependencies Object Definitions The Link Reference Object The Link Value Object The Bytecode Object The Package Meta Object The Sources Object The Source Object The Checksum Object The Contract Type Object The Contract Instance Object The Compiler Information Object BIP122 URI Glossary Rationale Minification Package Names BIP122 Compilers Backwards Compatibility Security Considerations Copyright Simple Summary A data format describing a smart contract software package. Abstract This EIP defines a data format for package manifest documents, representing a package of one or more smart contracts, optionally including source code and any/all deployed instances across multiple networks. Package manifests are minified JSON objects, to be distributed via content addressable storage networks, such as IPFS. Packages are then published to on-chain EthPM registries, defined in EIP-1319, from where they can be freely accessed. This document presents a natural language description of a formal specification for version 3 of this format. Motivation This standard aims to encourage the Ethereum development ecosystem towards software best practices around code reuse. By defining an open, community-driven package data format standard, this effort seeks to provide support for package management tools development by offering a general-purpose solution that has been designed with observed common practices in mind. Updates the schema for a package manifest to be compatible with the metadata output for compilers. Updates the "sources" object definition to support a wider range of source file types and serve as JSON input for a compiler. Moves compiler definitions to a top-level "compilers" array in order to: Simplify the links between a compiler version, sources, and the compiled assets. Simplify packages that use multiple compiler versions. Updates key formatting from snake_case to camelCase to be more consistent with JSON convention. Guiding Principles This specification makes the following assumptions about the document lifecycle. Package manifests are intended to be generated programmatically by package management software as part of the release process. Package manifests will be consumed by package managers during tasks like installing package dependencies or building and deploying new releases. Package manifests will typically not be stored alongside the source, but rather by package registries or referenced by package registries and stored in something akin to IPFS. Package manifests can be used to verify public deployments of source contracts. Use Cases The following use cases were considered during the creation of this specification. owned: A package which contains contracts which are not meant to be used by themselves but rather as base contracts to provide functionality to other contracts through inheritance. transferable: A package which has a single dependency. standard-token: A package which contains a reusable contract. safe-math-lib: A package which contains deployed instance of one of the package contracts. piper-coin: A package which contains a deployed instance of a reusable contract from a dependency. escrow: A package which contains a deployed instance of a local contract which is linked against a deployed instance of a local library. wallet: A package with a deployed instance of a local contract which is linked against a deployed instance of a library from a dependency. wallet-with-send: A package with a deployed instance which links against a deep dependency. simple-auction: Compiler "metadata" field output. Package Specification Conventions RFC2119 The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119. https://www.ietf.org/rfc/rfc2119.txt Prefixed vs Unprefixed A prefixed hexadecimal value begins with 0x. Unprefixed values have no prefix. Unless otherwise specified, all hexadecimal values should be represented with the 0x prefix. Prefixed: 0xdeadbeef Unprefixed: deadbeef Document Format The canonical format is a single JSON object. Packages must conform to the following serialization rules. The document must be tightly packed, meaning no linebreaks or extra whitespace. The keys in all objects must be sorted alphabetically. Duplicate keys in the same object are invalid. The document must use UTF-8 encoding. The document must not have a trailing newline. To ensure backwards compatibility, manifest_version is a forbidden top-level key. Document Specification The following fields are defined for the package. Custom fields may be included. Custom fields should be prefixed with x- to prevent name collisions with future versions of the specification. See Also: Formalized (JSON-Schema) version of this specification: package.spec.json Jump To: Definitions EthPM Manifest Version The manifest field defines the specification version that this document conforms to. Packages must include this field. Required: Yes Key: manifest Type: String Allowed Values: ethpm/3 Package Name The name field defines a human readable name for this package. Packages should include this field to be released on an EthPM registry. Package names must begin with a lowercase letter and be comprised of only the lowercase letters a-z, numeric characters 0-9, and the dash character -. Package names must not exceed 255 characters in length. Required: If version is included. Key: name Type: String Format: must match the regular expression ^[a-z][-a-z0-9]{0,255}$ Package Version The version field declares the version number of this release. Packages should include this field to be released on an EthPM registry. This value should conform to the semver version numbering specification. Required: If name is included. Key: version Type: String Package Metadata The meta field defines a location for metadata about the package which is not integral in nature for package installation, but may be important or convenient to have on-hand for other reasons. This field should be included in all Packages. Required: No Key: meta Type: Package Meta Object Sources The sources field defines a source tree that should comprise the full source tree necessary to recompile the contracts contained in this release. Required: No Key: sources Type: Object (String: Sources Object) Contract Types The contractTypes field hosts the Contract Types which have been included in this release. Packages should only include contract types that can be found in the source files for this package. Packages should not include contract types from dependencies. Packages should not include abstract contracts in the contract types section of a release. Required: No Key: contractTypes Type: Object (String: Contract Type Object) Format: Keys must be valid Contract Aliases. Values must conform to the Contract Type Object definition. Compilers The compilers field holds the information about the compilers and their settings that have been used to generate the various contractTypes included in this release. Required: No Key: compilers Type: Array (Compiler Information Object) Deployments The deployments field holds the information for the chains on which this release has Contract Instances as well as the Contract Types and other deployment details for those deployed contract instances. The set of chains defined by the BIP122 URI keys for this object must be unique. There cannot be two different URI keys in a deployments field representing the same blockchain. Required: No Key: deployments Type: Object (String: Object(String: Contract Instance Object)) Format: Keys must be a valid BIP122 URI chain definition. Values must be objects which conform to the following format: - Keys must be valid Contract Instance Names - Values must be a valid Contract Instance Object Build Dependencies The buildDependencies field defines a key/value mapping of EthPM packages that this project depends on. Required: No Key: buildDependencies Type: Object (String: String) Format: Keys must be valid package names. Values must be a Content Addressable URI which resolves to a valid package that conforms the same EthPM manifest version as its parent. Object Definitions Definitions for different objects used within the Package. All objects allow custom fields to be included. Custom fields should be prefixed with x- to prevent name collisions with future versions of the specification. The Link Reference Object A Link Reference object has the following key/value pairs. All link references are assumed to be associated with some corresponding Bytecode. Offsets: offsets The offsets field is an array of integers, corresponding to each of the start positions where the link reference appears in the bytecode. Locations are 0-indexed from the beginning of the bytes representation of the corresponding bytecode. This field is invalid if it references a position that is beyond the end of the bytecode. Required: Yes Type: Array Length: length The length field is an integer which defines the length in bytes of the link reference. This field is invalid if the end of the defined link reference exceeds the end of the bytecode. Required: Yes Type: Integer Name: name The name field is a string which must be a valid Identifier. Any link references which should be linked with the same link value should be given the same name. Required: No Type: String Format: must conform to the Identifier format. The Link Value Object Describes a single Link Value. A Link Value object is defined to have the following key/value pairs. Offsets: offsets The offsets field defines the locations within the corresponding bytecode where the value for this link value was written. These locations are 0-indexed from the beginning of the bytes representation of the corresponding bytecode. Required: Yes Type: Integer Format: See below. Format Array of integers, where each integer must conform to all of the following. greater than or equal to zero strictly less than the length of the unprefixed hexadecimal representation of the corresponding bytecode. Type: type The type field defines the value type for determining what is encoded when linking the corresponding bytecode. Required: Yes Type: String Allowed Values: "literal" for bytecode literals. "reference" for named references to a particular Contract Instance Value: value The value field defines the value which should be written when linking the corresponding bytecode. Required: Yes Type: String Format: Determined based on type, see below. Format For static value literals (e.g. address), value must be a 0x-prefixed hexadecimal string representing bytes. To reference the address of a Contract Instance from the current package the value should be the name of that contract instance. This value must be a valid Contract Instance Name. The chain definition under which the contract instance that this link value belongs to must contain this value within its keys. This value may not reference the same contract instance that this link value belongs to. To reference a contract instance from a Package from somewhere within the dependency tree the value is constructed as follows. Let [p1, p2, .. pn] define a path down the dependency tree. Each of p1, p2, pn must be valid package names. p1 must be present in keys of the buildDependencies for the current package. For every pn where n > 1, pn must be present in the keys of the buildDependencies of the package for pn-1. The value is represented by the string ::<...>:: where all of , , are valid package names and is a valid Contract Name. The value must be a valid Contract Instance Name. Within the package of the dependency defined by , all of the following must be satisfiable: There must be exactly one chain defined under the deployments key which matches the chain definition that this link value is nested under. The value must be present in the keys of the matching chain. The Bytecode Object A bytecode object has the following key/value pairs. Bytecode: bytecode The bytecode field is a string containing the 0x prefixed hexadecimal representation of the bytecode. Required: Yes Type: String Format: 0x prefixed hexadecimal. Link References: linkReferences The linkReferences field defines the locations in the corresponding bytecode which require linking. Required: No Type: Array Format: All values must be valid Link Reference objects. See also below. Format This field is considered invalid if any of the Link References are invalid when applied to the corresponding bytecode field, or if any of the link references intersect. Intersection is defined as two link references which overlap. Link Dependencies: linkDependencies The linkDependencies defines the Link Values that have been used to link the corresponding bytecode. Required: No Type: Array Format: All values must be valid Link Value objects. See also below. Format Validation of this field includes the following: Two link value objects must not contain any of the same values for offsets. Each link value object must have a corresponding link reference object under the linkReferences field. The length of the resolved value must be equal to the length of the corresponding Link Reference. The Package Meta Object The Package Meta object is defined to have the following key/value pairs. Authors The authors field defines a list of human readable names for the authors of this package. Packages may include this field. Required: No Key: authors Type: Array(String) License The license field declares the license associated with this package. This value should conform to the SPDX format. Packages should include this field. If a file Source Object defines its own license, that license takes precedence for that particular file over this package-scoped meta license. Required: No Key: license Type: String Description The description field provides additional detail that may be relevant for the package. Packages may include this field. Required: No Key: description Type: String Keywords The keywords field provides relevant keywords related to this package. Required: No Key: keywords Type: Array(String) Links The links field provides URIs to relevant resources associated with this package. When possible, authors should use the following keys for the following common resources. website: Primary website for the package. documentation: Package Documentation repository: Location of the project source code. Required: No Key: links Type: Object (String: String) The Sources Object A Sources object is defined to have the following fields. Key: A unique identifier for the source file. (String) Value: Source Object The Source Object Checksum: checksum Hash of the source file. Required: Only if the content field is missing and none of the provided URLs contain a content hash. Key: checksum Value: Checksum Object URLS: urls Array of urls that resolve to the same source file. Urls should be stored on a content-addressable filesystem. If they are not, then either content or checksum must be included. Urls must be prefixed with a scheme. If the resulting document is a directory the key should be interpreted as a directory path. If the resulting document is a file the key should be interpreted as a file path. Required: If content is not included. Key: urls Value: Array(String) Content: content Inlined contract source. If both urls and content are provided, the content value must match the content of the files identified in urls. Required: If urls is not included. Key: content Value: String Install Path: installPath Filesystem path of source file. Must be a relative filesystem path that begins with a ./. Must resolve to a path that is within the current virtual working directory. Must be unique across all included sources. Must not contain ../ to avoid accessing files outside of the source folder in improper implementations. Required: This field must be included for the package to be writable to disk. Key: installPath Value: String Type: type The type field declares the type of the source file. The field should be one of the following values: solidity, vyper, abi-json, solidity-ast-json. Required: No Key: type Value: String License: license The license field declares the type of license associated with this source file. When defined, this license overrides the package-scoped meta license. Required: No Key: license Value: String The Checksum Object A Checksum object is defined to have the following key/value pairs. Algorithm: algorithm The algorithm used to generate the corresponding hash. Possible algorithms include, but are not limited to sha3, sha256, md5, keccak256. Required: Yes Type: String Hash: hash The hash of a source files contents generated with the corresponding algorithm. Required: Yes Type: String The Contract Type Object A Contract Type object is defined to have the following key/value pairs. Contract Name: contractName The contractName field defines the Contract Name for this Contract Type. Required: If the Contract Name and Contract Alias are not the same. Type: String Format: Must be a valid Contract Name Source ID: sourceId The global source identifier for the source file from which this contract type was generated. Required: No Type: String Format: Must match a unique source ID included in the Sources Object for this package. Deployment Bytecode: deploymentBytecode The deploymentBytecode field defines the bytecode for this Contract Type. Required: No Type: Object Format: Must conform to the Bytecode object format. Runtime Bytecode: runtimeBytecode The runtimeBytecode field defines the unlinked 0x-prefixed runtime portion of Bytecode for this Contract Type. Required: No Type: Object Format: Must conform to the Bytecode object format. ABI: abi Required: No Type: Array Format: Must conform to the Ethereum Contract ABI JSON format. UserDoc: userdoc Required: No Type: Object Format: Must conform to the UserDoc format. DevDoc: devdoc Required: No Type: Object Format: Must conform to the DevDoc format. The Contract Instance Object A Contract Instance Object represents a single deployed Contract Instance and is defined to have the following key/value pairs. Contract Type: contractType The contractType field defines the Contract Type for this Contract Instance. This can reference any of the contract types included in this Package or any of the contract types found in any of the package dependencies from the buildDependencies section of the Package Manifest. Required: Yes Type: String Format: See below. Format Values for this field must conform to one of the two formats herein. To reference a contract type from this Package, use the format . The value must be a valid Contract Alias. The value must be present in the keys of the contractTypes section of this Package. To reference a contract type from a dependency, use the format :. The value must be present in the keys of the buildDependencies of this Package. The value must be be a valid Contract Alias. The resolved package for must contain the value in the keys of the contractTypes section. Address: address The address field defines the Address of the Contract Instance. Required: Yes Type: String Format: Hex encoded 0x prefixed Ethereum address matching the regular expression ^0x[0-9a-fA-F]{40}$. Transaction: transaction The transaction field defines the transaction hash in which this Contract Instance was created. Required: No Type: String Format: 0x prefixed hex encoded transaction hash. Block: block The block field defines the block hash in which this the transaction which created this contract instance was mined. Required: No Type: String Format: 0x prefixed hex encoded block hash. Runtime Bytecode: runtimeBytecode The runtimeBytecode field defines the runtime portion of bytecode for this Contract Instance. When present, the value from this field supersedes the runtimeBytecode from the Contract Type for this Contract Instance. Required: No Type: Object Format: Must conform to the Bytecode Object format. Every entry in the linkReferences for this bytecode must have a corresponding entry in the linkDependencies section. The Compiler Information Object The compilers field defines the various compilers and settings used during compilation of any Contract Types or Contract Instance included in this package. A Compiler Information object is defined to have the following key/value pairs. Name: name The name field defines which compiler was used in compilation. Required: Yes Key: name Type: String Version: version The version field defines the version of the compiler. The field should be OS agnostic (OS not included in the string) and take the form of either the stable version in semver format or if built on a nightly should be denoted in the form of - ex: 0.4.8-commit.60cc1668. Required: Yes Key: version Type: String Settings: settings The settings field defines any settings or configuration that was used in compilation. For the "solc" compiler, this should conform to the Compiler Input and Output Description. Required: No Key: settings Type: Object Contract Types: contractTypes A list of the Contract Alias or Contract Types in this package that used this compiler to generate its outputs. All contractTypes that locally declare runtimeBytecode should be attributed for by a compiler object. A single contractTypes must not be attributed to more than one compiler. Required: No Key: contractTypes Type: Array(Contract Alias) BIP122 URI BIP122 URIs are used to define a blockchain via a subset of the BIP-122 spec. blockchain:///block/ The represents the blockhash of the first block on the chain, and represents the hash of the latest block that’s been reliably confirmed (package managers should be free to choose their desired level of confirmations). Glossary The terms in this glossary have been updated to reflect the changes made in V3. ABI The JSON representation of the application binary interface. See the official specification for more information. Address A public identifier for an account on a particular chain Bytecode The set of EVM instructions as produced by a compiler. Unless otherwise specified this should be assumed to be hexadecimal encoded, representing a whole number of bytes, and prefixed with 0x. Bytecode can either be linked or unlinked. (see Linking) Unlinked Bytecode: The hexadecimal representation of a contract’s EVM instructions that contains sections of code that requires linking for the contract to be functional. The sections of code which are unlinked must be filled in with zero bytes. Example: 0x606060405260e06000730000000000000000000000000000000000000000634d536f Linked Bytecode: The hexadecimal representation of a contract’s EVM instructions which has had all Link References replaced with the desired Link Values. Example: 0x606060405260e06000736fe36000604051602001526040518160e060020a634d536f Chain Definition This definition originates from BIP122 URI. A URI in the format blockchain:///block/ chain_id is the unprefixed hexadecimal representation of the genesis hash for the chain. block_hash is the unprefixed hexadecimal representation of the hash of a block on the chain. A chain is considered to match a chain definition if the the genesis block hash matches the chain_id and the block defined by block_hash can be found on that chain. It is possible for multiple chains to match a single URI, in which case all chains are considered valid matches Content Addressable URI Any URI which contains a cryptographic hash which can be used to verify the integrity of the content found at the URI. The URI format is defined in RFC3986 It is recommended that tools support IPFS and Swarm. Contract Alias This is a name used to reference a specific Contract Type. Contract aliases must be unique within a single Package. The contract alias must use one of the following naming schemes: The portion must be the same as the Contract Name for this contract type. The portion must match the regular expression ^[-a-zA-Z0-9]{1,256}$. Contract Instance A contract instance a specific deployed version of a Contract Type. All contract instances have an Address on some specific chain. Contract Instance Name A name which refers to a specific Contract Instance on a specific chain from the deployments of a single Package. This name must be unique across all other contract instances for the given chain. The name must conform to the regular expression ^[a-zA-Z_$][a-zA-Z0-9_$]{0,255}$ In cases where there is a single deployed instance of a given Contract Type, package managers should use the Contract Alias for that contract type for this name. In cases where there are multiple deployed instances of a given contract type, package managers should use a name which provides some added semantic information as to help differentiate the two deployed instances in a meaningful way. Contract Name The name found in the source code that defines a specific Contract Type. These names must conform to the regular expression ^[a-zA-Z_$][a-zA-Z0-9_$]{0,255}$. There can be multiple contracts with the same contract name in a projects source files. Contract Type Refers to a specific contract in the package source. This term can be used to refer to an abstract contract, a normal contract, or a library. Two contracts are of the same contract type if they have the same bytecode. Example: contract Wallet { ... } A deployed instance of the Wallet contract would be of of type Wallet. Identifier Refers generally to a named entity in the Package. A string matching the regular expression ^[a-zA-Z][-_a-zA-Z0-9]{0,255}$ Link Reference A location within a contract’s bytecode which needs to be linked. A link reference has the following properties. offset: Defines the location within the bytecode where the link reference begins. length: Defines the length of the reference. name: (optional) A string to identify the reference. Link Value A link value is the value which can be inserted in place of a Link Reference Linking The act of replacing Link References with Link Values within some Bytecode. Package Distribution of an application’s source or compiled bytecode along with metadata related to authorship, license, versioning, et al. For brevity, the term Package is often used metonymously to mean Package Manifest. Package Manifest A machine-readable description of a package. Prefixed Bytecode string with leading 0x. Example: 0xdeadbeef Unprefixed Not Prefixed. Example: deadbeef Rationale Minification EthPM packages are distributed as alphabetically-ordered & minified JSON to ensure consistency. Since packages are published on content-addressable filesystems (eg. IPFS), this restriction guarantees that any given set of contract assets will always resolve to the same content-addressed URI. Package Names Package names are restricted to lower-case characters, numbers, and - to improve the readability of the package name, in turn improving the security properties for a package. A user is more likely to accurately identify their target package with this restricted set of characters, and not confuse a malicious package that disguises itself as a trusted package with similar but different characters (e.g. O and 0). BIP122 The BIP-122 standard has been used since EthPM v1 since it is an industry standard URI scheme for identifying different blockchains and distinguishing between forks. Compilers Compilers are now defined in a top-level array, simplifying the task for tooling to identify the compiler types needed to interact with or validate the contract assets. This also removes unnecessarily duplicated information, should multiple contractTypes share the same compiler type. Backwards Compatibility To improve understanding and readability of the EthPM spec, the manifest_version field was updated to manifest in v3. To ensure backwards compatibility, v3 packages must define a top-level "manifest" with a value of "ethpm/3". Additionally, "manifest_version" is a forbidden top-level key in v3 packages. Security Considerations Using EthPM packages implicitly requires importing &/or executing code written by others. The EthPM spec guarantees that when using a properly constructed and released EthPM package, the user will have the exact same code that was included in the package by the package author. However, it is impossible to guarantee that this code is safe to interact with. Therefore, it is critical that end users only interact with EthPM packages authored and released by individuals or organizations that they trust to include non-malicious code. Copyright Copyright and related rights waived via CC0. Citation Please cite this document as: g. nicholas d’andrea (@gnidan), Piper Merriam (@pipermerriam), Nick Gheorghita (@njgheorghita), Christian Reitwiessner (@chriseth), Ben Hauser (@iamdefinitelyahuman), Bryant Eisenbach (@fubuloubu), "ERC-2678: Revised Ethereum Smart Contract Packaging Standard (EthPM v3)," Ethereum Improvement Proposals, no. 2678, May 2020. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-2678. Ethereum Improvement Proposals Ethereum Improvement Proposals ethereum/EIPs Ethereum Improvement Proposals (EIPs) describe standards for the Ethereum platform, including core protocol specifications, client APIs, and contract standards.