欢迎来到环境100文库! | 帮助中心 分享价值,成长自我!

环境100文库

换一换
首页 环境100文库 > 资源分类 > PDF文档下载
 

IPFS(Filecoin)星际文件系统白皮书.pdf

  • 资源ID:3477       资源大小:208.36KB        全文页数:11页
  • 资源格式: PDF        下载权限:游客/注册会员/VIP会员    下载费用:10碳币 【人民币10元】
快捷注册下载 游客一键下载
会员登录下载
三方登录下载: 微信开放平台登录 QQ登录   微博登录  
下载资源需要10碳币 【人民币10元】
邮箱/手机:
温馨提示:
支付成功后,系统会自动生成账号(用户名和密码都是您填写的邮箱或者手机号),方便下次登录下载和查询订单;
支付方式: 支付宝    微信支付   
验证码:   换一换

加入VIP,免费下载
 
友情提示
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,既可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰   

IPFS(Filecoin)星际文件系统白皮书.pdf

IPFS - Content Addressed, Versioned, P2P File SystemDRAFT 3Juan Benetjuanbenet.aiABSTRACTThe InterPlanetary File System IPFS is a peer-to-peer dis-tributed le system that seeks to connect all computing de-vices with the same system of les. In some ways, IPFSis similar to the Web, but IPFS could be seen as a sin-gle BitTorrent swarm, exchanging objects within one Gitrepository. In other words, IPFS provides a high through-put content-addressed block storage model, with content-addressed hyper links. This s a generalized MerkleDAG, a data structure upon which one can build versionedle systems, blockchains, and even a Permanent Web. IPFScombines a distributed hashtable, an incentivized block ex-change, and a self-certifying namespace. IPFS has no singlepoint of failure, and nodes do not need to trust each other.1. INTRODUCTIONThere have been many attempts at constructing a globaldistributed le system. Some systems have seen signi -cant success, and others failed completely. Among the aca-demic attempts, AFS [6] has succeeded widely and is stillin use today. Others [7, ] have not attained the samesuccess. Outside of academia, the most successful systemshave been peer-to-peer le-sharing applications primarilygeared toward large media audio and video. Most no-tably, Napster, KaZaA, and BitTorrent [2] deployed largele distribution systems supporting over 100 million simul-taneous users. Even today, BitTorrent maintains a massivedeployment where tens of millions of nodes churn daily [16].These applications saw greater numbers of users and les dis-tributed than their academic le system counterparts. How-ever, the applications were not designed as infrastructure tobe built upon. While there have been successful repurpos-ings1, no general le-system has emerged that o ers global,low-latency, and decentralized distribution.Perhaps this is because a \good enough system for mostuse cases already exists HTTP. By far, HTTP is the mostsuccessful \distributed system of les ever deployed. Cou-pled with the browser, HTTP has had enormous technicaland social impact. It has become the de facto way to trans-mit les across the internet. Yet, it fails to take advantageof dozens of brilliant le distribution techniques invented inthe last fteen years. From one prespective, evolving Webinfrastructure is near-impossible, given the number of back-wards compatibility constraints and the number of strong1For example, Linux distributions use BitTorrent to trans-mit disk images, and Blizzard, Inc. uses it to distributevideo game content.parties invested in the current model. But from another per-spective, new protocols have emerged and gained wide usesince the emergence of HTTP. What is lacking is upgradingdesign enhancing the current HTTP web, and introducingnew functionality without degrading user experience.Industry has gotten away with using HTTP this long be-cause moving small les around is relatively cheap, even forsmall organizations with lots of tra c. But we are enter-ing a new era of data distribution with new challenges ahosting and distributing petabyte datasets, b computingon large data across organizations, c high-volume high-de nition on-demand or real-time media streams, d ver-sioning and linking of massive datasets, e preventing ac-cidental disappearance of important les, and more. Manyof these can be boiled down to \lots of data, accessible ev-erywhere. Pressed by critical features and bandwidth con-cerns, we have already given up HTTP for di erent datadistribution protocols. The next step is making them partof the Web itself.Orthogonal to e cient data distribution, version controlsystems have managed to develop important data collabo-ration work ows. Git, the distributed source code versioncontrol system, developed many useful ways to model andimplement distributed data operations. The Git toolchaino ers versatile versioning functionality that large le distri-bution systems severely lack. New solutions inspired by Gitare emerging, such as Camlistore [], a personal le stor-age system, and Dat [] a data collaboration toolchainand dataset package manager. Git has already in uenceddistributed lesystem design [9], as its content addressedMerkle DAG data model enables powerful le distributionstrategies. What remains to be explored is how this datastructure can in uence the design of high-throughput ori-ented le systems, and how it might upgrade the Web itself.This paper introduces IPFS, a novel peer-to-peer version-controlled lesystem seeking to reconcile these issues. IPFSsynthesizes learnings from many past successful systems.Careful interface-focused integration yields a system greaterthan the sum of its parts. The central IPFS principle ismodeling all data as part of the same Merkle DAG.2. BACKGROUNDThis section reviews important properties of successfulpeer-to-peer systems, which IPFS combines.2.1 Distributed Hash TablesDistributed Hash Tables DHTs are widely used to coor-dinate and maintain metadata about peer-to-peer systems.For example, the BitTorrent MainlineDHT tracks sets ofpeers part of a torrent swarm.2.1.1 Kademlia DHTKademlia [10] is a popular DHT that provides1. E cient lookup through massive networks queries onaverage contact dlog2ne nodes. e.g. 20 hops for anetwork of 10;000;000 nodes.2. Low coordination overhead it optimizes the numberof control messages it sends to other nodes.3. Resistance to various attacks by preferring long-livednodes.4. Wide usage in peer-to-peer applications, includingGnutella and BitTorrent, ing networks of over 20million nodes [16].2.1.2 Coral DSHTWhile some peer-to-peer lesystems store data blocks di-rectly in DHTs, this \wastes storage and bandwidth, as datamust be stored at nodes where it is not needed [5]. TheCoral DSHT extends Kademlia in three particularly impor-tant ways1. Kademlia stores values in nodes whose ids are\nearestusing XOR-distance to the key. This does not takeinto account application data locality, ignores \farnodes that may already have the data, and forces\near-est nodes to store it, whether they need it or not.This wastes signi cant storage and bandwith. Instead,Coral stores addresses to peers who can provide thedata blocks.2. Coral relaxes the DHT API from get_valuekey toget_any_valueskey the \sloppy in DSHT. Thisstill works since Coral users only need a single work-ing peer, not the complete list. In return, Coral candistribute only subsets of the values to the \nearestnodes, avoiding hot-spots overloading all the nearestnodes when a key becomes popular.3. Additionally, Coral organizes a hierarchy of separateDSHTs called clusters depending on region and size.This enables nodes to query peers in their region rst,\ nding nearby data without querying distant nodes[5]and greatly reducing the latency of lookups.2.1.3 S/Kademlia DHTS/Kademlia [1] extends Kademlia to protect against ma-licious attacks in two particularly important ways1. S/Kademlia provides schemes to secure NodeId gener-ation, and prevent Sybill attacks. It requires nodes tocreate a PKI key pair, derive their identity from it,and sign their messages to each other. One schemeincludes a proof-of-work crypto puzzle to make gener-ating Sybills expensive.2. S/Kademlia nodes lookup values over disjoint paths,in order to ensure honest nodes can connect to eachother in the presence of a large fraction of adversariesin the network. S/Kademlia achieves a success rate of0.85 even with an adversarial fraction as large as halfof the nodes.2.2 Block Exchanges - BitTorrentBitTorrent [3] is a widely successful peer-to-peer leshar-ing system, which succeeds in coordinating networks of un-trusting peers swarms to cooperate in distributing piecesof les to each other. Key features from BitTorrent and itsecosystem that in IPFS design include1. BitTorrent’s data exchange protocol uses a quasi tit-for-tat strategy that rewards nodes who contribute toeach other, and punishes nodes who only leech others’resources.2. BitTorrent peers track the availability of le pieces,prioritizing sending rarest pieces rst. This takes loado seeds, making non-seed peers capable of tradingwith each other.3. BitTorrent’s standard tit-for-tat is vulnerable to someexploitative bandwidth sharing strategies. PropShare [8]is a di erent peer bandwidth allocation strategy thatbetter resists exploitative strategies, and improves theperance of swarms.2.3 Version Control Systems - GitVersion Control Systems provide facilities to model leschanging over time and distribute di erent versions e ciently.The popular version control system Git provides a power-ful Merkle DAG 2 object model that captures changes to alesystem tree in a distributed-friendly way.1. Immutable objects represent Files blob, Directoriestree, and Changes commit.2. Objects are content-addressed, by the cryptographichash of their contents.3. Links to other objects are embedded, ing a MerkleDAG. This provides many useful integrity and work-ow properties.4. Most versioning metadata branches, tags, etc. aresimply pointer references, and thus inexpensive to cre-ate and update.5. Version changes only update references or add objects.6. Distributing version changes to other users is simplytransferring objects and updating remote references.2.4 Self-Certified Filesystems - SFSSFS [12, 11] proposed compelling implementations of botha distributed trust chains, and b egalitarian shared globalnamespaces. SFS introduced a technique for building Self-Certi ed Filesystems addressing remote lesystems usingthe following scheme/sfs/where Location is the server network address, andHostID hashpublic_key || LocationThus the name of an SFS le system certi es its server.The user can verify the public key o ered by the server,negotiate a shared secret, and secure all tra c. All SFSinstances share a global namespace where name allocationis cryptographic, not gated by any centralized body.2Merkle Directed Acyclic Graph { similar but more generalconstruction than a Merkle Tree. Deduplicated, does notneed to be balanced, and non-leaf nodes contain data.3. IPFS DESIGNIPFS is a distributed le system which synthesizes suc-cessful ideas from previous peer-to-peer sytems, includingDHTs, BitTorrent, Git, and SFS. The contribution of IPFSis simplifying, evolving, and connecting proven techniquesinto a single cohesive system, greater than the sum of itsparts. IPFS presents a new plat for writing and de-ploying applications, and a new system for distributing andversioning large data. IPFS could even evolve the web itself.IPFS is peer-to-peer; no nodes are privileged. IPFS nodesstore IPFS objects in local storage. Nodes connect to eachother and transfer objects. These objects represent les andother data structures. The IPFS Protocol is divided into astack of sub-protocols responsible for di erent functionality1. Identities - manage node identity generation and ver-i cation. Described in Section 3.1.2. Network - manages connections to other peers, usesvarious underlying network protocols. Con gurable.Described in Section 3.2.3. Routing - maintains ination to locate speci cpeers and objects. Responds to both local and re-mote queries. Defaults to a DHT, but is swappable.Described in Section 3.3.4. Exchange - a novel block exchange protocol BitSwapthat governs e cient block distribution. Modelled asa market, weakly incentivizes data replication. TradeStrategies swappable. Described in Section 3.4.5. Objects - a Merkle DAG of content-addressed im-mutable objects with links. Used to represent arbi-trary datastructures, e.g. le hierarchies and commu-nication systems. Described in Section 3.5.6. Files - versioned le system hierarchy inspired by Git.Described in Section 3.6.7. Naming - A self-certifying mutable name system. De-scribed in Section 3.7.These subsystems are not independent; they are integratedand leverage blended properties. However, it is useful to de-scribe them separately, building the protocol stack from thebottom up.Notation data structures and functions below are speci-ed in Go syntax.3.1 IdentitiesNodes are identi ed by a NodeId, the cryptographic hash3of a public-key, created with S/Kademlia’s static crypto puz-zle [1]. Nodes store their public and private keys encryptedwith a passphrase. Users are free to instatiate a \new nodeidentity on every launch, though that loses accrued networkbene ts. Nodes are incentivized to remain the same.type NodeId Multihashtype Multihash []byte// self-describing cryptographic hash digesttype PublicKey []byte3Throughout this document, hash and checksum referspeci cally to cryptographic hash checksums of data.type PrivateKey []byte// self-describing keystype Node struct {NodeId NodeIDPubKey PublicKeyPriKey PrivateKey}S/Kademlia based IPFS identity generationdifficulty n Node{}do {n.PubKey, n.PrivKey PKI.genKeyPairn.NodeId hashn.PubKeyp count_preceding_zero_bitshashn.NodeId} while p This allows the system to a choose the best function forthe use case e.g. stronger security vs faster perance,and b evolve as function choices change. Self-describingvalues allow using di erent parameter choices compatibly.3.2 NetworkIPFS nodes communicate regualarly with hundreds of othernodes in the network, potentially across the wide internet.The IPFS network stack featuresTransport IPFS can use any transport protocol,and is best suited for WebRTC DataChannels [] forbrowser connectivity or uTPLEDBAT [14].Reliability IPFS can provide reliability if underlyingnetworks do not provide it, using uTP LEDBAT [14]or SCTP [15].Connectivity IPFS also uses the ICE NAT traversaltechniques [13].Integrity optionally checks integrity of messages us-ing a hash checksum.Authenticity optionally checks authenticity of mes-sages using HMAC with sender’s public key.3.2.1 Note on Peer AddressingIPFS can use any network; it does not rely on or assumeaccess to IP. This allows IPFS to be used in overlay networks.IPFS stores addresses as multiaddr atted byte stringsfor the underlying network to use. multiaddr provides a wayto express addresses and their protocols, including supportfor encapsulation. For example an SCTP/IPv4 connection/ip4/10.20.30.40/sctp/1234/ an SCTP/IPv4 connection proxied over TCP/IPv4/ip4/5.6.7.8/tcp/5678/ip4/1.2.3.4/sctp/1234/3.3 RoutingIPFS nodes require a routing system that can nd aother peers’ network addresses and b peers who can serveparticular objects. IPFS achieves this using a DSHT basedon S/Kademlia and Coral, using the properties discussed in2.1. The size of objects and use patterns of IPFS are similarto Coral [5] and Mainline [16], so the IPFS DHT makes adistinction for values stored based on their size. Small valuesequal to or less than 1KB are stored directly on the DHT.For values larger, the DHT stores references, which are theNodeIds of peers who can serve the block.The inter

注意事项

本文(IPFS(Filecoin)星际文件系统白皮书.pdf)为本站会员(币链财经)主动上传,环境100文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知环境100文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。




关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

copyright@ 2017 环境100文库版权所有
国家工信部备案号:京ICP备16041442号-6

收起
展开