Open Source Column: Tribler: P2P Search, Share and Stream

Volume 4, Issue 1, March 2012 (ISSN 1947-4598)


 

By Niels Zeilemaker and Johan Pouwelse

URL: http://www.tribler.org

Niels Zeilemaker, Delft University of Technology, The Netherlands niels@zeilemaker.nl

Johan Pouwelse, Delft University of Technology, The Netherlands peer2peer@gmail.com

Six years ago, we created a new open source P2P file sharing program called Tribler. During this time over one million users have used it, and three generations of Ph.D. students tested their algorithms in the real world.
Tribler is built around BitTorrent. Introduced in 2001, BitTorrent revolutionized the P2P world because of its unprecedented efficiency. However, some problems are not properly addressed in BitTorrent. First, it does not specify how to search the network, relying instead on central websites. These websites allow users to find and download small metadata files called torrents. A torrent describes the content and is required for downloading to start. Second, BitTorrent's unique method for downloading files is incompatible for streaming. This is due to the fact that it is optimized for speedy and reliable downloading, not providing a method for quick buffering.
Tribler is the first client which continuously tries to improve upon the basic BitTorrent implementation by addressing some of the flaws described above. It implements, amongst others, remote search, streaming, channels and reputation-management. All these features are implemented in a completely distributed manner, not relying on any centralized component. Still, Tribler manages to remain fully backwards compatible with BitTorrent. Work on Tribler was initiated in 2005 and has been supported by multiple European grants.

In order to maximize the resource contribution of peers (other computers downloading/uploading the same file), BitTorrent splits a file into small pieces. This way, downloaders (called leechers) can upload completed pieces to other leechers, without the need to have the complete file first. Furthermore, uploading is encouraged by the tit-for-tat incentive mechanism: a leecher will rank its connected peers by their upload speed and will upload only to the fastest uploaders. Peers which have the complete file can help others by sending them pieces for free, these peers are called seeders.
Before being able to download a file, a peer first has to obtain a torrent. This torrent describes the content of the file and includes the SHA-1 cryptographic hash per piece. This mechanism protects against transfer errors and malicious modifications of the file.

Tribler Design and Features

Tribler component overview

A basic overview of Tribler is shown in the figure on the right, it consists of four distinct components.

  • GUI: build using wxWidgets in order to be platform independent
  • BTengine: a BitTorrent engine, which has been altered to allow for our customizations
  • BuddyCast: our custom BitTorrent overlay, which is slowly being phased out
  • Dispersy: our new custom protocol build with NAT traversal and distributed permissions

Tribler is built around PermIds, permanent identities that allow us to identify the actions of users. PermIds are stored as a public/private keypair and are used in Tribler to sign every message.
Communication between peers is established by using a 'special' BitTorrent swarm. All Tribler peers connect to this swarm and communicate with each other using the BuddyCast protocol. Peers are discovered by connecting to SuperPeers. Identical to 'normal' peers, but are considered to always be online. BuddyCast connects to a new peer every 15s and will exchange its preference list. This list contains the last downloads of a peer. By collecting them, we can calculate which peers are most similar to a given user. Those peers, taste buddies, are then used during search. While performing the BuddyCast handshake the user will exchange his current connections as well, allowing it to hillclimb towards finding his most similar peers and discovering new peers.

 

Search
Tribler search results

Performing remote search in a decentralized manner has been a problem for many years. An early P2P protocol, Gnutella, used to send a message to all of its neighbors, which was then forwarded until a TTL of 7 was reached. Such an implementation is called flooding, as it causes a search-query to be sent to almost all peers in the network. Flooding a network is very quick, but it consumes huge amounts of bandwidth.
In contrast, Tribler uses a TTL of 1 (i.e., it only uses its neighbors to perform a remote search). Using the connections of the user to similar peers, we can obtain good results. Using only taste buddies, hitrates over 60% are possible. This figure is further improved by local caches deployed at every peer. The caches contain information for up to 50 000 torrents, which are used for improving search. Tribler connects to up to 10 taste buddies and to 10 random peers, thus allowing the user to search within up to 1050000 torrents.
A torrent is collected when our algorithms deem it interesting for a peer. This is calculated by using the same user-item matrix that is used to find similar peers. The user-item matrix is constructed by storing the BuddyCast preferences. Collaborative filtering allows us to 'recommend' torrents to be collected. Collected torrents are thus tailored for every user, resulting in quicker search results, as we can display in the GUI the locally cached results before receiving any response from a peer.
More details are available in our papers [1,2].

Streaming

Tribler VOD streaming

Tribler supports two distinct types of streaming: Video-On-Demand and Live-Steaming. Both of them extend BitTorrent, by replacing one aspect of it. VOD requires a different approach for downloading pieces. The default policy of BitTorrent is to download the rarest piece first, ensuring the health of all pieces in the swarm. In contrast, in VOD we want to download the first few pieces as soon as possible to commence playback as soon as possible.
To allow for this we have defined three priorities (high, mid and low). Priorities are assigned to pieces, based on the current playback position. High priority pieces are downloaded in-order, thus allowing to start playback quickly. After all high priority pieces are downloaded, we start downloading mid priority pieces. Those are downloaded rarest-first, this to ensure we maintain the overall health of the swarm. Because mid priority pieces are only a subset of all available pieces, we still ensure that the playback-buffer remains stable. After downloading all high and mid priority pieces, we start downloading the low priority pieces, rarest-first. Because the playback position is moving forward, the priority of the pieces will be continuously modified.
Furthermore, we replaced the default BitTorrent incentive mechanism (tit-for-tat) with Give-to-Get. This incentive mechanism will rank peers according to their forwarding rank. A metric describing how well a peer is sending pieces to other peers. Full details are available in our paper [5].

For live streaming we had to modify the actual torrent file. Because in live streaming pieces are not know beforehand, we cannot include their SHA-1 hashes. We replaced the verification scheme by specifying the public key of the original source in the torrent file. Using the public key, every peer can then verify the validity of the pieces. Because live streaming may have an indefinite duration, we keep pieces that are at most 15 minutes old relative to our playback position.

Channels
Channel overview: listing all torrents
Channel comments: listing latest received comments for this Channel
Channel activity: listing latest activity for this Channel

Since December 2011 we are evaluating, in the wild, the performance of our new transport protocol, Dispersy. Dispersy is the successor of BuddyCast, and it is focused on NAT-traversal. Instead of using TCP, Dispersy uses UDP. Furthermore, while BuddyCast implemented one global overlay to connect all Tribler-peers using a 'special' BitTorrent swarm, Dispersy creates a separate overlay per protocol.
Using Dispersy we implemented Channels. These are created by users and consist of a list of torrents they like. Channels are implemented as separate Dispersy overlays and are discovered through one special overlay to which all peers connect. Channels have evolved from a simple list of torrents to a community in which users can comment on torrents, modify their name and description and organize playlists. Modifications are publicly visible, in a system that resembles Wikipedia. By allowing everyone to edit/improve the metadata of torrents, we hope to get a similar quality of experience as in Wikipedia. If a channel owner (the user creating the channel) does not want other users to interfere with his channel, he can limit which messages other users are able to send. This is enabled by using the decentralized permission system build into Dispersy, and it allows for a flexible configuration of channels.
By voting on a Channel, Dispersy will start to collect its contents. Before voting on a Channel only a snapshot of its content is available. More details regarding the voting is described in our paper [6]. Currently, popular channels have well over 30 000 torrents. Furthermore, our users have currently casted over 60 000 votes.

Reputation
BarterCast graph: showing data transfers between peers

A feature lacking in BitTorrent is the cross-swarm identification of peers. While downloading a file, peers have an incentive to upload to others due to tit-for-tat. But after completing a download, no incentives are in place to motivate a peer to keep uploading a file.
In order to address this, Tribler uses its PermIds to identify Tribler peers in other swarms. Additionally, we employ a mechanism called BarterCast, which builds a history of upload and download traffic between Tribler peers. We can then build a graph consisting of the download behavior of peers, scoring them accordingly. A peer which has shown to upload more than others is rewarded by being able to download at a faster rate, while lacking peers can be prevented from downloading at all. More details are available in our papers [3,4].

Acknowledgments

Since the start of the project in 2005 many many people have contributed to the project. Amongst others are A. Bakker, J.J.D. Mol, J. Yang, L. d'Acunto, J.A. Pouwelse, J. Wang, P. Garbacki, A. Iosup, J. Doumen, J. Roozenburg, Y. Yuan, M. ten Brinke, L. Musat, F. Zindel, F. van der Werf, M. Meulpolder, J. Taal, R. Rahman, B. Schoon and N.S.M. Zeilemaker. Tribler is a project which continues to evolve with the help of its community. Currently we have an active userbase who are commenting and suggesting features in the forums and continues to innovate together with our european partners.
If you are interested by the text above and want to try out Tribler, you can download it from our website http://www.tribler.org. Furthermore, the website has even more documentation of feature Tribler has and had, feel free to look around and leave a comment in the forums.

References
  1. Pouwelse JA, Garbacki P, Wang J, et al. Tribler: A social-based peer-to-peer system. Concurrency and Computation: Practice and Experience 2008. Available at: http://www3.interscience.wiley.com/journal/114219988/abstract
  2. Zeilemaker N, Capota M, Bakker A. Tribler: P2P media search and sharing. Proceedings of the 19th ACM international conference on Multimedia (ACM MM) 2011. Available at: http://dl.acm.org/citation.cfm?id=2072433
  3. Meulpolder M, Pouwelse J. Bartercast: A practical approach to prevent lazy freeriding in p2p networks. Sixth Int'l Workshop on Hot Topics in P2P Systems (HoT-P2P) 2009. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5160954
  4. Delaviz R, Andrade N, Pouwelse JA. Improving accuracy and coverage in an internet-deployed reputation mechanism. IEEE Tenth International Conference on Peer-to-Peer Computing (IEEE P2P) 2010. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5569965
  5. Mol J, Pouwelse J, Meulpolder M, Epema D, Sips H. Give-to-get: Free-riding-resilient video-on-demand in p2p systems. Fifteenth Annual Multimedia Computing and Networking (MMCN) 2008. Available at: http://www.pds.ewi.tudelft.nl/pubs/papers/mmcn2008.pdf
  6. Rahman R, Hales D, Meulpolder M, Heinink V, Pouwelse JA and Sips H. Robust vote sampling in a P2P media distribution system. In: Proceedings of IPDPS 2009. Available at: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5160946
  7. Tribler Protocol Specification. Available at: http://svn.tribler.org/bt2-design/proto-spec-unified/trunk/proto-spec-current.pdf

Previous Section Table of Contents Next Section